.. _method-sampling:

""""""""
sampling
""""""""


Randomly samples variables according to their distributions


**Topics**


uncertainty_quantification, sampling


.. toctree::
   :hidden:
   :maxdepth: 1

   method-sampling-samples
   method-sampling-seed
   method-sampling-fixed_seed
   method-sampling-sample_type
   method-sampling-refinement_samples
   method-sampling-d_optimal
   method-sampling-variance_based_decomp
   method-sampling-backfill
   method-sampling-principal_components
   method-sampling-wilks
   method-sampling-std_regression_coeffs
   method-sampling-tolerance_intervals
   method-sampling-final_moments
   method-sampling-response_levels
   method-sampling-probability_levels
   method-sampling-reliability_levels
   method-sampling-gen_reliability_levels
   method-sampling-distribution
   method-sampling-rng
   method-sampling-model_pointer


**Specification**

- *Alias:* nond_sampling 

- *Arguments:* None


**Child Keywords:**

+-------------------------+--------------------+----------------------------+---------------------------------------------+
| Required/Optional       | Description of     | Dakota Keyword             | Dakota Keyword Description                  |
|                         | Group              |                            |                                             |
+=========================+====================+============================+=============================================+
| Optional                                     | `samples`__                | Number of samples for sampling-based        |
|                                              |                            | methods                                     |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `seed`__                   | Seed of the random number generator         |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `fixed_seed`__             | Reuses the same seed value for multiple     |
|                                              |                            | random sampling sets                        |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `sample_type`__            | Selection of sampling strategy              |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `refinement_samples`__     | Performs an incremental Latin Hypercube     |
|                                              |                            | Sampling (LHS) study                        |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `d_optimal`__              | Generate a D-optimal sampling design        |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `variance_based_decomp`__  | Activates global sensitivity analysis based |
|                                              |                            | on decomposition of response variance into  |
|                                              |                            | contributions from variables                |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `backfill`__               | Ensures that the samples of discrete        |
|                                              |                            | variables with finite support are unique    |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `principal_components`__   | Activates principal components analysis of  |
|                                              |                            | the response matrix of N samples * L        |
|                                              |                            | responses.                                  |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `wilks`__                  | Number of samples for random sampling using |
|                                              |                            | Wilks statistics                            |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `std_regression_coeffs`__  | Output Standardized Regression Coefficients |
|                                              |                            | and R^2 for samples                         |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `tolerance_intervals`__    | Computes the double sided tolerance         |
|                                              |                            | interval equivalent normal distribuion.     |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `final_moments`__          | Output moments of the specified type and    |
|                                              |                            | include them within the set of final        |
|                                              |                            | statistics.                                 |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `response_levels`__        | Values at which to estimate desired         |
|                                              |                            | statistics for each response                |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `probability_levels`__     | Specify probability levels at which to      |
|                                              |                            | estimate the corresponding response value   |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `reliability_levels`__     | Specify reliability levels at which the     |
|                                              |                            | response values will be estimated           |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `gen_reliability_levels`__ | Specify generalized relability levels at    |
|                                              |                            | which to estimate the corresponding         |
|                                              |                            | response value                              |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `distribution`__           | Selection of cumulative or complementary    |
|                                              |                            | cumulative functions                        |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `rng`__                    | Selection of a random number generator      |
+----------------------------------------------+----------------------------+---------------------------------------------+
| Optional                                     | `model_pointer`__          | Identifier for model block to be used by a  |
|                                              |                            | method                                      |
+----------------------------------------------+----------------------------+---------------------------------------------+

.. __: method-sampling-samples.html
__ method-sampling-seed.html
__ method-sampling-fixed_seed.html
__ method-sampling-sample_type.html
__ method-sampling-refinement_samples.html
__ method-sampling-d_optimal.html
__ method-sampling-variance_based_decomp.html
__ method-sampling-backfill.html
__ method-sampling-principal_components.html
__ method-sampling-wilks.html
__ method-sampling-std_regression_coeffs.html
__ method-sampling-tolerance_intervals.html
__ method-sampling-final_moments.html
__ method-sampling-response_levels.html
__ method-sampling-probability_levels.html
__ method-sampling-reliability_levels.html
__ method-sampling-gen_reliability_levels.html
__ method-sampling-distribution.html
__ method-sampling-rng.html
__ method-sampling-model_pointer.html


**Description**


This method generates parameter values by drawing samples from the
specified uncertain variable probability distributions. The
computational model is executed over all generated parameter values to
compute the responses for which statistics are computed. The
statistics support sensitivity analysis and uncertainty
quantification.

*Default Behavior*

By default, ``sampling`` methods operate on aleatory and epistemic
uncertain variables.  The types of variables can be restricted or
expanded (to include design or state variables) through use of the
``active`` keyword in the :dakkw:`variables` block in the Dakota input
file.  If continuous design and/or state variables are designated as
active, the sampling algorithm will treat them as parameters with
uniform probability distributions between their upper and lower
bounds.  Refer to :ref:`topic-variable_support` for additional
information on supported variable types, with and without correlation.

The following keywords change how the samples are selected:

- sample_type
- fixed_seed
- rng
- samples
- seed
- variance_based_decomp

*Expected Outputs*

As a default, Dakota provides correlation analyses when running LHS.
Correlation tables are printed with the simple, partial, and rank
correlations between inputs and outputs. These can be useful to get a
quick sense of how correlated the inputs are to each other, and how
correlated various outputs are to inputs. ``variance_based_decomp`` is
used to request more sensitivity information, with additional cost.

Additional statistics can be computed from the samples using the following
keywords:

- ``response_levels``
- ``reliability_levels``
- ``probability_levels``
- ``gen_reliability_levels``

``response_levels`` computes statistics at the specified response value.
The other three allow the specification of the statistic value, and will
estimate the corresponding response value.

``distribution`` is used to specify whether the statistic values are
from cumulative or complementary cumulative functions.

*Expected HDF5 Output*

If Dakota was built with HDF5 support and run with the
:dakkw:`environment-results_output-hdf5` keyword, this method
writes the following results to HDF5:


* When :dakkw:`method-sampling-variance_based_decomp` is enabled  
   * :ref:`hdf5_results-vbd`

* For aleatory UQ studies  
   * :ref:`hdf5_results-pdf`
   * :ref:`hdf5_results-level_mappings`
   * :ref:`hdf5_results-sampling_moments`
   * :ref:`hdf5_results-correlations`

* For epistemic UQ studies
   * :ref:`hdf5_results-extreme_responses`
   * :ref:`hdf5_results-correlations`

*Usage Tips*

``sampling`` is a robust approach to doing sensitivity analysis and
uncertainty quantification that can be applied to any problem.  It
requires more simulations than newer, advanced methods.  Thus, an
alternative may be preferable if the simulation is computationally
expensive.

**Active Variables:** By default sampling generates samples only for
the uncertain variables, and treats any design or state variables as
constants.  However, if :dakkw:`variables-active`
:dakkw:`variables-active-all` is specified sampling will be performed
over all variables, including uncertain, design, and state.  In this
case, the sampling algorithm will treat any continuous design or
continuous state variables as parameters with uniform probability
distributions between their upper and lower bounds.

This is similar to the behavior of the design of experiments methods,
since they will also generate samples over all continuous design,
uncertain, and state variables in the variables specification.
However, the design of experiments methods will treat all variables as
being uniformly distributed between their upper and lower bounds,
whereas the sampling method will sample the uncertain variables within
their specified probability distributions. The other ``active``
options can enable sample over other subsets of variables.


**Examples**


.. code-block::

    # tested on Dakota 6.0 on 140501
    
    environment
      tabular_data
        tabular_data_file = 'Sampling_basic.dat'
    
    method
      sampling
        sample_type lhs
        samples = 20
    
    model
      single
    
    variables
      active uncertain
      uniform_uncertain = 2
        descriptors  =   'input1'     'input2'
        lower_bounds =  -2.0     -2.0
        upper_bounds =   2.0      2.0
      continuous_state = 1
        descriptors =   'constant1'
        initial_state = 100
    
    interface
      analysis_drivers 'text_book'
        fork
    
    responses
      response_functions = 1
      no_gradients
      no_hessians


This example illustrates a basic sampling Dakota input file.


- LHS is used instead of purely random sampling.
- The default random number generator is used.
- Without a ``seed`` specified, this will not be reproducable
- In the ``variables`` block, two types of variables are used
- Only the uncertain variables are varied, this is the default     behavior, and is also specified by the ``active`` keyword, w/ the     ``uncertain`` option


**FAQ**


*Q:* Do I need to keep the LHS* and S4 files?
*A:* No