# multilevel_sampling

Multilevel Monte Carlo (MLMC) sampling method for UQ

**Specification**

*Alias:*multilevel_mc mlmc*Arguments:*None

**Child Keywords:**

Required/Optional |
Description of Group |
Dakota Keyword |
Dakota Keyword Description |
---|---|---|---|

Optional |
Sequence of seed values for multi-stage random sampling |
||

Optional |
Reuses the same seed value for multiple random sampling sets |
||

Optional |
Initial set of samples for multilevel sampling methods. |
||

Optional |
Solution mode for multilevel/multifidelity methods |
||

Optional |
Selection of sampling strategy |
||

Optional |
Add control variate weights to the recursive differences using in multilevel Monte Carlo |
||

Optional |
Enable export of multilevel/multifidelity sample sequences to individual files |
||

Optional |
Allocation statistics/target for the MLMC sample allocation. |
||

Optional |
Aggregation strategy for the QoIs statistics for problems with multiple responses in the MLMC algorithm |
||

Optional |
Stopping criterion based on relative error |
||

Optional |
Select absolute or relative convergence tolerance |
||

Optional |
Select target for MLMC sample allocation |
||

Optional |
Stopping criterion based on number of refinement iterations within the multilevel sample allocation |
||

Optional |
Stopping criterion based on maximum function evaluations |
||

Optional |
Selection of a random number generator |
||

Optional |
Identifier for model block to be used by a method |

**Description**

An adaptive sampling method that utilizes multilevel relationships within an ensemble surrogate model in order to improve efficiency through variance reduction.

In the case of a multilevel relationship, multilevel Monte Carlo methods are used to compute an optimal sample allocation per level.

*Multilevel Monte Carlo*

The Monte Carlo estimator for the mean is defined as

In a multilevel method with \(L\) levels, we replace this estimator with a telescoping sum:

This decomposition forms discrepancies for each level greater than 0, seeking reduction in the variance of the discrepancy \(Y\) relative to the variance of the original response \(Q\) . The number of samples allocated for each level ( \(N_l\) ) is based on a total cost minimization procedure that incorporates the relative cost and observed variance for each of the \(Y_\ell\) .

*Weighting and Model Selection*

Similar to MFMC (see `multifidelity_sampling`

), weighted
MLMC is a special case of generalized ACV using the ACV-RD sampling
scheme ([BLWL22]) in combination with a hierarchical DAG
(each approximation node points to the next approximation of higher
fidelity, ending with the truth model at the root node). As such, the
selection of a weighted MLMC approach is promoted to the generalized
ACV solver in order to gain access to both weighting and optional
model selection capabilities by activating the
`weighted`

and
`model_selection`

options, respectively. This results in one or more numerical
solutions for a fixed hierarchical DAG.

*Default Behavior*

The `multilevel_sampling`

method employs a number of important default settings:

By default, MLMC is unweighted and provides a convenient analytic solution for the optimal sample allocations. The option of weighted MLMC is also available as a numerical solution and can be combined with model selection as described above.

Solution mode will be

`online_pilot`

, an approach which iterates toward a set of shared samples whose size is consistent with the optimal allocation.Monte Carlo sample sets are used by default and are most consistent with the underlying theory, but this default can be overridden to use Latin hypercube sample sets using

`sample_type`

`lhs`

. Allocations remain governed by Monte Carlo variance for all cases.

*Expected Output*

The `multilevel_sampling`

method reports estimates of the first four
moments and a summary of the evaluations performed for each model
fidelity and discretization level. The method does not support any
level mappings (response, probability, reliability, generalized
reliability) at this time.

*Expected HDF5 Output*

If Dakota was built with HDF5 support and run with the
`hdf5`

keyword, this method
writes the following results to HDF5:

Sampling Moments (moments only, not confidence intervals)

In addition, the execution group has the attribute `equiv_hf_evals`

, which
records the equivalent number of high-fidelity evaluations.

*Usage Tips*

The multilevel sampling method must be used in combination with an
ordered ensemble surrogate model specification, and supports either a
sequence of model forms or a sequence of discretization levels. For
the former, each model form must provide a scalar
`solution_level_cost`

and for the latter, it is necessary to
identify the variable string descriptor that controls the resolution
levels using `solution_level_control`

as well as the associated
array of relative costs using `solution_level_cost`

. An alternative
to prescribing the cost profile is estimating it on the fly using cost
metadata that is returned from the different simulation instances.

**Examples**

The following method block

```
method,
model_pointer = 'HIERARCH'
multilevel_sampling
pilot_samples = 20 seed = 1237
max_iterations = 10
convergence_tolerance = .001
```

specifies a multilevel Monte Carlo study in combination with the model identified by the HIERARCH pointer. This model specification provides a one-dimensional hierarchy, typically defined by a single model fidelity with multiple discretization levels, but may also be provided as multiple ordered model fidelities, each with a single (or default) discretization level. An example of the former (single model fidelity with multiple discretization levels) follows:

```
model,
id_model = 'HIERARCH'
surrogate ensemble
ordered_model_fidelities = 'SIM1'
correction additive zeroth_order
model,
id_model = 'SIM1'
simulation
solution_level_control = 'N_x'
solution_level_cost = 630. 1260. 2100. 4200.
```

Refer to `dakota/test/dakota_uq_*_mlmc`

.in in the source distribution
for additional examples.