.. _method-multifidelity_sampling:

""""""""""""""""""""""
multifidelity_sampling
""""""""""""""""""""""


Multifidelity Monte Carlo sampling method for UQ



.. toctree::
   :hidden:
   :maxdepth: 1

   method-multifidelity_sampling-seed_sequence
   method-multifidelity_sampling-fixed_seed
   method-multifidelity_sampling-pilot_samples
   method-multifidelity_sampling-solution_mode
   method-multifidelity_sampling-numerical_solve
   method-multifidelity_sampling-search_model_graphs
   method-multifidelity_sampling-sample_type
   method-multifidelity_sampling-export_sample_sequence
   method-multifidelity_sampling-convergence_tolerance
   method-multifidelity_sampling-max_iterations
   method-multifidelity_sampling-max_function_evaluations
   method-multifidelity_sampling-rng
   method-multifidelity_sampling-model_pointer


**Specification**

- *Alias:* multifidelity_mc mfmc 

- *Arguments:* None


**Child Keywords:**

+-------------------------+--------------------+------------------------------+-----------------------------------------------+
| Required/Optional       | Description of     | Dakota Keyword               | Dakota Keyword Description                    |
|                         | Group              |                              |                                               |
+=========================+====================+==============================+===============================================+
| Optional                                     | `seed_sequence`__            | Sequence of seed values for multi-stage       |
|                                              |                              | random sampling                               |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `fixed_seed`__               | Reuses the same seed value for multiple       |
|                                              |                              | random sampling sets                          |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `pilot_samples`__            | Initial set of samples for                    |
|                                              |                              | multilevel/multifidelity sampling methods.    |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `solution_mode`__            | Solution mode for multilevel/multifidelity    |
|                                              |                              | methods                                       |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `numerical_solve`__          | Specify the situations where numerical        |
|                                              |                              | optimization is used for MFMC sample          |
|                                              |                              | allocation                                    |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `search_model_graphs`__      | Perform a search over admissible model        |
|                                              |                              | relationships for a given model ensemble      |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `sample_type`__              | Selection of sampling strategy                |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `export_sample_sequence`__   | Enable export of multilevel/multifidelity     |
|                                              |                              | sample sequences to individual files          |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `convergence_tolerance`__    | Stopping criterion based on relative error    |
|                                              |                              | reduction                                     |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `max_iterations`__           | Number of iterations allowed for optimizers   |
|                                              |                              | and adaptive UQ methods                       |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `max_function_evaluations`__ | Stopping criterion based on maximum function  |
|                                              |                              | evaluations                                   |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `rng`__                      | Selection of a random number generator        |
+----------------------------------------------+------------------------------+-----------------------------------------------+
| Optional                                     | `model_pointer`__            | Identifier for model block to be used by a    |
|                                              |                              | method                                        |
+----------------------------------------------+------------------------------+-----------------------------------------------+

.. __: method-multifidelity_sampling-seed_sequence.html
__ method-multifidelity_sampling-fixed_seed.html
__ method-multifidelity_sampling-pilot_samples.html
__ method-multifidelity_sampling-solution_mode.html
__ method-multifidelity_sampling-numerical_solve.html
__ method-multifidelity_sampling-search_model_graphs.html
__ method-multifidelity_sampling-sample_type.html
__ method-multifidelity_sampling-export_sample_sequence.html
__ method-multifidelity_sampling-convergence_tolerance.html
__ method-multifidelity_sampling-max_iterations.html
__ method-multifidelity_sampling-max_function_evaluations.html
__ method-multifidelity_sampling-rng.html
__ method-multifidelity_sampling-model_pointer.html



**Description**


An adaptive sampling method that utilizes multifidelity
relationships in order to improve efficiency through variance reduction.

Multifidelity sampling is a recursive sampling scheme for which model
ordering is important.  In the case of an ensemble surrogate model
with more than two model instances (either in terms of model forms or
resolutions or both), the multi-model approach of :cite:p:`peherstorfer2016optimal`
is supported for which all model instances can be integrated into the
scheme.  In the special case of two model instances, this collapses to
the approach of :cite:p:`Ng2014`.  The approach can be used with
a model form sequence, a resolution level sequence, or a combination of
both (all specified form/resolution combinations will be enumerated).

*Control Variate Monte Carlo*

In the case of two model instances (low fidelity denoted as LF and
high fidelity denoted as HF), we employ a control variate approach as
described in :cite:p:`Ng2014`:


.. math::  \hat{Q}_{HF}^{CV} = \hat{Q}_{HF}^{MC}
   - \beta (\hat{Q}_{LF}^{MC} - \mathbb{E}[Q_{LF}]) 

As opposed to the traditional control variate approach, we do not know
:math:`\mathbb{E}[Q_{LF}]`  precisely, but rather we estimate it more
accurately than :math:`\hat{Q}_{LF}^{MC}`  based on a sampling increment
applied to the LF model.  This sampling increment is based on
a total cost minimization procedure that incorporates the relative LF
and HF costs and the observed Pearson correlation coefficient
:math:`\rho_{LH}`  between :math:`Q_{LF}`  and :math:`Q_{HF}` .  The
coefficient :math:`\beta` is then determined from the observed LF-HF
covariance and LF variance.

*Multifidelity Monte Carlo*

This approach can be extended to a sequence of low-fidelity
approximations using a recusive sampling approach as in
:cite:p:`peherstorfer2016optimal`.


.. math::  \hat{Q}_{HF}^{CV} = \hat{Q}_{HF}^{MC}
   - \sum_{i=1}^M \beta_i (\hat{Q}_{LF_i}^{MC} - \mathbb{E}[Q_{LF_i}]) 

In this case, the variance in the estimate of the :math:`i^{th}` control
mean is reduced by the :math:`(i+1)^{th}` control variate, such that the
variance reduction is limited by the case of an exact estimate of the
first control mean (referred to as OCV-1 in :cite:p:`GORODETSKY2020109257`).

*Model Selection*

Similar to weighted MLMC (see
:dakkw:`method-multilevel_sampling-weighted`), MFMC is a special case
of generalized ACV (:cite:p:`Bomarito2022`) using the ACV-MF sampling
scheme in combination with a hierarchical DAG (each approximation node
points to the next approximation of higher fidelity, ending with the
truth model at the root node). As such, the MFMC approach can be
promoted to the generalized ACV solver in order to gain access to its
model selection capabilities.  Activating the ``model_selection``
option results in a set of numerical solutions that will enumerate
model combinations for a fixed hierarchical DAG.


*Default Behavior*

The ``multifidelity_sampling`` method employs a number of important default settings:

* The DAG defining control variate pairings is a "hierarchical" DAG where each approximation node points to the next approximation of higher fidelity, ending with the truth model at the root node.  For other DAG options, refer to the approximate control variate method (peer DAG by default) and it's option to ``search_model_graphs`` (enumerates all or some subset of the admissible DAGs).

* If the model QoI are well-ordered, then the analytic solution from :cite:p:`peherstorfer2016optimal` will be used.  If a numerical solution is instead used, whether due to an override specification or due to detection of model misordering, the numerical solver will be ``global_local`` by default, starting with the DIRECT global solver and proceeding to available local solvers (SQP and NIP) in competition.  The numerical solution reorders models on the fly in order to enforce the required ordering constraints for computing the MFMC estimator variance.

* Solution mode will be ``online_pilot``, an approach which iterates toward a set of shared samples whose size is consistent with the optimal allocation

* Monte Carlo sample sets are used by default and are most consistent with the underlying theory, but this default can be overridden to use Latin hypercube sample sets using ``sample_type`` ``lhs``.  Allocations remain governed by Monte Carlo variance for all cases.


*Expected Output*

The ``multifidelity_sampling`` method reports estimates of the first four
moments and a summary of the evaluations performed for each model
fidelity and discretization level.  The method does not support any
level mappings (response, probability, reliability, generalized
reliability) at this time.

*Expected HDF5 Output*

If Dakota was built with HDF5 support and run with the
:dakkw:`environment-results_output-hdf5` keyword, this method
writes the following results to HDF5:


- :ref:`hdf5_results-sampling_moments` (moments only, not confidence intervals)

In addition, the execution group has the attribute ``equiv_hf_evals``, which
records the equivalent number of high-fidelity evaluations.

*Usage Tips*

The ``multifidelity_sampling`` method is used in combination with an
ensemble model specification for a model form sequence, a
discretization level sequence, or both.  For a model form sequence,
each model must provide a scalar ``solution_level_cost``.  For a
discretization level sequence, it is necessary to identify the
variable string descriptor that controls the resolution levels using
``solution_level_control`` as well as the associated array of relative
costs using ``solution_level_cost``.  An alternative to prescribing
the cost profile is estimating it on the fly using cost metadata that
is returned from the different simulation instances.




**Examples**


We provide an example of a multifidelity Monte Carlo study using an
ensemble model specification employing multiple approximations.

The following method block:

.. code-block::

    method,
     model_pointer = 'NONHIER'
     multifidelity_sampling
       pilot_samples = 20 seed = 1237
       max_iterations = 10
       convergence_tolerance = .001

specifies MFMC in combination with the model identified by the
NONHIER pointer.

This NONHIER model specification provides a truth model and a set of
unordered approximation models, each with a single (or default)
discretization level:

.. code-block::

    model,
     id_model = 'NONHIER'
     surrogate ensemble
       truth_model = 'HF'
       unordered_model_fidelities = 'LF1' 'LF2'
    
    model,
     id_model = 'LF1'
     interface_pointer = 'LF1_INT'
     simulation
       solution_level_cost = 0.01
    
    model,
     id_model = 'LF2'
     interface_pointer = 'LF2_INT'
     simulation
       solution_level_cost = 0.1
    
    model,
     id_model = 'HF'
     interface_pointer = 'HF_INT'
     simulation
       solution_level_cost = 1.


Refer to ``dakota/test/dakota_uq_*_cvmc``.in and
``dakota/test/dakota_uq_*_mfmc``.in in the source distribution
for additional examples.