.. _method-efficient_global:

""""""""""""""""
efficient_global
""""""""""""""""


Global Surrogate Based Optimization, a.k.a. EGO


**Topics**


global_optimization_methods, surrogate_based_optimization_methods


.. toctree::
   :hidden:
   :maxdepth: 1

   method-efficient_global-initial_samples
   method-efficient_global-seed
   method-efficient_global-batch_size
   method-efficient_global-max_iterations
   method-efficient_global-convergence_tolerance
   method-efficient_global-x_conv_tol
   method-efficient_global-gaussian_process
   method-efficient_global-use_derivatives
   method-efficient_global-import_build_points_file
   method-efficient_global-export_approx_points_file
   method-efficient_global-model_pointer


**Specification**

- *Alias:* None

- *Arguments:* None


**Child Keywords:**

+-------------------------+--------------------+-------------------------------+---------------------------------------------+
| Required/Optional       | Description of     | Dakota Keyword                | Dakota Keyword Description                  |
|                         | Group              |                               |                                             |
+=========================+====================+===============================+=============================================+
| Optional                                     | `initial_samples`__           | Initial number of samples for               |
|                                              |                               | sampling-based methods                      |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `seed`__                      | Seed of the random number generator         |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `batch_size`__                | Total batch size in parallel EGO            |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `max_iterations`__            | Number of iterations allowed for optimizers |
|                                              |                               | and adaptive UQ methods                     |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `convergence_tolerance`__     | Expected improvement convergence tolerance  |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `x_conv_tol`__                | x-convergence tolerance                     |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `gaussian_process`__          | Gaussian Process surrogate model            |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `use_derivatives`__           | Use derivative data to construct surrogate  |
|                                              |                               | models                                      |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `import_build_points_file`__  | File containing points you wish to use to   |
|                                              |                               | build a surrogate                           |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `export_approx_points_file`__ | Output file for surrogate model value       |
|                                              |                               | evaluations                                 |
+----------------------------------------------+-------------------------------+---------------------------------------------+
| Optional                                     | `model_pointer`__             | Identifier for model block to be used by a  |
|                                              |                               | method                                      |
+----------------------------------------------+-------------------------------+---------------------------------------------+

.. __: method-efficient_global-initial_samples.html
__ method-efficient_global-seed.html
__ method-efficient_global-batch_size.html
__ method-efficient_global-max_iterations.html
__ method-efficient_global-convergence_tolerance.html
__ method-efficient_global-x_conv_tol.html
__ method-efficient_global-gaussian_process.html
__ method-efficient_global-use_derivatives.html
__ method-efficient_global-import_build_points_file.html
__ method-efficient_global-export_approx_points_file.html
__ method-efficient_global-model_pointer.html


**Description**


The Efficient Global Optimization (EGO) method was first developed by
Jones, Schonlau, and Welch :cite:p:`Jon98`. In EGO,
a stochastic response surface approximation for the objective function
is developed based on some sample points from the "true" simulation.

Note that several major differences exist between our implementation
and that of :cite:p:`Jon98`. First, rather than
using a branch and bound method to find the point which maximizes the
EIF, we use the DIRECT global optimization method.

Second, we support both global optimization and global nonlinear least
squares as well as general nonlinear constraints through abstraction
and subproblem recasting.

The efficient global method is in prototype form. Currently, we do not
expose any specification controls for the underlying Gaussian process
model used or for the optimization of the expected improvement
function (which is currently performed by the NCSU DIRECT algorithm
using its internal defaults).

By default, EGO uses the Surfpack GP (Kriging) model, but the Dakota
implementation may be selected instead. If ``use_derivatives`` is
specified the GP model will be built using available derivative data
(Surfpack GP only).

*Expected HDF5 Output*

If Dakota was built with HDF5 support and run with the
:dakkw:`environment-results_output-hdf5` keyword, this method
writes the following results to HDF5:


- :ref:`hdf5_results-best_params`
- :ref:`hdf5_results-best_obj_fncs` (when :dakkw:`responses-objective_functions`) are specified)
- :ref:`hdf5_results-best_constraints`
- :ref:`hdf5_results-calibration` (when :dakkw:`responses-calibration_terms` are specified)


**Theory**


The particular response surface used is a Gaussian process (GP). The
GP allows one to calculate the prediction at a new input location as
well as the uncertainty associated with that prediction. The key idea
in EGO is to maximize the Expected Improvement Function (EIF). The EIF
is used to select the location at which a new training point should be
added to the Gaussian process model by maximizing the amount of
improvement in the objective function that can be expected by adding
that point. A point could be expected to produce an improvement in the
objective function if its predicted value is better than the current
best solution, or if the uncertainty in its prediction is such that
the probability of it producing a better solution is high. Because the
uncertainty is higher in regions of the design space with few
observations, this provides a balance between exploiting areas of the
design space that predict good solutions, and exploring areas where
more information is needed. EGO trades off this "exploitation
vs. exploration." The general procedure for these EGO-type methods is:


- Build an initial Gaussian process model of the objective function
- Find the point that maximizes the EIF. If the EIF value at this point is sufficiently small, stop.
- Evaluate the objective function at the point where the EIF is maximized. Update the Gaussian process model using this new point.
- Return to the previous step.