Surrogate-Based Local Minimization
A generally-constrained nonlinear programming problem takes the form
where
Original Objective |
Lagrangian |
Augmented Lagrangian |
|
---|---|---|---|
No constraints |
inappropriate |
acceptable |
TRAL |
Linearized constraints |
acceptable |
SQP-like |
acceptable |
Original constraints |
Direct surrogate |
acceptable |
IPTRSAO |
Initial approaches to nonlinearly-constrained SBO optimized an approximate merit function which incorporated the nonlinear constraints [ALG+00, RRW98]:
where the surrogate merit function is denoted as
While this approach does address infeasible iterates, it still shares the feature that the surrogate merit function may reflect inaccurate relative weightings of the objective and constraints prior to convergence of the Lagrange multipliers and penalty parameters. That is, one may benefit from more feasible intermediate iterates, but the process may still be slow to converge to optimality. The concept of this approach is similar to that of SQP-like SBO approaches [ALG+00] which use linearized constraints:
in that the primary concern is minimizing a composite merit function of the objective and constraints, but under the restriction that the original problem constraints may not be wildly violated prior to convergence of Lagrange multiplier estimates. Here, the merit function selection of the Lagrangian function (row 2, column 2 in Table 20; see also Eq. (247)) is most closely related to SQP, which includes the use of first-order Lagrange multiplier updates (Eq. (253)) that should converge more rapidly near a constrained minimizer than the zeroth-order updates (Eqs. (250) and (251)) used for the augmented Lagrangian.
All of these previous constrained SBO approaches involve a recasting of the approximate subproblem objective and constraints as a function of the original objective and constraint surrogates. A more direct approach is to use a formulation of:
This approach has been termed the direct surrogate approach since it optimizes surrogates of the original objective and constraints (row 3, column 1 in Table 20) without any recasting. It is attractive both from its simplicity and potential for improved performance, and is the default approach taken in Dakota. Other Dakota defaults include the use of a filter method for iterate acceptance, an augmented Lagrangian merit merit function), Lagrangian hard convergence assessment), and no constraint relaxation.
While the formulation of Eq. (240) (and others from row 1 in Table 20) can suffer from infeasible intermediate iterates and slow convergence to constrained minima, each of the approximate subproblem formulations with explicit constraints (Eqs. (241)-(243), and others from rows 2-3 in Table 20) can suffer from the lack of a feasible solution within the current trust region. Techniques for dealing with this latter challenge involve some form of constraint relaxation. Homotopy approaches [PerezEaR04, PerezRW04] or composite step approaches such as Byrd-Omojokun [Omo89], Celis-Dennis-Tapia [CDT85], or MAESTRO [ALG+00] may be used for this purpose (see Constraint relaxation).
After each of the
The formulation in Eq. (243) may also form a merit function for computing the trust region ratio; however, the omission of this merit function from explicit use in the approximate optimization cycles can lead to synchronization problems with the optimizer.
Once computed, the value for
Ratio Value |
Surrogate Accuracy |
Iterate Acceptance |
Trust Region Sizing |
---|---|---|---|
poor |
reject step |
shrink |
|
marginal |
accept step |
shrink |
|
moderate |
accept step |
retain |
|
good |
accept step |
expand |
Iterate acceptance logic

Fig. 83 Illustration of slanting filter
When a surrogate optimization is completed and the approximate solution has been validated, then the decision must be made to either accept or reject the step. The traditional approach is to base this decision on the value of the trust region ratio, as outlined previously in Table 21. An alternate approach is to utilize a filter method [FLT02], which does not require penalty parameters or Lagrange multiplier estimates. The basic idea in a filter method is to apply the concept of Pareto optimality to the objective function and constraint violations and only accept an iterate if it is not dominated by any previous iterate. Mathematically, a new iterate is not dominated if at least one of the following:
is true for all
The use of a filter method is compatible with any of the SBO formulations in Eqs. (240)-(243).
Merit functions
The merit function
The penalty function employed in this paper uses a quadratic penalty with the penalty schedule linked to SBO iteration number
The adaptive penalty function is identical in form to
Eq. (245), but adapts
The Lagrangian merit function is
for which the Lagrange multiplier estimation is discussed in Convergence Assessment. Away from the optimum, it is possible for the least squares estimates of the Lagrange multipliers for active constraints to be zero, which equates to omitting the contribution of an active constraint from the merit function. This is undesirable for tracking SBO progress, so usage of the Lagrangian merit function is normally restricted to approximate subproblems and hard convergence assessments.
The augmented Lagrangian employed in this paper follows the sign conventions described in [Van84]
where
The updating of multipliers and penalties is carefully orchestrated [CGT00] to drive reduction in constraint violation of the iterates. The penalty updates can be more conservative than in Eq. (246), often using an infrequent application of a constant multiplier rather than a fixed exponential progression.
Convergence assessment
To terminate the SBO process, hard and soft convergence metrics are monitored. It is preferable for SBO studies to satisfy hard convergence metrics, but this is not always practical (e.g., when gradients are unavailable or unreliable). Therefore, simple soft convergence criteria are also employed which monitor for diminishing returns (relative improvement in the merit function less than a tolerance for some number of consecutive iterations).
To assess hard convergence, one calculates the norm of the projected gradient of a merit function whenever the feasibility tolerance is satisfied. The best merit function for this purpose is the Lagrangian merit function from Eq. (247). This requires a least squares estimation for the Lagrange multipliers that best minimize the projected gradient:
where gradient portions directed into active global variable bounds have been removed. This can be posed as a linear least squares problem for the multipliers:
where
Constraint relaxation
The goal of constraint relaxation is to achieve efficiency through the balance of feasibility and optimality when the trust region restrictions prevent the location of feasible solutions to constrained approximate subproblems (Eqs. (241)-(243), and other formulations from rows 2-3 in Table 20). The SBO algorithm starting from infeasible points will commonly generate iterates which seek to satisfy feasibility conditions without regard to objective reduction [PerezEaR04].
One approach for achieving this balance is to use relaxed constraints when iterates are infeasible with respect to the surrogate constraints. We follow Perez, Renaud, and Watson [PerezRW04], and use a global homotopy mapping the relaxed constraints and the surrogate constraints. For formulations in Eqs. (241) and (243) (and others from row 3 Table 20), the relaxed constraints are defined from
For Eq. (242) (and others from row 2 in
Table 20), the original surrogate constraints
in place of the corresponding subproblems in
Eqs. (241)-(243).
Alternatively, since the relaxation terms are constants for the
At the start of the SBO algorithm,
starting at
The adjustment parameter
After

Fig. 84 Illustration of SBO iterates using surrogate (red) and relaxed(blue) constraints.
Fig. 84 illustrates the SBO
algorithm on a two-dimensional problem with one inequality constraint
starting from an infeasible point,
The behavior illustrated in
Fig. 84 is an example where
using the relaxed constraints over the surrogate constraints may improve
the overall performance of the SBO algorithm by reducing the number of
iterations performed. This improvement comes at the cost of solving the
minimization subproblem in Eq. (257), which can
be significant in some cases (i.e., when the cost of evaluating