Building a Black-Box Interface to a Simulation Code
To interface a simulation code to Dakota using one of the black-box interfaces (system call or fork), pre- and post-processing functionality typically needs to be supplied (or developed) in order to transfer the parameters from Dakota to the simulator input file and to extract the response values of interest from the simulator’s output file for return to Dakota (see Fig. 28 and Fig. 30). This is often managed through the use of scripting languages, such as C-shell [AA86], Bourne shell [Bli96], Perl [WCS96], or Python [Mar03]. While these are common and convenient choices for simulation drivers/filters, it is important to recognize that any executable file can be used. If the user prefers, the desired pre- and post-processing functionality may also be compiled or interpreted from any number of programming languages (C, C++, F77, F95, JAVA, Basic, etc.).
In the dakota/share/dakota/examples/official/drivers/bash/
directory, a simple example uses the Rosenbrock test function as
a mock engineering simulation code. Several scripts have been included
to demonstrate ways to accomplish the pre- and post-processing needs.
Actual simulation codes will, of course, have different pre- and
post-processing requirements, and as such, this example serves only to
demonstrate the issues associated with interfacing a simulator.
Modifications will almost surely be required for new applications.
Generic Script Interface Files
The dakota/share/dakota/examples/official/drivers/bash/
directory
contains four important files: dakota_rosenbrock.in
(the Dakota
input file), simulator_script.sh
(the simulation driver script),
templatedir/ros.template
(a template simulation input file), and
templatedir/rosenbrock_bb.py
(the Rosenbrock simulator).
The file dakota_rosenbrock.in
specifies the study that Dakota will
perform and, in the interface section, describes the components to be
used in performing function evaluations. In particular, it identifies
simulator_script.sh
as its analysis_driver
, as shown in
Listing 20.
# DAKOTA INPUT FILE - dakota_rosenbrock.in
# This sample Dakota input file optimizes the Rosenbrock function.
# See p. 95 in Practical Optimization by Gill, Murray, and Wright.
method
conmin_frcg
variables
continuous_design = 2
cdv_initial_point -1.0 1.0
cdv_lower_bounds -2.0 -2.0
cdv_upper_bounds 2.0 2.0
cdv_descriptor 'x1' 'x2'
interface
fork
analysis_driver = 'simulator_script.sh'
parameters_file = 'params.in'
results_file = 'results.out'
file_save
work_directory named 'workdir'
directory_tag directory_save
link_files = 'templatedir/*'
deactivate active_set_vector
responses
num_objective_functions = 1
analytic_gradients
no_hessians
The simulator_script.sh
listed in Listing 21 is a
short driver shell script that Dakota executes to perform each
function evaluation. The names of the parameters and results files are
passed to the script on its command line; they are referenced in the
script by $1
and $2
, respectively. The simulator_script.sh
is divided into three parts: pre-processing, analysis, and
post-processing.
#!/bin/sh
# Sample simulator to Dakota system call script
# The first and second command line arguments to the script are the
# names of the Dakota parameters and results files.
params=$1
results=$2
# --------------
# PRE-PROCESSING
# --------------
# Incorporate the parameters from Dakota into the template, writing ros.in
dprepro3 $params ros.template ros.in
# ---------
# EXECUTION
# ---------
./rosenbrock_bb.py
# ---------------
# POST-PROCESSING
# ---------------
# extract function value from the simulation output
grep 'Function value' ros.out | cut -c 18- > results.tmp
# extract gradients from the simulation output (in this case will be ignored
# by Dakota if not needed)
grep -i 'Function g' ros.out | cut -c 21- >> results.tmp
mv results.tmp $results
In the pre-processing portion, the simulator_script.sh
uses dprepro
,
a template processing utility, to extract the current variable values from
a parameters file ($1
) and combine them with the simulator template
input file (ros.template
) to create a new input file (ros.in
) for
the simulator.
Dakota also provides a second, more general-purpose template processing
tool named pyprepro
, which provides many of the same features and
functions as dprepro
. Both pyprepro
and dprepro
permit parameter
substitution and execution of arbitrary Python scripting within templates.
This pair of tools is extensively documented in the main dprepro and pyprepro section.
Note
Internal to Sandia, the APREPRO utility is also often used for pre-processing. Other preprocessing tools of potential interest are the BPREPRO utility (see [Wal]), and at Lockheed Martin sites, the JPrePost utility, a Java pre- and post-processor [Fla].
The dprepro
script will be used from here on out, for simplicity of discussion.
dprepro
can use either Dakota’s aprepro parameters file format
or Dakota’s standard format, so either option
may be selected in the interface section of the Dakota input file.
The ros.template
file in Listing 22
is a template simulation input file which contains targets for the incoming variable values,
identified by the strings “{x1}
” and “{x2}
”. These identifiers
match the variable descriptors specified in dakota_rosenbrock.in
. The template input file is
contrived as Rosenbrock has nothing to do with finite element analysis;
it only mimics a finite element code to demonstrate the simulator
template process. The dprepro
script will search the simulator
template input file for fields marked with curly brackets and then
create a new file (ros.in
) by replacing these targets with the corresponding
numerical values for the variables. As shown in simulator_script.sh
, the names for the
Dakota parameters file ($1
), template file (ros.template
), and generated input
file (ros.in
) must be specified in the dprepro
command line arguments.
Title of Model: Rosenbrock black box
***************************************************************************
* Description: This is an input file to the Rosenbrock black box
* Fortran simulator. This simulator is structured so
* as to resemble the input/output from an engineering
* simulation code, even though Rosenbrock's function
* is a simple analytic function. The node, element,
* and material blocks are dummy inputs.
*
* Input: x1 and x2
* Output: objective function value
***************************************************************************
node 1 location 0.0 0.0
node 2 location 0.0 1.0
node 3 location 1.0 0.0
node 4 location 1.0 1.0
node 5 location 2.0 0.0
node 6 location 2.0 1.0
node 7 location 3.0 0.0
node 8 location 3.0 1.0
element 1 nodes 1 3 4 2
element 2 nodes 3 5 6 4
element 3 nodes 5 7 8 6
element 4 nodes 7 9 10 8
material 1 elements 1 2
material 2 elements 3 4
variable 1 {x1}
variable 2 {x2}
end
The second part of the script executes the rosenbrock_bb.py
simulator. The input and
output file names, ros.in
and ros.out
, respectively, are hard-coded into the simulator. When the
./rosenbrock_bb.py
simulator is executed, the values for x1
and x2
are read in from
ros.in
, the Rosenbrock function is evaluated, and the function value is
written out to ros.out
.
The third part performs the post-processing and writes the response
results to a file for Dakota to read. Using the UNIX “grep
” utility,
the particular response values of interest are extracted from the raw
simulator output and saved to a temporary file (results.tmp
). When complete, this
file is renamed $2
, which in this example is always results.out
. Note that
moving or renaming the completed results file avoids any problems with
read race conditions (see the section on
system call synchronization).
Because the Dakota input file dakota_rosenbrock.in
(Listing 20) specifies
work_directory
and directory_tag
in its interface section, each
invocation of simulator_script.sh
wakes up in its own temporary directory, which Dakota has
populated with the contents of directory templatedir/
. Having a separate directory
for each invocation of simulator_script.sh
simplifies the script when the Dakota input file
specifies asynchronous
(so several instances of simulator_script.sh
might run
simultaneously), as fixed names such as ros.in
, ros.out
, and results.tmp
can be used for
intermediate files. If neither asynchronous
nor file_tag
is
specified, and if there is no need (e.g., for debugging) to retain
intermediate files having fixed names, then directory_tag
offers no
benefit and can be omitted. An alternative to directory_tag
is to
proceed as earlier versions of this chapter — prior to Dakota 5.0’s
introduction of work_directory
— recommended: add two more steps to
the simulator_script.sh
, an initial one to create a temporary directory explicitly and copy
templatedir/
to it if needed, and a final step to remove the temporary directory and
any files in it.
When work_directory
is specified, Dakota adjusts the $PATH
seen
by simulator_script.sh
so that simple program names (i.e., names not containing a slash)
that are visible in Dakota’s directory will also be visible in the work
directory. Relative path names — involving an intermediate slash but not
an initial one, such as ./rosenbrock_bb.py
or a/bc/rosenbrock_bb
— will only be visible in the work directory
if a link_files
or copy_files
specification (see
1.5.5) has made them visible there.
As an example of the data flow on a particular function evaluation, consider evaluation 60. The parameters file for this evaluation consists of:
2 variables
4.664752623441543e-01 x1
2.256400864298234e-01 x2
1 functions
3 ASV_1:obj_fn
2 derivative_variables
1 DVV_1:x1
2 DVV_2:x2
0 analysis_components
60 eval_id
This file is called workdir/workdir.60/params.in
if the line
named 'workdir' file_save directory_save
in Listing 20 is uncommented. The
first portion of the file indicates that there are two variables,
followed by new values for variables x1
and x2
, and one response
function (an objective function), followed by an active set vector (ASV)
value of 1
. The ASV indicates the need to return the value of the
objective function for these parameters (see the Active Set Vector section).
The dprepro
script reads the variable values from this file, namely 4.664752623441543e-01
and
2.256400864298234e-01
for x1
and x2
respectively, and
substitutes them in the {x1}
and {x2}
fields of the ros.template
file. The
final three lines of the resulting input file (ros.in
) then appear as follows:
variable 1 0.4664752623
variable 2 0.2256400864
end
where all other lines are identical to the template file. The rosenbrock_bb
simulator
accepts ros.in
as its input file and generates the following output to the file
ros.out
:
Beginning execution of model: Rosenbrock black box
Set up complete.
Reading nodes.
Reading elements.
Reading materials.
Checking connectivity...OK
*****************************************************
Input value for x1 = 4.6647526230000003e-01
Input value for x2 = 2.2564008640000000e-01
Computing solution...Done
*****************************************************
Function value = 2.9111427884970176e-01
Function gradient = [ -2.5674048470887652e+00 1.6081832124292317e+00 ]
Next, the appropriate values are extracted from the raw simulator output
and returned in the results file. This post-processing is relatively
trivial in this case, and the simulator_script.sh
uses the grep
and cut
utilities to
extract the value from the “Function value
” line of the ros.out
output
file and save it to $results
, which is the results.out
file for this evaluation.
This single value provides the objective function value requested by the
ASV.
After 132 of these function evaluations, the following Dakota output
shows the final solution using the rosenbrock_bb.py
simulator:
Exit NPSOL - Optimal solution found.
Final nonlinear objective value = 0.1165704E-06
NPSOL exits with INFORM code = 0 (see "Interpretation of output" section in NPSOL manual)
NOTE: see Fortran device 9 file (fort.9 or ftn09)
for complete NPSOL iteration history.
<<<<< Iterator npsol_sqp completed.
<<<<< Function evaluation summary: 132 total (132 new, 0 duplicate)
<<<<< Best parameters =
9.9965861667e-01 x1
9.9931682203e-01 x2
<<<<< Best objective function =
1.1657044253e-07
<<<<< Best evaluation ID: 130
<<<<< Iterator npsol_sqp completed.
<<<<< Single Method Strategy completed.
Dakota execution time in seconds:
Total CPU = 0.12 [parent = 0.116982, child = 0.003018]
Total wall clock = 1.47497
Adapting These Scripts to Another Simulation
To adapt this approach for use with another simulator, several steps need to be performed:
Create a template simulation input file by identifying the fields in an existing input file that correspond to the variables of interest and then replacing them with
{}
identifiers (e.g.{cdv_1}
,{cdv_2}
, etc.) which match the Dakota variable descriptors. Copy this template input file to a templatedir that will be used to create working directories for the simulation.Modify the
dprepro
arguments insimulator_script.sh
to reflect names of the Dakota parameters file (previously$1
), template file name (previouslyros.template
) and generated input file (previouslyros.in
). Alternatively, use APREPRO, BPREPRO, or JPrePost to perform this step (and adapt the syntax accordingly).Modify the analysis section of
simulator_script.sh
to replace therosenbrock_bb
function call with the new simulator name and command line syntax (typically including the input and output file names).Change the post-processing section in
simulator_script.sh
to reflect the revised extraction process. At a minimum, this would involve changing thegrep
command to reflect the name of the output file, the string to search for, and the characters to cut out of the captured output line. For more involved post-processing tasks, invocation of additional tools may have to be added to the script.Modify the
dakota_rosbenbrock.in
input file to reflect, at a minimum, updated variables and responses specifications.
These nonintrusive interfacing approaches can be used to rapidly interface with simulation codes. While generally custom for each new application, typical interface development time is on the order of an hour or two. Thus, this approach is scalable when dealing with many different application codes. Weaknesses of this approach include the potential for loss of data precision (if care is not taken to preserve precision in pre- and post-processing file I/O), a lack of robustness in post-processing (if the data capture is too simplistic), and scripting overhead (only noticeable if the simulation time is on the order of a second or less).
If the application scope at a particular site is more focused and only a small number of simulation codes are of interest, then more sophisticated interfaces may be warranted. For example, the economy of scale afforded by a common simulation framework justifies additional effort in the development of a high quality Dakota interface. In these cases, more sophisticated interfacing approaches could involve a more thoroughly developed black box interface with robust support of a variety of inputs and outputs, or it might involve intrusive interfaces such as the direct simulation interface discussed below in the “Developing a Direct Simulation Interface” section or the “SAND Simulation Codes” section of the main page on interface coupling.
Additional Examples
A variety of additional examples of black-box interfaces to simulation
codes are maintained in the dakota/share/dakota/examples/official/drivers
directory.