Building a Black-Box Interface to a Simulation Code

To interface a simulation code to Dakota using one of the black-box interfaces (system call or fork), pre- and post-processing functionality typically needs to be supplied (or developed) in order to transfer the parameters from Dakota to the simulator input file and to extract the response values of interest from the simulator’s output file for return to Dakota (see Fig. 28 and Fig. 30). This is often managed through the use of scripting languages, such as C-shell [AA86], Bourne shell [Bli96], Perl [WCS96], or Python [Mar03]. While these are common and convenient choices for simulation drivers/filters, it is important to recognize that any executable file can be used. If the user prefers, the desired pre- and post-processing functionality may also be compiled or interpreted from any number of programming languages (C, C++, F77, F95, JAVA, Basic, etc.).

In the dakota/share/dakota/examples/official/drivers/bash/ directory, a simple example uses the Rosenbrock test function as a mock engineering simulation code. Several scripts have been included to demonstrate ways to accomplish the pre- and post-processing needs. Actual simulation codes will, of course, have different pre- and post-processing requirements, and as such, this example serves only to demonstrate the issues associated with interfacing a simulator. Modifications will almost surely be required for new applications.

Generic Script Interface Files

The dakota/share/dakota/examples/official/drivers/bash/ directory contains four important files: dakota_rosenbrock.in (the Dakota input file), simulator_script.sh (the simulation driver script), templatedir/ros.template (a template simulation input file), and templatedir/rosenbrock_bb.py (the Rosenbrock simulator).

The file dakota_rosenbrock.in specifies the study that Dakota will perform and, in the interface section, describes the components to be used in performing function evaluations. In particular, it identifies simulator_script.sh as its analysis_driver, as shown in Listing 16.

Listing 16 The dakota_rosenbrock.in input file.
#  DAKOTA INPUT FILE - dakota_rosenbrock.in
#  This sample Dakota input file optimizes the Rosenbrock function.
#  See p. 95 in Practical Optimization by Gill, Murray, and Wright.

method
  conmin_frcg

variables
  continuous_design = 2
    cdv_initial_point   -1.0      1.0
    cdv_lower_bounds    -2.0     -2.0
    cdv_upper_bounds     2.0      2.0
    cdv_descriptor       'x1'     'x2'

interface
  fork
    analysis_driver = 'simulator_script.sh'
    parameters_file = 'params.in'
    results_file    = 'results.out'
      file_save
    work_directory named 'workdir'
      directory_tag directory_save
      link_files = 'templatedir/*'
    deactivate active_set_vector
     
responses
  num_objective_functions = 1
  analytic_gradients
  no_hessians

The simulator_script.sh listed in Listing 17 is a short driver shell script that Dakota executes to perform each function evaluation. The names of the parameters and results files are passed to the script on its command line; they are referenced in the script by $1 and $2, respectively. The simulator_script.sh is divided into three parts: pre-processing, analysis, and post-processing.

Listing 17 The simulator_script.sh sample driver script.
#!/bin/sh
# Sample simulator to Dakota system call script

# The first and second command line arguments to the script are the
# names of the Dakota parameters and results files.
params=$1
results=$2

# --------------
# PRE-PROCESSING
# --------------
# Incorporate the parameters from Dakota into the template, writing ros.in

dprepro3 $params ros.template ros.in

# ---------
# EXECUTION
# ---------

./rosenbrock_bb.py

# ---------------
# POST-PROCESSING
# ---------------

# extract function value from the simulation output
grep 'Function value' ros.out | cut -c 18- > results.tmp
# extract gradients from the simulation output (in this case will be ignored
# by Dakota if not needed)
grep -i 'Function g' ros.out | cut -c 21- >> results.tmp
mv results.tmp $results

In the pre-processing portion, the simulator_script.sh uses dprepro, a template processing utility, to extract the current variable values from a parameters file ($1) and combine them with the simulator template input file (ros.template) to create a new input file (ros.in) for the simulator.

Dakota also provides a second, more general-purpose template processing tool named pyprepro, which provides many of the same features and functions as dprepro. Both pyprepro and dprepro permit parameter substitution and execution of arbitrary Python scripting within templates.

This pair of tools is extensively documented in the main dprepro and pyprepro section.

Note

Internal to Sandia, the APREPRO utility is also often used for pre-processing. Other preprocessing tools of potential interest are the BPREPRO utility (see [Wal]), and at Lockheed Martin sites, the JPrePost utility, a Java pre- and post-processor [Fla].

The dprepro script will be used from here on out, for simplicity of discussion. dprepro can use either Dakota’s aprepro parameters file format or Dakota’s standard format, so either option may be selected in the interface section of the Dakota input file.

The ros.template file in Listing 18 is a template simulation input file which contains targets for the incoming variable values, identified by the strings “{x1}” and “{x2}”. These identifiers match the variable descriptors specified in dakota_rosenbrock.in. The template input file is contrived as Rosenbrock has nothing to do with finite element analysis; it only mimics a finite element code to demonstrate the simulator template process. The dprepro script will search the simulator template input file for fields marked with curly brackets and then create a new file (ros.in) by replacing these targets with the corresponding numerical values for the variables. As shown in simulator_script.sh, the names for the Dakota parameters file ($1), template file (ros.template), and generated input file (ros.in) must be specified in the dprepro command line arguments.

Listing 18 Listing of the ros.template file
Title of Model: Rosenbrock black box
***************************************************************************
* Description:  This is an input file to the Rosenbrock black box
*               Fortran simulator.  This simulator is structured so
*               as to resemble the input/output from an engineering
*               simulation code, even though Rosenbrock's function
*               is a simple analytic function.  The node, element,
*               and material blocks are dummy inputs.
*
* Input:  x1 and x2
* Output: objective function value
*************************************************************************** 
node 1 location 0.0 0.0
node 2 location 0.0 1.0
node 3 location 1.0 0.0
node 4 location 1.0 1.0
node 5 location 2.0 0.0
node 6 location 2.0 1.0
node 7 location 3.0 0.0
node 8 location 3.0 1.0
element 1 nodes 1 3 4 2
element 2 nodes 3 5 6 4
element 3 nodes 5 7 8 6
element 4 nodes 7 9 10 8
material 1 elements 1 2
material 2 elements 3 4
variable 1 {x1}
variable 2 {x2}
end

The second part of the script executes the rosenbrock_bb.py simulator. The input and output file names, ros.in and ros.out, respectively, are hard-coded into the simulator. When the ./rosenbrock_bb.py simulator is executed, the values for x1 and x2 are read in from ros.in, the Rosenbrock function is evaluated, and the function value is written out to ros.out.

The third part performs the post-processing and writes the response results to a file for Dakota to read. Using the UNIX “grep” utility, the particular response values of interest are extracted from the raw simulator output and saved to a temporary file (results.tmp). When complete, this file is renamed $2, which in this example is always results.out. Note that moving or renaming the completed results file avoids any problems with read race conditions (see the section on system call synchronization).

Because the Dakota input file dakota_rosenbrock.in (Listing 16) specifies work_directory and directory_tag in its interface section, each invocation of simulator_script.sh wakes up in its own temporary directory, which Dakota has populated with the contents of directory templatedir/. Having a separate directory for each invocation of simulator_script.sh simplifies the script when the Dakota input file specifies asynchronous (so several instances of simulator_script.sh might run simultaneously), as fixed names such as ros.in, ros.out, and results.tmp can be used for intermediate files. If neither asynchronous nor file_tag is specified, and if there is no need (e.g., for debugging) to retain intermediate files having fixed names, then directory_tag offers no benefit and can be omitted. An alternative to directory_tag is to proceed as earlier versions of this chapter — prior to Dakota 5.0’s introduction of work_directory — recommended: add two more steps to the simulator_script.sh, an initial one to create a temporary directory explicitly and copy templatedir/ to it if needed, and a final step to remove the temporary directory and any files in it.

When work_directory is specified, Dakota adjusts the $PATH seen by simulator_script.sh so that simple program names (i.e., names not containing a slash) that are visible in Dakota’s directory will also be visible in the work directory. Relative path names — involving an intermediate slash but not an initial one, such as ./rosenbrock_bb.py or a/bc/rosenbrock_bb — will only be visible in the work directory if a link_files or copy_files specification (see 1.5.5) has made them visible there.

As an example of the data flow on a particular function evaluation, consider evaluation 60. The parameters file for this evaluation consists of:

                    2 variables
4.664752623441543e-01 x1
2.256400864298234e-01 x2
                    1 functions
                    3 ASV_1:obj_fn
                    2 derivative_variables
                    1 DVV_1:x1
                    2 DVV_2:x2
                    0 analysis_components
                   60 eval_id

This file is called workdir/workdir.60/params.in if the line

named 'workdir' file_save  directory_save

in Listing 16 is uncommented. The first portion of the file indicates that there are two variables, followed by new values for variables x1 and x2, and one response function (an objective function), followed by an active set vector (ASV) value of 1. The ASV indicates the need to return the value of the objective function for these parameters (see the Active Set Vector section).

The dprepro script reads the variable values from this file, namely 4.664752623441543e-01 and 2.256400864298234e-01 for x1 and x2 respectively, and substitutes them in the {x1} and {x2} fields of the ros.template file. The final three lines of the resulting input file (ros.in) then appear as follows:

variable 1 0.4664752623
variable 2 0.2256400864
end

where all other lines are identical to the template file. The rosenbrock_bb simulator accepts ros.in as its input file and generates the following output to the file ros.out:

Beginning execution of model: Rosenbrock black box
Set up complete.
Reading nodes.
Reading elements.
Reading materials.
Checking connectivity...OK
*****************************************************

Input value for x1 =  4.6647526230000003e-01
Input value for x2 =  2.2564008640000000e-01

Computing solution...Done
*****************************************************
Function value =   2.9111427884970176e-01
Function gradient = [ -2.5674048470887652e+00   1.6081832124292317e+00 ]

Next, the appropriate values are extracted from the raw simulator output and returned in the results file. This post-processing is relatively trivial in this case, and the simulator_script.sh uses the grep and cut utilities to extract the value from the “Function value” line of the ros.out output file and save it to $results, which is the results.out file for this evaluation. This single value provides the objective function value requested by the ASV.

After 132 of these function evaluations, the following Dakota output shows the final solution using the rosenbrock_bb.py simulator:

Exit NPSOL - Optimal solution found.

Final nonlinear objective value =   0.1165704E-06

NPSOL exits with INFORM code = 0 (see "Interpretation of output" section in NPSOL manual)

NOTE: see Fortran device 9 file (fort.9 or ftn09)
      for complete NPSOL iteration history.

<<<<< Iterator npsol_sqp completed.
<<<<< Function evaluation summary: 132 total (132 new, 0 duplicate)
<<<<< Best parameters          =
                      9.9965861667e-01 x1
                      9.9931682203e-01 x2
<<<<< Best objective function  =
                   1.1657044253e-07
<<<<< Best evaluation ID: 130

<<<<< Iterator npsol_sqp completed.
<<<<< Single Method Strategy completed.
Dakota execution time in seconds:
  Total CPU        =       0.12 [parent =   0.116982, child =   0.003018]
  Total wall clock =    1.47497

Adapting These Scripts to Another Simulation

To adapt this approach for use with another simulator, several steps need to be performed:

  1. Create a template simulation input file by identifying the fields in an existing input file that correspond to the variables of interest and then replacing them with {} identifiers (e.g. {cdv_1}, {cdv_2}, etc.) which match the Dakota variable descriptors. Copy this template input file to a templatedir that will be used to create working directories for the simulation.

  2. Modify the dprepro arguments in simulator_script.sh to reflect names of the Dakota parameters file (previously $1), template file name (previously ros.template) and generated input file (previously ros.in). Alternatively, use APREPRO, BPREPRO, or JPrePost to perform this step (and adapt the syntax accordingly).

  3. Modify the analysis section of simulator_script.sh to replace the rosenbrock_bb function call with the new simulator name and command line syntax (typically including the input and output file names).

  4. Change the post-processing section in simulator_script.sh to reflect the revised extraction process. At a minimum, this would involve changing the grep command to reflect the name of the output file, the string to search for, and the characters to cut out of the captured output line. For more involved post-processing tasks, invocation of additional tools may have to be added to the script.

  5. Modify the dakota_rosbenbrock.in input file to reflect, at a minimum, updated variables and responses specifications.

These nonintrusive interfacing approaches can be used to rapidly interface with simulation codes. While generally custom for each new application, typical interface development time is on the order of an hour or two. Thus, this approach is scalable when dealing with many different application codes. Weaknesses of this approach include the potential for loss of data precision (if care is not taken to preserve precision in pre- and post-processing file I/O), a lack of robustness in post-processing (if the data capture is too simplistic), and scripting overhead (only noticeable if the simulation time is on the order of a second or less).

If the application scope at a particular site is more focused and only a small number of simulation codes are of interest, then more sophisticated interfaces may be warranted. For example, the economy of scale afforded by a common simulation framework justifies additional effort in the development of a high quality Dakota interface. In these cases, more sophisticated interfacing approaches could involve a more thoroughly developed black box interface with robust support of a variety of inputs and outputs, or it might involve intrusive interfaces such as the direct simulation interface discussed below in the “Developing a Direct Simulation Interface” section or the “SAND Simulation Codes” section of the main page on interface coupling.

Additional Examples

A variety of additional examples of black-box interfaces to simulation codes are maintained in the dakota/share/dakota/examples/official/drivers directory.