.. _responses-calibration_terms-calibration_data_file: """"""""""""""""""""" calibration_data_file """"""""""""""""""""" Supply scalar calibration data only .. toctree:: :hidden: :maxdepth: 1 responses-calibration_terms-calibration_data_file-custom_annotated responses-calibration_terms-calibration_data_file-annotated responses-calibration_terms-calibration_data_file-freeform responses-calibration_terms-calibration_data_file-num_experiments responses-calibration_terms-calibration_data_file-num_config_variables responses-calibration_terms-calibration_data_file-experiment_variance_type **Specification** - *Alias:* least_squares_data_file - *Arguments:* STRING - *Default:* none **Child Keywords:** +-------------------------+--------------------+------------------------------+---------------------------------------------+ | Required/Optional | Description of | Dakota Keyword | Dakota Keyword Description | | | Group | | | +=========================+====================+==============================+=============================================+ | Optional (Choose One) | Tabular Format | `custom_annotated`__ | Selects custom-annotated tabular file | | | | | format for experiment data | | | +------------------------------+---------------------------------------------+ | | | `annotated`__ | Selects annotated tabular file format for | | | | | experiment data | | | +------------------------------+---------------------------------------------+ | | | `freeform`__ | Selects free-form tabular file format for | | | | | experiment data | +-------------------------+--------------------+------------------------------+---------------------------------------------+ | Optional | `num_experiments`__ | Add context to data: number of different | | | | experiments | +----------------------------------------------+------------------------------+---------------------------------------------+ | Optional | `num_config_variables`__ | Add context to data: number of | | | | configuration variables. | +----------------------------------------------+------------------------------+---------------------------------------------+ | Optional | `experiment_variance_type`__ | Add context to data: specify the type of | | | | experimental error | +----------------------------------------------+------------------------------+---------------------------------------------+ .. __: responses-calibration_terms-calibration_data_file-custom_annotated.html __ responses-calibration_terms-calibration_data_file-annotated.html __ responses-calibration_terms-calibration_data_file-freeform.html __ responses-calibration_terms-calibration_data_file-num_experiments.html __ responses-calibration_terms-calibration_data_file-num_config_variables.html __ responses-calibration_terms-calibration_data_file-experiment_variance_type.html **Description** Enables text file import of experimental observations for use in calibration, for scalar responses only, with optional scalar variance information. For more complex data import cases see :dakkw:`responses-calibration_terms-calibration_data` Dakota will calibrate model variables to best match these data. Key options include: \li format: whether the data file is in ``annotated``, ``custom_annotated``, or ``freeform`` format \li content: where ``num_experiments``, ``num_config_variables``, and ``experiment_variance_type`` indicate which columns appear in the data. In the most general case, the content of the data file is described by the arguments of three optional parameters. - ``num_experiments`` ( :math:`N_{exp}` ) Default: :math:`N_{exp} = 1` This indicates that the data represent multiple experiments, where each experiment might be conducted with different values of configuration variables. An experiment can also be thought of as a replicate, where the experiments are run at the same values of the configuration variables. - ``num_config_variables`` ( :math:`N_{cfg}` ) Configuration variables specify the values of experimental conditions at which data were collected. The variables in these columns must correspond to state variables in the calibration study. The simulation model will be run at each configuration and compared to the appropriate experiment data. - ``experiment_variance_type`` ('none' or 'scalar') This indicates if the data file contains variances for measurement error of the experimental data. The default is 'none'. While some components may be omitted, the most complete version of a an annotated calibration data file could include columns corresponding to experiment ID, configuration variables, function value observations, and variances (observation errors), shown here in annotated format: .. code-block:: exp_id | configuration xvars | y data observations | y data variances 1 7.8 7 21.9372 1.8687 0.25 0.04 2 8.6 2 19.0779 4.8976 0.25 0.04 3 8.4 8 38.2758 4.4559 0.25 0.04 4 4.2 1 39.7600 6.4631 0.25 0.04 Each row in the file corresponds to an experiment or replicate observation of an experiment to be compared to the model output. This example shows 4 experiments, governed by two configuration variables (one real-valued and one integer-valued), two responses (QOIs), and corresponding observation errors with standard deviation 0.5 and 0.2. *Usage Tips* - The ``calibration_data_file`` keyword is used when \em only scalar calibration terms are present. If there are field calibration terms, instead use :dakkw:`responses-calibration_terms-calibration_data`. For mixed scalar and field calibration terms, one may use the :dakkw:`responses-calibration_terms-calibration_data-scalar_data_file` specification, which uses the format described on this page. - *Attention:* In versions of Dakota prior to 6.14, string-valued configuration variables were specified in data files with 0-based indices into the admissible values. As of Dakota 6.14, strings must be specified by value. For example a string-valued configuration variable for an experimental condition might appear in the file as ``low_pressure`` vs. ``high_pressure``. **Examples** *Simple Case:* In the simplest case, no data content descriptors are specified: .. code-block:: responses calibration_terms = 2 descriptors = 'volts' 'amps' calibration_data_file = 'circuit.dat' annotated And the data file ``circuit``.dat must contain only the :math:`y^{Data}` observations which represent a single experimental observation. In this case, the data file should have :math:`N_{terms} = 2` columns (for volts, amps) and 1 row, where :math:`N_{terms}` is the value of :dakkw:`responses-calibration_terms`. The data file is shown here in annotated format: .. code-block:: exp_id | y data observations 1 21.9372 1.8687 For each function evaluation, Dakota will run the analysis driver, which must return :math:`N_{terms} = 2` model responses. Then the residuals are computed as: .. math:: R_{i} = y^{Model}_i - y^{Data}_{i}. These residuals can be weighted using :dakkw:`responses-calibration_terms-weights`. *Multiple experiments:* One might specify ``num_experiments`` :math:`N_E` indicating that there are multiple experiments. When multiple experiments are present, Dakota will expand the number of residuals for the repeat measurement data and difference with the data accordingly. For example, if the user has :math:`N_E = 4` experiments in the example above with 2 calibration terms, the input file would contain .. code-block:: responses calibration_terms = 2 descriptors = 'volts' 'amps' calibration_data_file = 'circuit.dat' annotated num_experiments = 4 And the ``calibration_data_file`` would need to contain 2 rows (one for each experiment), and each row should contain 2 experimental data values that will be differenced with respect to the appropriate model response: .. code-block:: exp_id | y data observations 1 21.9372 1.8687 2 19.0779 4.8976 3 38.2758 4.4559 4 39.7600 6.4631 To summarize, Dakota will calculate the sum of the squared residuals as: .. math:: f = \sum_{i=1}^{N_E}R_{i}^2 where the residuals now are calculated as: .. math:: R_{i} = y^{Model}_i(\theta) - y^{Data}_{i}. *With experimental variances:* If information is known about the measurement error and the uncertainty in the measurement, that can be specified by sending the measurement error variance to Dakota. In this case, the keyword ``experiment_variance_type`` is added, followed by a string of variance types of length one or of length :math:`N_{terms}` , where :math:`N_{terms}` is the value of :dakkw:`responses-calibration_terms`. The ``experiment_variance_type`` for each response can be 'none' or 'scalar'. NOTE: you must specify the same ``experiment_variance_type`` for all scalar terms. That is, they will all be 'none' or all be 'scalar.' .. code-block:: responses calibration_terms = 2 descriptors = 'volts' 'amps' calibration_data_file = 'circuit.dat' annotated experiment_variance_type 'scalar' For each response that has a 'scalar' variance type, each row of the datafile will now have :math:`N_{terms} = 2` of :math:`y` data values (volts, amps) followed by :math:`N_{terms} =2` columns that specify the measurement error (in units of variance, not standard deviation) for volts, amps. An example with two experiments in annotated format: .. code-block:: exp_id | y data observations | y data variances 1 21.9372 1.8687 0.25 0.04 Dakota will run the analysis driver, which must return :math:`N_{terms}` responses. Then the residuals are computed as: .. math:: R_{i} = \frac{y^{Model}_i - y^{Data}_{i}}{\sqrt{{var}_i}} for :math:`i = 1 \dots N_{terms}` . *Putting all the options together:* Specifying all these options together might look like .. code-block:: responses calibration_terms = 2 descriptors = 'volts' 'amps' calibration_data_file = 'circuit.dat' annotated num_experiments = 4 experiment_variance_type 'scalar' Dakota will expect a data file .. code-block:: exp_id | configuration xvars | y data observations | y data variances 1 7.8 7 21.9372 1.8687 0.25 0.04 2 8.6 2 19.0779 4.8976 0.25 0.04 3 8.4 8 38.2758 4.4559 0.25 0.04 4 4.2 1 39.7600 6.4631 0.25 0.04 To compute residuals for each experiment, e.g., exp_id = 4, Dakota will 1. Evaluate the computational model at the specified configuration (state variables = [4.2, 1]). 2. Difference the resulting 2 function values with the data [39.7600 volts, 6.4631 amps] 3. Weight by the standard deviation = sqrt([0.25 0.04])