histogram_point_uncertain

Aleatory uncertain variable - discrete histogram

Topics

discrete_variables, aleatory_uncertain_variables

Specification

  • Alias: None

  • Arguments: None

  • Default: no histogram point uncertain variables

Child Keywords:

Required/Optional

Description of Group

Dakota Keyword

Dakota Keyword Description

Optional

integer

Integer valued point histogram variable

Optional

string

String (categorical) valued point histogram variable

Optional

real

Real valued point histogram variable

Description

Histogram uncertain variables are typically used to model a set of empirical data. When the variables take on only discrete values or categories, a discrete, or point histogram is used to describe their probability mass function (one could think of this as a variables-histogram_bin_uncertain variable with “bins” of zero width). Dakota supports integer-, string-, and real-valued point histograms.

Point histograms are similar to variables-discrete_design_set and variables-discrete_state_set, but as they are uncertain variables, include the relative probabilities of observing the different values within the set.

The histogram_point_uncertain keyword is followed by one or more of integer, string, or real, each of which specify the number of variables to be characterized as discrete histograms of that sub-type.

Each discrete histogram variable is specified by one or more abscissa/count pairs. The abscissas, are the possible values the variable can take on (\(x\) coordinates of type integer, string, or real), and must be specified in increasing order. These are paired with counts \(c\) which provide the frequency of the given value or string, relative to other possible values/strings.

Thus, to fully specify a point-based histogram with \(n\) points, \(n\) \((x,c)\) pairs must be specified with the following features:

  • \(x\) is the point value (integer, string, or real) and \(c\) is the corresponding count for that value.

  • the \(x\) values must be strictly increasing (lexicographically for strings).

  • all \(c\) values must be positive.

  • a minimum of one pair must be specified for each point-based histogram.

Examples

The pairs_per_variable specification provides for the proper association of multiple sets of \((x,c)\) or \((x,y)\) pairs with individual histogram variables. For example, in the following specification,

histogram_point_uncertain
  integer            = 2
  pairs_per_variable = 2     3
  abscissas          = 3 4   100 200 300
  counts             = 1 1   1   2   1

pairs_per_variable associates the \((x,c)\) pairs {(3,1),(4,1)} with one point-based histogram variable (where the values 3 and 4 are equally probable) and associates the \((x,c)\) pairs {(100,1),(200,2),(300,1)} with a second point-based histogram variable (where the value 200 is twice as probable as either 100 or 300).

FAQ

Difference between bin and point histograms: A (continuous) bin histogram specifies bins of non-zero width, whereas a (discrete) point histogram specifies individual point values, which can be thought of as bins with zero width. In the terminology of LHS [WJ98], the bin pairs specification defines a “continuous linear” distribution and the point pairs specification defines a “discrete histogram” distribution (although the points are real-valued, the number of possible values is finite).