histogram_bin_uncertain

Aleatory uncertain variable - continuous histogram

Topics

continuous_variables, aleatory_uncertain_variables

Specification

  • Alias: None

  • Arguments: INTEGER

  • Default: no histogram bin uncertain variables

Child Keywords:

Required/Optional

Description of Group

Dakota Keyword

Dakota Keyword Description

Optional

pairs_per_variable

Number of pairs defining each histogram bin variable

Required

abscissas

Real abscissas for a bin histogram

Required (Choose One)

Density Values

ordinates

Ordinates specifying a “skyline” probability density function

counts

Frequency or relative probability of each bin

Optional

initial_point

Initial values for variables

Optional

descriptors

Labels for the variables

Description

Histogram uncertain variables are typically used to model a set of empirical data. The bin histogram (contrast: histogram_point_uncertain) is a continuous aleatory distribution characterized by bins of non-zero width where the uncertain variable may lie, together with the relative frequencies of each bin. Hence it can be used to specify a marginal probability density function arising from data.

The histogram_bin_uncertain keyword specifies the number of variables to be characterized as continuous histograms. The required sub-keywords are: abscissas (ranges of values the variable can take on) and either ordinates or counts (characterizing each variable’s frequency information). When using histogram bin variables, each variable must be defined by at least one bin (with two bounding value pairs). When more than one histogram bin variable is active, pairs_per_variable can be used to specify unequal apportionment of provided bin pairs among the variables.

The abscissas specification defines abscissa values (\(x\) coordinates) for the probability density function of each histogram variable. When paired with counts, the specifications provide sets of \((x,c)\) pairs for each histogram variable where \(c\) defines a count (i.e., a frequency or relative probability) associated with a bin. If using bins of unequal width and specification of probability densities is more natural, then the counts specification can be replaced with an ordinates specification (\(y\) coordinates) in order to support interpretation of the input as \((x,y)\) pairs defining the profile of a “skyline” probability density function.

Conversion between the two specifications is straightforward: a count/frequency is a cumulative probability quantity defined from the product of the ordinate density value and the \(x\) bin width. Thus, in the cases of bins of equal width, ordinate and count specifications are equivalent. In addition, ordinates and counts may be relative values; it is not necessary to scale them as all user inputs will be normalized.

To fully specify a bin-based histogram with \(n\) bins (potentially of unequal width), \(n+1\) \((x,c)\) or \((x,y)\) pairs must be specified with the following features:

  • \(x\) is the parameter value for the left boundary of a histogram bin and \(c\) is the corresponding count for that bin. Alternatively, \(y\) defines the ordinate density value for this bin within a skyline probability density function. The right boundary of the bin is defined by the left boundary of the next pair.

  • the final pair specifies the right end of the last bin and must have a \(c\) or \(y\) value of zero.

  • the \(x\) values must be strictly increasing.

  • all \(c\) or \(y\) values must be positive, except for the last which must be zero.

  • a minimum of two pairs must be specified for each bin-based histogram variable.

Examples

The pairs_per_variable specification provides for the proper association of multiple sets of \((x,c)\) or \((x,y)\) pairs with individual histogram variables. For example, in this input snippet

histogram_bin_uncertain = 2
  pairs_per_variable = 3           4
  abscissas          = 5  8  10    .1 .2 .3 .4
  counts             = 17 21 0     12 24 12 0
  descriptors        = 'hbu_1'     'hbu_2'

pairs_per_variable associates the first 3 \((x,c)\) pairs from abscissas and counts {(5,17),(8,21),(10,0)} with one bin-based histogram variable, where one bin is defined between 5 and 8 with a count of 17 and another bin is defined between 8 and 10 with a count of 21. The following set of 4 \((x,c)\) pairs {(.1,12),(.2,24),(.3,12),(.4,0)} defines a second bin-based histogram variable containing three equal-width bins with counts 12, 24, and 12 (middle bin is twice as probable as the other two).

FAQ

Difference between bin and point histograms: A (continuous) bin histogram specifies bins of non-zero width, whereas a (discrete) point histogram specifies individual point values, which can be thought of as bins with zero width. In the terminology of LHS [WJ98], the bin pairs specification defines a “continuous linear” distribution and the point pairs specification defines a “discrete histogram” distribution (although the points are real-valued, the number of possible values is finite).