kde

Calculate the Kernel Density Estimate of the posterior distribution

Specification

  • Alias: None

  • Arguments: None

Description

A kernel density estimate (KDE) is a non-parametric, smooth approximation of the probability density function of a random variable. It is calculated using a set of samples of the random variable. If \(X\) is a univariate random variable with unknown density \(f\) and independent and identically distributed samples \(x_{1}, x_{2}, \ldots, x_{n}\) , the KDE is given by

\[\hat{f} = \frac{1}{nh} \sum_{i = 1}^{n} K \left( \frac{x - x_{i}}{h} \right).\]

The kernel \(K\) is a non-negative function which integrates to one. Although the kernel can take many forms, such as uniform or triangular, Dakota uses a normal kernel. The bandwidth \(h\) is a smoothing parameter that should be optimized. Choosing a large value of \(h\) yields a wide KDE with large variance, while choosing a small value of \(h\) yields a choppy KDE with large bias. Dakota approximates the bandwidth using Silverman’s rule of thumb,

\[h = \hat{\sigma} \left( \frac{4}{3n} \right)^{1/5},\]

where \(\hat{\sigma}\) is the standard deviation of the sample set \(\left\{ x_{i} \right\}\) .

For multivariate cases, the random variables are treated as independent, and a separate KDE is calculated for each.

Expected Output

If kde is specified, calculated values of \(\hat{f}\) will be output to the file kde_posterior.dat. Example output is given below.