bw.rot {smoothemplik}R Documentation

Silverman's rule-of-thumb bandwidth

Description

A fail-safe function that would return a nice Silverman-like bandwidth suggestion for data for which the standard deviation might be NA or 0.

Usage

bw.rot(
  x,
  kernel = c("gaussian", "uniform", "triangular", "epanechnikov", "quartic"),
  na.rm = FALSE,
  robust = TRUE,
  discontinuous = FALSE
)

Arguments

x

A numeric vector without non-finite values.

kernel

A string character: "gaussian", "uniform", "triangular", "epanechnikov", or "quartic".

na.rm

Logical: should missing values be removed? Setting it to TRUE may cause issues because variable-wise removal of NAs may return a bandwidth that is inappropriate for the final data set for which it is suggested.

robust

Logical: safeguard against extreme observations? If TRUE, uses min(sd(x), IQR(x)/1.34) to estimate the spread.

discontinuous

Logical: if the true density is discontinuous (i.e. has jumps), then, the formula for the optimal bandwidth for density estimation changes.

Details

\Sigma = \mathrm{\mathop{diag}}(\sigma^2_k) with \det\Sigma = \prod_k \sigma^2_k and \Sigma^{-1} = \mathrm{\mathop{diag}}(1/\sigma^{2}_k)). Then, the formula 4.12 in Silverman (1986) depends only on \alpha, \beta. \alpha = \mathrm{\mathop{diag}}(\sigma^2_k) (which depend only on the kernel and are fixed for a multivariate normal), and on the L2-norm of the second derivative of the density. The (i, i)th element of the Hessian of multi-variate normal (\phi(x_1, \ldots, x_d) = \phi(X)) is \phi(X)(x_i^2 - \sigma^2_i)/\sigma_i^4.

The rule-of-thumb bandwidth is obtained under the assumption that the true density is multivariate normal with zero covariances (i.e. a diagonal variance-covariance matrix). For details, see (Silverman 1986).

Value

A numeric vector of bandwidths that are a reasonable start optimal non-parametric density estimation of x.

References

Silverman BW (1986). Density estimation for statistics and data analysis. New York: Chapman and Hall.

Examples

set.seed(1); bw.rot(stats::rnorm(100)) # Should be 0.3787568 in R version 4.0.4
set.seed(1); bw.rot(matrix(stats::rnorm(500), ncol = 10)) # 0.4737872 ... 0.7089850

[Package smoothemplik version 0.0.14 Index]