estim_tilde_AMSE {ElliptCopulas} | R Documentation |
Estimate the part of the AMSE of the elliptical density generator that only depends
on the parameter "a" assuming h
has been optimally chosen
Description
A continuous elliptical distribution has a density of the form
f_X(x) = {|\Sigma|}^{-1/2}
g\left( (x-\mu)^\top \, \Sigma^{-1} \, (x-\mu) \right),
where x \in \mathbb{R}^d
,
\mu \in \mathbb{R}^d
is the mean,
\Sigma
is a d \times d
positive-definite matrix
and a function g: \mathbb{R}_+ \rightarrow \mathbb{R}_+
, called the
density generator of X
.
The goal is to estimate g
at some point \xi
, by
\widehat{g}_{n,h,a}(\xi)
:= \dfrac{\xi^{\frac{-d+2}{2}} \psi_a'(\xi)}{n h s_d}
\sum_{i=1}^n
K\left( \dfrac{ \psi_a(\xi) - \psi_a(\xi_i) }{h} \right)
+ K\left( \dfrac{ \psi_a(\xi) + \psi_a(\xi_i) }{h} \right),
where
s_d := \pi^{d/2} / \Gamma(d/2)
,
\Gamma
is the Gamma function,
h
and a
are tuning parameters (respectively the bandwidth and a
parameter controlling the bias at \xi = 0
),
\psi_a(\xi) := -a + (a^{d/2} + \xi^{d/2})^{2/d},
\xi \in \mathbb{R}
, K
is a kernel function and
\xi_i := (X_i - \mu)^\top \, \Sigma^{-1} \, (X_i - \mu),
for a sample X_1, \dots, X_n
.
Thanks to Proposition 2.2 in (Ryan and Derumigny, 2024), the asymptotic
mean square error of \widehat{g}_{n,h,a}(\xi)
can be decomposed into
a product of a constant (that depends on the true g
) and a term that
depends on g
and a
. This function computes this term. It can be
useful to find out the best value of the parameter a
to be used.
Usage
estim_tilde_AMSE(
X,
mu = 0,
Sigma_m1 = diag(NCOL(X)),
grid,
h,
Kernel = "gaussian",
a = 1,
mpfr = FALSE,
precBits = 100,
dopb = TRUE
)
Arguments
X |
a matrix of size |
mu |
mean of X. This can be the true value or an estimate. It must be
a vector of dimension |
Sigma_m1 |
inverse of the covariance matrix of X.
This can be the true value or an estimate. It must be
a matrix of dimension |
grid |
grid of values of |
h |
bandwidth of the kernel. Can be either a number or a vector of the
size |
Kernel |
name of the kernel. Possible choices are
|
a |
tuning parameter to improve the performance at 0.
Can be either a number or a vector of the
size |
mpfr |
if |
precBits |
number of precBits used for floating point precision
(only used if |
dopb |
a Boolean value.
If |
Value
a vector of the same size as the grid, with the corresponding value
for the \widetilde{AMSE}
.
Author(s)
Alexis Derumigny, Victor Ryan
References
Ryan, V., & Derumigny, A. (2024). On the choice of the two tuning parameters for nonparametric estimation of an elliptical distribution generator arxiv:2408.17087.
Examples
# Comparison between the estimated and true generator of the Gaussian distribution
n = 50000
d = 3
X = matrix(rnorm(n * d), ncol = d)
grid = seq(0, 5, by = 0.1)
a = 1.5
AMSE_est = estim_tilde_AMSE(X = X, grid = grid, a = a, h = 0.09)
plot(grid, abs(AMSE_est), type = "l")
# Computation of true values
g = exp(-grid/2)/(2*pi)^{3/2}
gprime = (-1/2) *exp(-grid/2)/(2*pi)^{3/2}
A = a^(d/2)
psia = -a + (A + grid^(d/2))^(2/d)
psiaprime = grid^(d/2 - 1) * (A + grid^(d/2))^(2/d - 1)
psiasecond = psiaprime * ( (d-2)/2 ) * grid^{-1} * A *
( grid^(d/2) + A )^(-1)
rhoprimexi = ((d-2) * grid^((d-4)/2) * psiaprime
- 2 * grid^((d-2)/2) * psiasecond) / (2 * psiaprime^3) * g +
grid^((d-2)/2) / (psiaprime^2) * gprime
AMSE = rhoprimexi / psiaprime
lines(grid, abs(AMSE), col = "red")
# Comparison as a function of $a$
n = 50000
d = 3
X = matrix(rnorm(n * d), ncol = d)
grid = 0.1
vec_a = c(0.001, 0.002, 0.005,
0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.8, 1, 1.5, 2)
AMSE_est = rep(NA, length = length(vec_a))
for (i in 1:length(vec_a)){
AMSE_est[i] = estim_tilde_AMSE(X = X, grid = grid, a = vec_a[i], h = 0.09,
dopb = FALSE)
}
plot(vec_a, abs(AMSE_est), type = "l", log = "x")
# Computation of true values
a = vec_a
g = exp(-grid/2)/(2*pi)^{3/2}
gprime = (-1/2) *exp(-grid/2)/(2*pi)^{3/2}
A = a^(d/2)
psia = -a + (A + grid^(d/2))^(2/d)
psiaprime = grid^(d/2 - 1) * (A + grid^(d/2))^(2/d - 1)
psiasecond = psiaprime * ( (d-2)/2 ) * grid^{-1} * A *
( grid^(d/2) + A )^(-1)
rhoprimexi = ((d-2) * grid^((d-4)/2) * psiaprime
- 2 * grid^((d-2)/2) * psiasecond) / (2 * psiaprime^3) * g +
grid^((d-2)/2) / (psiaprime^2) * gprime
AMSE = rhoprimexi / psiaprime
yliminf = min(c(abs(AMSE_est), abs(AMSE)))
ylimsup = max(c(abs(AMSE_est), abs(AMSE)))
plot(vec_a, abs(AMSE_est), type = "l", log = "xy",
ylim = c(yliminf, ylimsup))
lines(vec_a, abs(AMSE), col = "red")