weightedEL0 {smoothemplik} | R Documentation |
Uni-variate empirical likelihood via direct lambda search
Description
Empirical likelihood with counts to solve one-dimensional problems efficiently with Brent's root search algorithm.
Conducts an empirical likelihood ratio test of the hypothesis that the mean of z
is mu
.
The names of the elements in the returned list are consistent with the original R code in (Owen 2017).
Usage
weightedEL0(
z,
mu = NULL,
ct = NULL,
shift = NULL,
return.weights = FALSE,
SEL = FALSE,
weight.tolerance = NULL,
boundary.tolerance = 1e-09,
trunc.to = 0,
chull.fail = c("taylor", "wald", "adjusted", "balanced", "none"),
uniroot.control = list(),
verbose = FALSE
)
Arguments
z |
Numeric data vector. |
mu |
Hypothesized mean of |
ct |
Numeric count variable with non-negative values that indicates the multiplicity of observations.
Can be fractional. Very small counts below the threshold |
shift |
The value to add in the denominator (useful in case there are extra Lagrange multipliers): 1 + lambda'Z + shift. |
return.weights |
Logical: if TRUE, individual EL weights are computed and returned. Setting this to FALSE gives huge memory savings in large data sets, especially when smoothing is used. |
SEL |
If |
weight.tolerance |
Weight tolerance for counts to improve numerical stability (similar to the ones in Art B. Owen's 2017 code, but adapting to the sample size). |
boundary.tolerance |
Relative tolerance for determining when the lambda is not an interior solution because it is too close to the boundary. Corresponds to a fraction of the interval range length. |
trunc.to |
Counts under |
chull.fail |
A character: what to do if the convex hull of |
uniroot.control |
A list passed to the |
verbose |
Logical: if |
Details
This function provides the core functionality for univariate empirical likelihood. The technical details is given in (Cosma et al. 2019), although the algorithm used in that paper is slower than the one provided by this function.
Since we know that the EL probabilities belong to (0, 1), the interval (bracket) for \lambda
search
can be determined in the spirit of formula (2.9) from (Owen 2001). Let
z^*_i := z_i - \mu
be the recentred observations.
p_i = c_i/N \cdot (1 + \lambda z^*_i + s)^{-1}
The probabilities are bounded from above: p_i<1
for all i, therefore,
c_i/N \cdot (1 + \lambda z^*_i + s)^{-1} < 1
c_i/N - 1 - s < \lambda z^*_i
Two cases: either z^*_i<0
, or z^*_i>0
(cases with z^*_i=0
are trivially excluded because they do not affect the EL). Then,
(c_i/N - 1 - s)/z^*_i > \lambda,\ \forall i: z^*_i<0
(c_i/N - 1 - s)/z^*_i < \lambda,\ \forall i: z^*_i>0
which defines the search bracket:
\lambda_{\min} := \max_{i: z^*_i>0} (c_i/N - 1 - s)/z^*_i
\lambda_{\max} := \min_{i: z^*_i<0} (c_i/N - 1 - s)/z^*_i
\lambda_{\min} < \lambda < \lambda_{\max}
(This derivation contains s, which is the extra shift that extends the
function to allow mixed conditional and unconditional estimation;
Owen's textbook formula corresponds to s = 0
.)
The actual tolerance of the lambda search in brentZero
is
2 |\lambda_{\max}| \epsilon_m + \mathrm{tol}/2
,
where tol
can be set in uniroot.control
and
\epsilon_m
is .Machine$double.eps
.
The sum of log-weights is maximised without Taylor expansion, forcing mu
to be inside
the convex hull of z
. If a violation is happening, consider using the chull.fail
argument
or switching to Euclidean likelihood via [weightedEuL()].
Value
A list with the following elements:
- logelr
Logarithm of the empirical likelihood ratio.
- lam
The Lagrange multiplier.
- wts
Observation weights/probabilities (of the same length as
z
).- converged
TRUE
if the algorithm converged,FALSE
otherwise (usually means thatmu
is not within the range ofz
, i.e. the one-dimensional convex hull ofz
).- iter
The number of iterations used (from
brentZero
).- bracket
The admissible interval for lambda (that is, yielding weights between 0 and 1).
- estim.prec
The approximate estimated precision of lambda (from
brentZero
).- f.root
The value of the derivative of the objective function w.r.t. lambda at the root (from
brentZero
). Values> sqrt(.Machine$double.eps)
indicate convergence problems.- exitcode
An integer indicating the reason of termination.
- message
Character string describing the optimisation termination status.
References
Chen J, Variyath AM, Abraham B (2008).
“Adjusted empirical likelihood and its properties.”
Journal of Computational and Graphical Statistics, 17(2), 426–443.
doi:10.1198/106186008x321068.
Cosma A, Kostyrka AV, Tripathi G (2019).
“Inference in conditional moment restriction models when there is selection due to stratification.”
In Huynh KP, Jacho-Chavez DT, Tripathi G (eds.), The Econometrics of Complex Survey Data: Theory and Applications, 137–171.
Emerald Publishing Limited.
ISBN 978-1-78756-726-9.
Emerson SC, Owen AB (2009).
“Calibration of the empirical likelihood method for a vector mean.”
Electronic Journal of Statistics, 3, 1161–1192.
ISSN 1935-7524, doi:10.1214/09-ejs518.
Owen AB (2001).
Empirical Likelihood.
Chapman and Hall/CRC, New York, USA.
Owen AB (2017).
A weighted self-concordant optimization for empirical likelihood.
https://artowen.su.domains/empirical/countnotes.pdf.
See Also
[weightedEL()]
Examples
# Figure 2.4 from Owen (2001) -- with a slightly different data point
earth <- c(
5.5, 5.61, 4.88, 5.07, 5.26, 5.55, 5.36, 5.29, 5.58, 5.65, 5.57, 5.53, 5.62, 5.29,
5.44, 5.34, 5.79, 5.1, 5.27, 5.39, 5.42, 5.47, 5.63, 5.34, 5.46, 5.3, 5.75, 5.68, 5.85
)
set.seed(1)
system.time(r1 <- replicate(40, weightedEL(sample(earth, replace = TRUE), mu = 5.517)))
set.seed(1)
system.time(r2 <- replicate(40, weightedEL0(sample(earth, replace = TRUE), mu = 5.517)))
plot(apply(r1, 2, "[[", "logelr"), apply(r1, 2, "[[", "logelr") - apply(r2, 2, "[[", "logelr"),
bty = "n", xlab = "log(ELR) computed via dampened Newthon method",
main = "Discrepancy between weightedEL and weightedEL0", ylab = "")
abline(h = 0, lty = 2)
# Handling the convex hull violation differently
weightedEL0(1:9, chull.fail = "none")
weightedEL0(1:9, chull.fail = "taylor")
weightedEL0(1:9, chull.fail = "wald")
# Interpolation to well-defined branches outside the convex hull
mu.seq <- seq(-1, 7, 0.1)
wEL1 <- -2*sapply(mu.seq, function(m) weightedEL0(1:9, mu = m, chull.fail = "none")$logelr)
wEL2 <- -2*sapply(mu.seq, function(m) weightedEL0(1:9, mu = m, chull.fail = "taylor")$logelr)
wEL3 <- -2*sapply(mu.seq, function(m) weightedEL0(1:9, mu = m, chull.fail = "wald")$logelr)
plot(mu.seq, wEL1)
lines(mu.seq, wEL2, col = 2)
lines(mu.seq, wEL3, col = 4)
# Warning: depending on the compiler, the discrepancy between weightedEL and weightedEL0
# can be one million (1) times larger than the machine epsilon despite both of them
# being written in pure R
# The results from Apple clang-1400.0.29.202 and Fortran GCC 12.2.0 are different from
# those obtained under Ubuntu 22.04.4 + GCC 11.4.0-1ubuntu1~22.04,
# Arch Linux 6.6.21 + GCC 14.1.1, and Windows Server 2022 + GCC 13.2.0
out1 <- weightedEL(earth, mu = 5.517)[1:4]
out2 <- weightedEL0(earth, mu = 5.517, return.weights = TRUE)[1:4]
print(c(out1$lam, out2$lam), 16)
# Value of lambda weightedEL weightedEL0
# aarch64-apple-darwin20 -1.5631313955?????? -1.5631313957?????
# Windows, Ubuntu, Arch -1.563131395492627 -1.563131395492627