testcov {MDCcure} | R Documentation |
Covariate Hypothesis Test of the Cure Probability based on Martingale Difference Correlation
Description
Performs nonparametric hypothesis tests to evaluate the association between a covariate and the cure probability in mixture cure models. Several test statistics are supported, including martingale difference correlation (MDC)-based tests and an alternative GOFT test.
Usage
testcov(
x,
time,
delta,
h = NULL,
method = "FMDCU",
P = 999,
parallel = TRUE,
ncores = -1
)
Arguments
x |
A numeric vector representing the covariate of interest. |
time |
A numeric vector of observed survival times. |
delta |
A binary vector indicating censoring status: |
h |
Bandwidth parameter for kernel smoothing. Either a positive numeric value, |
method |
Character string specifying the test to perform. One of:
Default is |
P |
Integer. Number of permutations or bootstrap replications used to compute the null distribution of the test statistic.
For methods |
parallel |
Logical. If |
ncores |
Integer. Number of cores to use for parallel computing. If |
Details
The function computes a statistic, based on the methodology proposed by Monroy-Castillo et al.,
to test whether a covariate \boldsymbol{X}
has an effect on the cure probability.
\mathcal{H}_0 : \mathbb{E}(\nu | \boldsymbol{X}) \equiv 1 - p \quad \text{a.s.}
\quad \text{vs} \quad
\mathcal{H}_1 : \mathbb{E}(\nu | \boldsymbol{X}) \not\equiv 1 - p \quad \text{a.s.}
The main problem is that the response variable (cure indicator \nu
) is partially observed due to censoring.
This is addressed by estimating the cure indicator using the methodology of Amico et al. (2021).
We define \tau = \sup_x \tau(x)
, with \tau(x) = \inf\{t: S_0(t|x) = 0\}
.
We assume \tau < \infty
and that follow-up is long enough so that \tau < \tau_{G(x)}
for all x
.
Therefore, individuals with censored observed times greater than \tau
are considered cured (\nu = 1
).
Four tests are proposed: three are based on the martingale difference correlation (MDC). For the MDCU and MDCV tests, the null distribution is approximated via a permutation procedure. To provide a faster alternative, a chi-squared approximation is implemented for the MDCU test statistic (FMDCU). Additionally, a modified version of the goodness-of-fit test proposed by Müller and Van Keilegom (2019) is included (GOFT). The test statistic is given by:
\widehat{\mathcal{T}}_n = nh^{1/2}\frac{1}{n}\sum_{i = 1}^{n}\left\{\hat{p}_h(X_i) - \hat{p}\right\}^2,
where \hat{p}_h(X_i)
denotes the nonparametric estimator of the cure probability under the alternative hypothesis,
and \hat{p}
denotes the nonparametric estimator of the cure probability under the null hypothesis.
The approximation of the critical value for the test is done using the bootstrap procedure given in Section 3 of Müller and Van Keilegom (2019).
Value
A list containing:
-
test_results
: A list with the results (e.g., test statistics and p-values) of the selected test(s). -
nu_hat
: A numeric vector of estimated cure probabilities.
References
Amico, M, Van Keilegom, I. & Han, B. (2021). Assessing cure status prediction from survival data using receiver operating characteristic curves. Biometrika, 108(3), 727–740. doi:10.1093/biomet/asaa080
López-Cheda, A., Cao, R., Jácome, M. A., & Van Keilegom, I. (2016). Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Computational Statistics & Data Analysis, 100, 490–502. doi:10.1016/j.csda.2016.04.006
Müller, U.U, & Van Keilegom, I. (2019). Goodness-of-fit tests for the cure rate in a mixture cure model. Biometrika, 106, 211-227. doi:10.1093/biomet/asy058
Shao, X., & Zhang, J. (2014). Martingale difference correlation and its use in high-dimensional variable screening. Journal of the American Statistical Association, 105, 144-165. doi:10.1080/01621459.2014.887012
See Also
Examples
## Some artificial data
set.seed(123)
n <- 50
x <- runif(n, -2, 2) ## Covariate values
y <- rweibull(n, shape = .5*(x + 4)) ## True lifetimes
c <- rexp(n) ## Censoring values
p <- exp(2*x)/(1 + exp(2*x)) ## Probability of being susceptible
u <- runif(n)
t <- ifelse(u < p, pmin(y, c), c) ## Observed times
d <- ifelse(u < p, ifelse(y < c, 1, 0), 0) ## Uncensoring indicator
data <- data.frame(x = x, t = t, d = d)
testcov(x, t, d)