dataexample.unstratified {CaseCohortCoxSurvival} | R Documentation |
Example of case-cohort with unstratified sampling of the subcohort, and set of auxiliary variables
Description
List with cohort
and A
.
cohort
is a simulated cohort with 20 000 subjects. It contains:
id
is the subject identifier.
X1
is a continuous baseline covariate. Its measurements are only available for subjects in the case-cohort, i.e., on subjects with subcohort = 1
and/or status = 1
.
X2
is a categorical baseline covariate, with categories 0, 1, and 2. It is measured on all cohort subjects.
X3
is a continuous baseline covariate. Its measurements are only available for subjects in the case-cohort.
status
indicates case status.
event.time
gives the event or censoring time. status
indicates whether the subject experienced the event of interest or was censored.
1053 subjects were sampled (independently of case status) from the cohort. subcohort
indicates all these subjects included in the subcohort. The case-cohort (phase-two sample) consists of the subcohort and any other cases not in the subcohort.
n
gives the number of subjects in the cohort.
m
gives the number of subjects sampled from the cohort (i.e., 1053).
m
and n
would be used to compute the design weights of non-cases. Because all the cases were included in the case-cohort, they would be assigned a design weight of 1.
n.cases
gives the number of cases in the entire cohort.
X1.proxy
is a continuous baseline covariate. It is a proxy of X1
, with 0.8 correlation. It is measured on all cohort subjects. It can be used for design weights calibration in the argument predictors.cox.phase2
of function caseCohortCoxSurvival
, as one would need to predict X1
on the entire cohort.
X3.proxy
is a continuous baseline covariate. It is a proxy of X3
, with 0.8 correlation. It is measured on all cohort subjects. It can be used for design weights calibration in the argument predictors.cox.phase2
of function caseCohortCoxSurvival
, as one would need to predict X3
on the entire cohort.
X1.pred
is a prediction of X1
, available for all cohort subjects. The predictions were obtained by weighted linear regression on X1.proxy
, with the design weights.
X3.pred
is a prediction of X3
, available for all cohort subjects. The predictions were obtained by weighted linear regression on X1.proxy
, X2
, and X3.proxy
, with the design weights.
A
contains auxiliary variables, obtained as proposed by Breslow et al. (2009) and Shin et al. (2020). A
can be used with argument aux.var
of function caseCohortCoxSurvival
.
Predictions of X1
were obtained by weighted linear regression on X1.proxy
and X2
, with the design weights. Predictions of X3
were obtained by weighted linear regression on X1.proxy
, X2
, and X3.proxy
, with the design weights. Then the Cox model with X2
and the predicted values of X1
and X3
(available for all cohort subjects) was run. A.X1
, A.X2
, and A.X3
contain the influences on the estimated log-RHs (available for all cohort subjects).
Second, design weights were then calibrated based on A.1
, A.X1
, A.X2
, and A.X3
, with A.1
that is identically equal to 1. The log-RH parameter was then estimated from the case-cohort data with these calibrated weights. Finally, the log-RH estimate was used with X2
and the predicted values of X1
and X3
(available for all cohort subjects), and exponentiated. A.Shin
contains the product of this quantity with the total follow-up time on interval (0,8].
References
Etievant, L., Gail, M. H. (2024). Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data. Lifetime Data Analysis, 30, 572-599.
Etievant, L., Gail, M. H. (2024). Software Application Profile: CaseCohortCoxSurvival: an R package for case-cohort inference for relative hazard and pure risk under the Cox model. Submitted.
Shin Y.E., Pfeiffer R.M., Graubard B.I., Gail M.H. (2020) Weight calibration to improve the efficiency of pure risk estimates from case-control samples nested in a cohort. Biometrics, 76, 1087-1097
Breslow, N.E., Lumley, T., Ballantyne, C.M., Chambless, L.E. and Kulich, M. (2009). Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology. Statistics in Biosciences, 1, 32-49.
Examples
data(dataexample.unstratified, package="CaseCohortCoxSurvival")
# Display some of the data
dataexample.unstratified$cohort[1:5, ]
dataexample.unstratified$A[1:5, ] # auxiliary variable values in the cohort