cp_est {cutpoint} | R Documentation |
Estimate cutpoints in a multivariable setting for survival data
Description
One or two cutpoints of a metric variable are estimated using either the AIC (Akaike Information Criterion) or the LRT (Likelihood-Ratio Test statistic) within a multivariable Cox proportional hazards model. These cutpoints are used to create two or three groups with different survival probabilities.
The cutpoints are estimated by dichotomising the variable of interest, which is then incorporated into the Cox regression model. The cutpoint of this variable is the value at which the AIC reaches its lowest value or the LRT statistic achieves its maximum for the corresponding Cox-regression model.
This process occurs within a multivariable framework, as other
covariates and/or factors are considered during the search for the
cutpoints. Cutpoints can also be estimated when the variable of interest
shows a U-shaped or inverse U-shaped relationship to the hazard ratio of
time-to-event data. The argument symtail
facilitates the estimation of two
cutpoints, ensuring that the two outer tails represent groups of equal size.
Usage
cp_est(
cpvarname,
time = "time",
event = "event",
covariates = NULL,
data = data,
nb_of_cp = 1,
bandwith = 0.1,
est_type = "AIC",
cpvar_strata = FALSE,
ushape = FALSE,
symtails = FALSE,
dp = 2,
plot_splines = TRUE,
all_splines = TRUE,
print_res = TRUE,
verbose = TRUE
)
Arguments
cpvarname |
character, the name of the variable for which the cutpoints are estimated. |
time |
character, this is the follow-up time. |
event |
character, the status indicator, normally 0=no event, 1=event |
covariates |
character vector with the names of the covariates and/ or
factors. If no covariates are used, set |
data |
a data.frame, contains the following variables:
|
nb_of_cp |
numeric, number of cutpoints to be estimated (1 or 2). The
default is: |
bandwith |
numeric, minimum group size per group in percent of the total
sample size, |
est_type |
character, the method used to estimate the cutpoints. The default is 'AIC' (Akaike information criterion). The other options is 'LRT' (likelihood ratio test statistic) |
cpvar_strata |
logical value: if |
ushape |
logical value: if |
symtails |
logical value: if |
dp |
numeric, number of decimal places the cutpoints are rounded to.
Default is |
plot_splines |
logical value: if |
all_splines |
logical value: if |
print_res |
logical value: if |
verbose |
logical value: if |
Value
Returns the cpobj
object with cutpoints and the characteristics
of the formed groups.
References
Govindarajulu, U., & Tarpey, T. (2020). Optimal partitioning for the proportional hazards model. Journal of Applied Statistics, 49(4), 968–987. https://doi.org/10.1080/02664763.2020.1846690
See Also
cp_splines_plot()
for penalized spline plots, cp_value_plot()
for Value plots and Index plots
Examples
# Example 1:
# Estimate two cutpoints of the variable biomarker.
# The dataset data1 is included in this package and contains
# the variables time, event, biomarker, covariate_1, and covariate_2.
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1", "covariate_2"),
data = data1,
nb_of_cp = 2,
plot_splines = FALSE
)
# Example 2:
# Searching for cutpoints, if the variable shows a U-shaped or
# inverted U-shaped relationship to the hazard ratio.
# The dataset data2_ushape is included in this package and contains
# the variables time, event, biomarker, and cutpoint_1.
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1"),
data = data2_ushape,
nb_of_cp = 2,
bandwith = 0.2,
ushape = TRUE,
plot_splines = FALSE
)