cure_rate_MC3 {bayesCureRateModel} | R Documentation |
Main function of the package
Description
Runs a Metropolis Coupled MCMC (MC^3
) sampler in order to estimate the joint posterior distribution of the model.
Usage
cure_rate_MC3(formula, data, nChains = 12, mcmc_cycles = 15000,
alpha = NULL,nCores = 1, sweep = 5, a_g = 1, b_g = 1,
a_l = 2.1, b_l = 1.1, mu_b = NULL,
Sigma = NULL, g_prop_sd = 0.045,
lambda_prop_scale = 0.03, b_prop_sd = NULL,
initialValues = NULL, plot = TRUE, adjust_scales = FALSE,
verbose = FALSE, tau_mala = 1.5e-05, mala = 0.15,
promotion_time = list(family = "weibull",
prior_parameters = matrix(rep(c(2.1, 1.1), 2), byrow = TRUE, 2, 2),
prop_scale = c(0.1, 0.2)), single_MH_in_f = 0.2, c_under = 1e-9)
Arguments
formula |
an object of class |
data |
a data frame containing all variable names included in |
nChains |
Positive integer corresponding to the number of heated chains in the MC |
mcmc_cycles |
Length of the generated MCMC sample. Default value: 15000. Note that each MCMC cycle consists of |
alpha |
A decreasing sequence in |
nCores |
The number of cores used for parallel processing. In case where |
sweep |
The number of usual MCMC iterations per MCMC cycle. Default value: 10. |
a_g |
Parameter |
b_g |
Parameter |
a_l |
Shape parameter |
b_l |
Scale parameter |
mu_b |
Mean ( |
Sigma |
Covariance matrix of the multivariate normal prior distribution of regression coefficients. |
g_prop_sd |
The scale of the proposal distribution for single-site updates of the |
lambda_prop_scale |
The scale of the proposal distribution for single-site updates of the |
b_prop_sd |
The scale of the proposal distribution for the update of the |
initialValues |
A list of initial values for each parameter (optional). |
plot |
Plot MCMC sample on the run. Default: TRUE. |
adjust_scales |
Boolean. If TRUE the MCMC sampler runs an initial phase of a small number of iterations in order to tune the scale of the proposal distributions in the Metropolis-Hastings steps. |
verbose |
Print progress on the terminal if TRUE. |
tau_mala |
Scale of the Metropolis adjusted Langevin diffussion proposal distribution. |
mala |
A number between |
promotion_time |
A list with details indicating the parametric family of distribution describing the promotion time and corresponding prior distributions. See 'details'. |
single_MH_in_f |
The probability for attempting a series of single site updates in the typical Metropolis-Hastings move. Otherwise, with probability 1 - |
c_under |
A small positive number (much smaller than 1) which is used as a threshold in the CDF of the promotion time for avoiding underflows in the computation of the log-likelihood function. Default value: 1e-9. |
Details
It is advised to scale all continuous explanatory variables in the design matrix, so their sample mean and standard deviations are equal to 0 and 1, respectively. No missing or infinite values are accepted.
The promotion_time
argument should be a list containing the following entries
family
Character string specifying the family of distributions
\{F(\cdot;\boldsymbol\alpha);\boldsymbol\alpha\in\mathcal A\}
describing the promotion time.prior_parameters
Values of hyper-parameters in the prior distribution of the parameters
\boldsymbol\alpha
.prop_scale
The scale of the proposal distributions for each parameter in
\boldsymbol\alpha
.K
Optional. The number of mixture components in case where a mixture model is fitted, that is, when setting
distribution
to either'gamma_mixture'
or'user_mixture'
.dirichlet_concentration_parameter
Optional. Relevant only in the case of the
'gamma_mixture'
or'user_mixture'
. Positive scalar (typically, set to 1) determining the (common) concentration parameter of the Dirichlet prior distribution of mixing proportions.
The family
specifies the distributional family of promotion time and corresponds to a character string with available choices: 'exponential'
, 'weibull'
, 'gamma'
, 'logLogistic'
, 'gompertz'
, 'lomax'
, 'dagum'
, 'gamma_mixture'
. User defined promotion time distributions and finite mixtures of them can be also fitted using the options family = 'user'
and family = 'user_mixture'
, respectively. See the 'Note' below for further details.
The joint prior distribution of \boldsymbol\alpha = (\alpha_1,\ldots,\alpha_d)
factorizes into products of inverse Gamma distributions for all (positive) parameters of F
. Moreover, in the case of 'gamma_mixture'
or 'user_mixture'
, the joint prior also consists of another term to the Dirichlet prior distribution on the mixing proportions.
The prop_scale
argument should be a vector with length equal to the length of vector d
(number of elements in \boldsymbol\alpha
), containing (positive) numbers which correspond to the scale of the proposal distribution. Note that these scale parameters are used only as initial values in case where adjust_scales = TRUE
.
Value
An object of class bayesCureModel
, i.e. a list with the following entries
mcmc_sample |
Object of class
|
latent_status_censored |
A data frame with the simulated binary latent status for each censored item. |
complete_log_likelihood |
The complete log-likelihood for the target chain. |
swap_accept_rate |
the acceptance rate of proposed swappings between adjacent MCMC chains. |
all_cll_values |
The complete log-likelihood for all chains. |
input_data_and_model_prior |
the input data, model specification and selected prior parameters values. |
log_posterior |
the logarithm of the (non-augmented) posterior distribution (after integrating the latent cured-status parameters out), up to a normalizing constant. |
map_estimate |
The Maximum A Posterior estimate of parameters. |
BIC |
Bayesian Information Criterion of the fitted model. |
AIC |
Akaike Information Criterion of the fitted model. |
residuals |
The Cox-Snell residuals of the fitted model. |
logLik |
The maximum log-likelihood value for the target chain. |
n_parameters |
The number of parameters of the model. |
initial_values |
The initial values per chain. |
Note
User-defined promotion time distributions and finite mixtures of them can be fitted using the options family = 'user'
and family = 'user_mixture'
, respectively, in the promotion_time
argument.
In this case, the user should specify the distributional family in a separate argument named define
which is passed as an additional entry in the promotion_time
. This function should accept two input arguments y
and a
corresponding to the observed data (vector of positive numbers) and the parameters of the distribution (vector of positives). Pay attention to the positivity requirement of the parameters: if this is not the case, the user should suitably reparameterize the distribution in terms of positive parameters. The define
function should return in the form of a list two arguments: log_f
and log_F
, corresponding to the the logarithm of the probability density function and the logarithm of the cumulative density function, respectively, per observation. See Papastamoulis and Milienos (2024b) for examples of the implementation.
Author(s)
Panagiotis Papastamoulis
References
Papastamoulis and Milienos (2024). Bayesian inference and cure rate modeling for event history data. TEST doi: 10.1007/s11749-024-00942-w.
Papastamoulis and Milienos (2024b). bayesCureRateModel: Bayesian Cure Rate Modeling for Time to Event Data in R. arXiv:2409.10221
Plummer M, Best N, Cowles K, Vines K (2006). "CODA: Convergence Diagnosis and Output Analysis for MCMC." R News, 6(1), 7-11.
Therneau T (2024). A Package for Survival Analysis in R. R package version 3.7-0, https://CRAN.R-project.org/package=survival.
See Also
Examples
# simulate toy data just for cran-check purposes
set.seed(10)
n = 4
# censoring indicators
stat = rbinom(n, size = 1, prob = 0.5)
# covariates
x <- matrix(rnorm(2*n), n, 2)
# observed response variable
y <- rexp(n)
# define a data frame with the response and the covariates
my_data_frame <- data.frame(y, stat, x1 = x[,1], x2 = x[,2])
fit1 <- cure_rate_MC3(survival::Surv(y, stat) ~ x1 + x2,
data = my_data_frame,
promotion_time = list(family = 'weibull'),
nChains = 2, nCores = 1,
mcmc_cycles = 3, sweep = 2)