aloocv {baclava}R Documentation

Approximate Leave-One-Out Cross-Validation

Description

Approximate leave-one-out cross-validation computed from the posterior draws of the Markov chain Monte Carlo sampler as implemented in fit_baclava().

Usage

aloocv(
  object,
  data.clinical,
  data.assess,
  J.increment = 75L,
  J.max = 225L,
  ess.target = 50L,
  n.core = 1L,
  verbose = TRUE,
  lib = NULL
)

Arguments

object

The value object returned by fit_baclava().

data.clinical

A data.frame object. The clinical data on which the model is assessed. The data must be structured as for fit_baclava(); specifically, it must contain

  • id: A character, numeric, or integer object. The unique participant id to which the record pertains. Note these must include those provided in data.assess. Must be only 1 record for each participant.

  • age_entry: A numeric object. The age at time of entry into the study. Note that this data is used to calculate the normalization; to expedite numerical integration, it is recommended that the ages be rounded to minimize repeated calculations. Optional input 'round.age.entry' can be set to FALSE if this approximation is not desired; however, the computation time will significantly increase.

  • endpoint_type: A character object. Must be one of {"clinical", "censored", "preclinical"}. Type "clinical" indicates that disease was diagnosed in the clinical compartment (i.e., symptomatic). Type "preclinical" indicates that disease was diagnosed in the pre-clinical compartment (i.e., during an assessment). Type "censored" indicates participant was censored.

  • age_endpoint: A numeric object. The participant's age at the time the endpoint was evaluated.

If the sensitivity parameter (beta) is arm-specific, an additional column arm is required indicating the study arm to which each participant is assigned. Similarly, if the preclinical Weibull distribution is group-specific, an additional column grp.rateP is required. See Details for further information.

data.assess

A data.frame object. The disease status assessment data on which the model is assessed. The data must be structured as for fit_baclava(); specifically, the data must contain

  • id: A character, numeric, or integer object. The unique participant id to which the record pertains.

  • age_assess: A numeric object. The participant's age at time of assessment.

  • disease_detected: An integer object. Must be binary 0/1, where 1 indicates that disease was detected at the assessment; 0 otherwise.

If the sensitivity parameter (beta) is screen-type specific, an additional column screen_type is required indicating the type of each screen.

J.increment

An integer object. The number of replicates of each participant to generate in each iteration of the importance sampling procedure to attain desired effective sample size.

J.max

An integer object. The maximum number of samples to be drawn.

ess.target

An integer object. The target effective sample size in the importance sampling procedure.

n.core

An integer object. The function allows for the outer loop across participants to be run in parallel using foreach().

verbose

A logical object. If TRUE, progress information will be printed. This input will be ignored if n.core > 1.

lib

An optional character vector allowing for library path to be provided to cluster.

Details

Computes the predictive fit of a model. For each individual and each MCMC draw, the function approximates the marginal likelihood via importance sampling. It samples J.increment values of the individual's latent variables using the Metropolis-Hastings proposal distributions and computes the effective sample size (ESS) of the importance sampling procedure. If the target ESS is not met, J.increment additional samples are taken, and the ESS is re-evaluated. This is repeated until either the ESS is satisfied or J.max samples have been drawn.

Value

A list object. Element summary contains the min, mean, and the 1 likelihood; and the individual-level and estimated predictive fit. Element result contains the likelihood, ESS, and J for each MCMC sample for each participant.

Examples


data(screen_data)

theta_0 <- list("rate_H" = 7e-4, "shape_H" = 2.0,
                "rate_P" = 0.5  , "shape_P" = 1.0,
                "beta" = 0.9, psi = 0.4)
prior <- list("rate_H" = 0.01, "shape_H" = 1,
              "rate_P" = 0.01, "shape_P" = 1,
              "a_psi" = 1/2 , "b_psi" = 1/2,
              "a_beta" = 38.5, "b_beta" = 5.8)

# This is for illustration only -- the number of MCMC samples should be
# significantly larger and the epsilon values should be tuned.
example <- fit_baclava(data.assess = data.screen,
                       data.clinical = data.clinical,
                       t0 = 30.0,
                       theta_0 = theta_0,
                       prior = prior,
                       thin = 10L)

res <- aloocv(example, data.clinical, data.screen)


[Package baclava version 1.1 Index]