runVICatMixVarSelAvg {VICatMix}R Documentation

runVICatMixVarSelAvg

Description

An extension of 'runVICatMixVarSel' to incorporate model averaging/summarisation over multiple initialisations.

Usage

runVICatMixVarSelAvg(
  data,
  K,
  alpha,
  a = 2,
  maxiter = 2000,
  tol = 5e-08,
  outcome = NA,
  inits = 25,
  loss = "VoIcomp",
  var_threshold = 0.95,
  parallel = FALSE,
  cores = getOption("mc.cores", 2L),
  verbose = FALSE
)

Arguments

data

A data frame or data matrix with N rows of observations, and P columns of covariates.

K

Maximum number of clusters desired. Must be an integer greater than 1.

alpha

The Dirichlet prior parameter. Recommended to set this to a number < 1. Must be > 0.

a

Hyperparameter for variable selection hyperprior. Default is 2.

maxiter

The maximum number of iterations for the algorithm. Default is 2000.

tol

A convergence parameter. Default is 5x10^-8.

outcome

Optional outcome variable. Default is NA; having an outcome triggers semi-supervised profile regression.

inits

The number of initialisations included in the co-clustering matrix. Default is 25.

loss

The loss function to be used with the co-clustering matrix. Default is VoIcomp. Options are "VoIavg", "VoIcomp" and "medv".

var_threshold

Threshold for selection proportion for determining selected variables under the averaged model. Options are 0 < n <= 1 for a threshold. Default is 0.95.

parallel

Logical value indicating whether to run initialisations in parallel. Default is FALSE.

cores

User can specify number of cores for parallelisation if parallel = TRUE. Package automatically uses the user's parallel backend if one has already been registered.

verbose

Default FALSE. Set to TRUE to output ELBO values for each iteration.

Value

A list with the following components: (maxNCat refers to the maximum number of categories for any covariate in the data)

labels_avg

A numeric N-vector listing the cluster assignments for the observations in the averaged model.

varsel_avg

A numeric P-vector with a variable selection indicator for the covariates in the averaged model.

init_results

A list where each entry is the cluster assignments for one of the initialisations included in the model averaging.

init_varsel_results

A list where each entry is the expected value for the variable selection parameters ('c') for one of the initialisations included in the model averaging.

See Also

runVICatMixVarSel

Examples

# example code

set.seed(12)
generatedData <- generateSampleDataBin(500, 4, c(0.1, 0.2, 0.3, 0.4), 40, 10)
result <- runVICatMixVarSelAvg(generatedData$data, 10, 0.01, inits = 10)

print(result$labels_avg)
print(result$varsel_avg)




[Package VICatMix version 1.0 Index]