runVICatMixVarSelAvg {VICatMix} | R Documentation |
runVICatMixVarSelAvg
Description
An extension of 'runVICatMixVarSel' to incorporate model averaging/summarisation over multiple initialisations.
Usage
runVICatMixVarSelAvg(
data,
K,
alpha,
a = 2,
maxiter = 2000,
tol = 5e-08,
outcome = NA,
inits = 25,
loss = "VoIcomp",
var_threshold = 0.95,
parallel = FALSE,
cores = getOption("mc.cores", 2L),
verbose = FALSE
)
Arguments
data |
A data frame or data matrix with N rows of observations, and P columns of covariates. |
K |
Maximum number of clusters desired. Must be an integer greater than 1. |
alpha |
The Dirichlet prior parameter. Recommended to set this to a number < 1. Must be > 0. |
a |
Hyperparameter for variable selection hyperprior. Default is 2. |
maxiter |
The maximum number of iterations for the algorithm. Default is 2000. |
tol |
A convergence parameter. Default is 5x10^-8. |
outcome |
Optional outcome variable. Default is NA; having an outcome triggers semi-supervised profile regression. |
inits |
The number of initialisations included in the co-clustering matrix. Default is 25. |
loss |
The loss function to be used with the co-clustering matrix. Default is VoIcomp. Options are "VoIavg", "VoIcomp" and "medv". |
var_threshold |
Threshold for selection proportion for determining selected variables under the averaged model. Options are 0 < n <= 1 for a threshold. Default is 0.95. |
parallel |
Logical value indicating whether to run initialisations in parallel. Default is FALSE. |
cores |
User can specify number of cores for parallelisation if parallel = TRUE. Package automatically uses the user's parallel backend if one has already been registered. |
verbose |
Default FALSE. Set to TRUE to output ELBO values for each iteration. |
Value
A list with the following components: (maxNCat refers to the maximum number of categories for any covariate in the data)
labels_avg |
A numeric N-vector listing the cluster assignments for the observations in the averaged model. |
varsel_avg |
A numeric P-vector with a variable selection indicator for the covariates in the averaged model. |
init_results |
A list where each entry is the cluster assignments for one of the initialisations included in the model averaging. |
init_varsel_results |
A list where each entry is the expected value for the variable selection parameters ('c') for one of the initialisations included in the model averaging. |
See Also
Examples
# example code
set.seed(12)
generatedData <- generateSampleDataBin(500, 4, c(0.1, 0.2, 0.3, 0.4), 40, 10)
result <- runVICatMixVarSelAvg(generatedData$data, 10, 0.01, inits = 10)
print(result$labels_avg)
print(result$varsel_avg)