glmmsel {glmmsel} | R Documentation |
Generalised linear mixed model selection
Description
Fits the regularisation path for a sparse generalised linear mixed model (GLMM).
Usage
glmmsel(
x,
y,
cluster,
family = c("gaussian", "binomial"),
local.search = FALSE,
max.nnz = 100,
nlambda = 100,
lambda.step = 0.99,
lambda = NULL,
alpha = 0.8,
intercept = TRUE,
random.intercept = TRUE,
standardise = TRUE,
eps = 1e-04,
max.cd.iter = 10000,
max.ls.iter = 100,
max.bls.iter = 30,
t.init = 1,
t.scale = 0.5,
max.pql.iter = 100,
active.set = TRUE,
active.set.count = 3,
sort = TRUE,
screen = 100,
warn = TRUE
)
Arguments
x |
a predictor matrix |
y |
a response vector |
cluster |
a vector of length |
family |
the likelihood family to use; 'gaussian' for a continuous response or 'binomial' for a binary response |
local.search |
a logical indicating whether to perform local search after coordinate descent; typically leads to higher quality solutions |
max.nnz |
the maximum number of predictors ever allowed to be active |
nlambda |
the number of regularisation parameters to evaluate when |
lambda.step |
the step size taken when computing |
lambda |
an optional vector of regularisation parameters |
alpha |
the hierarchical parameter |
intercept |
a logical indicating whether to include a fixed intercept |
random.intercept |
a logical indicating whether to include a random intercept; applies
only when |
standardise |
a logical indicating whether to scale the data to have unit root mean square; all parameters are returned on the original scale of the data |
eps |
the convergence tolerance; convergence is declared when the relative maximum
difference in consecutive parameter values is less than |
max.cd.iter |
the maximum number of coordinate descent iterations allowed |
max.ls.iter |
the maximum number of local search iterations allowed |
max.bls.iter |
the maximum number of backtracking line search iterations allowed |
t.init |
the initial value of the gradient step size during backtracking line search |
t.scale |
the scaling parameter of the gradient step size during backtracking line search |
max.pql.iter |
the maximum number of penalised quasi-likelihood iterations allowed |
active.set |
a logical indicating whether to use active set updates; typically lowers the run time |
active.set.count |
the number of consecutive coordinate descent iterations in which a subset should appear before running active set updates |
sort |
a logical indicating whether to sort the coordinates before running coordinate descent; typically leads to higher quality solutions |
screen |
the number of predictors to keep after gradient screening; smaller values typically lower the run time |
warn |
a logical indicating whether to print a warning if the algorithms fail to converge |
Value
An object of class glmmsel
; a list with the following components:
beta0 |
a vector of fixed intercepts |
gamma0 |
a vector of random intercept variances |
beta |
a matrix of fixed slopes |
gamma |
a matrix of random slope variances |
u |
an array of random coefficient predictions |
sigma2 |
a vector of residual variances |
loss |
a vector of loss function values |
cd.iter |
a vector indicating the number of coordinate descent iterations for convergence |
ls.iter |
a vector indicating the number of local search iterations for convergence |
pql.iter |
a vector indicating the number of penalised quasi-likelihood iterations for convergence |
nnz |
a vector of the number of nonzeros |
lambda |
a vector of regularisation parameters used for the fit |
family |
the likelihood family used |
clusters |
a vector of cluster identifiers |
alpha |
the value of the hierarchical parameter used for the fit |
intercept |
whether a fixed intercept is included in the model |
random.intercept |
whether a random intercept is included in the model |
Author(s)
Ryan Thompson <ryan.thompson-1@uts.edu.au>
References
Thompson, R., Wand, M. P., and Wang, J. J. J. (2025). 'Scalable subset selection in linear mixed models'. arXiv: 2506.20425.
Examples
# Generate data
set.seed(1234)
n <- 100
m <- 4
p <- 10
s <- 5
x <- matrix(rnorm(n * p), n, p)
beta <- c(rep(1, s), rep(0, p - s))
u <- cbind(matrix(rnorm(m * s), m, s), matrix(0, m, p - s))
cluster <- sample(1:m, n, replace = TRUE)
xb <- rowSums(x * sweep(u, 2, beta, '+')[cluster, ])
y <- rnorm(n, xb)
# Fit sparse linear mixed model
fit <- glmmsel(x, y, cluster)
plot(fit)
fixef(fit, lambda = 10)
ranef(fit, lambda = 10)
coef(fit, lambda = 10)
predict(fit, x[1:3, ], cluster[1:3], lambda = 10)