priorityelasticnet {priorityelasticnet}R Documentation

Priority Elastic Net for High-Dimensional Data

Description

This function performs penalized regression analysis using the elastic net method, tailored for high-dimensional data with a known group structure. It also includes an optional feature to launch a Shiny application for model evaluation with weighted threshold optimization.

Usage

priorityelasticnet(
  X,
  Y,
  weights = NULL,
  family = c("gaussian", "binomial", "cox", "multinomial"),
  alpha = 0.5,
  type.measure,
  blocks,
  max.coef = NULL,
  block1.penalization = TRUE,
  lambda.type = "lambda.min",
  standardize = TRUE,
  nfolds = 10,
  foldid = NULL,
  cvoffset = FALSE,
  cvoffsetnfolds = 10,
  mcontrol = missing.control(),
  scale.y = FALSE,
  return.x = TRUE,
  adaptive = FALSE,
  initial_global_weight = TRUE,
  verbose = FALSE,
  ...
)

Arguments

X

A numeric matrix of predictors.

Y

A response vector. For family = "multinomial", Y should be a factor with more than two levels.

weights

Optional observation weights. Default is NULL.

family

A character string specifying the model type. Options are "gaussian", "binomial", "cox", and "multinomial". Default is "gaussian".

alpha

The elastic net mixing parameter, with 0 \le \alpha \le 1. The penalty is defined as (1-\alpha)/2||\beta||_2^2 + \alpha||\beta||_1. Default is 1.

type.measure

Loss function for cross-validation. Options are "mse", "deviance", "class", "auc". Default depends on the family.

blocks

A list where each element is a vector of indices indicating the predictors in that block.

max.coef

A numeric vector specifying the maximum number of non-zero coefficients allowed in each block. Default is NULL, meaning no limit.

block1.penalization

Logical. If FALSE, the first block will not be penalized. Default is TRUE.

lambda.type

Type of lambda to select. Options are "lambda.min" or "lambda.1se". Default is "lambda.min".

standardize

Logical flag for variable standardization, prior to fitting the model. Default is TRUE.

nfolds

Number of folds for cross-validation. Default is 10.

foldid

Optional vector of values between 1 and nfolds identifying what fold each observation is in. Default is NULL.

cvoffset

Logical. If TRUE, a cross-validated offset is used. Default is FALSE.

cvoffsetnfolds

Number of folds for cross-validation of the offset. Default is 10.

mcontrol

Control parameters for handling missing data. Default is missing.control().

scale.y

Logical. If TRUE, the response variable Y is scaled. Default is FALSE.

return.x

Logical. If TRUE, the function returns the input matrix X. Default is TRUE.

adaptive

Logical. If TRUE, the adaptive elastic net is used, where penalties are adjusted based on the importance of the coefficients from an initial model fit. Default is FALSE.

initial_global_weight

Logical. If TRUE (the default), global initial weights will be calculated based on all predictors. If FALSE, initial weights will be calculated separately for each block.

verbose

Logical. If TRUE prints detailed logs of the process. Default is FALSE.

...

Additional arguments to be passed to cv.glmnet.

Value

A list with the following components:

lambda.ind

Indices of the selected lambda values.

lambda.type

Type of lambda used.

lambda.min

Selected lambda values.

min.cvm

Cross-validated mean squared error for each block.

nzero

Number of non-zero coefficients for each block.

glmnet.fit

Fitted glmnet objects for each block.

name

Name of the model.

block1unpen

Fitted model for the unpenalized first block, if applicable.

coefficients

Coefficients of the fitted models.

call

The function call.

X

The input matrix X, if return.x is TRUE.

missing.data

Logical vector indicating missing data.

imputation.models

Imputation models used, if applicable.

blocks.used.for.imputation

Blocks used for imputation, if applicable.

missingness.pattern

Pattern of missing data, if applicable.

y.scale.param

Parameters for scaling Y, if applicable.

blocks

The input blocks.

mcontrol

Control parameters for handling missing data.

family

The model family.

dim.x

Dimensions of the input matrix X.

Note

Ensure that glmnet version >= 2.0.13 is installed. The function does not support single missing values within a block.

Examples



  # Simulation of multinomial data:
  set.seed(123)
  n <- 100
  p <- 50
  k <- 3
  x <- matrix(rnorm(n * p), n, p)
  y <- sample(1:k, n, replace = TRUE)
  y <- factor(y)
  blocks <- list(bp1 = 1:10, bp2 = 11:30, bp3 = 31:50)
  
  # Run priorityelasticnet:
  fit <- priorityelasticnet(x, y, family = "multinomial", alpha = 0.5, 
                     type.measure = "class", blocks = blocks,
                     block1.penalization = TRUE, lambda.type = "lambda.min", 
                     standardize = TRUE, nfolds = 5, 
                     adaptive = FALSE)
                     
   fit$coefficients


[Package priorityelasticnet version 0.1.0 Index]