TSLA.fit {TSLA} | R Documentation |
Solve the TSLA optimization problem
Description
Find the solutions with a Smoothing Proximal Gradient (SPG) algorithm
for a sequence of \alpha
and \lambda
values.
Usage
TSLA.fit(
y,
X_1 = NULL,
X_2,
treemat,
family = c("ls", "logit"),
penalty = c("CL2", "RFS-Sum"),
gamma.init = NULL,
weight = NULL,
group.weight = NULL,
feature.weight = NULL,
control = list(),
modstr = list()
)
Arguments
y |
Response in matrix form, continuous for |
X_1 |
Design matrix for unpenalized features (excluding intercept). Need to be in the matrix form. |
X_2 |
Expanded design matrix for |
treemat |
Expanded tree structure in matrix form for
|
family |
Two options. Use "ls" for least square problems and "logit" for logistic regression problems. |
penalty |
Two options for group penalty on |
gamma.init |
Initial value for the optimization. Default is a zero vector.
The length should equal to 1+ |
weight |
A vector of length two and it is used for logistic regression only. The first element corresponds to weight of y=1 and the second element corresponds to weight of y=0. |
group.weight |
User-defined weights for group penalty. Need to be a vector and the length equals to the number of groups. |
feature.weight |
User-defined weights for each predictor after expansion. |
control |
A list of parameters controlling algorithm convergence. Default values:
|
modstr |
A list of parameters controlling tuning parameters. Default values:
|
Details
We adopt the warm start technique to speed up the calculation.
The warm start is applied with a fixed value of \alpha
and a
descending sequence of \lambda
.
The objective function for "ls" is
1/2 RSS+\lambda(\alpha P(\beta)+(1-\alpha) P(\gamma)),
subject to \beta=A\gamma
.
The objective function for "logit" is
-loglik + \lambda(\alpha P(\beta)+(1-\alpha) P(\gamma)),
subject to \beta=A\gamma
. Note that, in this package, the input parameter "alpha" is the
tuning parameter for the generalized lasso penalty.
Details for "penalty" option:
For penalty = "CL2"
, see details for the
"Child-l2" penalty in the main paper.
For penalty = "RFS-Sum"
, the theoretical optimal weights are used.
Please check the details in paper
"Rare feature selection in high dimensions".
Value
A list of model fitting results.
gammacoef |
Estimation for |
groupnorm |
Weighted norms for each group. |
lambda.seq |
Sequence of |
alpha.seq |
Tuning parameter sequence for the generalized lasso penalty. |
rmid |
Column index for all zero features. |
family |
Option of |
cov.name |
Names for unpenalized features. |
bin.name |
Names for binary feautres. |
tree.object |
Outputs from |
References
Chen, J., Aseltine, R. H., Wang, F., & Chen, K. (2024).
Tree-Guided Rare Feature Selection and Logic Aggregation with
Electronic Health Records Data. Journal of the American Statistical Association 119(547), 1765-1777,
doi:10.1080/01621459.2024.2326621.
Chen, X., Q. Lin, S. Kim, J. G. Carbonell, and E. P. Xing (2012).
Smoothing proximal gradient method for general structured sparse regression.
The Annals of Applied Statistics 6(2), 719–752,
doi:10.1214/11-AOAS514.
Yan, X. and J. Bien (2021).
Rare feature selection in high dimensions.
Journal of the American Statistical Association 116(534), 887–900,
doi:10.1080/01621459.2020.1796677.
Examples
# Load the synthetic data
data(RegressionExample)
tree.org <- RegressionExample$tree.org # original tree structure
x2.org <- RegressionExample$x.org # original design matrix
y <- RegressionExample$y # response
# Do the tree-guided expansion
expand.data <- getetmat(tree.org, x2.org)
x2 <- expand.data$x.expand # expanded design matrix
tree.expand <- expand.data$tree.expand # expanded tree structure
# specify some model parameters
set.seed(100)
control <- list(maxit = 100, mu = 1e-3, tol = 1e-5, verbose = FALSE)
# fit model with a pair of lambda and alpha
modstr <- list(lambda = 1, alpha = 0.1)
x1 <- NULL
fit1 <- TSLA.fit(y, x1, x2, tree.expand, family = 'ls',
penalty = 'CL2',
gamma.init = NULL, weight = NULL,
group.weight = NULL, feature.weight = NULL,
control, modstr)
# get group norms from fit1
fit1$groupnorm