spar.cv {spareg} | R Documentation |
Sparse Projected Averaged Regression
Description
Apply Sparse Projected Averaged Regression to High-Dimensional Data, where the number of models and the threshold parameter is chosen using a cross-validation procedure.
Usage
spar.cv(
x,
y,
family = gaussian("identity"),
model = spar_glmnet(),
rp = NULL,
screencoef = NULL,
nfolds = 10,
nnu = 20,
nus = NULL,
nummods = c(20),
measure = c("deviance", "mse", "mae", "class", "1-auc"),
parallel = FALSE,
seed = NULL,
set.seed.iteration = FALSE,
...
)
spareg.cv(
x,
y,
family = gaussian("identity"),
model = spar_glmnet(),
rp = NULL,
screencoef = NULL,
nfolds = 10,
nnu = 20,
nus = NULL,
nummods = c(20),
measure = c("deviance", "mse", "mae", "class", "1-auc"),
parallel = FALSE,
seed = NULL,
set.seed.iteration = FALSE,
...
)
Arguments
x |
n x p numeric matrix of predictor variables. |
y |
quantitative response vector of length n. |
family |
a |
model |
function creating a |
rp |
function creating a |
screencoef |
function creating a |
nfolds |
number of folds to use for cross-validation; should be at least 2, defaults to 10. |
nnu |
number of different threshold values |
nus |
optional vector of |
nummods |
vector of numbers of marginal models to consider for
validation; defaults to |
measure |
loss to use for validation; defaults to |
parallel |
assuming a parallel backend is loaded and available, a logical indicating whether the function should use it in parallelizing the estimation of the marginal models. Defaults to FALSE. |
seed |
integer seed to be set at the beginning of the SPAR algorithm. Default to NULL, in which case no seed is set. |
set.seed.iteration |
a boolean indicating whether a different seed should be set in each marginal model |
... |
further arguments mainly to ensure back-compatibility |
Value
object of class 'spar.cv'
with elements
-
betas
p xmax(nummods)
sparse matrix of class'Matrix::dgCMatrix'
containing the standardized coefficients from each marginal model -
intercepts
used in each marginal model, vector of lengthmax(nummods)
-
scr_coef
p-vector of coefficients used for screening for standardized predictors -
inds
list of index-vectors corresponding to variables kept after screening in each marginal model of lengthmax(nummods)
-
RPMs
list of projection matrices used in each marginal model of lengthmax(nummods)
-
val_sum
adata.frame
with CV results (mean and sd validation measure and mean number of active variables) for each element of nus and nummods -
nus
vector of\nu
's considered for thresholding -
nummods
vector of numbers of marginal models considered for validation -
ycenter
empirical mean of initial response vector -
yscale
empirical standard deviation of initial response vector . -
xcenter
p-vector of empirical means of initial predictor variables -
xscale
p-vector of empirical standard deviations of initial predictor variables -
rp
an object of class'randomprojection'
-
screencoef
an object of class'screeningcoef'
See Also
spar,coef.spar.cv,predict.spar.cv,plot.spar.cv,print.spar.cv
Examples
example_data <- simulate_spareg_data(n = 200, p = 2000, ntest = 100)
spar_res <- spar.cv(example_data$x, example_data$y,
nummods = c(5, 10, 15, 20, 25, 30))
spar_res
coefs <- coef(spar_res)
pred <- predict(spar_res, example_data$x)
plot(spar_res)
plot(spar_res, plot_type = "Val_Meas", plot_along = "nummod", nu = 0)
plot(spar_res, plot_type = "Val_Meas", plot_along = "nu", nummod = 10)
plot(spar_res, plot_type = "Val_numAct", plot_along = "nummod", nu = 0)
plot(spar_res, plot_type = "Val_numAct", plot_along = "nu", nummod = 10)
plot(spar_res, plot_type = "res-vs-fitted", xfit = example_data$xtest,
yfit = example_data$ytest, opt_par = "1se")
plot(spar_res, "coefs", prange = c(1, 400))
spar_res <- spareg.cv(example_data$x, example_data$y,
nummods=c(5, 10, 15, 20, 25, 30))