data.frame2glmnet.matrix {easy.glmnet} | R Documentation |
Convert a data.frame into a matrix ready for glmnet
Description
Function to convert categorical variables into dummy variables ready for glmnet_fit
and glmnet_predict
. Additionally, it also removes constant columns.
Usage
data.frame2glmnet.matrix_fit(x)
data.frame2glmnet.matrix(m, x)
Arguments
m |
model to conduct the conversion, obtained with |
x |
data.frame to be converted. |
Details
Note that the returned matrix might differ from the design matrix of a linear model because for categoric variables with more than two levels, it creates as many dummy variables as levels (which is ok for lasso).
Value
A matrix ready for glmnet_fit
and glmnet_predict
.
Author(s)
Joaquim Radua and Aleix Solanes
See Also
glmnet_predict
for obtaining predictions,
cv
for conducting a cross-validation.
Examples
# Create random x (predictors) and y (binary)
x = cbind(
as.data.frame(matrix(rnorm(10000), ncol = 20)),
matrix(sample(letters, 2500, TRUE), ncol = 5)
)
y = 1 * (plogis(apply(x[,1:5], 1, sum) + rnorm(500, 0, 0.1)) > 0.5)
# Predict y via cross-validation, including conversion to matrix
fit_fun = function (x_training, y_training) {
m = list(
matrix = data.frame2glmnet.matrix_fit(x_training)
)
x_mat = data.frame2glmnet.matrix(m$matrix, x_training)
m$lasso = glmnet_fit(x_mat, y_training, family = "binomial")
m
}
predict_fun = function (m, x_test) {
x_mat = data.frame2glmnet.matrix(m$matrix, x_test)
glmnet_predict(m$lasso, x_mat)
}
# Only 2 folds to ensure the example runs quickly
res = cv(x, y, family = "binomial", fit_fun = fit_fun, predict_fun = predict_fun, nfolds = 2)
# Show accuracy
se = mean(res$predictions$y.pred[res$predictions$y == 1] > 0.5)
sp = mean(res$predictions$y.pred[res$predictions$y == 0] < 0.5)
bac = (se + sp) / 2
cat("Sensitivity:", round(se, 2), "\n")
cat("Specificity:", round(sp, 2), "\n")
cat("Balanced accuracy:", round(bac, 2), "\n")
[Package easy.glmnet version 1.0 Index]