ensemble_model_spec {modeltime.ensemble} | R Documentation |
Creates a Stacked Ensemble Model from a Model Spec
Description
A 2-stage stacking regressor that follows:
Stage 1: Sub-Model's are Trained & Predicted using
modeltime.resample::modeltime_fit_resamples()
.Stage 2: A Meta-learner (
model_spec
) is trained on Out-of-Sample Sub-Model Predictions usingensemble_model_spec()
.
Usage
ensemble_model_spec(
object,
model_spec,
kfolds = 5,
param_info = NULL,
grid = 6,
control = control_grid()
)
Arguments
object |
A Modeltime Table. Used for ensemble sub-models. |
model_spec |
A Can be either:
|
kfolds |
K-Fold Cross Validation for tuning the Meta-Learner.
Controls the number of folds used in the meta-learner's cross-validation.
Gets passed to |
param_info |
A |
grid |
Grid specification or grid size for tuning the Meta Learner.
Gets passed to |
control |
An object used to modify the tuning process.
Uses |
Details
Stacked Ensemble Process
Start with a Modeltime Table to define your sub-models.
Step 1: Use
modeltime.resample::modeltime_fit_resamples()
to perform the submodel resampling procedure.Step 2: Use
ensemble_model_spec()
to define and train the meta-learner.
What goes on inside the Meta Learner?
The Meta-Learner Ensembling Process uses the following basic steps:
-
Make Cross-Validation Predictions. Cross validation predictions are made for each sub-model with
modeltime.resample::modeltime_fit_resamples()
. The out-of-sample sub-model predictions contained in.resample_results
are used as the input to the meta-learner. -
Train a Stacked Regressor (Meta-Learner). The sub-model out-of-sample cross validation predictions are then modeled using a
model_spec
with options:-
Tuning: If the
model_spec
does include tuning parameters viatune::tune()
then the meta-learner will be hypeparameter tuned using K-Fold Cross Validation. The parameters and grid can adjusted usingkfolds
,grid
, andparam_info
. -
No-Tuning: If the
model_spec
does not include tuning parameters viatune::tune()
then the meta-learner will not be hypeparameter tuned and will have the model fitted to the sub-model predictions.
-
-
Final Model Selection.
-
If tuned, the final model is selected based on RMSE, then retrained on the full set of out of sample predictions.
-
If not-tuned, the fitted model from Stage 2 is used.
-
Progress
The best way to follow the training process and watch progress is to use
control = control_grid(verbose = TRUE)
to see progress.
Parallelize
Portions of the process can be parallelized. To parallelize, set
up parallelization using tune
via one of the backends such as
doFuture
. Then set control = control_grid(allow_par = TRUE)
Value
A mdl_time_ensemble
object.
Examples
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)
library(dplyr)
library(timetk)
library(glmnet)
# Step 1: Make resample predictions for submodels
resamples_tscv <- training(m750_splits) %>%
time_series_cv(
assess = "2 years",
initial = "5 years",
skip = "2 years",
slice_limit = 1
)
submodel_predictions <- m750_models %>%
modeltime_fit_resamples(
resamples = resamples_tscv,
control = control_resamples(verbose = TRUE)
)
# Step 2: Metalearner ----
# * No Metalearner Tuning
ensemble_fit_lm <- submodel_predictions %>%
ensemble_model_spec(
model_spec = linear_reg() %>% set_engine("lm"),
control = control_grid(verbose = TRUE)
)
ensemble_fit_lm
# * With Metalearner Tuning ----
ensemble_fit_glmnet <- submodel_predictions %>%
ensemble_model_spec(
model_spec = linear_reg(
penalty = tune(),
mixture = tune()
) %>%
set_engine("glmnet"),
grid = 2,
control = control_grid(verbose = TRUE)
)
ensemble_fit_glmnet