stsc {hdflex} | R Documentation |
Signal-Transformed-Subset-Combination (STSC)
Description
stsc()
is a time series forecasting method designed to handle
vast sets of predictive signals, many of which may be irrelevant or
short-lived. This method transforms heterogeneous scalar-valued signals into
candidate density forecasts via time-varying coefficient models (TV-C),
and subsequently, combines them into an ultimate aggregate density forecast
via dynamic subset combinations (DSC).
Usage
stsc(
y,
X,
Ext_F,
init,
lambda_grid,
kappa_grid,
bias,
gamma_grid,
psi_grid,
delta,
burn_in,
burn_in_dsc,
metric,
equal_weight,
incl,
parallel = FALSE,
n_threads = parallel::detectCores() - 2,
portfolio_params = NULL
)
Arguments
y |
A matrix of dimension |
X |
A matrix with |
Ext_F |
A matrix with |
init |
An integer that denotes the number of observations used to initialize the observational variance and the coefficients' variance in the TV-C models. |
lambda_grid |
A numeric vector which takes values between 0 and 1
denoting the discount factor(s) that control the dynamics of the time-varying
coefficients. Each signal in combination with each value of
lambda provides a separate candidate forecast.
Constant coefficients are nested for the case |
kappa_grid |
A numeric vector which takes values between 0 and 1
to accommodate time-varying volatility in the TV-C models.
The observational variance is estimated via
Exponentially Weighted Moving Average and kappa denotes the underlying
decay factor. Constant variance is nested for the case |
bias |
A boolean to indicate whether the TV-C-models
allow for a bias correction to F-signals.
|
gamma_grid |
A numeric vector containing potential discount factors between 0 and 1 to exponentially down-weight the past predictive performance of the candidate forecasting models. The values of this tuning parameter are chosen in a procedure that amounts to leave-one-out cross-validation, taking into account the time series structure of the data. For details, see Adaemmer et al. (2023). |
psi_grid |
An integer vector that controls the (possible) sizes of the subsets. The values of this tuning parameter are chosen in a procedure that amounts to leave-one-out cross-validation, taking taking into account the time series structure of the data. For details, see Adaemmer et al. (2023). |
delta |
A numeric value between 0 and 1 denoting the discount factor applied to down-weight the past predictive performance of the aggregate predictive densities. |
burn_in |
An integer value |
burn_in_dsc |
An integer value |
metric |
An integer from the set |
equal_weight |
A boolean that denotes whether equal weights are used to
combine the candidate forecasts within a subset. If |
incl |
An optional integer vector that denotes signals that
must be included in the subset combinations. For example, |
parallel |
A boolean indicating whether the function should be parallelized. |
n_threads |
An integer that denotes the number of cores used
for parallelization. Only necessary if |
portfolio_params |
A numeric vector of length 3 containing the following elements:
This parameter is only required if |
Value
A list containing:
- Forecasts
A list containing:
- Realization
-
A vector with the actual values of the target variable.
- Point_Forecast
-
A vector with the first moments of the aggregate predictive densities of the STSC model.
- Variance_Prediction
-
A vector with the second moments of the aggregate predictive densities of the STSC model.
- Tuning_Parameters
A list containing:
- Gamma
-
A vector containing the selected values for the tuning parameter gamma.
- Psi
-
A vector containing the selected values for the tuning parameter psi.
- Signals
-
A matrix containing the selected signals.
- Lambda
-
A matrix containing the selected values for the tuning parameter lambda.
- Kappa
-
A matrix containing the selected values for the tuning parameter kappa.
- Model
A list containing:
- Lambda_grid
The grid of lambda values used in the model.
- Kappa_grid
The grid of kappa values used in the model.
- Gamma_grid
The grid of gamma values used in the model.
- Psi_grid
The grid of psi values used in the model.
- Delta
The delta value used in the model.
- Init
The init value used in the model.
- Burn_in
The burn-in period used in the model.
- Burn_in_dsc
The burn-in period used in the model.
- metric
The ranking metric used in the model.
- Equal_weight
A boolean indicating if equal weighting was used.
- Bias
A boolean indicating if bias correction was applied.
- Incl
Additional included parameters.
Author(s)
Philipp Adämmer, Sven Lehmann, Rainer Schüssler
References
Beckmann, J., Koop, G., Korobilis, D., and Schüssler, R. A. (2020) "Exchange rate predictability and dynamic bayesian learning." Journal of Applied Econometrics, 35 (4): 410–421.
Dangl, T. and Halling, M. (2012) "Predictive regressions with time-varying coefficients." Journal of Financial Economics, 106 (1): 157–181.
Del Negro, M., Hasegawa, R. B., and Schorfheide, F. (2016) "Dynamic prediction pools: An investigation of financial frictions and forecasting performance." Journal of Econometrics, 192 (2): 391–405.
Koop, G. and Korobilis, D. (2012) "Forecasting inflation using dynamic model averaging." International Economic Review, 53 (3): 867–886.
Koop, G. and Korobilis, D. (2023) "Bayesian dynamic variable selection in high dimensions." International Economic Review.
Raftery, A. E., Kárn'y, M., and Ettler, P. (2010) "Online prediction under model uncertainty via dynamic model averaging: Application to a cold rolling mill." Technometrics, 52 (1): 52–66.
West, M. and Harrison, J. (1997) "Bayesian forecasting and dynamic models" Springer, 2nd edn.
See Also
https://github.com/lehmasve/hdflex#readme
Examples
#########################################################
######### Forecasting quarterly U.S. inflation ##########
#### Please see Koop & Korobilis (2023) for further ####
#### details regarding the data & external forecasts ####
#########################################################
# Load Package
library("hdflex")
library("ggplot2")
library("cowplot")
########## Get Data ##########
# Load Package Data
inflation_data <- inflation_data
# Set Target Variable
y <- inflation_data[, 1]
# Set 'P-Signals'
X <- inflation_data[, 2:442]
# Set 'F-Signals'
Ext_F <- inflation_data[, 443:462]
# Get Dates and Number of Observations
tdates <- rownames(inflation_data)
tlength <- length(tdates)
# First complete observation (no missing values)
first_complete <- which(complete.cases(inflation_data))[1]
########## Rolling AR2-Benchmark ##########
# Set up matrix for predictions
benchmark <- matrix(NA, nrow = tlength,
ncol = 1, dimnames = list(tdates, "AR2"))
# Set Window-Size (15 years of quarterly data)
window_size <- 15 * 4
# Time Sequence
t_seq <- seq(window_size, tlength - 1)
# Loop with rolling window
for (t in t_seq) {
# Split Data for Training Train Data
x_train <- cbind(int = 1, X[(t - window_size + 1):t, 1:2])
y_train <- y[(t - window_size + 1):t]
# Split Data for Prediction
x_pred <- cbind(int = 1, X[t + 1, 1:2, drop = FALSE])
# Fit AR-Model
model_ar <- .lm.fit(x_train, y_train)
# Predict and store in benchmark matrix
benchmark[t + 1, ] <- x_pred %*% model_ar$coefficients
}
########## STSC ##########
# Set TV-C-Parameter
init <- 5 * 4
lambda_grid <- c(0.90, 0.95, 1.00)
kappa_grid <- c(0.94, 0.96, 0.98)
bias <- TRUE
# Set DSC-Parameter
gamma_grid <- c(0.40, 0.50, 0.60, 0.70, 0.80, 0.90,
0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, 1.00)
n_tvc <- (ncol(X) + ncol(Ext_F)) * length(lambda_grid) * length(kappa_grid)
psi_grid <- c(1:100, sapply(1:4, function(i) floor(i * n_tvc / 4)))
delta <- 0.95
burn_in <- first_complete + init / 2
burn_in_dsc <- 1
metric <- 5
equal_weight <- TRUE
incl <- NULL
parallel <- FALSE
n_threads <- NULL
# Apply STSC-Function
results <- hdflex::stsc(y,
X,
Ext_F,
init,
lambda_grid,
kappa_grid,
bias,
gamma_grid,
psi_grid,
delta,
burn_in,
burn_in_dsc,
metric,
equal_weight,
incl,
parallel,
n_threads,
NULL)
########## Evaluation ##########
# Define Evaluation Period (OOS-Period)
eval_period <- which(tdates >= "1991-04-01" & tdates <= "2021-12-01")
# Get Evaluation Summary for STSC
eval_results <- summary(obj = results, eval_period = eval_period)
# Calculate (Mean-)Squared-Errors for AR2-Benchmark
se_ar2 <- (y[eval_period] - benchmark[eval_period, 1])^2
mse_ar2 <- mean(se_ar2)
# Create CSSED-Plot
cssed <- cumsum(se_ar2 - eval_results$MSE[[2]])
plot_cssed <- ggplot(
data = data.frame(eval_period, cssed),
aes(x = eval_period, y = cssed)
) +
geom_line() +
ylim(-0.0008, 0.0008) +
ggtitle("Cumulative Squared Error Differences") +
xlab("Time Index") +
ylab("CSSED") +
geom_hline(yintercept = 0, linetype = "dashed", color = "darkgray") +
theme_minimal(base_size = 15) +
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_rect(colour = "black", fill = NA),
axis.ticks = element_line(colour = "black"),
plot.title = element_text(hjust = 0.5)
)
# Show Plots
options(repr.plot.width = 15, repr.plot.height = 15)
plots_list <- eval_results$Plots
plots_list <- c(list(plot_cssed), plots_list)
cowplot::plot_grid(plotlist = plots_list,
ncol = 2,
nrow = 3,
align = "hv")
# Relative MSE
print(paste("Relative MSE:", round(eval_results$MSE[[1]] / mse_ar2, 4)))