pinterval_bootstrap {pintervals}R Documentation

Bootstrap prediction intervals

Description

This function computes bootstrapped prediction intervals with a confidence level of 1-alpha for a vector of (continuous) predicted values using bootstrapped prediction errors. The prediction errors to bootstrap from are computed using either a calibration set with predicted and true values or a set of pre-computed prediction errors from a calibration dataset or other data which the model was not trained on (e.g. OOB errors from a model using bagging). The function returns a tibble containing the predicted values along with the lower and upper bounds of the prediction intervals.

Usage

pinterval_bootstrap(
  pred,
  calib = NULL,
  calib_truth = NULL,
  error = NULL,
  error_type = c("raw", "absolute"),
  alpha = 0.1,
  n_bootstraps = 1000,
  lower_bound = NULL,
  upper_bound = NULL
)

Arguments

pred

Vector of predicted values

calib

A numeric vector of predicted values in the calibration partition or a 2 column tibble or matrix with the first column being the predicted values and the second column being the truth values

calib_truth

A numeric vector of true values in the calibration partition. Only required if calib is a numeric vector

error

An optional numeric vector of pre-computed prediction errors from a calibration partition or other test data. If provided, calib will be ignored

error_type

The type of error to use for the prediction intervals. Can be 'raw' or 'absolute'. If 'raw', bootstrapping will be done on the raw prediction errors. If 'absolute', bootstrapping will be done on the absolute prediction errors with random signs. Default is 'raw'

alpha

The confidence level for the prediction intervals. Must be a single numeric value between 0 and 1

n_bootstraps

The number of bootstraps to perform. Default is 1000

lower_bound

Optional minimum value for the prediction intervals. If not provided, the minimum (true) value of the calibration partition will be used

upper_bound

Optional maximum value for the prediction intervals. If not provided, the maximum (true) value of the calibration partition will be used

Value

A tibble with the predicted values, lower bounds, and upper bounds of the prediction intervals

Examples


library(dplyr)
library(tibble)
x1 <- runif(1000)
x2 <- runif(1000)
y <- rlnorm(1000, meanlog = x1 + x2, sdlog = 0.5)
df <- tibble(x1, x2, y)
df_train <- df %>% slice(1:500)
df_cal <- df %>% slice(501:750)
df_test <- df %>% slice(751:1000)
mod <- lm(log(y) ~ x1 + x2, data=df_train)
calib <- exp(predict(mod, newdata=df_cal))
calib_truth <- df_cal$y
pred_test <- exp(predict(mod, newdata=df_test))

pinterval_bootstrap(pred = pred_test,
calib = calib,
calib_truth = calib_truth,
error_type = 'raw',
alpha = 0.1,
lower_bound = 0)

[Package pintervals version 0.7.7 Index]