winsorize {staccuracy} | R Documentation |
Winsorize a numeric vector
Description
Winsorization means truncating the extremes of a numeric range by replacing extreme values with a predetermined minimum and maximum. winsorize()
returns the input vector values with values less than or greater than the provided minimum or maximum replaced by the provided minimum or maximum, respectively.
win_mae()
and win_rmse()
return MAE and RMSE respectively with winsorized predictions. The fundamental idea underlying the winsorization of predictions is that if the actual data has well-defined bounds, then models should not be penalized for being overzealous in predicting beyond the extremes of the data. Models that are overzealous in the boundaries might sometimes be superior within normal ranges; the extremes can be easily corrected by winsorization.
Usage
winsorize(x, win_range)
win_mae(actual, pred, win_range = range(actual), na.rm = FALSE)
win_rmse(actual, pred, win_range = range(actual), na.rm = FALSE)
Arguments
x |
numeric vector. |
win_range |
numeric(2). The minimum and maximum allowable values for the |
actual |
numeric vector. Actual (true) values of target outcome data. |
pred |
numeric vector. Predictions corresponding to each respective element in |
na.rm |
logical(1). |
Value
winsorize()
returns a winsorized vector.
win_mae()
returns the mean absolute error (MAE) of winsorized predicted values pred
compared to the actual
values. See mae()
for details.
win_rmse()
returns the root mean squared error (RMSE) of winsorized predicted values pred
compared to the actual
values. See rmse()
for details.
Examples
a <- c(3, 5, 2, 7, 9, 4, 6, 8, 2, 10)
p <- c(2.5, 5.5, 1.5, 6.5, 10.5, 3.5, 6, 7.5, 0.5, 11.5)
a # the original data
winsorize(a, c(2, 8)) # a winsorized on defined boundaries
# range of the original data
a
range(a)
# some overzealous predictions
p
range(p)
# MAE penalizes overzealous predictions
mae(a, p)
# Winsorized MAE forgives overzealous predictions
win_mae(a, p)
# RMSE penalizes overzealous predictions
rmse(a, p)
# Winsorized RMSE forgives overzealous predictions
win_rmse(a, p)