reg_aucroc {staccuracy} | R Documentation |
Area under the ROC curve for regression target outcomes
Description
Area under the ROC curve (AUCROC) is a classification measure. By dichotomizing the range of actual
values, reg_aucroc()
turns regression evaluation into classification evaluation for any regression model. Note that the model that generates the predictions is assumed to be a regression model; however, any numeric inputs are allowed for the pred
argument, so there is no check for the nature of the source model.
Usage
reg_aucroc(
actual,
pred,
num_quants = 100,
...,
cuts = NULL,
imbalance = 0.05,
na.rm = FALSE,
sample_size = 10000,
seed = 0
)
Arguments
actual |
numeric vector. Actual label values from a dataset. They must be numeric. |
pred |
numeric vector. Predictions corresponding to each respective element in |
num_quants |
scalar positive integer. If |
... |
Not used. Forces explicit naming of the arguments that follow. |
cuts |
numeric vector. If |
imbalance |
numeric(1) in (0, 0.5]. The result element |
na.rm |
See documentation for |
sample_size |
See documentation for |
seed |
See documentation for |
Details
The ROC data and AUCROC values are calculated with aucroc()
.
Value
List with the following elements:
-
rocs
: List of results foraucroc()
for each dichotomized segment ofactual
. -
auc
: named numeric vector of AUC extracted from each element ofrocs
. Named by the percentile that the AUC represents. -
mean_auc
: named numeric(3). The average AUC over the low, middle, and high quantiles of dichotomization: -
lo
: average AUC withimbalance
% (e.g., 5%) or less of the actual target values; -
mid
: average AUC in betweenlo
andhi
; -
hi
: average AUC with (1 -imbalance
)% (e.g., 95%) or more of the actual target values;
Examples
# Remove rows with missing values from airquality dataset
airq <- airquality |>
na.omit()
# Create binary version where the target variable 'Ozone' is dichotomized based on its median
airq_bin <- airq
airq_bin$Ozone <- airq_bin$Ozone >= median(airq_bin$Ozone)
# Create a generic regression model; use autogam
req_aq <- autogam::autogam(airq, 'Ozone', family = gaussian())
req_aq$perf$sa_wmae_mad # Standardized accuracy for regression
# Create a generic classification model; use autogam
class_aq <- autogam::autogam(airq_bin, 'Ozone', family = binomial())
class_aq$perf$auc # AUC (standardized accuracy for classification)
# Compute AUC for regression predictions
reg_auc_aq <- reg_aucroc(
airq$Ozone,
predict(req_aq)
)
# Average AUC over the lo, mid, and hi quantiles of dichotomization:
reg_auc_aq$mean_auc