tuneandtrainRobustTuneCSVM {RobustPrediction} | R Documentation |
Tune and Train RobustTuneC Support Vector Machine (SVM)
Description
This function tunes and trains a Support Vector Machine (SVM) classifier using the "RobustTuneC" method. It performs K-fold cross-validation (with K specified by the user) to select the best model based on the Area Under the Curve (AUC) metric.
Usage
tuneandtrainRobustTuneCSVM(
data,
dataext,
K = 5,
seed = 123,
kernel = "linear",
cost_seq = 2^(-15:15),
scale = FALSE
)
Arguments
data |
A data frame containing the training data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables. |
dataext |
A data frame containing the external validation data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables. |
K |
Number of folds to use in cross-validation. Default is 5. |
seed |
An integer specifying the random seed for reproducibility. Default is 123. |
kernel |
A character string specifying the kernel type to be used in the SVM. It can be "linear", "polynomial", "radial", or "sigmoid". Default is "linear". |
cost_seq |
A numeric vector of cost values to be evaluated. Default is '2^(-15:15)'. |
scale |
A logical value indicating whether to scale the predictor variables. Default is 'FALSE'. |
Details
In Support Vector Machines, the cost
parameter controls the trade-off between achieving
a low training error and a low testing error.
This function trains an SVM model on the training dataset, performs cross-validation to evaluate different
cost
values, and selects the one that yields the highest AUC.
The final model is trained using the optimal cost value, and its performance is reported using the AUC metric
on the external validation dataset.
Value
A list containing the best cost value ('best_cost'), the final trained model ('best_model'), and the chosen c value('best_c').
Examples
# Load sample data
data(sample_data_train)
data(sample_data_extern)
# Example usage
result <- tuneandtrainRobustTuneCSVM(sample_data_train, sample_data_extern, K = 5, seed = 123,
kernel = "linear", cost_seq = 2^(-15:15), scale = FALSE)
result$best_cost
result$best_model
result$best_c