tuneandtrainRobustTuneCSVM {RobustPrediction}R Documentation

Tune and Train RobustTuneC Support Vector Machine (SVM)

Description

This function tunes and trains a Support Vector Machine (SVM) classifier using the "RobustTuneC" method. It performs K-fold cross-validation (with K specified by the user) to select the best model based on the Area Under the Curve (AUC) metric.

Usage

tuneandtrainRobustTuneCSVM(
  data,
  dataext,
  K = 5,
  seed = 123,
  kernel = "linear",
  cost_seq = 2^(-15:15),
  scale = FALSE
)

Arguments

data

A data frame containing the training data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables.

dataext

A data frame containing the external validation data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables.

K

Number of folds to use in cross-validation. Default is 5.

seed

An integer specifying the random seed for reproducibility. Default is 123.

kernel

A character string specifying the kernel type to be used in the SVM. It can be "linear", "polynomial", "radial", or "sigmoid". Default is "linear".

cost_seq

A numeric vector of cost values to be evaluated. Default is '2^(-15:15)'.

scale

A logical value indicating whether to scale the predictor variables. Default is 'FALSE'.

Details

In Support Vector Machines, the cost parameter controls the trade-off between achieving a low training error and a low testing error. This function trains an SVM model on the training dataset, performs cross-validation to evaluate different cost values, and selects the one that yields the highest AUC. The final model is trained using the optimal cost value, and its performance is reported using the AUC metric on the external validation dataset.

Value

A list containing the best cost value ('best_cost'), the final trained model ('best_model'), and the chosen c value('best_c').

Examples


# Load sample data
data(sample_data_train)
data(sample_data_extern)

# Example usage
result <- tuneandtrainRobustTuneCSVM(sample_data_train, sample_data_extern, K = 5, seed = 123, 
                                     kernel = "linear", cost_seq = 2^(-15:15), scale = FALSE)
result$best_cost
result$best_model
result$best_c


[Package RobustPrediction version 0.1.7 Index]