tuneandtrainRobustTuneCRF {RobustPrediction} | R Documentation |
Tune and Train RobustTuneC Random Forest
Description
This function tunes and trains a Random Forest classifier using the ranger
package and the "RobustTuneC" method.
The function uses K-fold cross-validation to evaluate different min.node.size
values on the training dataset
and selects the best model based on the Area Under the Curve (AUC).
Usage
tuneandtrainRobustTuneCRF(data, dataext, K = 5, num.trees = 500)
Arguments
data |
A data frame containing the training data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables. |
dataext |
A data frame containing the external validation data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables. |
K |
Number of folds to use in cross-validation. Default is 5. |
num.trees |
An integer specifying the number of trees to grow in the Random Forest. Default is 500. |
Details
Random Forest constructs multiple decision trees and aggregates their predictions.
The min.node.size
parameter controls the minimum number of samples in each terminal node, affecting model complexity.
This function evaluates the min.node.size
values through cross-validation and then applies the best model to an
external validation dataset. The min.node.size
value that results in the highest AUC on the validation dataset is selected.
Value
A list containing the best minimum node size ('best_min_node_size'), the final trained model ('best_model'), and the chosen c value('best_c').
Examples
# Load sample data
data(sample_data_train)
data(sample_data_extern)
# Example usage
result <- tuneandtrainRobustTuneCRF(sample_data_train, sample_data_extern, K = 5, num.trees = 500)
result$best_min_node_size
result$best_model
result$best_c