tuneandtrainIntRF {RobustPrediction} | R Documentation |
Tune and Train Internal Random Forest
Description
This function tunes and trains a Random Forest classifier using the ranger
package with internal cross-validation.
The function evaluates a sequence of min.node.size
values on the training dataset and selects
the best model based on the Area Under the Curve (AUC).
Usage
tuneandtrainIntRF(data, num.trees = 500, nfolds = 5, seed = 123)
Arguments
data |
A data frame containing the training data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables. |
num.trees |
An integer specifying the number of trees in the Random Forest. Default is 500. |
nfolds |
An integer specifying the number of folds for cross-validation. Default is 5. |
seed |
An integer specifying the random seed for reproducibility. Default is 123. |
Details
Random Forest constructs multiple decision trees and aggregates their predictions.
The min.node.size
parameter controls the minimum number of samples in each terminal node, affecting model complexity.
This function performs cross-validation within the training dataset to evaluate the impact of different min.node.size
values.
The min.node.size
value that results in the highest AUC is selected as the best model.
Value
A list containing the best 'min.node.size' value ('best_min_node_size') and the final trained model ('best_model').
Examples
# Load sample data
data(sample_data_train)
# Example usage
result <- tuneandtrainIntRF(sample_data_train, num.trees = 500, nfolds = 5, seed = 123)
result$best_min_node_size
result$best_model