tuneandtrainIntRF {RobustPrediction}R Documentation

Tune and Train Internal Random Forest

Description

This function tunes and trains a Random Forest classifier using the ranger package with internal cross-validation. The function evaluates a sequence of min.node.size values on the training dataset and selects the best model based on the Area Under the Curve (AUC).

Usage

tuneandtrainIntRF(data, num.trees = 500, nfolds = 5, seed = 123)

Arguments

data

A data frame containing the training data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables.

num.trees

An integer specifying the number of trees in the Random Forest. Default is 500.

nfolds

An integer specifying the number of folds for cross-validation. Default is 5.

seed

An integer specifying the random seed for reproducibility. Default is 123.

Details

Random Forest constructs multiple decision trees and aggregates their predictions. The min.node.size parameter controls the minimum number of samples in each terminal node, affecting model complexity. This function performs cross-validation within the training dataset to evaluate the impact of different min.node.size values. The min.node.size value that results in the highest AUC is selected as the best model.

Value

A list containing the best 'min.node.size' value ('best_min_node_size') and the final trained model ('best_model').

Examples


# Load sample data
data(sample_data_train)

# Example usage
result <- tuneandtrainIntRF(sample_data_train, num.trees = 500, nfolds = 5, seed = 123)
result$best_min_node_size
result$best_model


[Package RobustPrediction version 0.1.7 Index]