tuneandtrainRobustTuneCRF {RobustPrediction}R Documentation

Tune and Train RobustTuneC Random Forest

Description

This function tunes and trains a Random Forest classifier using the ranger package and the "RobustTuneC" method. The function uses K-fold cross-validation to evaluate different min.node.size values on the training dataset and selects the best model based on the Area Under the Curve (AUC).

Usage

tuneandtrainRobustTuneCRF(data, dataext, K = 5, num.trees = 500)

Arguments

data

A data frame containing the training data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables.

dataext

A data frame containing the external validation data. The first column should be the response variable (factor), and the remaining columns should be the predictor variables.

K

Number of folds to use in cross-validation. Default is 5.

num.trees

An integer specifying the number of trees to grow in the Random Forest. Default is 500.

Details

Random Forest constructs multiple decision trees and aggregates their predictions. The min.node.size parameter controls the minimum number of samples in each terminal node, affecting model complexity. This function evaluates the min.node.size values through cross-validation and then applies the best model to an external validation dataset. The min.node.size value that results in the highest AUC on the validation dataset is selected.

Value

A list containing the best minimum node size ('best_min_node_size'), the final trained model ('best_model'), and the chosen c value('best_c').

Examples


# Load sample data
data(sample_data_train)
data(sample_data_extern)

# Example usage
result <- tuneandtrainRobustTuneCRF(sample_data_train, sample_data_extern, K = 5, num.trees = 500)
result$best_min_node_size
result$best_model
result$best_c


[Package RobustPrediction version 0.1.7 Index]