createDisMatrix {e2tree}R Documentation

Dissimilarity matrix

Description

The function createDisMatrix creates a dissimilarity matrix among observations from an ensemble tree.

Usage

createDisMatrix(
  ensemble,
  data,
  label,
  parallel = list(active = FALSE, no_cores = 1),
  verbose = FALSE
)

Arguments

ensemble

is an ensemble tree object (for the moment ensemble works only with random forest objects)

data

is a data frame containing the variables in the model. It is the data frame used for ensemble learning.

label

is a character. It indicates the response label.

parallel

A list with two elements: active (logical) and no_cores (integer). If active = TRUE, the function performs parallel computation using the number of cores specified in no_cores. If no_cores is NULL or equal to 0, it defaults to using all available cores minus one. If active = FALSE, the function runs on a single core. Default: list(active = FALSE, no_cores = 1).

verbose

Logical. If TRUE, the function prints progress messages and other information during execution. If FALSE (the default), messages are suppressed.

Value

A dissimilarity matrix. This is a dissimilarity matrix measuring the discordance between two observations concerning a given classifier of a random forest model.

Examples


## Classification
data("iris")

# Create training and validation set:
smp_size <- floor(0.75 * nrow(iris))
train_ind <- sample(seq_len(nrow(iris)), size = smp_size)
training <- iris[train_ind, ]
validation <- iris[-train_ind, ]
response_training <- training[,5]
response_validation <- validation[,5]

# Perform training:
ensemble <- randomForest::randomForest(Species ~ ., data=training,
importance=TRUE, proximity=TRUE)

D <- createDisMatrix(ensemble, data=training, 
                     label = "Species", 
                     parallel = list(active=FALSE, no_cores = 1))


## Regression
data("mtcars")

# Create training and validation set:
smp_size <- floor(0.75 * nrow(mtcars))
train_ind <- sample(seq_len(nrow(mtcars)), size = smp_size)
training <- mtcars[train_ind, ]
validation <- mtcars[-train_ind, ]
response_training <- training[,1]
response_validation <- validation[,1]

# Perform training
ensemble = randomForest::randomForest(mpg ~ ., data=training, ntree=1000, 
importance=TRUE, proximity=TRUE)

D = createDisMatrix(ensemble, data=training, 
                        label = "mpg", 
                       parallel = list(active=FALSE, no_cores = 1))  



[Package e2tree version 0.1.2 Index]