DTSR {DTSR} | R Documentation |
Distributed Trimmed Scores Regression (DTSR) for Handling Missing Data
Description
This function performs DTSR to handle missing data by dividing the dataset into D blocks, applying the Trimmed Scores Regression (TSR) method to each block, and then combining the results. It calculates various evaluation metrics including RMSE, MMAE, RRE, and Consistency Proportion Index (CPP) using different hierarchical clustering methods.
Usage
DTSR(data0, data.sample, data.copy, mr, km, D)
Arguments
data0 |
The original dataset containing the response variable and features. |
data.sample |
The dataset used for sampling, which may contain missing values. |
data.copy |
A copy of the original dataset, used for comparison or validation. |
mr |
Indices of the rows with missing values that need to be predicted. |
km |
The number of clusters for k-means clustering. |
D |
The number of blocks to divide the data into. |
Value
A list containing:
XDTSR |
The imputed dataset. |
RMSEDTSR |
The Root Mean Squared Error. |
MAEDTSR |
The Mean Absolute Error. |
REDTSR |
The Relative Eelative Error. |
GCVDTSR |
The DTSR for Generalized Cross-Validation. |
timeDTSR |
The DTSR algorithm execution time. |
See Also
TSR
for the original TSR function.
Examples
# Create a sample matrix with random values and introduce missing values
set.seed(123)
n <- 100
p <- 10
D <- 2
data.sample <- matrix(rnorm(n * p), nrow = n)
data.sample[sample(1:(n-10), (p-2))] <- NA
data.copy <- data.sample
data0 <- data.frame(data.sample, response = rnorm(n))
mr <- sample(1:n, 10) # Sample rows for evaluation
km <- 3 # Number of clusters
# Perform DTSR imputation
result <- DTSR(data0, data.sample, data.copy, mr, km,D)
# Print the results
print(result$XDTSR)