PPCR {OPCreg}R Documentation

Perturbation-based Principal Component Regression

Description

This function performs Perturbation-based Principal Component Regression (PPCR) on the provided dataset. It combines Principal Component Analysis (PCA) with linear regression, incorporating perturbation to enhance robustness.

Usage

PPCR(data, eta = 0.0035, m = 3, alpha = 0.05, perturbation_factor = 0.1)

Arguments

data

A data frame containing the response variable and predictors.

eta

A proportion (between 0 and 1) determining the initial sample size for PCA.

m

The number of principal components to retain.

alpha

Significance level (currently not used in the function).

perturbation_factor

A factor controlling the magnitude of perturbation added to the principal components.

Details

The function first standardizes the predictors, then performs PCA on an initial subset of the data. It iteratively updates the principal components by incorporating new observations and adding random perturbations. Finally, it fits a linear regression model using the principal components as predictors and transforms the coefficients back to the original space.

Value

A list containing the following components:

Bhat

Estimated regression coefficients in the original space.

RMSE

Root Mean Squared Error of the regression model.

summary

Summary of the linear regression model.

Vhat

Estimated principal components.

lambdahat

Estimated eigenvalues.

yhat

Predicted values from the regression model.

See Also

lm: For linear regression models.

prcomp: For principal component analysis.

Examples

## Not run: 
# Example data
set.seed(1234)
n <- 2000
p <- 10
mu0 <- as.matrix(runif(p, 0))
sigma0 <- as.matrix(runif(p, 0, 10))
ro <- as.matrix(c(runif(round(p / 2), -1, -0.8), runif(p - round(p / 2), 0.8, 1)))
R0 <- ro %*% t(ro)
diag(R0) <- 1
Sigma0 <- sigma0 %*% t(sigma0) * R0
x <- mvrnorm(n, mu0, Sigma0)
colnames(x) <- paste("x", 1:p, sep = "")
e <- rnorm(n, 0, 1)
B <- sample(1:3, (p + 1), replace = TRUE)
en <- matrix(rep(1, n * 1), ncol = 1)
y <- cbind(en, x) %*% B + e
colnames(y) <- paste("y")
data <- data.frame(cbind(y, x))

# Call the PPCR function
result <- PPCR(data, eta = 0.0035, m = 3, alpha = 0.05, perturbation_factor = 0.1)

# Print results
print(result$Bhat)  # Estimated regression coefficients
print(result$RMSE)  # RMSE of the model
print(result$summary)  # Summary of the regression model

## End(Not run)


[Package OPCreg version 3.0.0 Index]