penalized.pls {ppls}R Documentation

Predict New Data Using a Penalized PLS Model

Description

Given a fitted penalized PLS model and new test data, this function predicts the response for all components. If true response values are provided, it also returns the mean squared error (MSE) for each component.

Computes the regression coefficients for a Penalized Partial Least Squares (PPLS) model, using either a classical NIPALS algorithm or a kernel-based version. Optionally allows block-wise variable selection.

Performs k-fold cross-validation to evaluate and select the optimal penalization parameter lambda and the number of components ncomp in a PPLS model.

Computes the regression coefficients using the standard (NIPALS-based) version of Penalized PLS. This function is typically called internally by penalized.pls.

Computes the regression coefficients using the kernel-based version of Penalized PLS, especially useful when the number of predictors exceeds the number of observations (p >> n).

Computes the regression coefficients of a Penalized Partial Least Squares (PPLS) model using block-wise selection, where each component is restricted to use variables from only one block.

Usage

new.penalized.pls(ppls, Xtest, ytest = NULL)

penalized.pls(
  X,
  y,
  P = NULL,
  ncomp = NULL,
  kernel = FALSE,
  scale = FALSE,
  blocks = 1:ncol(X),
  select = FALSE
)

penalized.pls.cv(
  X,
  y,
  P = NULL,
  lambda = 1,
  ncomp = NULL,
  k = 5,
  kernel = FALSE,
  scale = FALSE
)

penalized.pls.default(X, y, M = NULL, ncomp)

penalized.pls.kernel(X, y, M = NULL, ncomp)

penalized.pls.select(X, y, M = NULL, ncomp, blocks)

Arguments

ppls

A fitted penalized PLS model, as returned by penalized.pls.

Xtest

A numeric matrix of new input data for prediction.

ytest

Optional. A numeric response vector corresponding to Xtest, for evaluating prediction error.

X

A numeric matrix of centered (and optionally scaled) predictor variables.

y

A centered numeric response vector.

P

Optional penalty matrix. If NULL, ordinary PLS is computed (i.e., no penalization).

ncomp

Integer. Number of PLS components to compute.

kernel

Logical. If TRUE, uses the kernel representation of PPLS. Default is FALSE.

scale

Logical. If TRUE, scales predictors in X to unit variance. Default is FALSE.

blocks

An integer vector of length ncol(X) that defines the block structure of the variables. All variables sharing the same value in blocks belong to the same block.

select

Logical. If TRUE, block-wise variable selection is applied in each iteration. Only one block contributes to the latent direction per component. Default is FALSE.

lambda

A numeric vector of candidate penalty parameters. Default is 1.

k

Integer. Number of cross-validation folds. Default is 5.

M

Optional penalty transformation matrix M = (I + P)^{-1}. If NULL, no penalization is applied.

Details

The fitted model ppls contains intercepts and regression coefficients for each number of components (from 1 to ncomp). The function computes:

The prediction is performed as:

\hat{y}^{(i)} = X_\text{test} \cdot \beta^{(i)} + \text{intercept}^{(i)},

for each number of components i = 1, \ldots, ncomp.

This function centers X and y, and optionally scales X, then computes PPLS components using one of:

When a penalty matrix P is supplied, a transformation M = (I + P)^{-1} is computed internally. The algorithm then maximizes the penalized covariance between Xw and y:

\text{argmax}_w \; \text{Cov}(Xw, y)^2 - \lambda \cdot w^\top P w

The block-wise selection strategy (when select = TRUE) restricts the weight vector w at each iteration to be non-zero in a single block, selected greedily.

The function splits the data into k cross-validation folds, and for each value of lambda and number of components up to ncomp, computes the mean squared prediction error.

The optimal parameters are selected as those minimizing the prediction error across all folds. Internally, for each fold and lambda value, the function calls penalized.pls to fit the model and new.penalized.pls to evaluate predictions.

The returned object can be further used for statistical inference (e.g., via jackknife) or prediction.

The method is based on iteratively computing latent directions that maximize the covariance with the response y. At each step:

The final regression coefficients are computed via a triangular system using the bidiagonal matrix R = T^\top X W, and backsolving:

\beta = W L (T^\top y),

where L = R^{-1}.

The kernel PPLS algorithm is based on representing the model in terms of the Gram matrix K = X M X^\top (or simply K = X X^\top if M = NULL). The algorithm iteratively computes orthogonal latent components t_i in sample space.

Steps:

  1. Initialize residual vector u = y, then normalize t = Ku.

  2. Orthogonalize t with respect to previous components (if needed).

  3. Repeat for ncomp components.

The regression coefficients are recovered as:

\beta = X^\top A, \quad \text{where } A = UU L (T^\top y),

with UU and TT the matrices of latent vectors and components, and L = R^{-1} the back-solved triangular system.

This function implements a sparse selection strategy inspired by sparse or group PLS. At each component iteration, it computes the penalized covariance between X and y, and selects the block k for which the mean squared weight of its variables is maximal:

\text{score}_k = \frac{1}{|B_k|} \sum_{j \in B_k} w_j^2

Only the weights corresponding to the selected block are retained, and all others are set to zero. The rest of the algorithm follows the classical NIPALS-like PLS with orthogonal deflation.

This procedure enhances interpretability by selecting only one block per component, making it suitable for structured variable selection (e.g., grouped predictors).

Value

A list containing:

ypred

A numeric matrix of predicted responses. Each column corresponds to a different number of PLS components.

mse

A numeric vector of mean squared errors, if ytest is provided. Otherwise NULL.

A list with components:

intercept

A numeric vector of intercepts for 1 to ncomp components.

coefficients

A numeric matrix of size ncol(X) x ncomp, each column being the coefficient vector for the corresponding number of components.

An object of class "mypls", a list with the following components:

error.cv

A matrix of mean squared errors. Rows correspond to different lambda values; columns to different numbers of components.

lambda

The vector of candidate lambda values.

lambda.opt

The lambda value giving the minimum cross-validated error.

index.lambda

The index of lambda.opt in lambda.

ncomp.opt

The optimal number of PLS components.

min.ppls

The minimum cross-validated error.

intercept

Intercept of the optimal model (fitted on the full dataset).

coefficients

Coefficient vector for the optimal model.

coefficients.jackknife

An array of shape ncol(X) x ncomp x length(lambda) x k, containing the coefficients from each CV split and parameter setting.

A list with:

coefficients

A matrix of size ncol(X) x ncomp, each column containing the regression coefficients for the first i components.

A list with:

coefficients

A matrix of size ncol(X) x ncomp, containing the estimated regression coefficients for each number of components.

A list with:

coefficients

A matrix of size ncol(X) x ncomp, containing the regression coefficients after block-wise selection.

References

N. Kraemer, A.-L. Boulesteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94(1), 60–69. doi:10.1016/j.chemolab.2008.06.009

N. Kraemer, A.-L. Boulesteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94(1), 60–69. doi:10.1016/j.chemolab.2008.06.009

N. Kraemer, A.-L. Boulesteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94(1), 60–69. doi:10.1016/j.chemolab.2008.06.009

See Also

penalized.pls, penalized.pls.cv, ppls.splines.cv

penalized.pls.cv, new.penalized.pls, ppls.splines.cv, Penalty.matrix

penalized.pls, new.penalized.pls, jack.ppls, ppls.splines.cv

penalized.pls, penalized.pls.kernel, normalize.vector

penalized.pls, penalized.pls.default, normalize.vector

penalized.pls, penalized.pls.cv, normalize.vector

Examples

set.seed(123)
X <- matrix(rnorm(50 * 200), ncol = 50)
y <- rnorm(200)

Xtrain <- X[1:100, ]
ytrain <- y[1:100]
Xtest <- X[101:200, ]
ytest <- y[101:200]

pen.pls <- penalized.pls(Xtrain, ytrain, ncomp = 10)
pred <- new.penalized.pls(pen.pls, Xtest, ytest)
head(pred$ypred)
pred$mse

## Example from Kraemer et al. (2008)
data(BOD)
X <- BOD[, 1]
y <- BOD[, 2]

Xtest <- seq(min(X), max(X), length = 200)
dummy <- X2s(X, Xtest, deg = 3, nknot = 20)  # Spline transformation
Z <- dummy$Z
Ztest <- dummy$Ztest
size <- dummy$sizeZ
P <- Penalty.matrix(size, order = 2)
lambda <- 200
number.comp <- 3

ppls <- penalized.pls(Z, y, P = lambda * P, ncomp = number.comp)
new.ppls <- new.penalized.pls(ppls, Ztest)$ypred

# Plot fitted values for 2 components
plot(X, y, lwd = 3, xlim = range(Xtest))
lines(Xtest, new.ppls[, 2], col = "blue")

set.seed(42)
X <- matrix(rnorm(20 * 100), ncol = 20)
y <- rnorm(100)

# Example with no penalty
result <- penalized.pls.cv(X, y, lambda = c(0, 1, 10), ncomp = 5)
result$lambda.opt
result$ncomp.opt
result$min.ppls

# Using jackknife estimation after CV
jack <- jack.ppls(result)
coef(jack)

set.seed(123)
X <- matrix(rnorm(20 * 50), nrow = 50)
y <- rnorm(50)
M <- diag(ncol(X))  # No penalty
coef <- penalized.pls.default(scale(X, TRUE, FALSE), scale(y, TRUE, FALSE),
  M, ncomp = 3)$coefficients
coef[, 1]  # coefficients for 1st component

set.seed(123)
X <- matrix(rnorm(100 * 10), nrow = 100)
y <- rnorm(100)
K <- X %*% t(X)
coef <- penalized.pls.kernel(X, y, M = NULL, ncomp = 2)$coefficients
head(coef[, 1])  # coefficients for 1st component

set.seed(321)
X <- matrix(rnorm(40 * 30), ncol = 40)
y <- rnorm(30)

# Define 4 blocks of 10 variables each
blocks <- rep(1:4, each = 10)
result <- penalized.pls.select(X, y, M = NULL, ncomp = 2, blocks = blocks)
result$coefficients[, 1]  # Coefficients for first component


[Package ppls version 2.0.0 Index]