penalized.pls {ppls} | R Documentation |
Predict New Data Using a Penalized PLS Model
Description
Given a fitted penalized PLS model and new test data, this function predicts the response for all components. If true response values are provided, it also returns the mean squared error (MSE) for each component.
Computes the regression coefficients for a Penalized Partial Least Squares (PPLS) model, using either a classical NIPALS algorithm or a kernel-based version. Optionally allows block-wise variable selection.
Performs k-fold cross-validation to evaluate and select the optimal penalization parameter lambda
and the number of components ncomp
in a PPLS model.
Computes the regression coefficients using the standard (NIPALS-based) version of Penalized PLS. This function is typically called internally by penalized.pls
.
Computes the regression coefficients using the kernel-based version of Penalized PLS, especially useful when the number of predictors exceeds the number of observations (p >> n
).
Computes the regression coefficients of a Penalized Partial Least Squares (PPLS) model using block-wise selection, where each component is restricted to use variables from only one block.
Usage
new.penalized.pls(ppls, Xtest, ytest = NULL)
penalized.pls(
X,
y,
P = NULL,
ncomp = NULL,
kernel = FALSE,
scale = FALSE,
blocks = 1:ncol(X),
select = FALSE
)
penalized.pls.cv(
X,
y,
P = NULL,
lambda = 1,
ncomp = NULL,
k = 5,
kernel = FALSE,
scale = FALSE
)
penalized.pls.default(X, y, M = NULL, ncomp)
penalized.pls.kernel(X, y, M = NULL, ncomp)
penalized.pls.select(X, y, M = NULL, ncomp, blocks)
Arguments
ppls |
A fitted penalized PLS model, as returned by |
Xtest |
A numeric matrix of new input data for prediction. |
ytest |
Optional. A numeric response vector corresponding to |
X |
A numeric matrix of centered (and optionally scaled) predictor variables. |
y |
A centered numeric response vector. |
P |
Optional penalty matrix. If |
ncomp |
Integer. Number of PLS components to compute. |
kernel |
Logical. If |
scale |
Logical. If |
blocks |
An integer vector of length |
select |
Logical. If |
lambda |
A numeric vector of candidate penalty parameters. Default is |
k |
Integer. Number of cross-validation folds. Default is |
M |
Optional penalty transformation matrix |
Details
The fitted model ppls
contains intercepts and regression coefficients for each number of components (from 1 to ncomp
). The function computes:
the matrix of predicted values for each component (as columns),
and, if
ytest
is provided, a vector of mean squared errors for each component.
The prediction is performed as:
\hat{y}^{(i)} = X_\text{test} \cdot \beta^{(i)} + \text{intercept}^{(i)},
for each number of components i = 1, \ldots, ncomp
.
This function centers X
and y
, and optionally scales X
, then computes PPLS components using one of:
the classical NIPALS algorithm (
kernel = FALSE
), orthe kernel representation (
kernel = TRUE
), often faster whenp > n
(high-dimensional case).
When a penalty matrix P
is supplied, a transformation M = (I + P)^{-1}
is computed internally. The algorithm then maximizes the penalized covariance between Xw
and y
:
\text{argmax}_w \; \text{Cov}(Xw, y)^2 - \lambda \cdot w^\top P w
The block-wise selection strategy (when select = TRUE
) restricts the weight vector w
at each iteration to be non-zero in a single block, selected greedily.
The function splits the data into k
cross-validation folds, and for each value of lambda
and number of components up to ncomp
, computes the mean squared prediction error.
The optimal parameters are selected as those minimizing the prediction error across all folds. Internally, for each fold and lambda
value, the function calls penalized.pls
to fit the model and new.penalized.pls
to evaluate predictions.
The returned object can be further used for statistical inference (e.g., via jackknife) or prediction.
The method is based on iteratively computing latent directions that maximize the covariance with the response y
. At each step:
A weight vector
w
is computed asw = M X^\top y
(if penalization is used).The latent component
t = X w
is extracted and normalized.The matrix
X
is deflated orthogonally with respect tot
.
The final regression coefficients are computed via a triangular system using the bidiagonal matrix R = T^\top X W
, and backsolving:
\beta = W L (T^\top y),
where L = R^{-1}
.
The kernel PPLS algorithm is based on representing the model in terms of the Gram matrix K = X M X^\top
(or simply K = X X^\top
if M = NULL
). The algorithm iteratively computes orthogonal latent components t_i
in sample space.
Steps:
Initialize residual vector
u = y
, then normalizet = Ku
.Orthogonalize
t
with respect to previous components (if needed).Repeat for
ncomp
components.
The regression coefficients are recovered as:
\beta = X^\top A, \quad \text{where } A = UU L (T^\top y),
with UU
and TT
the matrices of latent vectors and components, and L = R^{-1}
the back-solved triangular system.
This function implements a sparse selection strategy inspired by sparse or group PLS. At each component iteration, it computes the penalized covariance between X
and y
, and selects the block k
for which the mean squared weight of its variables is maximal:
\text{score}_k = \frac{1}{|B_k|} \sum_{j \in B_k} w_j^2
Only the weights corresponding to the selected block are retained, and all others are set to zero. The rest of the algorithm follows the classical NIPALS-like PLS with orthogonal deflation.
This procedure enhances interpretability by selecting only one block per component, making it suitable for structured variable selection (e.g., grouped predictors).
Value
A list containing:
- ypred
A numeric matrix of predicted responses. Each column corresponds to a different number of PLS components.
- mse
A numeric vector of mean squared errors, if
ytest
is provided. OtherwiseNULL
.
A list with components:
- intercept
A numeric vector of intercepts for 1 to
ncomp
components.- coefficients
A numeric matrix of size
ncol(X)
xncomp
, each column being the coefficient vector for the corresponding number of components.
An object of class "mypls"
, a list with the following components:
- error.cv
A matrix of mean squared errors. Rows correspond to different
lambda
values; columns to different numbers of components.- lambda
The vector of candidate lambda values.
- lambda.opt
The lambda value giving the minimum cross-validated error.
- index.lambda
The index of
lambda.opt
inlambda
.- ncomp.opt
The optimal number of PLS components.
- min.ppls
The minimum cross-validated error.
- intercept
Intercept of the optimal model (fitted on the full dataset).
- coefficients
Coefficient vector for the optimal model.
- coefficients.jackknife
An array of shape
ncol(X) x ncomp x length(lambda) x k
, containing the coefficients from each CV split and parameter setting.
A list with:
- coefficients
A matrix of size
ncol(X) x ncomp
, each column containing the regression coefficients for the firsti
components.
A list with:
- coefficients
A matrix of size
ncol(X) x ncomp
, containing the estimated regression coefficients for each number of components.
A list with:
- coefficients
A matrix of size
ncol(X) x ncomp
, containing the regression coefficients after block-wise selection.
References
N. Kraemer, A.-L. Boulesteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94(1), 60–69. doi:10.1016/j.chemolab.2008.06.009
N. Kraemer, A.-L. Boulesteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94(1), 60–69. doi:10.1016/j.chemolab.2008.06.009
N. Kraemer, A.-L. Boulesteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94(1), 60–69. doi:10.1016/j.chemolab.2008.06.009
See Also
penalized.pls
, penalized.pls.cv
, ppls.splines.cv
penalized.pls.cv
, new.penalized.pls
, ppls.splines.cv
, Penalty.matrix
penalized.pls
, new.penalized.pls
, jack.ppls
, ppls.splines.cv
penalized.pls
, penalized.pls.kernel
, normalize.vector
penalized.pls
, penalized.pls.default
, normalize.vector
penalized.pls
, penalized.pls.cv
, normalize.vector
Examples
set.seed(123)
X <- matrix(rnorm(50 * 200), ncol = 50)
y <- rnorm(200)
Xtrain <- X[1:100, ]
ytrain <- y[1:100]
Xtest <- X[101:200, ]
ytest <- y[101:200]
pen.pls <- penalized.pls(Xtrain, ytrain, ncomp = 10)
pred <- new.penalized.pls(pen.pls, Xtest, ytest)
head(pred$ypred)
pred$mse
## Example from Kraemer et al. (2008)
data(BOD)
X <- BOD[, 1]
y <- BOD[, 2]
Xtest <- seq(min(X), max(X), length = 200)
dummy <- X2s(X, Xtest, deg = 3, nknot = 20) # Spline transformation
Z <- dummy$Z
Ztest <- dummy$Ztest
size <- dummy$sizeZ
P <- Penalty.matrix(size, order = 2)
lambda <- 200
number.comp <- 3
ppls <- penalized.pls(Z, y, P = lambda * P, ncomp = number.comp)
new.ppls <- new.penalized.pls(ppls, Ztest)$ypred
# Plot fitted values for 2 components
plot(X, y, lwd = 3, xlim = range(Xtest))
lines(Xtest, new.ppls[, 2], col = "blue")
set.seed(42)
X <- matrix(rnorm(20 * 100), ncol = 20)
y <- rnorm(100)
# Example with no penalty
result <- penalized.pls.cv(X, y, lambda = c(0, 1, 10), ncomp = 5)
result$lambda.opt
result$ncomp.opt
result$min.ppls
# Using jackknife estimation after CV
jack <- jack.ppls(result)
coef(jack)
set.seed(123)
X <- matrix(rnorm(20 * 50), nrow = 50)
y <- rnorm(50)
M <- diag(ncol(X)) # No penalty
coef <- penalized.pls.default(scale(X, TRUE, FALSE), scale(y, TRUE, FALSE),
M, ncomp = 3)$coefficients
coef[, 1] # coefficients for 1st component
set.seed(123)
X <- matrix(rnorm(100 * 10), nrow = 100)
y <- rnorm(100)
K <- X %*% t(X)
coef <- penalized.pls.kernel(X, y, M = NULL, ncomp = 2)$coefficients
head(coef[, 1]) # coefficients for 1st component
set.seed(321)
X <- matrix(rnorm(40 * 30), ncol = 40)
y <- rnorm(30)
# Define 4 blocks of 10 variables each
blocks <- rep(1:4, each = 10)
result <- penalized.pls.select(X, y, M = NULL, ncomp = 2, blocks = blocks)
result$coefficients[, 1] # Coefficients for first component