Dpca {FPCdpca} | R Documentation |
Distributed Principal Component Analysis (DPCA)
Description
Performs distributed PCA on a data matrix partitioned into subsets.
Usage
Dpca(data, K, nk)
Arguments
data |
A numeric matrix or data frame containing the data, where rows are observations and columns are variables. |
K |
Integer, the number of subsets to partition the data into. |
nk |
Integer, the size of each subset (number of rows per subset). |
Details
The function splits the input data matrix into K
subsets of size nk
each.
The parameters n
(number of rows) and p
(number of columns) are automatically
derived from the input data matrix as n = nrow(data)
and p = ncol(data)
.
Value
A list containing:
-
MSEXp
: Minimum squared reconstruction error. -
MSEvp
: MSE of eigenvectors. -
MSESp
: MSE of covariance matrix. -
kopt
: Optimal subset index.
Examples
K <- 20
nk <- 50
nr <- 10
p <- 8
n <- K * nk
d <- 6
data <- matrix(c(rnorm((n - nr) * p, 0, 1), rpois(nr * p, 100)), ncol = p)
Dpca(data = data, K = K, nk = nk)
[Package FPCdpca version 0.3.0 Index]