SFclust.permute {FKmL} | R Documentation |
Perform Permutation-Based Clustering Evaluation for SFclust
Description
Performs a permutation-based analysis to evaluate clustering results across different
values of the \ell_1
norm constraint (s
). This function is designed to help determine the
most appropriate \ell_1
norm value by comparing the observed clustering outcome with those
obtained under random permutations.
The function computes gap statistics for each \ell_1
norm constraint value based on permuted
versions of the input distance array, and identifies the optimal s
as the one
maximizing the gap statistic. Two ggplot objects are returned to visualize the gap patterns.
Usage
SFclust.permute(dist.ary, k, nperms, l1b)
Arguments
dist.ary |
A 3-dimensional distance array representing pairwise distances
between trajectories across multiple variables. Follows the same format used in |
k |
An integer specifying the number of clusters. |
nperms |
An integer specifying the number of permutations to perform. |
l1b |
A numeric vector of |
Details
This function helps assess the robustness of clustering structure and select an optimal level of sparsity.
If any clustering attempt fails (e.g., due to convergence issues or weight update errors), the corresponding
l1b
values are reported in failed_l1b
and failed_j
.
This function returns two ggplot objects (gapplot.l1b
and gapplot.nnz
) that can be used to visualize the
gap statistics. These are not automatically printed, allowing users to decide when and how to display them.
This function involves random sampling internally. For reproducible results, set the random seed before calling the function using set.seed()
.
Value
A list containing the following components:
- totss
A numeric vector of total within-cluster sum of squared distances for each
\ell_1
norm value.- permtotss
A matrix of total sum of squared distances for each permutation and each
\ell_1
norm value.- nnonzerowss
A numeric vector of the number of nonzero weights for each
\ell_1
norm value.- gaps
A numeric vector of gap statistics: the difference between observed and permuted clustering results.
- sdgaps
A numeric vector of standard deviations of the gaps across permutations.
- l1bounds
A vector of
\ell_1
norm constraint values that were successfully processed without error.- bestl1b
The
\ell_1
norm constraint value that yielded the largest gap.- failed_j
Indices of
l1b
values that caused errors during the clustering process.- failed_l1b
The actual
\ell_1
norm values that caused errors.- gapplot.l1b
A ggplot object showing the gap statistics plotted against
\ell_1
norm constraint values.- gapplot.nnz
A ggplot object showing the gap statistics plotted against the number of nonzero weights.