DR.SC_fit {DR.SC}R Documentation

Joint dimension reduction and spatial clustering

Description

Joint dimension reduction and spatial clustering for scRNA-seq and spatial transcriptomics data

Usage

  DR.SC_fit(X,Adj_sp=NULL, q=15, K= NULL,error.heter= TRUE, K_set = seq(2, 10),
    beta_grid=seq(0.5, 5, by=0.5),maxIter=25, epsLogLik=1e-5, verbose=FALSE, 
    maxIter_ICM=6,pen.const=1,wpca.int=FALSE, parallel='parallel', num_core=5)

Arguments

X

an optional sparse matrix with class dgCMatrix, specify the log-normalization gene expression matrix used for DR-SC model.

Adj_sp

an optional sparse matrix with class dgCMatrix, specify the adjoint matrix used for DR-SC model. We provide this interface for those users who would like to define the adjoint matrix by their own.

q

a positive integer, specify the number of latent features to be extracted, default as 15.

K

a positive integer, specify the number of clusters, default as NULL. When K=NULL, it is automatically selected by MBIC criteria.

K_set

a vector of positive integer, means the candidates of number of clusters used for MBIC.

error.heter

an optional logical value, whether use the heterogenous error for DR-SC model, default as TRUE. If error.heter=FALSE, then the homogenuous error is used for model.

beta_grid

an optional vector of positive value, the candidate set of the smoothing parameter to be searched by the grid-search optimization approach.

maxIter

an optional positive value, represents the maximum iterations of EM.

epsLogLik

an optional positive vlaue, tolerance vlaue of relative variation rate of the observed pseudo log-loglikelihood value, defualt as '1e-5'.

verbose

an optional logical value, whether output the information of the ICM-EM algorithm.

maxIter_ICM

an optional positive value, represents the maximum iterations of ICM.

pen.const

an optional positive value, the adjusted constant used in the MBIC criteria. It usually takes value between 0.1 to 1.

wpca.int

an optional logical value, means whether use the weighted PCA to obtain the initial values of loadings and other paramters, default as FALSE which means the conventional PCA is used.

parallel

a optional string, specify the parallel way to choose the number of clusters by MBIC. We provide two methods: 1. parallel="parrallel" uses parrallel R package to conduct the parallel schema; 2.parallel=NULL doesn't use parallel computation.

num_core

an optional positive integer, means the cores used in parallel computating.

Details

Nothing

Value

DR.SC_fit returns a list with the following components:

cluster

inferred class labels

hZ

extracted latent features.

beta

estimated smoothing parameter

Mu

mean vectors of mixtures components.

Sigma

covariance matrix of mixtures components.

W

estimated loading matrix

Lam_vec

estimated variance of errors in probabilistic PCA model

loglik

pseudo observed log-likelihood

Note

nothing

Author(s)

Wei Liu

References

None

See Also

None

Examples

## we generate the spatial transcriptomics data with lattice neighborhood, i.e. ST platform.
seu <- gendata_RNAExp(height=10, width=10,p=50, K=4)
library(Seurat)
seu <- NormalizeData(seu)
# choose 2000 variable features using Seurat
seu <- FindVariableFeatures(seu, nfeatures = 40)
# users define the adjoint matrix
Adj_sp <- getAdj(seu, platform = 'ST')
var.features <- seu@assays$RNA@var.features
X <- Matrix::t(LogNormalize(seu@assays$RNA@counts[var.features,],))
# maxIter = 2 is only used for illustration, and user can use default.
drscList <- DR.SC_fit(X,Adj_sp=Adj_sp ,K=4, maxIter=2, verbose=TRUE)


[Package DR.SC version 2.3 Index]