HetseqDoubleML {HetSeq} | R Documentation |
Heterogeneity-seq: Classifying cellular response by gene expression values including causal inference by DoubleML
Description
Classifying the cellular response of control cells using single gene expression (+ informative features) to identify features with the strongest predictive capabilities and applying causal inference by a DoubleML approach.
Usage
HetseqDoubleML(
object,
trajectories,
score.group = NULL,
score.name = NULL,
quantiles = c(0.25, 0.75),
compareGroups = c("Low", "High"),
posClass = NULL,
basefeatures = NULL,
genes = NULL,
background = NULL,
assay = NULL,
split = NULL,
cross = 10,
num_cores = 1
)
Arguments
object |
Seurat object |
trajectories |
Matrix of cell-cell trajectories. Columns represent time points, rows represent trajectories of connected cells over time points. |
score.group |
A named vector of response groups. Names represent cells, the values represent the score groups. If no score.group is set, use score.name and quantiles parameters must be set to define score groups. |
score.name |
The name of a numeric Seurat meta data column, which will be used to calculate score groups. Only used if no score.group is given. |
quantiles |
Thresholds of the score.name meta data to define 3 response groups. Low, Middle, High. |
compareGroups |
Which score groups to test. Default: Low vs. High |
posClass |
Define the positive Class for classification. |
basefeatures |
Additional informative features to include in the classification. Must be meta data available in the Seurat object. |
genes |
Vector of genes to test. |
background |
A set of genes that will be considered as potential confounding factors in the DoubleML analysis. Must contain all genes set in the genes parameter. By default, all genes are used. |
assay |
The name of the Seurat assay to perform Heterogeneity-seq on. If NULL, the default assay will be used. |
split |
Set a training-test data split. Must be in [0,1] |
cross |
Number of cross-validations. |
num_cores |
The number of cores used in parallel processing. |
Value
Table of log2FC and AUC values for each gene and an additional AUC value for the baseline features.
Examples
# Full vignette available on https://grandr.erhard-lab.de/articles/web/hetseq.html
t <- HetseqDoubleML(data, trajectories, score.name = "score")
t <- HetseqDoubleML(data, trajectories, score.group = group_vector,
compareGroups = c("Weak", "Strong"))