snQTL_test_corrnet {snQTL} | R Documentation |
Spectral network quantitative trait loci (snQTL) test
Description
Spectral framework to detect network QTLs affecting the co-expression networks. This is the main function for snQTL test.
Given a list of expression data matrices from samples with different gentoypes, we test whether there are significant difference among three co-expression networks. Statistically, we consider the hypothesis testing task:
H_0: N_A = N_B = N_H,
where A,B,H
refer to different genotypes, N
refers to the adjacency matrices corresponding to the co-expression network.
We provide four options for the test statistics, composed by sparse matrix/tensor eigenvalues. We perform permutation test to obtain the empirical p-values for the hypothesis testing.
NOTE: This function is also applicable for generalized cases to compare multiple (K > 3) biological networks. Instead of separating the samples by genotypes, people can separate the samples into K groups based on other interested metrics, e.g., locations, treatments. The generalized hypothesis testing problem becomes
H_0: N_1 = ... = N_K,
where N_k
refers to the correlation-based network corresponding to the group k.
For consistency, we stick with the original genotype-based setting in this help document.
See details and examples for the generalization on the Github manual https://github.com/Marchhu36/snQTL.
Usage
snQTL_test_corrnet(
exp_list,
method = c("sum", "sum_square", "max", "tensor"),
npermute = 100,
seeds = 1:100,
stats_seed = NULL,
rho = 1000,
sumabs = 0.2,
niter = 20,
trace = FALSE,
adj.beta = -1,
tensor_iter = 20,
tensor_tol = 10^(-3),
trans = FALSE,
location = NULL
)
Arguments
exp_list |
list, a list of expression data from samples with different genotypes; the dimensions for data matrices are n1-by-p, n2-by-p, and n3-by-p, respectively; see "details" |
method |
character, the choice of test statistics; see "details" |
npermute |
number, the number of permutations to obtain empirical p-values |
seeds |
vector, the random seeds for permutation; length of the vector is equal to the |
stats_seed |
number, the random seed for test statistics calculation with non-permuted data |
rho |
number, a large positive constant adding to the diagonal elements to ensure positive definiteness in symmetric matrix spectral decomposition |
sumabs |
number, the number specify the sparsity level in the matrix/tensor eigenvector; |
niter |
integer, the number of iterations to use in the PMD algorithm (see |
trace |
logic variable, whether to trace the progress of PMD algorithm (see |
adj.beta |
number, the power transformation to the correlation matrices (see |
tensor_iter |
integer, the maximal number of iteration in SSTD algorithm (see |
tensor_tol |
number, a small positive constant for error difference to indicate the SSTD convergence (see |
trans |
logic variable, whether to only consider the trans-correlation (between genes from two different chromosomes or regions); see "details" |
location |
vector, the (chromosome) locations for genes if |
Details
In exp_list
, the data matrices are usually ordered with marker's genotypes AA, BB, and AB.
The expression data is usually normalized. We use expression data to generate the Pearson's correlation co-expression networks.
Given the list of co-expression networks, we generate pairwise differential networks
D_{AB} = N_A - N_B, D_{AH} = N_H - N_A, D_{BH} = N_H - N_B.
We use pairwise differential networks to generate the snQTL test statistics.
We provide four options of test statistics with different choices of method
:
sum, the sum of sparse leading matrix eigenvalues (sLMEs) of all pairwise differential networks:
Stat_sum = \lambda(D_{AB}) + \lambda(D_{AH}) + \lambda(D_{BH}),
where
\lambda
refers to the sLME operation with given sparsity level set up bysumabs
.sum_square, the sum of squared sLMEs:
Stat_sumsquare = \lambda^2(D_{AB}) + \lambda^2(D_{AH}) + \lambda^2(D_{BH}).
max, the maximal of sLMEs:
Stat_max = \max(\lambda(D_{AB}), \lambda(D_{AH}), \lambda(D_{BH})).
tensor, the sparse leading tensor eigenvalue (sLTE) of the differential tensor:
Stat_tensor = \Lambda(\mathcal{D}),
where
\Lambda
refers to the sLTE operation with given sparsity level set up bysumabs
, and\mathcal{D}
is the differential tensor composed by stacking three pairwise differential networks.
Additionally, if trans = TRUE
, we only consider the trans-correlation between the genes from two different chromosomes or regions in co-expression networks.
The entries in correlation matrices N_{ij} = 0
if gene i and gene j are from the same chromosome or region.
The gene location information is required if trans = TRUE
.
Value
a list containing the following:
method |
character, recall of the choice of test statistics |
res_original |
list, test result for non-permuted data, including the recall of method choices, test statistics, and decomposition components |
res_permute |
list, test results for each permuted data, including the recall of method choices, test statistics, and decomposition components |
emp_p_value |
number, the empirical p-value from permutation test |
References
Hu, J., Weber, J. N., Fuess, L. E., Steinel, N. C., Bolnick, D. I., & Wang, M. (2025). A spectral framework to map QTLs affecting joint differential networks of gene co-expression. PLOS Computational Biology, 21(4), e1012953.
Examples
### artificial example
n1 = 50
n2 = 60
n3 = 100
p = 200
location = c(rep(1,20), rep(2, 50), rep(3, 100), rep(4, 30))
## expression data from null
set.seed(0416) # random seeds for example data
exp1 = matrix(rnorm(n1*p, mean = 0, sd = 1), nrow = n1)
exp2 = matrix(rnorm(n2*p, mean = 0, sd = 1), nrow = n2)
exp3 = matrix(rnorm(n3*p, mean = 0, sd = 1), nrow = n3)
exp_list = list(exp1, exp2, exp3)
result = snQTL_test_corrnet(exp_list = exp_list, method = 'tensor',
npermute = 30, seeds = 1:30, stats_seed = 0416,
trans = TRUE, location = location)
result$emp_p_value