VanValen {smsets} | R Documentation |
van Valen's test
Description
Computes van Valen's test for the comparison of the variation in two multivariate samples. The comparison is made in terms of distances between all standardized variables from their corresponding standardized medians, thus producing two sets of pooled distances, one per sample, whose means are then compared by a two-sample t-test.
Usage
VanValen(x, group, level1, alternative = "two.sided", var.equal = FALSE)
Arguments
x |
a data frame with one two-level factor and p response variables. |
group |
two-level factor defining groups. It must be one of the columns
in |
level1 |
a character string identifying Sample 1. The string must be one
of the factor levels in |
alternative |
a character string specifying the alternative hypothesis
in the t-test for the comparison of mean pooled distances. Must be one of
|
var.equal |
a logical variable indicating whether to treat the two
variances of pooled distances as being equal. If |
Details
To ensure that all variables are given equal weight, each variable is first standardized in van Valen's test, so that the mean is zero and variance is one for all samples combined before the calculation of the pooled distances. These are given by
d_{ij} = \sqrt{\sum_{k = 1}^{p}{(x_{ijk}-M_{jk})^2}}
where
x_{ijk}
is the value of the standardized variable X_{k}
for the
i
th individual in sample j
, and
M_{jk}
is the median of the same standardized variable in the j
th
sample.
The sample means of the d_{ij}
values are compared with a t-test. If
one sample is more variable than another, then the mean d_{ij}
values
will tend to be higher in that sample. The expression for d_{ij}
in van
Valen's is based on an implicit assumption that if the two samples being
tested differ, then one sample will be more variable than the other for all
variables. A significant result cannot be expected in a case where, for
example, X_1
and X_2
are more variable in sample 1, but X_3
and X_4
are more variable in sample 2. The effect of the differing
variances would then tend to cancel out in the calculation of d_{ij}
.
Thus, Van Valen's test is not appropriate for situations where changes in
the level of variation are not expected to be consistent for all variables.
Value
Returns an object of class "VanValen"
, a list containing the
following components:
name | A character string describing the function. |
std.data | A list with two data frames matlevel1 and
matlevel2 containing the values of the standardized variables for
samples 1 and 2 respectively |
medians.std | A list containing two vectors. The first vector
medians.std1 contains the medians for all standardized variables in
sample 1 as declared in parameter level1 , and the second vector,
medians.std2 , holds the corresponding medians for the other sample.
|
dev.median | A list with two data frames dev.median1 and
dev.median2 containing the deviations from sample medians for
samples 1 and 2, respectively. |
d.list | A list with two data frames d.level1 and
d.level2 containing the pooled distances of standardized variables
from their corresponding medians for samples 1 and 2, respectively. |
means.d | A named numeric vector carrying the mean pooled distances for samples 1 and 2, respectively |
vars.d | A named numeric vector carrying the variance of pooled distances for samples 1 and 2, respectively |
t.vec | A named numeric vector containing the t-statistic, the degrees of freedom and the p-value for the test, respectively. |
alternative | a character string specifying the alternative hypothesis chosen. |
var.equal | A logical variable indicating whether the two
variances were treated as being equal TRUE or not FALSE . |
group | A character string specifying the name of the two-level factor defining groups. |
levels.group | A vector of length two, showing the two levels in
factor group . |
data.name | A character string giving the name of the data. |
variables | A character string vector containing the variable names. |
data | The data frame analyzed. |
Author(s)
Jorge Navarro Alberto, ganava4@gmail.com
References
Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. CRC Press.
van Valen, L. (1978) The statistics of variation. Evolutionary Theory 4: 33-43. (Erratum Evolutionary Theory 4: 202.)
Examples
data(sparrows)
res.VanValen <- VanValen(sparrows, "Survivorship", "S",
alternative = "less", var.equal = TRUE)
# Brief output
res.VanValen