weighted.median {spatstat.univar}R Documentation

Weighted Median, Quantiles or Variance

Description

Compute the median, quantiles or variance of a set of numbers which have weights associated with them.

Usage

weighted.median(x, w, na.rm = TRUE, type=2, collapse=FALSE)

weighted.quantile(x, w, probs=seq(0,1,0.25), na.rm = TRUE, type=4, collapse=FALSE)

weighted.var(x, w, na.rm = TRUE)

Arguments

x

Data values. A vector of numeric values, for which the median or quantiles are required.

w

Weights. A vector of nonnegative numbers, of the same length as x. If w is missing or NULL, the default weights are all equal to 1.

probs

Probabilities for which the quantiles should be computed. A numeric vector of values between 0 and 1.

na.rm

Logical. Whether to ignore NA values.

type

Integer specifying the rule for calculating the median or quantile, corresponding to the rules available for quantile. The currently available choices are type=1, 2, 3 and 4. See Details.

collapse

Logical value specifying whether duplicated values in x should be pooled (replacing them by a unique x value whose weight is the sum of the associated weights).

Details

The ith observation x[i] is treated as having a weight proportional to w[i].

The weighted sample median is a value m such that the total weight of data less than or equal to m is equal to half the total weight. More generally, the weighted sample quantile with probability p is a value q such that the total weight of data less than or equal to q is equal to p times the total weight.

Define the weighted empirical cumulative distribution function

F(x) = \sum_{i: x_i \le x} w_i / \sum_{i=1}^n w_i

That is, F(x) is the fraction of total weight that is associated with data values x_i less than or equal to the value x.

The weighted quantile for probability p is a number q defined so that F(q) = p wherever possible. There are different definitions of the quantile depending on how this should be achieved.

For unweighted data, there are 9 different definitions of the sample median and sample quantile, which enjoy slightly different properties. These 9 different definitions are explained in the help for quantile.default. The user's choice of algorithm is selected using the argument type.

For weighted data, the first 4 of the 9 definitions of sample quantile have been generalised to weighted quantiles. The functions weighted.median and weighted.quantile documented here provide these definitions of weighted sample quantile. The user's choice of algorithm is again selected using the argument type.

Suppose the data values have been arranged in increasing order as x_{[1]} \le x_{[2]} \le \ldots \le x_{[n]}. If one of the data values x_{[k]} satisfies F(x_{[k]})=p exactly, then

If there is no data value satisfying F(x_{[k]})=p exactly, then the code finds the two adjacent values x_{[k]} and x_{[k+1]} which satisfy F(x_{[k]}) < p and F(x_{[k+1]}) > p, and defines the quantile as follows:

For very small probabilities p < F(x_{[1]}) the value x_{[1]} is returned. For very large probabilities p > F(x_{[n]}) the value x_{[n]} is returned.

Type 1 is the left-continuous quantile function.

Type 2 is consistent with the traditional definition of the sample median.

Types 1 and 3 always return a value selected from the input data x, while types 2 and 4 sometimes return values that are interpolated between the input data values.

Note that the default settings are different for weighted.median and weighted.quantile.

The implementation of type 3 is experimental and may be changed.

Value

weighted.median returns a numeric value. weighted.quantile returns a numeric vector of the same length as probs.

Author(s)

Adrian Baddeley Adrian.Baddeley@curtin.edu.au.

See Also

quantile, median.

Examples

  x <- 1:20
  w <- runif(20)
  weighted.median(x, w)
  weighted.quantile(x, w, probs=(0:4)/4)
  weighted.var(x, w)

[Package spatstat.univar version 3.1-4 Index]