pcf.ppp {spatstat.explore} | R Documentation |
Pair Correlation Function of Point Pattern
Description
Estimates the pair correlation function of a point pattern using kernel methods.
Usage
## S3 method for class 'ppp'
pcf(X, ..., r = NULL,
adaptive=FALSE,
kernel="epanechnikov", bw=NULL, h=NULL,
bw.args=list(), stoyan=0.15, adjust=1,
correction=c("translate", "Ripley"),
divisor = c("r", "d", "a", "t"),
zerocor=c("weighted", "reflection", "convolution",
"bdrykern", "JonesFoster", "none"),
gref = NULL,
tau = 0,
fast = TRUE,
var.approx = FALSE,
domain=NULL,
ratio=FALSE, close=NULL)
Arguments
X |
A point pattern (object of class |
... |
Arguments passed to |
r |
Optional. Vector of values for the argument |
adaptive |
Logical value specifying whether to use adaptive kernel smoothing
( |
kernel |
Choice of smoothing kernel, passed to |
bw |
Bandwidth for smoothing kernel. Either a single numeric value giving the standard deviation of the kernel, or a character string specifying a bandwidth selection rule, or a function that computes the selected bandwidth. See Details. |
h |
Kernel halfwidth |
bw.args |
Optional. List of additional arguments to be passed to |
stoyan |
Coefficient for Stoyan's bandwidth selection rule; see Details. |
adjust |
Numerical adjustment factor for the bandwidth.
The bandwidth actually used is |
correction |
Edge correction. A character vector specifying the choice (or choices) of edge correction. See Details. |
divisor |
Choice of divisor in the estimation formula:
either |
zerocor |
String (partially matched) specifying a correction for the boundary effect
bias at |
gref |
Optional. A pair correlation function that will be used as the
reference for the transformation to uniformity, when
|
tau |
Optional shrinkage coefficient. A single numeric value. |
fast |
Logical value specifying whether to compute the kernel smoothing
using a Fast Fourier Transform algorithm ( |
var.approx |
Logical value indicating whether to compute an analytic approximation to the variance of the estimated pair correlation. |
domain |
Optional. Calculations will be restricted to this subset of the window. See Details. |
ratio |
Logical.
If |
close |
Advanced use only. Precomputed data. See section on Advanced Use. |
Details
The pair correlation function g(r)
is a summary of the dependence between points in a spatial point
process. The best intuitive interpretation is the following: the probability
p(r)
of finding two points at locations x
and y
separated by a distance r
is equal to
p(r) = \lambda^2 g(r) \,{\rm d}x \, {\rm d}y
where \lambda
is the intensity of the point process.
For a completely random (uniform Poisson) process,
p(r) = \lambda^2 \,{\rm d}x \, {\rm d}y
so g(r) = 1
.
Formally, the pair correlation function of a stationary point process
is defined by
g(r) = \frac{K'(r)}{2\pi r}
where K'(r)
is the derivative of K(r)
, the
reduced second moment function (aka “Ripley's K
function”)
of the point process. See Kest
for information
about K(r)
.
For a stationary Poisson process, the
pair correlation function is identically equal to 1. Values
g(r) < 1
suggest inhibition between points;
values greater than 1 suggest clustering.
This routine computes an estimate of g(r)
by kernel smoothing.
-
If
divisor="r"
(the default), then the standard kernel estimator (Stoyan and Stoyan, 1994, pages 284–285) is used. By default, the recommendations of Stoyan and Stoyan (1994) are followed exactly. -
If
divisor="d"
then a modified estimator is used (Guan, 2007): the contribution from an interpoint distanced_{ij}
to the estimate ofg(r)
is divided byd_{ij}
instead of dividing byr
. This usually improves the bias of the estimator whenr
is close to zero. -
If
divisor="a"
then improved method of Baddeley, Davies and Hazelton (2025) is used. The distancesd_{ij}
are first converted to disc areasa_{ij}=\pi d_{ij}^2
, and smoothing is performed on the area scale, then the result is back-transformed to the original scale. -
If
divisor="t"
then the distancesd_{ij}
are transformed to uniformity using the reference pair correlation functiongref
as described in Baddeley, Davies and Hazelton (2025). -
If
divisor
is afunction
in the R language, then it will be applied to the point patternX
and should return one of the strings"r"
,"d"
,"a"
or"t"
listed above. This option makes it possible to specify a rule which decides which estimator to use, based on the data.
There is also a choice of spatial edge corrections (which are needed to avoid bias due to edge effects associated with the boundary of the spatial window):
-
If
correction="translate"
orcorrection="translation"
then the translation correction is used. Fordivisor="r"
the translation-corrected estimate is given in equation (15.15), page 284 of Stoyan and Stoyan (1994). -
If
correction="Ripley"
orcorrection="isotropic"
then Ripley's isotropic edge correction is used. Fordivisor="r"
the isotropic-corrected estimate is given in equation (15.18), page 285 of Stoyan and Stoyan (1994). -
If
correction="none"
then no edge correction is used, that is, an uncorrected estimate is computed.
Multiple corrections can be selected. The default is
correction=c("translate", "Ripley")
.
Alternatively correction="all"
selects all options;
correction="best"
selects the option which has the best
statistical performance; correction="good"
selects the option
which is the best compromise between statistical performance and speed
of computation.
Argument zerocor
determines the correction
to the one-dimensional kernel-smoothed estimate
on the real number line, to correct bias close to the boundary r=0
.
The argument zerocor
is passed to
densityBC
.
Options include:
-
zerocor="none"
: no correction. -
zerocor="convolution"
: the convolution, uniform or renormalization kernel. -
zerocor="weighted"
: the cut-and-normalization method. -
zerocor="reflection"
: the reflection method. -
zerocor="bdrykern"
: the linear boundary kernel. -
zerocor="JonesFoster"
: the Jones-Foster modification of the linear boundary kernel.
The choice of smoothing kernel is controlled by the
argument kernel
which is passed
to density.default
.
The default is the Epanechnikov kernel, recommended by
Stoyan and Stoyan (1994, page 285).
The bandwidth of the smoothing kernel can be controlled by the
argument bw
. Bandwidth is defined as the standard deviation
of the kernel; see the documentation for density.default
.
For the Epanechnikov kernel with half-width h
,
the argument bw
is equivalent to h/\sqrt{5}
.
Stoyan and Stoyan (1994, page 285) recommend using the Epanechnikov
kernel with support [-h,h]
chosen by the rule of thumn
h = c/\sqrt{\lambda}
,
where \lambda
is the (estimated) intensity of the
point process, and c
is a constant in the range from 0.1 to 0.2.
See equation (15.16).
If bw
is missing or NULL
,
then this rule of thumb will be applied.
The argument stoyan
determines the value of c
.
The smoothing bandwidth that was used in the calculation is returned
as an attribute of the final result.
The argument bw
can be
-
missing or null. In this case, the default value for
bw
is"stoyan"
whenadaptive=FALSE
and"bw.abram"
whenadaptive=TRUE
. -
a single numeric value giving the bandwidth.
-
a character string specifying a bandwidth selection rule. String names of rules applicable when
adaptive=FALSE
include"stoyan"
,"fiksel"
and any rules recognised bydensity.default
. String names applicable whenadaptive=TRUE
include"bw.abram"
and"bw.pow"
. -
a function that computes the selected bandwidth.
-
If
adaptive=FALSE
, the functionbw
will be applied to the point patternX
to determine the bandwidth. Examples includebw.pcf
andbw.stoyan
. The functionbw
should accept the point patternX
as its first argument. Additional arguments tobw
may be specified in the listbw.args
. Ifbw
recognises any of the argumentskernel
,correction
,divisor
,zerocor
andadaptive
, then these arguments will be passed tobw
as well. The functionbw
should return a single numeric value. -
If
adaptive=TRUE
, the functionbw
will be applied to the vector of pairwise distances between data points (or the transformed distances ifdivisor="a"
ordivisor="t"
). Examples includebw.abram.default
andbw.pow
. The functionbw
should accept the vector of pairwise distances as its first argument. Additional arguments tobw
may be specified in the listbw.args
.
-
Note that if bw.args
is a function, it will be applied to
the point pattern X
to determine the list of arguments
(whether adaptive
is TRUE
or FALSE
).
The argument r
is the vector of values for the
distance r
at which g(r)
should be evaluated.
There is a sensible default.
If it is specified, r
must be a vector of increasing numbers
starting from r[1] = 0
,
and max(r)
must not exceed half the diameter of
the window.
If the argument domain
is given, estimation will be restricted
to this region. That is, the estimate of
g(r)
will be based on pairs of points in which the first point lies
inside domain
and the second point is unrestricted.
The argument domain
should be a window (object of class "owin"
) or something acceptable to
as.owin
. It must be a subset of the
window of the point pattern X
.
To compute a confidence band for the true value of the
pair correlation function, use lohboot
.
If var.approx = TRUE
, the variance of the
estimate of the pair correlation will also be calculated using
an analytic approximation (Illian et al, 2008, page 234)
which is valid for stationary point processes which are not
too clustered. This calculation is not yet implemented when
the argument domain
is given.
If fast=TRUE
, the calculation uses the Fast Fourier Transform
to the maximum extent possible for the chosen boundary correction.
If fast=FALSE
(the default), the entire calculation uses
analytic formulas written in C
code.
Value
A function value table
(object of class "fv"
).
Essentially a data frame containing the variables
r |
the vector of values of the argument |
theo |
vector of values equal to 1,
the theoretical value of |
trans |
vector of values of |
iso |
vector of values of |
v |
vector of approximate values of the variance of
the estimate of |
as required.
If ratio=TRUE
then the return value also has two
attributes called "numerator"
and "denominator"
which are "fv"
objects
containing the numerators and denominators of each
estimate of g(r)
.
The return value also has an attribute "bw"
giving the
smoothing bandwidth that was used, and an attribute "info"
containing details of the algorithm parameters.
Advanced Use
To perform the same computation using several different bandwidths bw
,
it is efficient to use the argument close
.
This should be the result of closepairs(X, rmax)
for a suitably large value of rmax
, namely
rmax >= max(r) + 3 * bw
.
Author(s)
Adrian Baddeley Adrian.Baddeley@curtin.edu.au, Rolf Turner rolfturner@posteo.net, Ege Rubak rubak@math.aau.dk, Martin Hazelton Martin.Hazelton@otago.ac.nz and Tilman Davies Tilman.Davies@otago.ac.nz.
References
Baddeley, A., Davies, T.M. and Hazelton, M.L. (2025) An improved estimator of the pair correlation function of a spatial point process. Biometrika, to appear.
Guan, Y. (2007) A least-squares cross-validation bandwidth selection approach in pair correlation function estimation. Statistics and Probability Letters 77 (18) 1722–1729.
Illian, J., Penttinen, A., Stoyan, H. and Stoyan, D. (2008) Statistical Analysis and Modelling of Spatial Point Patterns. Wiley.
Stoyan, D. and Stoyan, H. (1994) Fractals, random shapes and point fields: methods of geometrical statistics. John Wiley and Sons.
See Also
Kest
,
pcf
,
density.default
,
bw.stoyan
,
bw.pcf
,
lohboot
.
Examples
pr <- pcf(redwood, divisor="r")
plot(pr, main="pair correlation function for redwoods")
# compare estimates
pd <- pcf(redwood, divisor="d")
pa <- pcf(redwood, divisor="a")
plot(pr, cbind(iso, theo) ~ r, col=c("red", "black"),
ylim.covers=0, main="Estimates of PCF",
lwd=c(2,1), lty=c(1,3), legend=FALSE)
plot(pd, iso ~ r, col="blue", lwd=2, add=TRUE)
plot(pa, iso ~ r, col="green", lwd=2, add=TRUE)
legend("center", col=c("red", "blue", "green"), lty=1, lwd=2,
legend=c("divisor=r","divisor=d", "divisor=a"))