rd2d.dist {rd2d}R Documentation

Local Polynomial RD Estimation on Distance-Based Running Variables

Description

rd2d.dist implements distance-based local polynomial boundary regression discontinuity (RD) point estimators with robust bias-corrected pointwise confidence intervals and uniform confidence bands, developed in Cattaneo, Titiunik and Yu (2025a) with a companion software article Cattaneo, Titiunik and Yu (2025b). For robust bias-correction, see Calonico, Cattaneo, Titiunik (2014).

Companion commands are: rdbw2d.dist for data-driven bandwidth selection.

For other packages of RD designs, visit https://rdpackages.github.io/

Usage

rd2d.dist(
  Y,
  D,
  h = NULL,
  b = NULL,
  p = 1,
  q = 2,
  kink = c("off", "on"),
  kernel = c("tri", "triangular", "epa", "epanechnikov", "uni", "uniform", "gau",
    "gaussian"),
  level = 95,
  cbands = TRUE,
  side = c("two", "left", "right"),
  repp = 1000,
  bwselect = c("mserd", "imserd", "msetwo", "imsetwo", "user provided"),
  vce = c("hc1", "hc0", "hc2", "hc3"),
  rbc = c("on", "off"),
  bwcheck = 50 + p + 1,
  masspoints = c("check", "adjust", "off"),
  C = NULL,
  scaleregul = 1,
  cqt = 0.5
)

Arguments

Y

Dependent variable; a numeric vector of length N, where N is the sample size.

D

Distance-based scores \mathbf{D}_i=(\mathbf{D}_{i}(\mathbf{b}_1),\cdots,\mathbf{D}_{i}(\mathbf{b}_J)); dimension is N \times J where N = sample size and J = number of cutoffs; non-negative values means data point in treatment group and negative values means data point in control group.

h

Bandwidth(s); if c=h then same bandwidth is used for both groups; if a matrix of size J \times 2 is provided, each row contains (h_{\text{control}}, h_{\text{tr}}) for the evaluation point; if not specified, bandwidths are selected via rdbw2d.dist().

b

Optional evaluation points; a matrix or data frame specifying boundary points \mathbf{b}_j = (b_{1j}, b_{2j}), dimension J \times 2.

p

Polynomial order for point estimation. Default is p = 1.

q

Polynomial order for bias-corrected estimation. Must satisfy q \geq p. Default is q = p + 1.

kink

Logical; whether to apply kink adjustment. Options: "on" (default) or "off".

kernel

Kernel function to use. Options are "unif", "uniform" (uniform), "triag", "triangular" (triangular, default), and "epan", "epanechnikov" (Epanechnikov).

level

Nominal confidence level for intervals/bands, between 0 and 100 (default is 95).

cbands

Logical. If TRUE, also compute uniform confidence bands (default is FALSE).

side

Type of confidence interval. Options: "two" (two-sided, default), "left" (left tail), or "right" (right tail).

repp

Number of bootstrap repetitions used for critical value simulation. Default is 1000.

bwselect

Bandwidth selection strategy. Options:

  • "mserd". One common MSE-optimal bandwidth selector for the boundary RD treatment effect estimator for each evaluation point (default).

  • "imserd". IMSE-optimal bandwidth selector for the boundary RD treatment effect estimator based on all evaluation points.

  • "msetwo". Two different MSE-optimal bandwidth selectors (control and treatment) for the boundary RD treatment effect estimator for each evaluation point.

  • "imsetwo". Two IMSE-optimal bandwidth selectors (control and treatment) for the boundary RD treatment effect estimator based on all evaluation points.

  • "user provided". User-provided bandwidths. If h is not NULL, then bwselect is overwritten to "user provided".

vce

Variance-covariance estimator for standard errors. Options:

"hc0"

Heteroskedasticity-robust variance estimator without small sample adjustment (White robust).

"hc1"

Heteroskedasticity-robust variance estimator with degrees-of-freedom correction (default).

"hc2"

Heteroskedasticity-robust variance estimator using leverage adjustments.

"hc3"

More conservative heteroskedasticity-robust variance estimator (similar to jackknife correction).

rbc

Logical. Whether to apply robust bias correction. Options: "on" (default) or "off". When kink = off, turn on rbc means setting q to p + 1. When kink = on, turn on rbc means shrinking the bandwidth selector to be proportional to N^{-1/3}.

bwcheck

If a positive integer is provided, the preliminary bandwidth used in the calculations is enlarged so that at least bwcheck observations are used. The program stops with “not enough observations” if sample size N < bwcheck. Default is 50 + p + 1.

masspoints

Strategy for handling mass points in the running variable. Options:

"check"

Check for repeated values and adjust inference if needed (default).

"adjust"

Adjust bandwidths to guarantee a sufficient number of unique support points.

"off"

Ignore mass points completely.

C

Cluster ID variable used for cluster-robust variance estimation. Default is C = NULL.

scaleregul

Scaling factor for the regularization term in bandwidth selection. Default is 1.

cqt

Constant controlling subsample fraction for initial bias estimation. Default is 0.5.

Details

MSE bandwidth selection for geometrical RD design

Value

An object of class "rd2d.dist", a list containing:

results

A data frame with point estimates, variances, p-values, confidence intervals, confidence bands, and bandwidths at each evaluation point.

b1

First coordinate of the evaluation point.

b2

Second coordinate of the evaluation point.

Est.p

Point estimate \widehat{\tau}_{\text{dist},p}(\mathbf{b}) with polynomial order p.

Var.p

Variance of \widehat{\tau}_{\text{dist},p}(\mathbf{b}).

Est.q

Bias-corrected estimate \widehat{\tau}_{\text{dist},q}(\mathbf{b}) with polynomial order q.

Var.q

Variance of \widehat{\tau}_{\text{dist},q}(\mathbf{b}).

pvalue

Two-sided p-value based on T_{\text{dist},q}(\mathbf{b}).

CI.lower

Lower bound of confidence interval.

CI.upper

Upper bound of confidence interval.

CB.lower

Lower bound of uniform confidence band (if cbands=TRUE).

CB.upper

Upper bound of uniform confidence band (if cbands=TRUE).

h0

Bandwidth used for control group (D_i(\mathbf{b}) < 0).

h1

Bandwidth used for treatment group (D_i(\mathbf{b}) \geq 0).

Nh0

Effective sample size for control group.

Nh1

Effective sample size for treatment group.

results.A0

Same structure as results but for control group outcomes.

results.A1

Same structure as results but for treatment group outcomes.

tau.hat

Vector of point estimates \widehat{\tau}_p(\mathbf{b}).

se.hat

Standard errors corresponding to \widehat{\tau}_p(\mathbf{b}).

cb

Confidence intervals and uniform bands.

cov.q

Covariance matrix for bias-corrected estimates \widehat{\tau}_{\text{dist},q}(\mathbf{b}) for all point evaluations \mathbf{b}.

opt

List of options used in the function call.

Author(s)

Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu
Rocío Titiunik, Princeton University. titiunik@princeton.edu
Ruiqi Rae Yu, Princeton University. rae.yu@princeton.edu

References

See Also

rdbw2d.dist, rd2d, print.rd2d.dist, summary.rd2d.dist

Examples

set.seed(123)
n <- 5000

# Generate running variables x1 and x2
x1 <- rnorm(n)
x2 <- rnorm(n)

# Define treatment assignment: treated if x1 >= 0
d <- as.numeric(x1 >= 0)

# Generate outcome variable y with some treatment effect
y <- 3 + 2 * x1 + 1.5 * x2 + 1.5 * d + rnorm(n, sd = 0.5)

# Define evaluation points (e.g., at the origin and another point)
eval <- data.frame(x.1 = c(0, 0), x.2 = c(0, 1))

# Compute Euclidean distances to evaluation points
dist.a <- sqrt((x1 - eval$x.1[1])^2 + (x2 - eval$x.2[1])^2)
dist.b <- sqrt((x1 - eval$x.1[2])^2 + (x2 - eval$x.2[2])^2)

# Combine distances into a matrix
D <- as.data.frame(cbind(dist.a, dist.b))

# Assign positive distances for treatment group, negative for control
d_expanded <- matrix(rep(2 * d - 1, times = ncol(D)), nrow = nrow(D), ncol = ncol(D))
D <- D * d_expanded

# Run the rd2d.dist function
result <- rd2d.dist(y, D, b = eval)

# View the estimation results
print(result)
summary(result)

[Package rd2d version 0.0.2 Index]