rdbw2d.dist {rd2d}R Documentation

Bandwidth Selection for Distance-Based RD Designs

Description

rdbw2d.dist implements bandwidth selector for distance-based local polynomial boundary regression discontinuity (RD) point estimators with robust bias-corrected pointwise confidence intervals and uniform confidence bands, developed in Cattaneo, Titiunik and Yu (2025a) with a companion software article Cattaneo, Titiunik and Yu (2025b). For robust bias-correction, see Calonico, Cattaneo, Titiunik (2014).

Usage

rdbw2d.dist(
  Y,
  D,
  b = NULL,
  p = 1,
  kink = c("off", "on"),
  kernel = c("tri", "triangular", "epa", "epanechnikov", "uni", "uniform", "gau",
    "gaussian"),
  bwselect = c("mserd", "imserd", "msetwo", "imsetwo"),
  vce = c("hc1", "hc0", "hc2", "hc3"),
  bwcheck = 20 + p + 1,
  masspoints = c("check", "adjust", "off"),
  C = NULL,
  scaleregul = 1,
  cqt = 0.5
)

Arguments

Y

Dependent variable; a numeric vector of length N, where N is the sample size.

D

Distance-based scores \mathbf{D}_i=(\mathbf{D}_{i}(\mathbf{b}_1),\cdots,\mathbf{D}_{i}(\mathbf{b}_J)); dimension is N \times J where N = sample size and J = number of cutoffs; non-negative values means data point in treatment group and negative values means data point in control group.

b

Optional evaluation points; a matrix or data frame specifying boundary points \mathbf{b}_j = (b_{1j}, b_{2j}), dimension J \times 2.

p

Polynomial order for point estimation. Default is p = 1.

kink

Logical; whether to apply kink adjustment. Options: "on" (default) or "off".

kernel

Kernel function to use. Options are "unif", "uniform" (uniform), "triag", "triangular" (triangular, default), and "epan", "epanechnikov" (Epanechnikov).

bwselect

Bandwidth selection strategy. Options:

  • "mserd". One common MSE-optimal bandwidth selector for the boundary RD treatment effect estimator for each evaluation point (default).

  • "imserd". IMSE-optimal bandwidth selector for the boundary RD treatment effect estimator based on all evaluation points.

  • "msetwo". Two different MSE-optimal bandwidth selectors (control and treatment) for the boundary RD treatment effect estimator for each evaluation point.

  • "imsetwo". Two IMSE-optimal bandwidth selectors (control and treatment) for the boundary RD treatment effect estimator based on all evaluation points.

  • "user provided". User-provided bandwidths. If h is not NULL, then bwselect is overwritten to "user provided".

vce

Variance-covariance estimator for standard errors. Options:

"hc0"

Heteroskedasticity-robust variance estimator without small sample adjustment (White robust).

"hc1"

Heteroskedasticity-robust variance estimator with degrees-of-freedom correction (default).

"hc2"

Heteroskedasticity-robust variance estimator using leverage adjustments.

"hc3"

More conservative heteroskedasticity-robust variance estimator (similar to jackknife correction).

bwcheck

If a positive integer is provided, the preliminary bandwidth used in the calculations is enlarged so that at least bwcheck observations are used. The program stops with “not enough observations” if sample size N < bwcheck. Default is 50 + p + 1.

masspoints

Strategy for handling mass points in the running variable. Options:

"check"

Check for repeated values and adjust inference if needed (default).

"adjust"

Adjust bandwidths to guarantee a sufficient number of unique support points.

"off"

Ignore mass points completely.

C

Cluster ID variable used for cluster-robust variance estimation with degrees-of-freedom weights.Default is C = NULL.

scaleregul

Scaling factor for the regularization term in bandwidth selection. Default is 1.

cqt

Constant controlling subsample fraction for initial bias estimation. Default is 0.5.

Value

An object of class "rdbw2d.dist", containing:

bws

Data frame of optimal bandwidths for each evaluation point:

b1

First coordinate of the evaluation point b1.

b2

Second coordinate of the evaluation point b2.

h0

Bandwidth for observations with distance D_{i}(\mathbf{b}) < 0.

h1

Bandwidth for observations with distance D_{i}(\mathbf{b}) \geq 0.

Nh0

Effective sample size for D_{i}(\mathbf{b}) < 0.

Nh1

Effective sample size for D_{i}(\mathbf{b}) \geq 0.

mseconsts

Data frame of intermediate bias and variance constants used for MSE/IMSE calculations.

opt

List of options used in the function call.

Author(s)

Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu
Rocío Titiunik, Princeton University. titiunik@princeton.edu
Ruiqi Rae Yu, Princeton University. rae.yu@princeton.edu

References

See Also

rd2d.dist, rd2d, summary.rdbw2d.dist, print.rdbw2d.dist

Examples

set.seed(123)
n <- 5000

# Generate running variables x1 and x2
x1 <- rnorm(n)
x2 <- rnorm(n)

# Define treatment assignment: treated if x1 >= 0
d <- as.numeric(x1 >= 0)

# Generate outcome variable y with some treatment effect
y <- 3 + 2 * x1 + 1.5 * x2 + 1.5 * d + rnorm(n, sd = 0.5)

# Define evaluation points (e.g., at the origin and another point)
eval <- data.frame(x.1 = c(0, 0), x.2 = c(0, 1))

# Compute Euclidean distances to evaluation points
dist.a <- sqrt((x1 - eval$x.1[1])^2 + (x2 - eval$x.2[1])^2)
dist.b <- sqrt((x1 - eval$x.1[2])^2 + (x2 - eval$x.2[2])^2)

# Combine distances into a matrix
D <- as.data.frame(cbind(dist.a, dist.b))

# Assign positive distances for treatment group, negative for control
d_expanded <- matrix(rep(2 * d - 1, times = ncol(D)), nrow = nrow(D), ncol = ncol(D))
D <- D * d_expanded

# Run the rd2d.dist function
bws <- rdbw2d.dist(y, D, b = eval)

# View the estimation results
print(bws)
summary(bws)

[Package rd2d version 0.0.2 Index]