cont_did {contdid} | R Documentation |
Difference-in-differences with a Continuous Treatment
Description
A function for difference-in-differences with a continuous treatment in a staggered treatment adoption setting.
cont_did
currently supports staggered treatment with continuous treatments using
B-splines under the hood.
Usage
cont_did(
yname,
dname,
gname = NULL,
tname,
idname,
xformula = ~1,
data,
target_parameter = c("level", "slope"),
aggregation = c("dose", "eventstudy", "none"),
treatment_type = c("continuous", "discrete"),
dose_est_method = c("parametric", "cck"),
dvals = NULL,
degree = 3,
num_knots = 0,
allow_unbalanced_panel = FALSE,
control_group = c("notyettreated", "nevertreated", "eventuallytreated"),
anticipation = 0,
weightsname = NULL,
alp = 0.05,
bstrap = TRUE,
cband = FALSE,
boot_type = "multiplier",
biters = 1000,
clustervars = NULL,
est_method = NULL,
base_period = "varying",
print_details = FALSE,
cl = 1,
...
)
Arguments
yname |
The name of the outcome variable |
dname |
The name of the treatment variable in the data. The functionality of
|
gname |
The name of the timing-group variable, i.e., when treatment starts for a particular unit. The value of this variable should be set to be 0 for units that do not participate in the treatment in any time period. |
tname |
The name of the column containing the time periods |
idname |
The individual (cross-sectional unit) id name |
xformula |
A formula for additional covariates. This is not currently supported. |
data |
The name of the data.frame that contains the data |
target_parameter |
Two options are "level" and "slope". In the first case, the function will report level effects, i.e., ATT's. In the second case, the function will report slope effects, i.e., ACRT's |
aggregation |
"dose" averages across timing-groups and time periods and provides results as a function of the dose. "eventstudy" averages across timing-groups and doses and reports results as a function of the length of exposure to the treatment. "none" is a stub for reporting fully disaggregated results that can be processed as desired by the user. This is not currently supported though. The combination of the arguments |
treatment_type |
"continuous" or "discrete" depending on the nature of the treatment. Default is "continuous". "discrete" is not yet supported. |
dose_est_method |
The method used to estimate the dose-specific effects. The default
is "parametric", where the user needs to specify the number of knots and degree for
a B-spline which is assumed to be correctly specified. The other option is "cck"
which uses the a data-driven nonparametric method to estimate the dose-specific effects
based on the |
dvals |
The values of the treatment at which to compute dose-specific effects. If it is not specified, the default choice will be use the percentiles of the dose among all ever-treated units. |
degree |
The degree of the B-Spline used in estimation. The default is 3, which in
combination with the default choice for the |
num_knots |
The number of knots to include for the B-Spline. The default is 0 so that the spline is global (i.e., this will amount to fitting a global polynomial). There is a bias-variance tradeoff for including more or less knots. |
allow_unbalanced_panel |
Whether or not function should
"balance" the panel with respect to time and id. The default
values if |
control_group |
Which units to use the control group.
The default is "nevertreated" which sets the control group
to be the group of units that never participate in the
treatment. This group does not change across groups or
time periods. The other option is to set
|
anticipation |
The number of time periods before participating in the treatment where units can anticipate participating in the treatment and therefore it can affect their untreated potential outcomes |
weightsname |
The name of the column containing the sampling weights. If not set, all observations have same weight. |
alp |
the significance level, default is 0.05 |
bstrap |
Boolean for whether or not to compute standard errors using
the multiplier bootstrap. If standard errors are clustered, then one
must set |
cband |
Boolean for whether or not to compute a uniform confidence
band that covers all of the group-time average treatment effects
with fixed probability |
boot_type |
should be one of "multiplier" (the default) or "empirical".
The multiplier bootstrap is generally much faster, but |
biters |
The number of bootstrap iterations to use. The default is 1000,
and this is only applicable if |
clustervars |
A vector of variables names to cluster on. At most, there
can be two variables (otherwise will throw an error) and one of these
must be the same as idname which allows for clustering at the individual
level. By default, we cluster at individual level (when |
est_method |
the method to compute group-time average treatment effects. The default is "dr" which uses the doubly robust
approach in the |
base_period |
Whether to use a "varying" base period or a "universal" base period. Either choice results in the same post-treatment estimates of ATT(g,t)'s. In pre-treatment periods, using a varying base period amounts to computing a pseudo-ATT in each treatment period by comparing the change in outcomes for a particular group relative to its comparison group in the pre-treatment periods (i.e., in pre-treatment periods this setting computes changes from period t-1 to period t, but repeatedly changes the value of t) A universal base period fixes the base period to always be (g-anticipation-1). This does not compute pseudo-ATT(g,t)'s in pre-treatment periods, but rather reports average changes in outcomes from period t to (g-anticipation-1) for a particular group relative to its comparison group. This is analogous to what is often reported in event study regressions. Using a varying base period results in an estimate of ATT(g,t) being reported in the period immediately before treatment. Using a universal base period normalizes the estimate in the period right before treatment (or earlier when the user allows for anticipation) to be equal to 0, but one extra estimate in an earlier period. |
print_details |
Whether or not to show details/progress of computations.
Default is |
cl |
number of clusters to be used when bootstrapping; default is 1 |
... |
extra arguments that can be passed to create the correct subsets
of the data (depending on |
Value
cont_did_obj
Examples
# build small simulated data
set.seed(1234)
df <- simulate_contdid_data(
n = 1000,
num_time_periods = 4,
num_groups = 4,
dose_linear_effect = 0,
dose_quadratic_effect = 0
)
# estimate effects of continuous treatment
cd_res <- cont_did(
yname = "Y",
tname = "time_period",
idname = "id",
dname = "D",
data = df,
gname = "G",
target_parameter = "slope",
aggregation = "dose",
treatment_type = "continuous",
control_group = "notyettreated",
biters = 50,
cband = TRUE,
num_knots = 1,
degree = 3,
)
summary(cd_res)