simulateData {fetwfe} | R Documentation |
Generate Random Panel Data for FETWFE Simulations
Description
Generates a random panel data set for simulation studies of the fused extended two-way fixed
effects (FETWFE) estimator by taking an object of class "FETWFE_coefs"
(produced by
genCoefs()
) and using it to simulate data. The function creates a balanced panel
with N
units over T
time periods, assigns treatment status across R
treated cohorts (with equal marginal probabilities for treatment and non-treatment), and
constructs a design matrix along with the corresponding outcome. The covariates are
generated according to the specified distribution
: by default, covariates are drawn
from a normal distribution; if distribution = "uniform"
, they are drawn uniformly
from [-\sqrt{3}, \sqrt{3}]
. When d = 0
(i.e. no covariates), no
covariate-related columns or interactions are generated. See the simulation studies section of
Faletto (2025) for details.
Usage
simulateData(coefs_obj, N, sig_eps_sq, sig_eps_c_sq, distribution = "gaussian")
Arguments
coefs_obj |
An object of class |
N |
Integer. Number of units in the panel. |
sig_eps_sq |
Numeric. Variance of the idiosyncratic (observation-level) noise. |
sig_eps_c_sq |
Numeric. Variance of the unit-level random effects. |
distribution |
Character. Distribution to generate covariates.
Defaults to |
Details
This function extracts simulation parameters from the FETWFE_coefs
object and passes them,
along with additional simulation parameters, to the internal function simulateDataCore()
.
It validates that all necessary components are returned and assigns the S3 class
"FETWFE_simulated"
to the output.
The argument distribution
controls the generation of covariates. For
"gaussian"
, covariates are drawn from rnorm
; for "uniform"
,
they are drawn from runif
on the interval [-\sqrt{3}, \sqrt{3}]
(which ensures that
the covariates have unit variance regardless of which distribution is chosen).
When d = 0
(i.e. no covariates), the function omits any covariate-related columns
and their interactions.
Value
An object of class "FETWFE_simulated"
, which is a list containing:
- pdata
A dataframe containing generated data that can be passed to
fetwfe()
.- X
The design matrix
X
, withp
columns with interactions.- y
A numeric vector of length
N \times T
containing the generated responses.- covs
A character vector containing the names of the generated features (if
d > 0
), or simply an empty vector (ifd = 0
)- time_var
The name of the time variable in pdata
- unit_var
The name of the unit variable in pdata
- treatment
The name of the treatment variable in pdata
- response
The name of the response variable in pdata
- coefs
The coefficient vector
\beta
used for data generation.- first_inds
A vector of indices indicating the first treatment effect for each treated cohort.
- N_UNTREATED
The number of never-treated units.
- assignments
A vector of counts (of length
R+1
) indicating how many units fall into the never-treated group and each of theR
treated cohorts.- indep_counts
Independent cohort assignments (for auxiliary purposes).
- p
The number of columns in the design matrix
X
.- N
Number of units.
- T
Number of time periods.
- R
Number of treated cohorts.
- d
Number of covariates.
- sig_eps_sq
The idiosyncratic noise variance.
- sig_eps_c_sq
The unit-level noise variance.
References
Faletto, G (2025). Fused Extended Two-Way Fixed Effects for Difference-in-Differences with Staggered Adoptions. arXiv preprint arXiv:2312.05985. https://arxiv.org/abs/2312.05985.
Examples
## Not run:
# Generate coefficients
coefs <- genCoefs(R = 5, T = 30, d = 12, density = 0.1, eff_size = 2, seed = 123)
# Simulate data using the coefficients
sim_data <- simulateData(coefs, N = 120, sig_eps_sq = 5, sig_eps_c_sq = 5)
## End(Not run)