generate_qualitative_data_iv {causalQual} | R Documentation |
Generate Qualitative Data (Instrumental Variables)
Description
Generate a synthetic data set with qualitative outcomes under an instrumental variables design. The data include a binary treatment indicator and a binary instrument. Potential outcomes and potential treatments are independent of the instrument. Moreover, the instrument does not directly impact potential outcomes, has an impact on treatment probability, and can only increase the probability of treatment.
Usage
generate_qualitative_data_iv(n, outcome_type)
Arguments
n |
Sample size. |
outcome_type |
String controlling the outcome type. Must be either |
Details
Outcome type
Potential outcomes are generated differently according to outcome_type
. If outcome_type == "multinomial"
, generate_qualitative_data_iv
computes linear predictors for each class using the covariates:
\eta_{mi} (d) = \beta_{m1}^d X_{i1} + \beta_{m2}^d X_{i2} + \beta_{m3}^d X_{i3}, \quad d = 0, 1,
and then transforms \eta_{mi} (d)
into valid probability distributions using the softmax function:
P(Y_i(d) = m | X_i) = \frac{\exp(\eta_{mi} (d))}{\sum_{m'} \exp(\eta_{m'i}(d))}, \quad d = 0, 1.
It then generates potential outcomes Y_i(1)
and Y_i(0)
by sampling from {1, 2, 3} using P_i(Y(d) = m | X), \, d = 0, 1
.
If instead outcome_type == "ordered"
, generate_qualitative_data_iv
first generates latent potential outcomes:
Y_i^* (d) = \tau d + X_{i1} + X_{i2} + X_{i3} + N (0, 1), \quad d = 0, 1,
with \tau = 2
. It then constructs Y_i (d)
by discretizing Y_i^* (d)
using threshold parameters \zeta_1 = 2
and \zeta_2 = 4
. Then,
P(Y_i(d) = m | X_i) = P(\zeta_{m-1} < Y_i^*(d) \leq \zeta_m | X_i) = \Phi (\zeta_m - \sum_j X_{ij} - \tau d) - \Phi (\zeta_{m-1} - \sum_j X_{ij} - \tau d), \quad d = 0, 1,
which allows us to analytically compute the local probabilities of shift.
Treatment assignment and instrument
The instrument is always generated as Z_i \sim \text{Bernoulli}(0.5)
. Treatment is always modeled as D_i \sim \text{Bernoulli}(\pi(X_i, Z_i))
, with
\pi(X_i, Z_i) = P ( D_i = 1 | X_i, Z_i)) = (X_{i1} + X_{i3} + Z_i) / 3
. Thus, Z_i
can increase the probability of treatment intake but cannot decrease it.
Other details
The function always generates three independent covariates from U(0,1)
. Observed outcomes Y_i
are always constructed using the usual observational rule.
Value
A list storing a data frame with the observed data, the true propensity score, the true instrument propensity score, and the true local probabilities of shift.
Author(s)
Riccardo Di Francesco
See Also
generate_qualitative_data_soo
generate_qualitative_data_rd
generate_qualitative_data_did
Examples
## Generate synthetic data.
set.seed(1986)
data <- generate_qualitative_data_iv(100,
outcome_type = "ordered")
data$local_pshifts