sim_A {BMIselect} | R Documentation |
Simulate dataset A: Independent continuous covariates with MCAR/MAR missingness
Description
Generates a dataset for Scenario A used in Bayesian MI-LASSO benchmarking. Covariates are iid standard normal, with a fixed true coefficient vector, linear outcome, missingness imposed on specified columns under MCAR or MAR, and multiple imputations via predictive mean matching.
Usage
sim_A(
n = 100,
p = 20,
type = "MAR",
SNP = 1.5,
low_missing = TRUE,
n_imp = 5,
seed = NULL
)
Arguments
n |
Integer. Number of observations. |
p |
Integer. Number of covariates (columns). Takes values in {20, 40}. |
type |
Character. Missingness mechanism: "MCAR" or "MAR". |
SNP |
Numeric. Signal-to-noise ratio controlling error variance. |
low_missing |
Logical. If TRUE, use low missingness rates; if FALSE, higher missingness. |
n_imp |
Integer. Number of multiple imputations to generate. |
seed |
Integer or NULL. Random seed for reproducibility. |
Value
A list with components:
- data_O
A list of complete covariate matrix and outcomes before missingness.
- data_mis
A list of covariate matrix and outcomes with missing values.
- data_MI
A list of array of imputed covariates (n_imp × n × p) and a matrix of imputed outcomes (n_imp × n).
- data_CC
A list of complete-case covariate matrix and outcomes.
- important
Logical vector of true nonzero coefficient indices.
- covmat
True covariance matrix used for X.
- beta
True coefficient vector.
Examples
sim <- sim_A(n = 100, p = 20, type = "MAR", SNP = 1.5,
low_missing = TRUE, n_imp = 5, seed = 123)
str(sim)