generate_simulated_data {modgo}R Documentation

Generate new data set by using previous correlation matrix

Description

This function is used internally by modgo. It conducts the computation of the correlation matrix of the transformed variables, which are assumed to follow a multivariate normal distribution.

Usage

generate_simulated_data(
  data,
  df_sim,
  variables,
  bin_variables,
  categ_variables,
  count_variables,
  n_samples,
  generalized_mode,
  generalized_mode_lmbds,
  multi_sugg_prop,
  pertr_vec,
  var_infl,
  infl_cov_stable
)

Arguments

data

a data frame with original variables.

df_sim

a data frame with simulated values.

variables

variables a character vector indicating which columns of data should be used.

bin_variables

a character vector listing the binary variables.

categ_variables

a character vector listing the ordinal categorical variables.

count_variables

a character vector listing the count as a sub sub category of categorical variables. Count variables should be part of categorical variables vector. Count variables are treated differently when using gldex to simulate them.

n_samples

Number of rows of each simulated data set. Default is the number of rows of data.

generalized_mode

A logical value indicating if generalized lambda/poisson distributions or set up thresholds will be used to generate the simulated values

generalized_mode_lmbds

A matrix that contains lmbds values for each of the variables of the data set to be used for either Generalized Lambda Distribution Generalized Poisson Distribution or setting up thresholds

multi_sugg_prop

A named vector that provides a proportion of value=1 for specific binary variables(=name of the vector) that will be the close to the proportion of this value in the simulated data sets.

pertr_vec

A named vector.Vector's names are the continuous variables that the user want to perturb. Variance of simulated data set mimic original data's variance.

var_infl

A named vector.Vector's names are the continuous variables that the user want to perturb and increase their variance

infl_cov_stable

Logical value. If TRUE,perturbation is applied to original data set and simulations values mimic the perturbed original data set.Covariance matrix used for simulation = original data's correlations. If FALSE, perturbation is applied to the simulated data sets.

Value

A data frame with simulated values

Author(s)

Francisco M. Ojeda, George Koliopanos


[Package modgo version 1.0.1 Index]