fngendata {GCEstim}R Documentation

Data generating function

Description

Generates data

Usage

fngendata(
  n,
  bin.k = 0,
  bin.prob = NULL,
  cont.k = 5,
  y.gen.bin.k = 0,
  y.gen.bin.beta = NULL,
  y.gen.bin.prob = NULL,
  y.gen.cont.beta = c(2, 4, 6, 8, 10),
  y.gen.cont.mod.k = 0,
  y.gen.cont.mod.beta = matrix(c(-2, 2), 1, 2, byrow = TRUE),
  y.gen.bin.mod.prob = c(0.5),
  y.gen.cont.sp.k = 0,
  y.gen.cont.sp.groups = 2,
  y.gen.cont.sp.rho = 0.2,
  y.gen.cont.sp.dif = 1,
  intercept.beta = 0,
  Xgenerator.method = "simstudy",
  corMatrix = 100,
  rho = NULL,
  corstr = NULL,
  condnumber = 1,
  mu = 0,
  muvect = NULL,
  sd = 1,
  sdvect = NULL,
  error.dist = "normal",
  error.dist.mean = 0,
  error.dist.sd = 1,
  error.dist.snr = NULL,
  error.dist.df = 2,
  dataframe = TRUE,
  seed = NULL
)

Arguments

n

Number of individuals.

bin.k

Number of binary variables not used for generating y.

bin.prob

A vector of probabilities with length equal to bin.k.

cont.k

Number of continuous variables not used for generating y.

y.gen.bin.k

Number of binary variables used for generating y.

y.gen.bin.beta

A vector of coefficients with length equal to bin.k used to generate y.

y.gen.bin.prob

A vector of probabilities with length equal to y.gen.bin.k.

y.gen.cont.beta

A vector of coefficients with length equal to cont.k used to generate y.

y.gen.cont.mod.k

Experimental

y.gen.cont.mod.beta

Experimental

y.gen.bin.mod.prob

Experimental

y.gen.cont.sp.k

Experimental

y.gen.cont.sp.groups

Experimental

y.gen.cont.sp.rho

Experimental

y.gen.cont.sp.dif

Experimental

intercept.beta

Value for the constant used to generate y.

Xgenerator.method

Method used to generate X data ( "simstudy" or "svd").

corMatrix

A positive number for alphad (see rcorrmatrix), NULL or a correlation matrix to be used when Xgenerator is "simstudy".

rho

Correlation coefficient, -1 <= rho <= 1. Use when Xgenerator is "simstudy" and corMatrix is NULL.

corstr

correlation structure ("ind", "cs" or "ar1") (see genCorData) to be used when Xgenerator is "simstudy" and corMatrix is NULL.

condnumber

A value for the condition number of the X matrix to be used when Xgenerator is "svd".

mu

The mean of the variables. To be used when all variables have the same mean.

muvect

A vector of means. To be used when variables have different means. The length of muvect must be k.

sd

Standard deviation of the variables. To be used when all variables have the same standard deviation.

sdvect

A vector of standard deviations. To be used when variables have different standard deviations. The length of sdvect must be k.

error.dist

Distribution of the error. "normal" for normal distribution or "t" for t-student distribution.

error.dist.mean

Mean value used when error.dist is "normal".

error.dist.sd

Standard deviation value used when error.dist is "normal".

error.dist.snr

Signal to noise ratio. If not NULL, the value of error.dist.sd will be ignored and it will be determined accordingly.

error.dist.df

Degrees of freedom used when error.dist is "t".

dataframe

Logical. If TRUE, the default, returns a data.frame else returns a list.

seed

A seed for reproducibility.

Value

A data.frame or a list composed of a matrix of independent variables values (X), a vector of the dependent variable values (y), a vector of coefficient values (coefficients), a vector of non-zero coefficients (y.coefficients), and a vector of the error values (epsilon).

Author(s)

Jorge Cabral, jorgecabral@ua.pt

Examples


dataGCEstim <- fngendata(
  n = 100, cont.k = 2,
  y.gen.cont.beta = c(3, 6, 9),
  intercept.beta = 1,
  Xgenerator.method = "svd", condnumber = 50,
  mu = 0, sd = 1,
  error.dist = "normal", error.dist.mean = 0, error.dist.snr = 5,
  dataframe = TRUE, seed = 230676)

summary(dataGCEstim)


[Package GCEstim version 0.1.0 Index]