synthesize {synthesizer}R Documentation

Create synthetic version of a dataset

Description

Create n values or records based on the emperical (multivariate) distribution of y. For data frames it is possible to decorrelate synthetic from the original variables by lowering the value for the rankcor parameter.

Usage

synthesize(x, na.rm = FALSE, n = NROW(x), rankcor = 1)

Arguments

x

[vector|data.frame] data to synthesize.

na.rm

[logical] Remove missing values before creating a synthesizer. Set to TRUE to avoid synthesizing missing values.

n

[integer] Number of values or records to synthesize.

rankcor

[numeric] in [0,1]. Either a single rank correlation value that is applied to all variables, or a vector of the form c(variable1=ut1lity1,...). Variables not explicitly mentioned will have rankcor=1. See also the note below. Ignored for all types of x, except for objects of class data.frame.

Value

A data object of the same type and structure as x.

Note

The utility of a synthetic variable is lowered by decorelating the rank correlation between the real and synthetic data. If rankcor=1, the synthetic data will ordered such that it has the same rank order as the original data. If rankcor=0, no such reordering will take place. For values between 0 and 1, blocks of data are randomly selected and randomly permuted iteratively until the rank correlation between original and synthetic data drops below the parameter.

See Also

Other synthesis: make_synthesizer()

Examples

synthesize(cars$speed,10)
synthesize(cars)
synthesize(cars,25)

s1 <- synthesize(iris, rankcor=1)
s2 <- synthesize(iris, rankcor=0.5)
s3 <- synthesize(iris, rankcor=c("Species"=0.5))

oldpar <- par(mfrow=c(2,2), pch=16, las=1)
plot(Sepal.Length ~ Sepal.Width, data=iris, col=iris$Species, main="Iris")
plot(Sepal.Length ~ Sepal.Width, data=s1, col=s1$Species, main="Synthetic Iris")
plot(Sepal.Length ~ Sepal.Width, data=s2, col=s2$Species, main="Low utility Iris")
plot(Sepal.Length ~ Sepal.Width, data=s3, col=s3$Species, main="Low utility Species")
par(oldpar)



[Package synthesizer version 0.5.0 Index]