create_synthetic_data {spect}R Documentation

Generates a survival data set for synthetic streaming service subscription data. The survival event in this case is a cancellation of the subscription. It is given as a function of household income and average number of hours watched in the prior month. Users can adjust the level of censoring and variance in the data with the supplied parameters or simply call with no parameters for a default distribution of data.

Description

Generates a survival data set for synthetic streaming service subscription data. The survival event in this case is a cancellation of the subscription. It is given as a function of household income and average number of hours watched in the prior month. Users can adjust the level of censoring and variance in the data with the supplied parameters or simply call with no parameters for a default distribution of data.

Usage

create_synthetic_data(
  sample_size = 250,
  minimum_income = 5000,
  median_income = 50000,
  income_variance = 10000,
  min_watchhours = 0,
  max_watchhours = 6,
  censor_percentage = 0,
  min_censor_amount = 0,
  max_censor_amount = 0,
  study_time_in_months = 48,
  perturbation_shift = 0
)

Arguments

sample_size

optional - size of the sample population to generate

minimum_income

optional - minimum household income used to generate the distribution

median_income

optional - median household income used to generate the distribution

income_variance

optional - variance to use when generating the household income distribution

min_watchhours

optional - minimum average number of hours watched used to generate the distribution

max_watchhours

optional - minimum average number of hours watched used to generate the distribution

censor_percentage

optional - percentage of population to artificially censor

min_censor_amount

optional - Minimum number of months of censoring to apply to the censored population

max_censor_amount

optional - maximum number of months of censoring to apply to the censored population

study_time_in_months

optional - observation horizon in months

perturbation_shift

optional - defines a boundary for the amount to randomly perturb the formulaic result. Zero for no perturbation

Value

A survival data set suitable for modeling using spect_train.

Author(s)

Stephen Abrams, stephen.abrams@louisville.edu

Examples

data <- create_synthetic_data()


[Package spect version 1.0 Index]