create_synthetic_data {spect} | R Documentation |
Generates a survival data set for synthetic streaming service subscription data. The survival event in this case is a cancellation of the subscription. It is given as a function of household income and average number of hours watched in the prior month. Users can adjust the level of censoring and variance in the data with the supplied parameters or simply call with no parameters for a default distribution of data.
Description
Generates a survival data set for synthetic streaming service subscription data. The survival event in this case is a cancellation of the subscription. It is given as a function of household income and average number of hours watched in the prior month. Users can adjust the level of censoring and variance in the data with the supplied parameters or simply call with no parameters for a default distribution of data.
Usage
create_synthetic_data(
sample_size = 250,
minimum_income = 5000,
median_income = 50000,
income_variance = 10000,
min_watchhours = 0,
max_watchhours = 6,
censor_percentage = 0,
min_censor_amount = 0,
max_censor_amount = 0,
study_time_in_months = 48,
perturbation_shift = 0
)
Arguments
sample_size |
optional - size of the sample population to generate |
minimum_income |
optional - minimum household income used to generate the distribution |
median_income |
optional - median household income used to generate the distribution |
income_variance |
optional - variance to use when generating the household income distribution |
min_watchhours |
optional - minimum average number of hours watched used to generate the distribution |
max_watchhours |
optional - minimum average number of hours watched used to generate the distribution |
censor_percentage |
optional - percentage of population to artificially censor |
min_censor_amount |
optional - Minimum number of months of censoring to apply to the censored population |
max_censor_amount |
optional - maximum number of months of censoring to apply to the censored population |
study_time_in_months |
optional - observation horizon in months |
perturbation_shift |
optional - defines a boundary for the amount to randomly perturb the formulaic result. Zero for no perturbation |
Value
A survival data set suitable for modeling using spect_train.
Author(s)
Stephen Abrams, stephen.abrams@louisville.edu
Examples
data <- create_synthetic_data()