remove_constants {cleanepi} | R Documentation |
Remove constant data, including empty rows, empty columns, and columns with constant values.
Description
The function iteratively removes constant data until none remain. It records details of the removed constant data as a data frame within the report object.
Usage
remove_constants(data, cutoff = 1)
Arguments
data |
The input |
cutoff |
A |
Value
The input dataset where the constant data is filtered out based on specified cut-off.
Examples
data <- readRDS(system.file("extdata", "test_df.RDS", package = "cleanepi"))
# introduce an empty column
data$empty_column <- NA
# inject some missing values across some columns
data$study_id[3] = NA_character_
data$date.of.admission[3] = NA_character_
data$date.of.admission[4] = NA_character_
data$dateOfBirth[3] = NA_character_
data$dateOfBirth[4] = NA_character_
data$dateOfBirth[5] = NA_character_
# with cutoff = 1, line 3, 4, and 5 are not removed
cleaned_df <- remove_constants(
data = data,
cutoff = 1
)
# drop rows or columns with a percentage of constant values
# equal to or more than 50%
cleaned_df <- remove_constants(
data = cleaned_df,
cutoff = 0.5
)
# drop rows or columns with a percentage of constant values
# equal to or more than 25%
cleaned_df <- remove_constants(
data = cleaned_df,
cutoff = 0.25
)
# drop rows or columns with a percentage of constant values
# equal to or more than 15%
cleaned_df <- remove_constants(
data = cleaned_df,
cutoff = 0.15
)
# check the report to see what has happened
print_report(cleaned_df, "constant_data")
[Package cleanepi version 1.1.1 Index]