ft_filter_freq {fctutils} | R Documentation |
Filter Factor Levels by Frequency and Recalculate Character Frequencies
Description
Filters out factor levels that occur less than a specified frequency threshold and recalculates character frequencies excluding the removed levels. Offers options to handle NA values and returns additional information.
Usage
ft_filter_freq(
factor_vec,
min_freq = 1,
na.rm = FALSE,
case = FALSE,
decreasing = TRUE,
return_info = FALSE
)
Arguments
factor_vec |
A factor vector to be filtered. |
min_freq |
A positive integer specifying the minimum frequency threshold. Factor levels occurring less than this number will be dropped. |
na.rm |
Logical. Should NA values be removed before filtering and frequency calculation? Default is |
case |
Logical. Should the character frequency count be case-sensitive? Default is |
decreasing |
Logical. Should the ordering of levels be decreasing by total character frequency? Default is |
return_info |
Logical. Should the function return additional information such as removed levels and character frequencies? Default is |
Value
If return_info
is FALSE
, returns a factor vector with levels filtered by the specified frequency threshold and reordered based on recalculated total character frequency. If return_info
is TRUE
, returns a list containing the filtered factor vector, removed levels, and character frequency table.
Author(s)
Kai Guo
Examples
# Example factor vector
factor_vec <- factor(c('apple', 'banana', 'cherry', 'date', 'banana', 'apple', 'fig', NA))
# Filter levels occurring less than 2 times and reorder by character frequency
ft_filter_freq(factor_vec, min_freq = 2)
# Filter levels, remove NA values, and return additional information
result <- ft_filter_freq(factor_vec, min_freq = 2, na.rm = TRUE, return_info = TRUE)
result$filtered_factor
result$removed_levels
result$char_freq_table