correct_misspelled_values {cleanepi} | R Documentation |
Correct misspelled values by using approximate string matching techniques to compare them against the expected values.
Description
Correct misspelled values by using approximate string matching techniques to compare them against the expected values.
Usage
correct_misspelled_values(
data,
target_columns,
wordlist,
max_distance = 1,
confirm = rlang::is_interactive(),
...
)
Arguments
data |
The input |
target_columns |
A |
wordlist |
A |
max_distance |
An |
confirm |
A |
... |
Details
When used interactively (see interactive()
) the user is presented a menu
to ensure that the words detected using approximate string matching are not
false positives and the user can decided whether to proceed with the
spelling corrections. In non-interactive sessions all misspelled values are
replaced by their closest values within the provided vector of expected
values.
If multiple words supplied in the wordlist
equally match a word in the
data and confirm
is TRUE
the user is presented a menu to choose the
replacement word. If it is not used interactively multiple equal matches
throws a warning.
Value
The corrected input data according to the user-specified wordlist
.
Examples
df <- data.frame(
case_type = c("confirmed", "confermed", "probable", "susspected"),
outcome = c("died", "recoverd", "did", "recovered")
)
df
correct_misspelled_values(
data = df,
target_columns = c("case_type", "outcome"),
wordlist = c("confirmed", "probable", "suspected", "died", "recovered"),
confirm = FALSE
)