find_duplicates {cleanepi} | R Documentation |
Identify and return duplicated rows in a data frame or linelist.
Description
Identify and return duplicated rows in a data frame or linelist.
Usage
find_duplicates(data, target_columns = NULL)
Arguments
data |
The input |
target_columns |
A |
Value
A <data.frame>
or <linelist>
of all duplicated rows
with following 2 additional columns:
- row_id
The indices of the duplicated rows from the input data. Users can choose from these indices, which row they consider as redundant in each group of duplicates.
- group_id
a unique identifier associated to each group of duplicates.
Examples
dups <- find_duplicates(
data = readRDS(
system.file("extdata", "test_linelist.RDS", package = "cleanepi")
),
target_columns = c("dt_onset", "dt_report", "sex", "outcome")
)
[Package cleanepi version 1.1.0 Index]