add_shuffle {deident} | R Documentation |
De-identification via random sampling
Description
add_shuffle()
adds a shuffling step to a transformation pipeline.
When ran as a transformation, each specified variable undergoes a random sample without
replacement so that summary metrics on a single variable are unchanged, but
inter-variable metrics are rendered spurious.
Usage
add_shuffle(object, ..., limit = 0)
Arguments
object |
Either a |
... |
variables to be transformed. |
limit |
integer - the minimum number of observations a variable needs to
have for shuffling to be performed. If the variable has length less than |
Value
A 'DeidentList' representing the untrained transformation pipeline. The object contains fields:
-
deident_methods
a list of each step in the pipeline (consisting ofvariables
andmethod
)
and methods:
-
mutate
apply the pipeline to a new data set -
to_yaml
serialize the pipeline to a '.yml' file
See Also
add_group()
for usage under aggregation
Examples
# Basic usage;
pipe.shuffle <- add_shuffle(ShiftsWorked, Employee)
pipe.shuffle$mutate(ShiftsWorked)
pipe.shuffle.limit <- add_shuffle(ShiftsWorked, Employee, limit=1)
pipe.shuffle.limit$mutate(ShiftsWorked)