int_encoding_errors {dataquieR} | R Documentation |
Encoding Errors
Description
Detects errors in the character encoding of string variables
Usage
int_encoding_errors(
resp_vars = NULL,
study_data,
label_col,
meta_data_dataframe = "dataframe_level",
item_level = "item_level",
ref_encs,
meta_data = item_level,
meta_data_v2,
dataframe_level
)
Arguments
resp_vars |
variable the names of the measurement variables, if
missing or |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
item_level |
data.frame the data frame that contains metadata attributes of study data |
ref_encs |
reference encodings (names are |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
dataframe_level |
data.frame alias for |
Details
Strings are stored based on code tables, nowadays, typically as UTF-8. However, other code systems are still in use, so, sometimes, strings from different systems are mixed in the data. This indicator checks for such problems and returns the count of entries per variable, that do not match the reference coding system, which is estimated from the study data (addition of metadata field is planned).
If not specified in the metadata (columns ENCODING
in item- or data-frame-
level, the encoding is guessed from the data). Otherwise, it may be any
supported encoding as returned by iconvlist()
.
Value
a list with:
-
SummaryTable
: data.frame with information on such problems -
SummaryData
: data.frame human readable version ofSummaryTable
-
FlaggedStudyData
: data.frame tells for each entry in study data if its encoding is OK. has the same dimensions asstudy_data