int_encoding_errors {dataquieR}R Documentation

Encoding Errors

Description

Detects errors in the character encoding of string variables

Indicator

Usage

int_encoding_errors(
  resp_vars = NULL,
  study_data,
  label_col,
  meta_data_dataframe = "dataframe_level",
  item_level = "item_level",
  ref_encs,
  meta_data = item_level,
  meta_data_v2,
  dataframe_level
)

Arguments

resp_vars

variable the names of the measurement variables, if missing or NULL, all variables will be checked

study_data

data.frame the data frame that contains the measurements

label_col

variable attribute the name of the column in the metadata with labels of variables

meta_data_dataframe

data.frame the data frame that contains the metadata for the data frame level

item_level

data.frame the data frame that contains metadata attributes of study data

ref_encs

reference encodings (names are resp_vars)

meta_data

data.frame old name for item_level

meta_data_v2

character path to workbook like metadata file, see prep_load_workbook_like_file for details. ALL LOADED DATAFRAMES WILL BE PURGED, using prep_purge_data_frame_cache, if you specify meta_data_v2.

dataframe_level

data.frame alias for meta_data_dataframe

Details

Strings are stored based on code tables, nowadays, typically as UTF-8. However, other code systems are still in use, so, sometimes, strings from different systems are mixed in the data. This indicator checks for such problems and returns the count of entries per variable, that do not match the reference coding system, which is estimated from the study data (addition of metadata field is planned).

If not specified in the metadata (columns ENCODING in item- or data-frame- level, the encoding is guessed from the data). Otherwise, it may be any supported encoding as returned by iconvlist().

Value

a list with:


[Package dataquieR version 2.5.1 Index]