harcode {sdtm.oak} | R Documentation |
Derive an SDTM variable with a hardcoded value
Description
-
hardcode_no_ct()
maps a hardcoded value to a target SDTM variable that has no terminology restrictions. -
hardcode_ct()
maps a hardcoded value to a target SDTM variable with controlled terminology recoding.
Usage
hardcode_no_ct(
tgt_dat = NULL,
tgt_val,
raw_dat,
raw_var,
tgt_var,
id_vars = oak_id_vars()
)
hardcode_ct(
tgt_dat = NULL,
tgt_val,
raw_dat,
raw_var,
tgt_var,
ct_spec,
ct_clst,
id_vars = oak_id_vars()
)
Arguments
tgt_dat |
Target dataset: a data frame to be merged against |
tgt_val |
The target SDTM value to be hardcoded into the variable
indicated in |
raw_dat |
The raw dataset (dataframe); must include the
variables passed in |
raw_var |
The raw variable: a single string indicating the name of the
raw variable in |
tgt_var |
The target SDTM variable: a single string indicating the name of variable to be derived. |
id_vars |
Key variables to be used in the join between the raw dataset
( |
ct_spec |
Study controlled terminology specification: a dataframe with a
minimal set of columns, see |
ct_clst |
A codelist code indicating which subset of the controlled
terminology to apply in the derivation. This parameter is optional, if left
as |
Value
The returned data set depends on the value of tgt_dat
:
If no target dataset is supplied, meaning that
tgt_dat
defaults toNULL
, then the returned data set israw_dat
, selected for the variables indicated inid_vars
, and a new extra column: the derived variable, as indicated intgt_var
.If the target dataset is provided, then it is merged with the raw data set
raw_dat
by the variables indicated inid_vars
, with a new column: the derived variable, as indicated intgt_var
.
Examples
md1 <-
tibble::tribble(
~oak_id, ~raw_source, ~patient_number, ~MDRAW,
1L, "MD1", 101L, "BABY ASPIRIN",
2L, "MD1", 102L, "CORTISPORIN",
3L, "MD1", 103L, NA_character_,
4L, "MD1", 104L, "DIPHENHYDRAMINE HCL"
)
# Derive a new variable `CMCAT` by overwriting `MDRAW` with the
# hardcoded value "GENERAL CONCOMITANT MEDICATIONS".
hardcode_no_ct(
tgt_val = "GENERAL CONCOMITANT MEDICATIONS",
raw_dat = md1,
raw_var = "MDRAW",
tgt_var = "CMCAT"
)
cm_inter <-
tibble::tribble(
~oak_id, ~raw_source, ~patient_number, ~CMTRT, ~CMINDC,
1L, "MD1", 101L, "BABY ASPIRIN", NA,
2L, "MD1", 102L, "CORTISPORIN", "NAUSEA",
3L, "MD1", 103L, "ASPIRIN", "ANEMIA",
4L, "MD1", 104L, "DIPHENHYDRAMINE HCL", "NAUSEA",
5L, "MD1", 105L, "PARACETAMOL", "PYREXIA"
)
# Derive a new variable `CMCAT` by overwriting `MDRAW` with the
# hardcoded value "GENERAL CONCOMITANT MEDICATIONS" with a prior join to
# `target_dataset`.
hardcode_no_ct(
tgt_dat = cm_inter,
tgt_val = "GENERAL CONCOMITANT MEDICATIONS",
raw_dat = md1,
raw_var = "MDRAW",
tgt_var = "CMCAT"
)
# Controlled terminology specification
(ct_spec <- read_ct_spec_example("ct-01-cm"))
# Hardcoding of `CMCAT` with the value `"GENERAL CONCOMITANT MEDICATIONS"`
# involving terminology recoding. `NA` values in `MDRAW` are preserved in
# `CMCAT`.
hardcode_ct(
tgt_dat = cm_inter,
tgt_var = "CMCAT",
raw_dat = md1,
raw_var = "MDRAW",
tgt_val = "GENERAL CONCOMITANT MEDICATIONS",
ct_spec = ct_spec,
ct_clst = "C66729"
)
# Variables are derived in sequence from multiple input sources.
# For each target variable, only missing (`NA`) values are filled
# during each step—previously assigned (non-missing) values are retained.
cm_raw <-
tibble::tibble(
oak_id = 1:4,
raw_source = "cm_raw",
patient_number = 370 + oak_id,
PATNUM = patient_number,
IT.CMTRT = c("BABY ASPIRIN", "CORTISPORIN", NA, NA),
IT.CMTRTOTH = c("Other Treatment - ", NA, "Other Treatment - Baby Aspirin", NA)
)
cm_raw
# Hardcoding of values of `CMCAT` is based firstly on the presence of missing
# values (`NA`) in `IT.CMTRT` and only secondly on `IT.CMTRTOTH`.
hardcode_no_ct(
tgt_val = "General Concomitant Medications",
raw_dat = cm_raw,
raw_var = "IT.CMTRT",
tgt_var = "CMCAT"
) |>
hardcode_no_ct(
tgt_val = "Other General Concomitant Medications",
raw_dat = cm_raw,
raw_var = "IT.CMTRTOTH",
tgt_var = "CMCAT"
)
# Note that hardcoding application is reversed in this example, this impacts
# the result.
hardcode_no_ct(
tgt_val = "Other General Concomitant Medications",
raw_dat = cm_raw,
raw_var = "IT.CMTRTOTH",
tgt_var = "CMCAT"
) |>
hardcode_no_ct(
tgt_val = "General Concomitant Medications",
raw_dat = cm_raw,
raw_var = "IT.CMTRT",
tgt_var = "CMCAT"
)