prepare_thresholds {LTFGRS} | R Documentation |
Calculate (personalised) thresholds based on CIPs.
Description
This function prepares input for estimate_liability
by calculating thresholds based on stratified cumulative incidence proportions (CIPs) with options for interpolation for ages between CIP values. Given a tibble with families and family members and (stratified) CIPs, personalised thresholds will be calculated for each individual present in .tbl
. An individual may be in multiple families, but only once in the same family.
Usage
prepare_thresholds(
.tbl,
CIP,
age_col,
CIP_merge_columns = c("sex", "birth_year", "age"),
CIP_cip_col = "cip",
Kpop = "useMax",
status_col = "status",
lower_equal_upper = FALSE,
personal_thr = FALSE,
fid_col = "fid",
personal_id_col = "pid",
interpolation = NULL,
bst.params = list(max_depth = 10, base_score = 0, nthread = 4, min_child_weight = 10),
min_CIP_value = 1e-05,
xgboost_itr = 30
)
Arguments
.tbl |
Tibble with family and personal id columns, as well as CIP_merge_columns and status. |
CIP |
Tibble with population representative cumulative incidence proportions. CIP must contain columns from |
age_col |
Name of column with age at the end of follow-up or age at diagnosis for cases. |
CIP_merge_columns |
The columns the CIPs are subset by, e.g. CIPs by birth_year, sex. |
CIP_cip_col |
Name of column with CIP values. |
Kpop |
Takes either "useMax" to use the maximum value in the CIP strata as population prevalence, or a tibble with population prevalence values based on other information. If a tibble is provided, it must contain columns from |
status_col |
Column that contains the status of each family member. Coded as 0 or FALSE (control) and 1 or TRUE (case). |
lower_equal_upper |
Should the upper and lower threshold be the same for cases? Can be used if CIPs are detailed, e.g. stratified by birth year and sex. |
personal_thr |
Should thresholds be based on stratified CIPs or population prevalence? |
fid_col |
Column that contains the family ID. |
personal_id_col |
Column that contains the personal ID. |
interpolation |
Type of interpolation, defaults to NULL. |
bst.params |
List of parameters to pass on to xgboost. See xgboost documentation for details. |
min_CIP_value |
Minimum cip value to allow. Too low values may lead to numerical instabilities. |
xgboost_itr |
Number of iterations to run xgboost for. |
Value
Tibble with (personlised) thresholds for each family member (lower & upper), the calculated cumulative incidence proportion for each individual (K_i), and population prevalence within an individuals CIP strata (K_pop; max value in stratum). The threshold and other potentially relevant information can be added to the family graphs with familywise_attach_attributes
.
Examples
tbl = data.frame(
fid = c(1, 1, 1, 1),
pid = c(1, 2, 3, 4),
role = c("o", "m", "f", "pgf"),
sex = c(1, 0, 1, 1),
status = c(0, 0, 1, 1),
age = c(22, 42, 48, 78),
birth_year = 2023 - c(22, 42, 48, 78),
aoo = c(NA, NA, 43, 45))
cip = data.frame(
age = c(22, 42, 43, 45, 48, 78),
birth_year = c(2001, 1981, 1975, 1945, 1975, 1945),
sex = c(1, 0, 1, 1, 1, 1),
cip = c(0.1, 0.2, 0.3, 0.3, 0.3, 0.4))
prepare_thresholds(.tbl = tbl, CIP = cip, age_col = "age", interpolation = NA)