penetrance {penetrance} | R Documentation |
penetrance: A Package for Penetrance Estimation
Description
A comprehensive package for penetrance estimation in family-based studies. This package implements Bayesian methods using Metropolis-Hastings algorithm for estimating age-specific penetrance of genetic variants. It supports both sex-specific and non-sex-specific analyses, and provides various visualization tools for examining MCMC results.
This function implements the Independent Metropolis-Hastings algorithm for Bayesian penetrance estimation of cancer risk. It utilizes parallel computing to run multiple chains and provides various options for analyzing and visualizing the results.
Usage
penetrance(
pedigree,
twins = NULL,
n_chains = 1,
n_iter_per_chain = 10000,
ncores = 6,
max_age = 94,
baseline_data = baseline_data_default,
remove_proband = FALSE,
age_imputation = FALSE,
median_max = TRUE,
BaselineNC = TRUE,
var = c(0.1, 0.1, 2, 2, 5, 5, 5, 5),
burn_in = 0,
thinning_factor = 1,
imp_interval = 100,
distribution_data = distribution_data_default,
prev = 1e-04,
sample_size = NULL,
ratio = NULL,
prior_params = prior_params_default,
risk_proportion = risk_proportion_default,
summary_stats = TRUE,
rejection_rates = TRUE,
density_plots = TRUE,
plot_trace = TRUE,
penetrance_plot = TRUE,
penetrance_plot_pdf = TRUE,
plot_loglikelihood = TRUE,
plot_acf = TRUE,
probCI = 0.95,
sex_specific = TRUE
)
Arguments
pedigree |
A list of data frames, where each data frame represents a single pedigree and contains the following columns:
|
twins |
A list specifying identical twins or triplets in the family. Each element of the list should be a vector containing the |
n_chains |
Integer, the number of chains for parallel computation. Default is 1. |
n_iter_per_chain |
Integer, the number of iterations for each chain. Default is 10000. |
ncores |
Integer, the number of cores for parallel computation. Default is 6. |
max_age |
Integer, the maximum age considered for analysis. Default is 94. |
baseline_data |
Data providing the absolute age-specific baseline risk (probability) of developing the cancer in the general population (e.g., from SEER database).
All probability values must be between 0 and 1.
- If |
remove_proband |
Logical, indicating whether to remove probands from the analysis. Default is FALSE. |
age_imputation |
Logical, indicating whether to perform age imputation. Default is FALSE. |
median_max |
Logical, indicating whether to use the baseline median age or |
BaselineNC |
Logical, indicating that the non-carrier penetrance is assumed to be the baseline penetrance. Default is TRUE. |
var |
Numeric vector, variances for the proposal distribution in the Metropolis-Hastings algorithm. Default is |
burn_in |
Numeric, the fraction of results to discard as burn-in (0 to 1). Default is 0 (no burn-in). |
thinning_factor |
Integer, the factor by which to thin the results. Default is 1 (no thinning). |
imp_interval |
Integer, the interval at which age imputation should be performed when age_imputation = TRUE. |
distribution_data |
Data for generating prior distributions. |
prev |
Numeric, prevalence of the carrier status. Default is 0.0001. |
sample_size |
Optional numeric, sample size for distribution generation. |
ratio |
Optional numeric, ratio parameter for distribution generation. |
prior_params |
List, parameters for prior distributions. |
risk_proportion |
Numeric, proportion of risk for distribution generation. |
summary_stats |
Logical, indicating whether to include summary statistics in the output. Default is TRUE. |
rejection_rates |
Logical, indicating whether to include rejection rates in the output. Default is TRUE. |
density_plots |
Logical, indicating whether to include density plots in the output. Default is TRUE. |
plot_trace |
Logical, indicating whether to include trace plots in the output. Default is TRUE. |
penetrance_plot |
Logical, indicating whether to include penetrance plots in the output. Default is TRUE. |
penetrance_plot_pdf |
Logical, indicating whether to include PDF plots in the output. Default is TRUE. |
plot_loglikelihood |
Logical, indicating whether to include log-likelihood plots in the output. Default is TRUE. |
plot_acf |
Logical, indicating whether to include autocorrelation function (ACF) plots for posterior samples. Default is TRUE. |
probCI |
Numeric, probability level for credible intervals in penetrance plots. Must be between 0 and 1. Default is 0.95. |
sex_specific |
Logical, indicating whether to use sex-specific parameters in the analysis. Default is TRUE. |
Details
Key features:
Bayesian estimation of penetrance using family-based data
Support for sex-specific and non-sex-specific analyses
Age imputation for missing data
Visualization tools for MCMC diagnostics
Integration with the clipp package for likelihood calculations
Value
A list containing combined results from all chains, including optional statistics and plots.
Author(s)
Maintainer: Nicolas Kubista bmendel@jimmy.harvard.edu
Authors:
BayesMendel Lab
See Also
Useful links:
Examples
# Create example baseline data (simplified for demonstration)
baseline_data_default <- data.frame(
Age = 1:94,
Female = rep(0.01, 94),
Male = rep(0.01, 94)
)
# Create example distribution data
distribution_data_default <- data.frame(
Age = 1:94,
Risk = rep(0.01, 94)
)
# Create example prior parameters
prior_params_default <- list(
shape = 2,
scale = 50
)
# Create example risk proportion
risk_proportion_default <- 0.5
# Create a simple example pedigree
example_pedigree <- data.frame(
PedigreeID = rep(1, 4),
ID = 1:4,
Sex = c(1, 0, 1, 0), # 1 for male, 0 for female
MotherID = c(NA, NA, 2, 2),
FatherID = c(NA, NA, 1, 1),
isProband = c(0, 0, 1, 0),
CurAge = c(70, 68, 45, 42),
isAff = c(0, 0, 1, 0),
Age = c(NA, NA, 40, NA),
Geno = c(NA, NA, 1, NA)
)
# Basic usage with minimal iterations
result <- penetrance(
pedigree = list(example_pedigree),
n_chains = 1,
n_iter_per_chain = 10, # Very small number for example
ncores = 1, # Single core for example
summary_stats = TRUE,
plot_trace = FALSE, # Disable plots for quick example
density_plots = FALSE,
penetrance_plot = FALSE,
penetrance_plot_pdf = FALSE,
plot_loglikelihood = FALSE,
plot_acf = FALSE
)
# View basic results
head(result$summary_stats)