plot_curves.bgmfit {bsitar}R Documentation

Plot Growth Curves

Description

The plot_curves() function visualizes six different types of growth curves using the ggplot2 package. Additionally, it allows users to create customized plots from the data returned as a data.frame. For an alternative approach, the marginal_draws() function can be used, which not only estimates adjusted curves but also enables comparison across groups using the hypotheses argument.

Usage

## S3 method for class 'bgmfit'
plot_curves(
  model,
  opt = "dv",
  apv = FALSE,
  bands = NULL,
  conf = 0.95,
  resp = NULL,
  dpar = NULL,
  ndraws = NULL,
  draw_ids = NULL,
  newdata = NULL,
  summary = FALSE,
  digits = 2,
  re_formula = NULL,
  numeric_cov_at = NULL,
  aux_variables = NULL,
  levels_id = NULL,
  avg_reffects = NULL,
  ipts = 10,
  deriv_model = TRUE,
  xrange = NULL,
  xrange_search = NULL,
  takeoff = FALSE,
  trough = FALSE,
  acgv = FALSE,
  acgv_velocity = 0.1,
  seed = 123,
  estimation_method = "fitted",
  allow_new_levels = FALSE,
  sample_new_levels = "uncertainty",
  incl_autocor = TRUE,
  robust = FALSE,
  transform = NULL,
  future = FALSE,
  future_session = "multisession",
  cores = NULL,
  trim = 0,
  layout = "single",
  linecolor = NULL,
  linecolor1 = NULL,
  linecolor2 = NULL,
  label.x = NULL,
  label.y = NULL,
  legendpos = NULL,
  linetype.apv = NULL,
  linewidth.main = NULL,
  linewidth.apv = NULL,
  linetype.groupby = NA,
  color.groupby = NA,
  band.alpha = NULL,
  show_age_takeoff = TRUE,
  show_age_peak = TRUE,
  show_age_cessation = TRUE,
  show_vel_takeoff = FALSE,
  show_vel_peak = FALSE,
  show_vel_cessation = FALSE,
  returndata = FALSE,
  returndata_add_parms = FALSE,
  parms_eval = FALSE,
  idata_method = NULL,
  parms_method = "getPeak",
  verbose = FALSE,
  fullframe = NULL,
  dummy_to_factor = NULL,
  expose_function = FALSE,
  usesavedfuns = NULL,
  clearenvfuns = NULL,
  funlist = NULL,
  envir = NULL,
  ...
)

plot_curves(model, ...)

Arguments

model

An object of class bgmfit.

opt

A character string containing one or more of the following plotting options:

  • 'd': Population average distance curve

  • 'v': Population average velocity curve

  • 'D': Individual-specific distance curves

  • 'V': Individual-specific velocity curves

  • 'u': Unadjusted individual-specific distance curves

  • 'a': Adjusted individual-specific distance curves (adjusted for random effects)

Note that 'd' and 'D' cannot be specified simultaneously, nor can 'v' and 'V'. Other combinations are allowed, e.g., 'dvau', 'Dvau', 'dVau', etc.

apv

A logical value (default FALSE) indicating whether to calculate and plot the age at peak velocity (APGV) when opt includes 'v' or 'V'.

bands

A character string containing one or more of the following options, or NULL (default), indicating if CI bands should be plotted around the curves:

  • 'd': Band around the distance curve

  • 'v': Band around the velocity curve

  • 'p': Band around the vertical line denoting the APGV parameter

The 'dvp' option will include CI bands for distance and velocity curves, and the APGV.

conf

A numeric value (default 0.95) specifying the confidence interval (CI) level for the bands. See growthparameters() for more details.

resp

A character string (default NULL) to specify the response variable when processing posterior draws for univariate_by and multivariate models. See bsitar() for details on univariate_by and multivariate models.

dpar

Optional name of a predicted distributional parameter. If specified, expected predictions of this parameters are returned.

ndraws

A positive integer indicating the number of posterior draws to use in estimation. If NULL (default), all draws are used.

draw_ids

An integer specifying the specific posterior draw(s) to use in estimation (default NULL).

newdata

An optional data frame for estimation. If NULL (default), newdata is retrieved from the model.

summary

A logical value indicating whether only the estimate should be computed (TRUE), or whether the estimate along with SE and CI should be returned (FALSE, default). Setting summary to FALSE will increase computation time. Note that summary = FALSE is required to obtain correct estimates when re_formula = NULL.

digits

An integer (default 2) to set the decimal places for rounding the results using the base::round() function.

re_formula

Option to indicate whether or not to include individual/group-level effects in the estimation. When NA (default), individual-level effects are excluded, and population average growth parameters are computed. When NULL, individual-level effects are included in the computation, and the resulting growth parameters are individual-specific. In both cases (NA or NULL), continuous and factor covariates are appropriately included in the estimation. Continuous covariates are set to their means by default (see numeric_cov_at for details), while factor covariates remain unaltered, allowing for the estimation of covariate-specific population average and individual-specific growth parameters.

numeric_cov_at

An optional (named list) argument to specify the value of continuous covariate(s). The default NULL option sets the continuous covariate(s) to their mean. Alternatively, a named list can be supplied to manually set these values. For example, numeric_cov_at = list(xx = 2) will set the continuous covariate variable 'xx' to 2. The argument numeric_cov_at is ignored when no continuous covariates are included in the model.

aux_variables

An optional argument to specify variables passed to the ipts argument, useful when fitting location-scale or measurement error models.

levels_id

An optional argument to specify the ids for the hierarchical model (default NULL). It is used only when the model is applied to data with three or more levels of hierarchy. For a two-level model, levels_id is automatically inferred from the model fit. For models with three or more levels, levels_id is inferred from the model fit under the assumption that hierarchy is specified from the lowest to the uppermost level, i.e., id followed by study, where id is nested within study. However, it is not guaranteed that levels_id is sorted correctly, so it is better to set it manually when fitting a model with three or more levels of hierarchy.

avg_reffects

An optional argument (default NULL) to calculate (marginal/average) curves and growth parameters, such as APGV and PGV. If specified, it must be a named list indicating the over (typically a level 1 predictor, such as age), feby (fixed effects, typically a factor variable), and reby (typically NULL, indicating that parameters are integrated over the random effects). For example, avg_reffects = list(feby = 'study', reby = NULL, over = 'age').

ipts

An integer to set the length of the predictor variable for generating a smooth velocity curve. If NULL, the original values are returned. If an integer (e.g., ipts = 10, default), the predictor is interpolated. Note that these interpolations do not alter the range of the predictor when calculating population averages and/or individual-specific growth curves.

deriv_model

A logical value specifying whether to estimate the velocity curve from the derivative function or by differentiating the distance curve. Set deriv_model = TRUE for functions that require the velocity curve, such as growthparameters() and plot_curves(). Set it to NULL for functions that use the distance curve (i.e., fitted values), such as loo_validation() and plot_ppc().

xrange

An integer to set the predictor range (e.g., age) when executing the interpolation via ipts. By default, NULL sets the individual-specific predictor range. Setting xrange = 1 applies the same range for individuals within the same higher grouping variable (e.g., study). Setting xrange = 2 applies an identical range across the entire sample. Alternatively, a numeric vector (e.g., xrange = c(6, 20)) can be provided to set the range within the specified values.

xrange_search

A vector of length two or a character string 'range' to set the range of the predictor variable (x) within which growth parameters are searched. This is useful when there is more than one peak and the user wants to summarize the peak within a specified range of the x variable. The default value is xrange_search = NULL.

takeoff

A logical value (default FALSE) indicating whether to calculate the age at takeoff velocity (ATGV) and the takeoff growth velocity (TGV) parameters.

trough

A logical value (default FALSE) indicating whether to calculate the age at cessation of growth velocity (ACGV) and the cessation of growth velocity (CGV) parameters.

acgv

A logical value (default FALSE) indicating whether to calculate the age at cessation of growth velocity from the velocity curve. If TRUE, the age at cessation of growth velocity (ACGV) and the cessation growth velocity (CGV) are calculated based on the percentage of the peak growth velocity, as defined by the acgv_velocity argument (see below). The acgv_velocity is typically set at 10 percent of the peak growth velocity. ACGV and CGV are calculated along with the uncertainty (SE and CI) around the ACGV and CGV parameters.

acgv_velocity

The percentage of the peak growth velocity to use when estimating acgv. The default value is 0.10, i.e., 10 percent of the peak growth velocity.

seed

An integer (default 123) that is passed to the estimation method to ensure reproducibility.

estimation_method

A character string specifying the estimation method when calculating the velocity from the posterior draws. The 'fitted' method internally calls fitted_draws(), while the 'predict' method calls predict_draws(). See brms::fitted.brmsfit() and brms::predict.brmsfit() for details.

allow_new_levels

A flag indicating if new levels of group-level effects are allowed (defaults to FALSE). Only relevant if newdata is provided.

sample_new_levels

Indicates how to sample new levels for grouping factors specified in re_formula. This argument is only relevant if newdata is provided and allow_new_levels is set to TRUE. If "uncertainty" (default), each posterior sample for a new level is drawn from the posterior draws of a randomly chosen existing level. Each posterior sample for a new level may be drawn from a different existing level such that the resulting set of new posterior draws represents the variation across existing levels. If "gaussian", sample new levels from the (multivariate) normal distribution implied by the group-level standard deviations and correlations. This options may be useful for conducting Bayesian power analysis or predicting new levels in situations where relatively few levels where observed in the old_data. If "old_levels", directly sample new levels from the existing levels, where a new level is assigned all of the posterior draws of the same (randomly chosen) existing level.

incl_autocor

A flag indicating if correlation structures originally specified via autocor should be included in the predictions. Defaults to TRUE.

robust

A logical value to specify the summary options. If FALSE (default), the mean is used as the measure of central tendency and the standard deviation as the measure of variability. If TRUE, the median and median absolute deviation (MAD) are applied instead. Ignored if summary is FALSE.

transform

A function applied to individual draws from the posterior distribution before computing summaries. The argument transform is based on the marginaleffects::predictions() function. This should not be confused with transform from brms::posterior_predict(), which is now deprecated.

future

A logical value (default FALSE) to specify whether or not to perform parallel computations. If set to TRUE, the future.apply::future_sapply() function is used to summarize the posterior draws in parallel.

future_session

A character string specifying the session type when future = TRUE. The 'multisession' (default) option sets the multisession environment, while the 'multicore' option sets up a multicore session. Note that 'multicore' is not supported on Windows systems. For more details, see future.apply::future_sapply().

cores

The number of cores to be used for parallel computations if future = TRUE. On non-Windows systems, this argument can be set globally via the mc.cores option. By default, NULL, the number of cores is automatically determined using future::availableCores(), and it will use the maximum number of cores available minus one (i.e., future::availableCores() - 1).

trim

A numeric value (default 0) indicating the number of long line segments to be excluded from the plot when the option 'u' or 'a' is selected. See sitar::plot.sitar for further details.

layout

A character string defining the plot layout. The default 'single' layout overlays distance and velocity curves on a single plot when opt includes combinations like 'dv', 'Dv', 'dV', or 'DV'. The alternative layout option 'facet' uses facet_wrap from ggplot2 to map and draw plots when opt includes two or more letters.

linecolor

The color of the lines when the layout is 'facet'. The default is NULL, which sets the line color to 'grey50'.

linecolor1

The color of the first line when the layout is 'single'. For example, in opt = 'dv', the distance line is controlled by linecolor1. The default NULL sets linecolor1 to 'orange2'.

linecolor2

The color of the second line when the layout is 'single'. For example, in opt = 'dv', the velocity line is controlled by linecolor2. The default NULL sets linecolor2 to 'green4'.

label.x

An optional character string to label the x-axis. If NULL (default), the x-axis label will be taken from the predictor (e.g., age).

label.y

An optional character string to label the y-axis. If NULL (default), the y-axis label will be taken from the plot type (e.g., distance, velocity). When layout = 'facet', the label is removed, and the same label is used as the title.

legendpos

A character string to specify the position of the legend. If NULL (default), the legend position is set to 'bottom' for distance and velocity curves in the 'single' layout. For individual-specific curves, the legend position is set to 'none' to suppress the legend.

linetype.apv

A character string to specify the type of the vertical line marking the APGV. Default NULL sets the linetype to dotted.

linewidth.main

A numeric value to specify the line width for distance and velocity curves. The default NULL sets the width to 0.35.

linewidth.apv

A numeric value to specify the width of the vertical line marking the APGV. The default NULL sets the width to 0.25.

linetype.groupby

A character string specifying the line type for distance and velocity curves when drawing plots for a model with factor covariates or individual-specific curves. The default is NA, which sets the line type to 'solid' and suppresses legends.

color.groupby

A character string specifying the line color for distance and velocity curves when drawing plots for a model with factor covariates or individual-specific curves. The default is NA, which suppresses legends.

band.alpha

A numeric value to specify the transparency of the CI bands around the curves. The default NULL sets the transparency to 0.4.

show_age_takeoff

A logical value (default TRUE) to indicate whether to display the ATGV line(s) on the plot.

show_age_peak

A logical value (default TRUE) to indicate whether to display the APGV line(s) on the plot.

show_age_cessation

A logical value (default TRUE) to indicate whether to display the ACGV line(s) on the plot.

show_vel_takeoff

A logical value (default FALSE) to indicate whether to display the TGV line(s) on the plot.

show_vel_peak

A logical value (default FALSE) to indicate whether to display the PGV line(s) on the plot.

show_vel_cessation

A logical value (default FALSE) to indicate whether to display the CGV line(s) on the plot.

returndata

A logical value (default FALSE) to indicate whether to plot the data or return it as a data.frame.

returndata_add_parms

A logical value (default FALSE) to specify whether to add growth parameters to the returned data.frame. Ignored when returndata = FALSE. Growth parameters are added when the opt argument includes 'v' or 'V' and apv = TRUE.

parms_eval

A logical value to specify whether or not to compute growth parameters on the fly. This is for internal use only and is mainly needed for compatibility across internal functions.

idata_method

A character string to indicate the interpolation method. The number of interpolation points is set by the ipts argument. Available options for idata_method are method 1 (specified as 'm1') and method 2 (specified as 'm2').

  • Method 1 ('m1') is adapted from the iapvbs package and is documented here.

  • Method 2 ('m2') is based on the JMbayes package and is documented here. The 'm1' method works by internally constructing the data frame based on the model configuration, while the 'm2' method uses the exact data frame from the model fit, accessible via fit$data. If idata_method = NULL (default), method 'm2' is automatically selected. Note that method 'm1' may fail in certain cases, especially when the model includes covariates (particularly in univariate_by models). In such cases, it is recommended to use method 'm2'.

parms_method

A character string specifying the method used when evaluating parms_eval. The default method is getPeak, which uses the sitar::getPeak() function from the sitar package. Alternatively, findpeaks uses the findpeaks function from the pracma package. This parameter is for internal use and ensures compatibility across internal functions.

verbose

A logical argument (default FALSE) to specify whether to print information collected during the setup of the object(s).

fullframe

A logical value indicating whether to return a fullframe object in which newdata is bound to the summary estimates. Note that fullframe cannot be used with summary = FALSE, and it is only applicable when idata_method = 'm2'. A typical use case is when fitting a univariate_by model. This option is mainly for internal use.

dummy_to_factor

A named list (default NULL) to convert dummy variables into a factor variable. The list must include the following elements:

  • factor.dummy: A character vector of dummy variables to be converted to factors.

  • factor.name: The name for the newly created factor variable (default is 'factor.var' if NULL).

  • factor.level: A vector specifying the factor levels. If NULL, levels are taken from factor.dummy. If factor.level is provided, its length must match factor.dummy.

expose_function

A logical argument (default FALSE) to indicate whether Stan functions should be exposed. If TRUE, any Stan functions exposed during the model fit using expose_function = TRUE in the bsitar() function are saved and can be used in post-processing. By default, expose_function = FALSE in post-processing functions, except in optimize_model() where it is set to NULL. If NULL, the setting is inherited from the original model fit. It must be set to TRUE when adding fit criteria or bayes_R2 during model optimization.

usesavedfuns

A logical value (default NULL) indicating whether to use already exposed and saved Stan functions. This is typically set automatically based on the expose_functions argument from the bsitar() call. Manual specification of usesavedfuns is rarely needed and is intended for internal testing, as improper use can lead to unreliable estimates.

clearenvfuns

A logical value indicating whether to clear the exposed Stan functions from the environment (TRUE) or not (FALSE). If NULL, clearenvfuns is set based on the value of usesavedfuns: TRUE if usesavedfuns = TRUE, or FALSE if usesavedfuns = FALSE.

funlist

A list (default NULL) specifying function names. This is rarely needed, as required functions are typically retrieved automatically. A use case for funlist is when sigma_formula, sigma_formula_gr, or sigma_formula_gr_str use an external function (e.g., poly(age)). The funlist should include function names defined in the globalenv(). For functions needing both distance and velocity curves (e.g., plot_curves(..., opt = 'dv')), funlist must include two functions: one for the distance curve and one for the velocity curve.

envir

The environment used for function evaluation. The default is NULL, which sets the environment to parent.frame(). Since most post-processing functions rely on brms, it is recommended to set envir = globalenv() or envir = .GlobalEnv, especially for derivatives like velocity curves.

...

Additional arguments passed to the brms::fitted.brmsfit() and brms::predict() functions.

Details

The plot_curves() function is a generic tool for visualizing the following six curves:

Internally, plot_curves() calls the growthparameters() function to estimate and summarize the distance and velocity curves, as well as to compute growth parameters such as the age at peak growth velocity (APGV). The function also calls fitted_draws() or predict_draws() to make inferences based on posterior draws. As a result, plot_curves() can plot either fitted or predicted curves. For more details, see fitted_draws() and predict_draws() to understand the difference between fitted and predicted values.

Value

A plot object (default) or a data.frame when returndata = TRUE.

Author(s)

Satpal Sandhu satpal.sandhu@bristol.ac.uk

See Also

growthparameters() fitted_draws() predict_draws()

Examples




# Fit Bayesian SITAR model 

# To avoid mode estimation which takes time, the Bayesian SITAR model is fit to 
# the 'berkeley_exdata' and saved as an example fit ('berkeley_exfit').
# See 'bsitar' function for details on 'berkeley_exdata' and 'berkeley_exfit'.

# Check and confirm whether the model fit object 'berkeley_exfit' exists
berkeley_exfit <- getNsObject(berkeley_exfit)

model <- berkeley_exfit

# Population average distance and velocity curves with default options
plot_curves(model, opt = 'dv')

# Individual-specific distance and velocity curves with default options
# Note that \code{legendpos = 'none'} will suppress the legend positions. 
# This suppression is useful when plotting individual-specific curves

plot_curves(model, opt = 'DV')

# Population average distance and velocity curves with APGV

plot_curves(model, opt = 'dv', apv = TRUE)

# Individual-specific distance and velocity curves with APGV

plot_curves(model, opt = 'DV', apv = TRUE)

# Population average distance curve, velocity curve, and APGV with CI bands
# To construct CI bands, growth parameters are first calculated for each  
# posterior draw and then summarized across draws. Therefore,summary 
# option must be set to FALSE

plot_curves(model, opt = 'dv', apv = TRUE, bands = 'dvp', summary = FALSE)

# Adjusted and unadjusted individual curves
# Note ipts = NULL (i.e., no interpolation of predictor (i.e., age) to plot a 
# smooth curve). This is because it does not a make sense to interploate data 
# when estimating adjusted curves. Also, layout = 'facet' (and not default 
# layout = 'single') is used for the ease of visualizing the plotted 
# adjusted and unadjusted individual curves. However, these lines can be 
# superimposed on each other by setting the set layout = 'single'.
# For other plots shown above, layout can be set as 'single' or 'facet'

# Separate plots for adjusted and unadjusted curves (layout = 'facet')
plot_curves(model, opt = 'au', ipts = NULL, layout = 'facet')

# Superimposed adjusted and unadjusted curves (layout = 'single')
plot_curves(model, opt = 'au', ipts = NULL, layout = 'single')




[Package bsitar version 0.3.2 Index]