model |
An object of class bgmfit .
|
resp |
A character string (default NULL ) to specify the response
variable when processing posterior draws for univariate_by and
multivariate models. See bsitar() for details on
univariate_by and multivariate models.
|
dpar |
Optional name of a predicted distributional parameter.
If specified, expected predictions of this parameters are returned.
|
ndraws |
A positive integer indicating the number of posterior draws to
use in estimation. If NULL (default), all draws are used.
|
draw_ids |
An integer specifying the specific posterior draw(s) to use
in estimation (default NULL ).
|
newdata |
An optional data frame for estimation. If NULL
(default), newdata is retrieved from the model .
|
datagrid |
A grid of user-specified values to be used in the
newdata argument of various functions in the marginaleffects
package. This allows you to define the regions of the predictor space
where you want to evaluate the quantities of interest. See
marginaleffects::datagrid() for more details. By default, the
datagrid is set to NULL , meaning no custom grid is constructed.
To set a custom grid, the argument should either be a data frame created
using marginaleffects::datagrid() , or a named list, which is internally
used for constructing the grid. For convenience, you can also pass an empty
list datagrid = list() , in which case essential arguments like
model and newdata are inferred from the respective arguments
specified elsewhere. Additionally, the first-level predictor (such as age)
and any covariates included in the model (e.g., gender) are automatically
inferred from the model object.
|
re_formula |
Option to indicate whether or not to include
individual/group-level effects in the estimation. When NA (default),
individual-level effects are excluded, and population average growth
parameters are computed. When NULL , individual-level effects are
included in the computation, and the resulting growth parameters are
individual-specific. In both cases (NA or NULL ), continuous
and factor covariates are appropriately included in the estimation.
Continuous covariates are set to their means by default (see
numeric_cov_at for details), while factor covariates remain
unaltered, allowing for the estimation of covariate-specific population
average and individual-specific growth parameters.
|
allow_new_levels |
A flag indicating if new levels of group-level
effects are allowed (defaults to FALSE ). Only relevant if
newdata is provided.
|
sample_new_levels |
Indicates how to sample new levels for grouping
factors specified in re_formula . This argument is only relevant if
newdata is provided and allow_new_levels is set to
TRUE . If "uncertainty" (default), each posterior sample for a
new level is drawn from the posterior draws of a randomly chosen existing
level. Each posterior sample for a new level may be drawn from a different
existing level such that the resulting set of new posterior draws
represents the variation across existing levels. If "gaussian" ,
sample new levels from the (multivariate) normal distribution implied by the
group-level standard deviations and correlations. This options may be useful
for conducting Bayesian power analysis or predicting new levels in
situations where relatively few levels where observed in the old_data. If
"old_levels" , directly sample new levels from the existing levels,
where a new level is assigned all of the posterior draws of the same
(randomly chosen) existing level.
|
parameter |
A single character string or a character vector specifying
the growth parameter(s) to be estimated. Options include 'tgv'
(takeoff growth velocity), 'atgv' (age at takeoff growth velocity),
'pgv' (peak growth velocity), 'apgv' (age at peak growth
velocity), 'cgv' (cessation growth velocity), 'acgv' (age at
cessation growth velocity), and 'all' . If parameter = NULL
(default), age at peak growth velocity ('apgv' ) is estimated. When
parameter = 'all' , all six parameters are estimated. Note that the
'all' option cannot be used when the by argument is set to
TRUE .
|
xrange |
An integer to set the predictor range (e.g., age) when
executing the interpolation via ipts . By default, NULL sets
the individual-specific predictor range. Setting xrange = 1 applies
the same range for individuals within the same higher grouping variable
(e.g., study). Setting xrange = 2 applies an identical range across
the entire sample. Alternatively, a numeric vector (e.g., xrange =
c(6, 20) ) can be provided to set the range within the specified values.
|
acg_velocity |
A real number specifying the percentage of peak growth
velocity to be used as the cessation velocity when estimating the
cgv and acgv growth parameters. The acg_velocity
should be greater than 0 and less than 1 . The default value
of acg_velocity = 0.10 indicates that 10 percent of the peak growth
velocity will be used to calculate the cessation growth velocity and the
corresponding age at cessation velocity. For example, if the peak growth
velocity estimate is 10 mm/year , then the cessation growth velocity
will be 1 mm/year .
|
digits |
An integer (default 2 ) to set the decimal places for
rounding the results using the base::round() function.
|
numeric_cov_at |
An optional (named list) argument to specify the value
of continuous covariate(s). The default NULL option sets the
continuous covariate(s) to their mean. Alternatively, a named list can be
supplied to manually set these values. For example, numeric_cov_at =
list(xx = 2) will set the continuous covariate variable 'xx' to 2. The
argument numeric_cov_at is ignored when no continuous covariates are
included in the model.
|
aux_variables |
An optional argument to specify the variable(s) that can
be passed to the ipts argument (see below). This is useful when
fitting location-scale models and measurement error models. If
post-processing functions throw an error such as variable 'x' not
found in either 'data' or 'data2' , consider using aux_variables .
|
levels_id |
An optional argument to specify the ids for the
hierarchical model (default NULL ). It is used only when the model is
applied to data with three or more levels of hierarchy. For a two-level
model, levels_id is automatically inferred from the model fit. For
models with three or more levels, levels_id is inferred from the
model fit under the assumption that hierarchy is specified from the lowest
to the uppermost level, i.e., id followed by study , where
id is nested within study . However, it is not guaranteed that
levels_id is sorted correctly, so it is better to set it manually
when fitting a model with three or more levels of hierarchy.
|
avg_reffects |
An optional argument (default NULL ) to calculate
(marginal/average) curves and growth parameters, such as APGV and PGV. If
specified, it must be a named list indicating the over (typically a
level 1 predictor, such as age), feby (fixed effects, typically a
factor variable), and reby (typically NULL , indicating that
parameters are integrated over the random effects). For example,
avg_reffects = list(feby = 'study', reby = NULL, over = 'age') .
|
idata_method |
A character string to indicate the interpolation method.
The number of interpolation points is set by the ipts argument.
Available options for idata_method are method 1 (specified as
'm1' ) and method 2 (specified as 'm2' ).
-
Method 1 ('m1' ) is adapted from the iapvbs package
and is documented
here.
-
Method 2 ('m2' ) is based on the JMbayes package
and is documented
here.
The 'm1' method works by internally constructing the data frame
based on the model configuration, while the 'm2' method uses the
exact data frame from the model fit, accessible via fit$data . If
idata_method = NULL (default), method 'm2' is automatically
selected. Note that method 'm1' may fail in certain cases,
especially when the model includes covariates (particularly in
univariate_by models). In such cases, it is recommended to use
method 'm2' .
|
ipts |
An integer to set the length of the predictor variable for
generating a smooth velocity curve. If NULL , the original values are
returned. If an integer (e.g., ipts = 10 , default), the predictor is
interpolated. Note that these interpolations do not alter the range of the
predictor when calculating population averages and/or individual-specific
growth curves.
|
seed |
An integer (default 123 ) that is passed to the estimation
method to ensure reproducibility.
|
future |
A logical value (default FALSE ) to specify whether or
not to perform parallel computations. If set to TRUE , the
future.apply::future_sapply() function is used to summarize the posterior
draws in parallel.
|
future_session |
A character string specifying the session type when
future = TRUE . The 'multisession' (default) option sets the
multisession environment, while the 'multicore' option sets up a
multicore session. Note that 'multicore' is not supported on Windows
systems. For more details, see future.apply::future_sapply() .
|
future_splits |
A list (default NULL ) that can be an unnamed
numeric list, a logical value, or a numeric vector of length 1 or 2. It is
used to split the processing of posterior draws into smaller subsets for
parallel computation.
If passed as a list (e.g., future_splits = list(1:6, 7:10) ),
each sequence of
numbers is passed to the draw_ids argument.
If passed as a numeric vector (e.g., future_splits = c(10, 2) ),
the first element
specifies the number of draws (see draw_ids ) and the second element
indicates the number of splits. The splits are created using
parallel::splitIndices() .
If passed as a numeric vector of length 1, the first element is
internally set as the
number of draws (ndraws or draw_ids ) depending on which one
is not NULL .
If TRUE , a numeric vector for future_splits is created
based on the number
of draws (ndraws ) and the number of cores (cores ).
If FALSE , future_splits is ignored.
The use case for future_splits is to save memory and improve
performance, especially on Linux systems when future::plan()
is set to multicore . Note: on Windows systems, R processes may not
be freed automatically when using 'multisession' . In such cases, the
R processes can be interrupted using installr::kill_all_Rscript_s() .
|
future_method |
A character string (default 'future' ) to specify
the method for parallel computation. Options include:
|
future_re_expose |
A logical (default NULL ) to indicate whether
to re-expose Stan functions when future = TRUE . This is
especially relevant when future::plan() is set to 'multisession' ,
as already exposed C++ Stan functions cannot be passed across
multiple sessions.
When future_re_expose = NULL (the default), future_re_expose
is automatically set to TRUE for the 'multisession' plan.
It is advised to explicitly set future_re_expose = TRUE for speed
gains when using parallel processing with future = TRUE .
|
usedtplyr |
A logical (default FALSE ) indicating whether to use
the dtplyr package for summarizing the draws. This package uses
data.table as a back-end. It is useful when the data has a large
number of observations. For typical use cases, it does not make a
significant performance difference. The usedtplyr argument is
evaluated only when method = 'custom' .
|
usecollapse |
A logical (default FALSE ) to indicate whether to
use the collapse package for summarizing the draws.
|
cores |
The number of cores to be used for parallel computations if
future = TRUE . On non-Windows systems, this argument can be set
globally via the mc.cores option. By default, NULL , the
number of cores is automatically determined using
future::availableCores() , and it will use the maximum number of cores
available minus one (i.e., future::availableCores() - 1 ).
|
fullframe |
A logical value indicating whether to return a
fullframe object in which newdata is bound to the summary
estimates. Note that fullframe cannot be used with summary =
FALSE , and it is only applicable when idata_method = 'm2' . A
typical use case is when fitting a univariate_by model. This option
is mainly for internal use.
|
average |
A logical indicating whether to internally call the
marginaleffects::predictions() or marginaleffects::avg_predictions()
function. If FALSE (default), marginaleffects::predictions() is
called; otherwise, marginaleffects::avg_predictions() is used when
average = TRUE .
|
plot |
A logical specifying whether to plot predictions by calling
marginaleffects::plot_predictions() (TRUE ) or not (FALSE ).
If FALSE (default), marginaleffects::predictions() or
marginaleffects::avg_predictions() are called to compute predictions (see
average for details). Note that
marginaleffects::plot_predictions() allows either condition or
by arguments, but not both. Therefore, when the condition
argument is not NULL , the by argument is set to NULL .
This step is required because marginal_draws() automatically
assigns the by argument when the model includes a covariate.
|
showlegends |
A logical value to specify whether to show legends
(TRUE ) or not (FALSE ). If NULL (default), the value of
showlegends is internally set to TRUE if re_formula =
NA , and FALSE if re_formula = NULL .
|
variables |
A named list specifying the level 1 predictor, such as
age or time , used for estimating growth parameters in the
current use case. The variables list is set via the esp
argument (default value is 1e-6 ). If variables is
NULL , the relevant information is retrieved internally from the
model . Alternatively, users can define variables as a named
list, e.g., variables = list('x' = 1e-6) where 'x' is the
level 1 predictor. By default, variables = list('age' = 1e-6) in the
marginaleffects package, as velocity is usually computed by
differentiating the distance curve using the dydx approach. When
using this default, the argument deriv is automatically set to
0 and deriv_model to FALSE . If parameters are to be
estimated based on the model's first derivative, deriv must be set
to 1 and variables will be defined as variables =
list('age' = 0) . Note that if the default behavior is used (deriv =
0 and variables = list('x' = 1e-6) ), additional arguments cannot be
passed to variables . In contrast, when using an alternative approach
(deriv = 0 and variables = list('x' = 0) ), additional options
can be passed to the marginaleffects::comparisons() and
marginaleffects::avg_comparisons() functions.
|
condition |
Conditional predictions
Character vector (max length 4): Names of the predictors to display.
Named list (max length 4): List names correspond to predictors. List elements can be:
Numeric vector
Function which returns a numeric vector or a set of unique categorical values
Shortcut strings for common reference values: "minmax", "quartile", "threenum"
1: x-axis. 2: color/shape. 3: facet (wrap if no fourth variable, otherwise cols of grid). 4: facet (rows of grid).
Numeric variables in positions 2 and 3 are summarized by Tukey's five numbers ?stats::fivenum
|
deriv |
An integer to indicate whether to estimate the distance curve or
its derivative (i.e., velocity curve). The deriv = 0 (default) is
for the distance curve, whereas deriv = 1 is for the velocity curve.
|
deriv_model |
A logical value specifying whether to estimate the
velocity curve from the derivative function or by differentiating the
distance curve. Set deriv_model = TRUE for functions that require
the velocity curve, such as growthparameters() and
plot_curves() . Set it to NULL for functions that use the
distance curve (i.e., fitted values), such as loo_validation() and
plot_ppc() .
|
method |
A character string specifying the computation method: whether
to use the marginaleffects machinery at the post-draw stage, i.e.,
marginaleffects::comparisons() (method = 'pkg' ) or to use custom
functions for efficiency and speed (method = 'custom' , default).
Note that method = 'custom' is particularly useful when testing
hypotheses. Also, when method = 'custom' ,
marginaleffects::predictions() is used internally instead of
marginaleffects::comparisons() .
|
marginals |
A list , data.frame , or tibble returned
by the marginaleffects functions (default NULL ). This is only
evaluated when method = 'custom' . The marginals can be the
output from marginaleffects functions or posterior draws from
marginaleffects::posterior_draws() . The marginals argument is
primarily used for internal purposes.
|
pdrawso |
A character string (default FALSE ) to indicate whether
to return the original posterior draws for parameters. Options include:
-
'return' : returns the original posterior draws,
-
'add' : adds the original posterior draws to the outcome.
When pdrawso = TRUE , the default behavior is pdrawso =
'return' . Note that the posterior draws are returned before calling
marginaleffects::posterior_draws() .
|
pdrawsp |
A character string (default FALSE ) to indicate whether
to return the posterior draws for parameters. Options include:
-
'return' : returns the posterior draws for parameters,
-
'add' : adds the posterior draws to the outcome.
When pdrawsp = TRUE , the default behavior is pdrawsp =
'return' . The pdrawsp represent the parameter estimates for each of
the posterior samples, and the summary of these are the estimates returned.
|
pdrawsh |
A character string (default FALSE ) to indicate whether
to return the posterior draws for parameter contrasts. Options include:
The summary of posterior draws for parameters is the default returned
object. The pdrawsh represent the contrast estimates for each of the
posterior samples, and the summary of these are the contrast returned.
|
type |
string indicates the type (scale) of the predictions used to
compute contrasts or slopes. This can differ based on the model
type, but will typically be a string such as: "response", "link", "probs",
or "zero". When an unsupported string is entered, the model-specific list of
acceptable values is returned in an error message. When type is NULL , the
first entry in the error message is used by default.
|
by |
Aggregate unit-level estimates (aka, marginalize, average over). Valid inputs:
-
FALSE : return the original unit-level estimates.
-
TRUE : aggregate estimates for each term.
Character vector of column names in newdata or in the data frame produced by calling the function without the by argument.
Data frame with a by column of group labels, and merging columns shared by newdata or the data frame produced by calling the same function without the by argument.
See examples below.
For more complex aggregations, you can use the FUN argument of the hypotheses() function. See that function's documentation and the Hypothesis Test vignettes on the marginaleffects website.
|
conf_level |
numeric value between 0 and 1. Confidence level to use to build a confidence interval.
|
transform |
A function applied to individual draws from the posterior
distribution before computing summaries. The argument transform is
based on the marginaleffects::predictions() function. This should not be
confused with transform from brms::posterior_predict() , which is
now deprecated.
|
byfun |
A function such as mean() or sum() used to aggregate
estimates within the subgroups defined by the by argument. NULL uses the
mean() function. Must accept a numeric vector and return a single numeric
value. This is sometimes used to take the sum or mean of predicted
probabilities across outcome or predictor
levels. See examples section.
|
wts |
logical, string or numeric: weights to use when computing average predictions, contrasts or slopes. These weights only affect the averaging in avg_*() or with the by argument, and not unit-level estimates. See ?weighted.mean
string: column name of the weights variable in newdata . When supplying a column name to wts , it is recommended to supply the original data (including the weights variable) explicitly to newdata .
numeric: vector of length equal to the number of rows in the original data or in newdata (if supplied).
FALSE: Equal weights.
TRUE: Extract weights from the fitted object with insight::find_weights() and use them when taking weighted averages of estimates. Warning: newdata=datagrid() returns a single average weight, which is equivalent to using wts=FALSE
|
hypothesis |
specify a hypothesis test or custom contrast using a number , formula, string equation, vector, matrix, or function.
Number: The null hypothesis used in the computation of Z and p (before applying transform ).
String: Equation to specify linear or non-linear hypothesis tests. If the terms in coef(object) uniquely identify estimates, they can be used in the formula. Otherwise, use b1 , b2 , etc. to identify the position of each parameter. The b* wildcard can be used to test hypotheses on all estimates. If a named vector is used, the names are used as labels in the output. Examples:
-
hp = drat
-
hp + drat = 12
-
b1 + b2 + b3 = 0
-
b* / b1 = 1
Formula: lhs ~ rhs | group
-
lhs
-
rhs
-
pairwise and revpairwise : pairwise differences between estimates in each row.
-
reference : differences between the estimates in each row and the estimate in the first row.
-
sequential : difference between an estimate and the estimate in the next row.
-
meandev : difference between an estimate and the mean of all estimates.
'meanotherdev: difference between an estimate and the mean of all other estimates, excluding the current one.
-
poly : polynomial contrasts, as computed by the stats::contr.poly() function.
-
helmert : Helmert contrasts, as computed by the stats::contr.helmert() function. Contrast 2nd level to the first, 3rd to the average of the first two, and so on.
-
trt_vs_ctrl : difference between the mean of estimates (except the first) and the first estimate.
-
I(fun(x)) : custom function to manipulate the vector of estimates x . The function fun() can return multiple (potentially named) estimates.
-
group (optional)
Examples:
-
~ poly
-
~ sequential | groupid
-
~ reference
-
ratio ~ pairwise
-
difference ~ pairwise | groupid
-
~ I(x - mean(x)) | groupid
-
~ I(\(x) c(a = x[1], b = mean(x[2:3]))) | groupid
Matrix or Vector: Each column is a vector of weights. The the output is the dot product between these vectors of weights and the vector of estimates. The matrix can have column names to label the estimates.
Function:
Accepts an argument x : object produced by a marginaleffects function or a data frame with column rowid and estimate
Returns a data frame with columns term and estimate (mandatory) and rowid (optional).
The function can also accept optional input arguments: newdata , by , draws .
This function approach will not work for Bayesian models or with bootstrapping. In those cases, it is easy to use get_draws() to extract and manipulate the draws directly.
See the Examples section below and the vignette: https://marginaleffects.com/chapters/hypothesis.html
|
equivalence |
Numeric vector of length 2: bounds used for the two-one-sided test (TOST) of equivalence, and for the non-inferiority and non-superiority tests. See Details section below.
|
constrats_by |
A character vector (default NULL ) specifying the
variable(s) by which hypotheses (at the post-draw stage) should be tested.
Note that the variable(s) in constrats_by should be a subset of the
variables included in the 'by' argument.
|
constrats_at |
A character vector (default NULL ) specifying the
variable(s) at which hypotheses (at the post-draw stage) should be tested.
constrats_at is particularly useful when the number of rows in the
estimates is large because marginaleffects does not allow hypotheses
testing when the number of rows exceeds 25.
|
reformat |
A logical (default TRUE ) indicating whether to
reformat the output returned by marginaleffects as a data frame.
Column names are redefined as conf.low to Q2.5 and
conf.high to Q97.5 (assuming conf_int = 0.95 ).
Additionally, some columns (term , contrast , etc.) are dropped
from the data frame.
|
estimate_center |
A character string (default NULL ) specifying
how to center estimates: either 'mean' or 'median' . This
option sets the global options as follows:
options("marginaleffects_posterior_center" = "mean") or
options("marginaleffects_posterior_center" = "median") . These global
options are restored upon function exit using base::on.exit() .
|
estimate_interval |
A character string (default NULL ) to specify
the type of credible intervals: 'eti' for equal-tailed intervals or
'hdi' for highest density intervals. This option sets the global
options as follows: options("marginaleffects_posterior_interval" =
"eti") or options("marginaleffects_posterior_interval" = "hdi") ,
and is restored on exit using base::on.exit() .
|
dummy_to_factor |
A named list (default NULL ) to convert dummy
variables into a factor variable. The list must include the following
elements:
-
factor.dummy : A character vector of dummy variables to be
converted to factors.
-
factor.name : The name for the newly created factor variable
(default is 'factor.var' if NULL ).
-
factor.level : A vector specifying the factor levels.
If NULL , levels are taken from factor.dummy .
If factor.level is provided, its length must match
factor.dummy .
|
verbose |
A logical argument (default FALSE ) to specify whether
to print information collected during the setup of the object(s).
|
expose_function |
A logical argument (default FALSE ) to indicate
whether Stan functions should be exposed. If TRUE , any Stan
functions exposed during the model fit using expose_function = TRUE
in the bsitar() function are saved and can be used in post-processing. By
default, expose_function = FALSE in post-processing functions,
except in optimize_model() where it is set to NULL . If
NULL , the setting is inherited from the original model fit. It must
be set to TRUE when adding fit criteria or bayes_R2
during model optimization.
|
usesavedfuns |
A logical value (default NULL ) indicating whether
to use already exposed and saved Stan functions. This is typically set
automatically based on the expose_functions argument from the
bsitar() call. Manual specification of usesavedfuns is rarely
needed and is intended for internal testing, as improper use can lead to
unreliable estimates.
|
clearenvfuns |
A logical value indicating whether to clear the exposed
Stan functions from the environment (TRUE ) or not (FALSE ). If
NULL , clearenvfuns is set based on the value of
usesavedfuns : TRUE if usesavedfuns = TRUE , or
FALSE if usesavedfuns = FALSE .
|
funlist |
A list (default NULL ) specifying function names. This
is rarely needed, as required functions are typically retrieved
automatically. A use case for funlist is when sigma_formula ,
sigma_formula_gr , or sigma_formula_gr_str use an external
function (e.g., poly(age) ). The funlist should include
function names defined in the globalenv() . For functions needing
both distance and velocity curves (e.g., plot_curves(..., opt =
'dv') ), funlist must include two functions: one for the distance
curve and one for the velocity curve.
|
envir |
The environment used for function evaluation. The default is
NULL , which sets the environment to parent.frame() . Since
most post-processing functions rely on brms, it is recommended to set
envir = globalenv() or envir = .GlobalEnv , especially for
derivatives like velocity curves.
|
... |
Additional arguments passed to the brms::fitted.brmsfit()
function. Please see brms::fitted.brmsfit() for details on
various options available.
|