betareg.stan {phylopairs} | R Documentation |
betareg.stan
Description
Fits one of two beta regression models to a dataset in a Bayesian
framework using Stan
software via the rstan
package. Response variable must be
bounded between 0 and 1. Users can fit either (1) a basic beta regression model or a
(2) beta mixed-effects model in which there are covarying random intercepts in the
linear predictor. For the latter, users must supply a covariance matrix. In both
models, users can choose one of four link functions. Users can alter parameters for
model-parameter prior distributions and Bayesian sampling settings. See details.
Usage
betareg.stan(des, y, model="beta.reg", link="logit", covmat=NULL,
iter=2000, chains=4, coef.u=0, coef.sd=10, phi.shape=0.01, phi.rate=0.01,
physig2.u=-1, physig2.sd=1, cores=4, ...)
Arguments
des |
A vector of predictor variable observations OR, in the case of multiple predictors, a matrix in which each column is a vector of observations of a given predictor. |
y |
A vector of response variable observations. |
model |
One of "beta.reg" or "beta.mm" for a basic beta regression model or beta mixed model, respectively; defaults to "beta.reg". |
link |
The link function to be used. Default is "logit". Other possible choices are "probit", "cloglog" (complementary log-log), and "loglog". |
covmat |
Covariance matrix for model residuals (a lineage-pair covariance matrix if analyzing lineage-pair data or a phylogenetic vcv matrix if analyzing bounded species data). |
iter |
Number of iterations to run on each chain; defaults to 2000 (more are often necessary). |
chains |
Number of Markov chains to run; defaults to 4. |
coef.u |
Mean of the Gaussian prior for each preditor variable coefficient; defaults to 0. |
coef.sd |
SD of the Gaussian prior for each preditor variable coefficient; defaults to 10. |
phi.shape |
Shape parameter for gamma prior of beta distribution's precision parameter (phi); defaults to 0.01. |
phi.rate |
Rate parameter for gamma prior of beta distribution's precision parameter (phi); defaults to 0.01. |
physig2.u |
Mean of the lognormal prior for the scale of the residual covariance; defaults to -1. |
physig2.sd |
SD of the lognormal prior for the scale of the residual covariance; defaults to 1. |
cores |
Number of cores to be used; defaults to 4 (one chain per core for most laptops). |
... |
additional arguments passed to |
Details
The two models that can be chosen differ in terms of their linear predictor on the scale of the link function. These predictors are
Standard beta regression:
g(u) = XB
Beta mixed-effects regression:
g(u) = XB + u, u~N(0, physig2*C)
, where C is a lineage-pair covariance matrix (if analyzing lineage-pair data) or a phylogenetic vcv matrix (if analyzing species data) and physig2 is the scaling parameter for C.
NOTE: For model 2, the user must supply a covariance matrix. These are generated with the 'taxapair.vcv' function.
Prior Distributions for Model Parameters: The underlying stan models assume the following prior distributions for model parameters.
Regression coefficients: Gaussian prior (users can set prior mean and sd).
Precision parameter
phi
: gamma prior (users can set prior shape and rate of gamma).-
physig
: lognormal prior (users can set prior mean and sd).
Value
A list containing two elements: (1) the posterior distribution of model parameters, and (2) the log-likelihood of the posteriors for use in downstream analyses (e.g. the calculation of model fitting metrics like loo or waic). For interpreting model parameters, note that Coef[1]
is the intercept and Coef[2]
, Coef[3]
, ... , Coef[N]
are regression coefficient for the 1st-(N-1)th predictor variables.
References
Anderson, S. A. S., et al. In review. The comparative analysis of lineage-pair data.
Examples
## Example 1: Fit beta regression models with different link functions to independent data
# Load a dataset of independent response observations simulated with a logit link function
data(data5)
# Note: data were simulated with Coef[1]=1 (intercept), Coef[2]=0.8 (slope), phi=5
# Run the betareg function
result1 = betareg.stan(des=data5[,3], y=data5[,4], iter=1000, cores=2)
# Fit the model again but this time use a probit link function
result2 = betareg.stan(des=data5[,3], y=data5[,4], link="probit", iter=1000, cores=2)
# Compare posterior parameter estimates
result1[[1]]
result2[[1]]
## Example 2: Fit beta regression models to a dataset with simulated non-independence
# Load a dataset of non-independent response observations simulated with a logit link function
data(data7)
# Note: data were simulated with Coef[1]=1 (intercept), Coef[2]=0.8 (slope), phi=5
# Load the lineage-pair covariance matrix that arose from those simulations
data(sim.cov.pairs)
# Run the betareg function
result1 = betareg.stan(des=data7[,3], y=data7[,4], model="beta.mm", cov=sim.cov.pairs,
iter=1000, cores=2)
# Fit the model again but this time without the covariance matrix
result2 = betareg.stan(des=data7[,3], y=data7[,4], iter=1000, cores=2)
# Fit the model a third time with the cov. matrix but now with a probit link function
result3 = betareg.stan(des=data7[,3], y=data7[,4], model="beta.mm",
cov=sim.cov.pairs, link="probit", iter=1000, cores=2)
# Compare posterior parameter estimates
result1[[1]]
result2[[1]]
result3[[1]]
# Compare the fit of the three models via leave-on-out (loo) cross validation.
loo1 = suppressWarnings(loo::loo(result1[[2]]))
loo2 = suppressWarnings(loo::loo(result2[[2]]))
loo3 = suppressWarnings(loo::loo(result3[[2]]))
loo::loo_compare(loo1, loo2, loo3)