spGLMstack {spStack} | R Documentation |
Bayesian spatial generalized linear model using predictive stacking
Description
Fits Bayesian spatial generalized linear model on a collection of candidate models constructed based on some candidate values of some model parameters specified by the user and subsequently combines inference by stacking predictive densities. See Pan, Zhang, Bradley, and Banerjee (2024) for more details.
Usage
spGLMstack(
formula,
data = parent.frame(),
family,
coords,
cor.fn,
priors,
params.list,
n.samples,
loopd.controls,
parallel = FALSE,
solver = "ECOS",
verbose = TRUE,
...
)
Arguments
formula |
a symbolic description of the regression model to be fit. See example below. |
data |
an optional data frame containing the variables in the model.
If not found in |
family |
Specifies the distribution of the response as a member of the
exponential family. Supported options are |
coords |
an |
cor.fn |
a quoted keyword that specifies the correlation function used
to model the spatial dependence structure among the observations. Supported
covariance model key words are: |
priors |
(optional) a list with each tag corresponding to a parameter
name and containing prior details. Valid tags include |
params.list |
a list containing candidate values of spatial process
parameters for the |
n.samples |
number of posterior samples to be generated. |
loopd.controls |
a list with details on how leave-one-out predictive
densities (LOO-PD) are to be calculated. Valid tags include |
parallel |
logical. If |
solver |
(optional) Specifies the name of the solver that will be used
to obtain optimal stacking weights for each candidate model. Default is
|
verbose |
logical. If |
... |
currently no additional argument. |
Details
Instead of assigning a prior on the process parameters \phi
and \nu
, the boundary adjustment parameter \epsilon
, we consider
a set of candidate models based on some candidate values of these parameters
supplied by the user. Suppose the set of candidate models is
\mathcal{M} = \{M_1, \ldots, M_G\}
. Then for each
g = 1, \ldots, G
, we sample from the posterior distribution
p(\sigma^2, \beta, z \mid y, M_g)
under the model M_g
and find
leave-one-out predictive densities p(y_i \mid y_{-i}, M_g)
. Then we
solve the optimization problem
\begin{aligned}
\max_{w_1, \ldots, w_G}& \, \frac{1}{n} \sum_{i = 1}^n \log \sum_{g = 1}^G
w_g p(y_i \mid y_{-i}, M_g) \\
\text{subject to} & \quad w_g \geq 0, \sum_{g = 1}^G w_g = 1
\end{aligned}
to find the optimal stacking weights \hat{w}_1, \ldots, \hat{w}_G
.
Value
An object of class spGLMstack
, which is a list including the
following tags -
family
the distribution of the responses as indicated in the function call
samples
a list of length equal to total number of candidate models with each entry corresponding to a list of length 3, containing posterior samples of fixed effects (
beta
), spatial effects (z
) and fine-scale variation term (xi
) for that particular model.loopd
a list of length equal to total number of candidate models with each entry containing leave-one-out predictive densities under that particular model.
loopd.method
a list containing details of the algorithm used for calculation of leave-one-out predictive densities.
n.models
number of candidate models that are fit.
candidate.models
a matrix with
n_model
rows with each row containing details of the model parameters and its optimal weight.stacking.weights
a numeric vector of length equal to the number of candidate models storing the optimal stacking weights.
run.time
a
proc_time
object with runtime details.solver.status
solver status as returned by the optimization routine.
The return object might include additional data that is useful for subsequent prediction, model fit evaluation and other utilities.
Author(s)
Soumyakanti Pan span18@ucla.edu,
Sudipto Banerjee sudipto@ucla.edu
References
Pan S, Zhang L, Bradley JR, Banerjee S (2024). "Bayesian Inference for Spatial-temporal Non-Gaussian Data Using Predictive Stacking." doi:10.48550/arXiv.2406.04655.
Vehtari A, Simpson D, Gelman A, Yao Y, Gabry J (2024). "Pareto Smoothed Importance Sampling." Journal of Machine Learning Research, 25(72), 1-58. URL https://jmlr.org/papers/v25/19-556.html.
See Also
Examples
set.seed(1234)
data("simPoisson")
dat <- simPoisson[1:100,]
mod1 <- spGLMstack(y ~ x1, data = dat, family = "poisson",
coords = as.matrix(dat[, c("s1", "s2")]), cor.fn = "matern",
params.list = list(phi = c(3, 7, 10), nu = c(0.25, 0.5, 1.5),
boundary = c(0.5, 0.6)),
n.samples = 1000,
loopd.controls = list(method = "CV", CV.K = 10, nMC = 1000),
parallel = TRUE, solver = "ECOS", verbose = TRUE)
# print(mod1$solver.status)
# print(mod1$run.time)
post_samps <- stackedSampler(mod1)
post_beta <- post_samps$beta
print(t(apply(post_beta, 1, function(x) quantile(x, c(0.025, 0.5, 0.975)))))
post_z <- post_samps$z
post_z_summ <- t(apply(post_z, 1, function(x) quantile(x, c(0.025, 0.5, 0.975))))
z_combn <- data.frame(z = dat$z_true,
zL = post_z_summ[, 1],
zM = post_z_summ[, 2],
zU = post_z_summ[, 3])
library(ggplot2)
plot_z <- ggplot(data = z_combn, aes(x = z)) +
geom_errorbar(aes(ymin = zL, ymax = zU),
width = 0.05, alpha = 0.15,
color = "skyblue") +
geom_point(aes(y = zM), size = 0.25,
color = "darkblue", alpha = 0.5) +
geom_abline(slope = 1, intercept = 0,
color = "red", linetype = "solid") +
xlab("True z") + ylab("Posterior of z") +
theme_bw() +
theme(panel.background = element_blank(),
aspect.ratio = 1)