bhetGP {bhetGP}R Documentation

MCMC sampling for Heteroskedastic GP

Description

Conducts MCMC sampling of hyperparameters and latent noise process llam for a hetGP. Separate length scale parameters theta_lam and theta_y govern the correlation strength of the hidden layer and outer layer respectively. lam layer may have a non-zero nugget g which governs noise for the latent noise layer. tau2_y and tau2_lam control the amplitude of the mean and noise process respectively. In Matern covariance, v governs smoothness.

Usage

bhetGP(
  x = NULL,
  y = NULL,
  reps_list = NULL,
  nmcmc = 1000,
  sep = TRUE,
  inits = NULL,
  priors = NULL,
  reps = TRUE,
  cov = c("exp2", "matern", "ARD matern"),
  v = 2.5,
  stratergy = c("default", "flat"),
  vecchia = FALSE,
  m = min(25, length(y) - 1),
  ordering = NULL,
  verb = TRUE,
  omp_cores = 4
)

Arguments

x

vector or matrix of input locations

y

vector of response values

reps_list

list object from hetGP::find_reps

nmcmc

number of MCMC iterations

sep

logical indicating whether to fit isotropic GP (sep = FALSE) or seperable GP (sep = TRUE)

inits

set initial values for hyparameters: llam, theta_y, theta_lam, g, mean_y, mean_lam, scale_y, scale_lam. Additionally, set initial conditions for tuning:

  • theta_check: logical; if theta_check = TRUE, then ensures that theta_lam > theta_y i.e., decay of correlation for noise process is slower than mean process.

  • prof_ll_lam: logical; if prof_ll_lam = TRUE, infers tau2_lam i.e., scale parameter for latent noise process

  • noise: logical; if noise = TRUE, infers nugget g throught M-H for latent noise process.

priors

hyperparameters for priors and proposals (see details)

reps

logical; if reps = TRUE uses Woodbury inference adjusting for replication of design points and reps = FALSE does not use Woodbury inference

cov

covariance kernel, either Matern, ARD Matern or squared exponential ("exp2")

v

Matern smoothness parameter (only used if cov = "matern")

stratergy

choose initialization stratergy; "default" uses hetGP for vecchia = FALSE settings and sVecchia for vecchia = TRUE. See details.

vecchia

logical indicating whether to use Vecchia approximation

m

size of Vecchia conditioning sets (only used if vecchia = TRUE)

ordering

optional ordering for Vecchia approximation, must correspond to rows of x, defaults to random, is applied to x (only used if vecchia = TRUE)

verb

logical indicating whether to print progress

omp_cores

logical; if vecchia = TRUE, user may specify the number of cores to use for OpenMP parallelization. Uses min(4, limit) where limit is max openMP cores available on the machine.

Details

Maps inputs x to mean response y and noise levels llam. Conducts sampling of the latent noise process using Elliptical Slice sampling. Utilizes Metropolis Hastings sampling of the length scale and nugget parameters with proposals and priors controlled by priors. g for the noise process is set to a specific value, and by default, is not estimated. When vecchia = TRUE, all calculations leverage the Vecchia approximation with specified conditioning set size m. tau2_y is always inferred from likelihood; tau2_lam is inferred by default but may be pre-specified and fixed.

NOTE on OpenMP: The Vecchia implementation relies on OpenMP parallelization for efficient computation. This function will produce a warning message if the package was installed without OpenMP (this is the default for CRAN packages installed on Apple machines). To set up OpenMP parallelization, download the package source code and install using the gcc/g++ compiler.

Proposals for g and theta follow a uniform sliding window scheme, e.g.

theta_star <- runif(1, l * theta_t / u, u * theta_t / l),

with defaults l = 1 and u = 2 provided in priors. To adjust these, set priors = list(l = new_l, u = new_u). Priors on g, theta_y, and theta_lam follow Gamma distributions with shape parameters (alpha) and rate parameters (beta) controlled within the priors list object. Defaults are

tau2_y and tau2_lam are not sampled; rather directly inferred under conjugate Inverse Gamma prior with shape (alpha) and scale parameters (beta) within the priors list object

These priors are designed for x scaled to [0, 1] and y having mean mean_y. These may be adjusted using the priors input.

Initial values for theta_y, theta_lam, llam may be specified by the user. If no initial values are specified, stratergy will determine the initialization method. stratergy = "default" leverages mleHetGP for initial values of hyper-parameters if vecchia = FALSE and Scaled-Vecchia with Stochastic Kriging (Sk-Vec) hybrid approach if vecchia = TRUE.

For SK-Vec hybrid approach, scaled Vecchia code from https://github.com/katzfuss-group/scaledVecchia/blob/master/vecchia_scaled.R is used to fit two GPs using the Vecchia approximation. The first for (x, y) pairs, which result in estimated residual sums of squares based on predicted y values. Another GP on (x, s) to obtain latent noise estimates which are smoothed. A script is leveraged internally within this package that fits this method.

Optionally, choose stratergy = "flat" which which will start at uninformative initial values; llam = log(var(y) * 0.1) or specify initial values.

The output object of class bhetgp or bhetgp_vec is designed for use with trim, predict, and plot.

Value

a list of the S3 class bhetgp or bhetgp_vec with elements:

References

Binois, Mickael, Robert B. Gramacy, and Mike Ludkovski. "Practical heteroscedastic Gaussian process modeling for large simulation experiments." Journal of Computational and Graphical Statistics 27.4 (2018): 808-821.

Katzfuss, Matthias, Joseph Guinness, and Earl Lawrence. "Scaled Vecchia approximation for fast computer-model emulation." SIAM/ASA Journal on Uncertainty Quantification 10.2 (2022): 537-554.

Sauer, Annie Elizabeth. "Deep Gaussian process surrogates for computer experiments." (2023).

Examples


# 1D function with 1D noise 

# Truth
fx <- function(x){
result <- (6 * x - 2)^2* sin(12 * x - 4)
}

# Noise
rx <- function(x){
result <- (1.1 + sin(2 * pi * x))^2
return(result)
}

# Training data
r <- 10 # replicates
xn <- seq(0, 1, length = 25)
x <- rep(xn, r)

rn <- drop(rx(x))
noise <- as.numeric(t(mvtnorm::rmvnorm(r, sigma = diag(rn, length(xn)))))

f <- fx(x) 
y <- f + noise

# Testing data
xx <- seq(0, 1, length = 100)
yy <- fx(xx)

#--------------------------------------------------------------------------- 
# Example 1: Full model, no Vecchia 
#---------------------------------------------------------------------------

# Fitting a bhetGP model using all the data
fit <- bhetGP(x, y, nmcmc = 100, verb = FALSE)

# Trimming the object to remove burn in and thin samples
fit <- trim(fit, 50, 10)

# Predition using the bhetGP object (indepedent predictions)
fit <- predict(fit, xx, cores = 2) 

# Visualizing the mean predictive surface. 
# Can run plot(fit, trace = TRUE) to view trace plots
plot(fit) 


#---------------------------------------------------------------------------
# Example 2: Vecchia approximated model
#---------------------------------------------------------------------------

# Fitting a bhetGP model with vecchia approximation. Two cores for OpenMP
fit <- bhetGP(x, y, nmcmc = 100, vecchia = TRUE, m = 5, omp_cores = 2, verb = FALSE)

# Trimming the object to remove burn in and thin samples
fit <- trim(fit, 50, 10)

# Predition using the bhetGP_vec object with joint predictions (lite = FALSE)
# Two cores for OpenMP, default setting (omp_cores = 2). No SNOW
fit <- predict(fit, xx, lite = FALSE, vecchia = TRUE) 

# Visualizing the mean predictive surface
plot(fit)


#--------------------------------------------------------------------------- 
# Example 3: Vecchia inference, non-vecchia predictions
#---------------------------------------------------------------------------

# Fitting a bhetGP model with vecchia approximation. Two cores for OpenMP
fit <- bhetGP(x, y, nmcmc = 200, vecchia = TRUE, m = 5, omp_cores = 2)

# Trimming the object to remove burn in and thin samples
fit <- trim(fit, 100, 10)

# Predition using the bhetGP object with joint predictions (lite = FALSE)
# Two cores for OpenMP which is default setting (omp_cores = 2)
# Two cores for SNOW (cores = 2)
fit <- predict(fit, xx, vecchia = FALSE, cores = 2, lite = FALSE)

# Visualizing the mean predictive surface
plot(fit)



[Package bhetGP version 1.0.1 Index]