thames_mixtures {thamesmix} | R Documentation |
THAMES estimator of the reciprocal log marginal likelihood for mixture models
Description
This function computes the THAMES estimate of the reciprocal log marginal likelihood for mixture models using posterior samples and unnormalized log posterior values.
Usage
thames_mixtures(
logpost,
sims,
n_samples = NULL,
c_opt = NULL,
type = "simple",
seed = NULL,
lps = NULL,
lps_unif = NULL,
max_iters = Inf
)
Arguments
logpost |
function logpost(sims,G) to compute lps with input "sims" |
sims |
n_simul x G x (u+1) array of parameters sampled from the posterior, where n_simul is the number of simulations from the posterior, G is the number of components, u is the number of mixture component parameters (parameter u+1 is the mixture weight) |
n_samples |
integer, number of posterior samples |
c_opt |
radius of the ellipsoid used to compute the THAMES |
type |
THAMES variant ("simple", "permutations", or "standard") |
seed |
a seed |
lps |
values of the unnormalized log posterior density |
lps_unif |
values of the unnormalized log posterior density, evaluated on a uniform sample on the posterior ellipsoid |
max_iters |
maximum number of shrinkage iterations |
Value
Returns a named list with the following elements:
theta_hat, posterior mean
sigma_hat, posterior covariance matrix
log_det_sigma_hat, log-determinant of sigma_hat
logvolA, log-volume of the ellipsoid
log_zhat_inv, log-reciprocal-marginal likelihood
log_zhat_inv_L, lower bound
log_zhat_inv_U, upper bound
alpha, HPD-region correction
len_perms, number of permutations evaluated
log_cor, log-correction of the volume of the ellipsoid
etas, Monte-Carlo sample on the ellipsoid
graph, the overlap graph for G
se, standard_error
phi, ar(1) model parameter
c_opt, radius of the ellipsoid
d_par, dimension of the parameter
G, number of mixture components
scaling, list of fit of QDA (means, covariances)
co, the criterion of overlap
References
Martin Metodiev, Nicholas J. Irons, Marie Perrot-Dockès, Pierre Latouche, Adrian E. Raftery. "Easily Computed Marginal Likelihoods for Multivariate Mixture Models Using the THAMES Estimator." arXiv preprint arXiv:2504.21812.
Examples
y = c(9.172, 9.350, 9.483, 9.558, 9.775, 10.227, 10.406, 16.084, 16.170,
18.419, 18.552, 18.600, 18.927, 19.052, 19.070, 19.330, 19.343, 19.349,
19.440, 19.473, 19.529, 19.541, 19.547, 19.663, 19.846, 19.856, 19.863,
19.914, 19.918, 19.973, 19.989, 20.166, 20.175, 20.179, 20.196, 20.215,
20.221, 20.415, 20.629, 20.795, 20.821, 20.846, 20.875, 20.986, 21.137,
21.492, 21.701, 21.814, 21.921, 21.960, 22.185, 22.209, 22.242, 22.249,
22.314, 22.374, 22.495, 22.746, 22.747, 22.888, 22.914, 23.206, 23.241,
23.263, 23.484, 23.538, 23.542, 23.666, 23.706, 23.711, 24.129, 24.285,
24.289, 24.366, 24.717, 24.990, 25.633, 26.690, 26.995, 32.065, 32.789,
34.279)
R <- diff(range(y))
m <- mean(range(y))
# likelihood
loglik_gmm <- function(sims,G){
mus = sims[,,1]
sigma_squs = sims[,,2]
pis = sims[,,3]
log_single_y = Vectorize(function(x)
log(rowSums(sapply(1:G,
function(g) pis[,g]*dnorm(x,mus[,g],sqrt(sigma_squs[,g]))))
)
)
res = suppressWarnings(rowSums(log_single_y(y)))
return(rowSums(log_single_y(y)))
}
# prior
logprior_gmm_marginal <- function(sims,G) {
mus = sims[,,1]
sigma_squs = sims[,,2]
pis = sims[,,3]
l_mus <- rowSums(sapply(1:G, function(g) dnorm(mus[,g], mean = m, sd = R,
log = TRUE)))
l_pis <- LaplacesDemon::ddirichlet(1:G/G, rep(1,G),log=TRUE)
l_sigma_squs <- lgamma(2*G+0.2) - lgamma(0.2) +
0.2*log(10/R^2) - (2*G+0.2) * log(rowSums(sigma_squs^(-1))+10/R^2) -
3*rowSums(log(sigma_squs))
return(l_mus + l_pis + l_sigma_squs)
}
# unnormalized log-posterior density
logpost = function(sims){
G = dim(sims)[2]
mus = sims[,1:G,1]
# apply exp transform
sims[,1:G,2] = sims[,1:G,2]
sigma_squs = sims[,1:G,2]
pis = sims[,1:G,3]
# set to 0 outside of support
if(G>2){
mask = (((pis > 0) & (rowSums(pis[,1:(G-1)])<=1)) & (sigma_squs>0))
}else{
mask = (((pis > 0) & (pis[,1]<=1)) & (sigma_squs>0))
}
l_total = suppressWarnings(loglik_gmm(sims,G)+
logprior_gmm_marginal(sims,G))
l_total[exp(rowSums(log(mask)))==0] = -Inf
return(l_total)
}
# toy sample from the posterior
mus = rbind(c(17.67849, 21.46734),
c(17.67849, 21.46734),
c(16.98067, 21.11391),
c(20.58628, 21.22104),
c(17.38332, 21.37224),
c(16.43644, 21.19085),
c(19.49676, 21.28964),
c(17.82287, 21.22475),
c(18.06050, 21.36945),
c(18.70759, 21.60244),
c(15.93795, 21.04681),
c(16.23184, 20.96049))
sigmasqus = rbind(c(46.75089, 3.660171),
c(58.44208, 3.026577),
c(63.19334, 4.090872),
c(87.02758, 2.856063),
c(82.34268, 3.760550),
c(50.92386, 2.380784),
c(49.51412, 3.605798),
c(38.67681, 3.362407),
c(49.59170, 3.130254),
c(63.41569, 2.475669),
c(65.95225, 3.927501),
c(47.22989, 5.465702))
taus = rbind(c(0.2653882, 0.7346118),
c(0.2560075, 0.7439925),
c(0.2371868, 0.7628132),
c(0.2998265, 0.7001735),
c(0.3518301, 0.6481699),
c(0.2840316, 0.7159684),
c(0.2060193, 0.7939807),
c(0.2859257, 0.7140743),
c(0.2420695, 0.7579305),
c(0.2466622, 0.7533378),
c(0.2726186, 0.7273814),
c(0.2738916, 0.7261084))
sims = array(dim=c(12,2,3))
sims[,,1] = mus
sims[,,2] = sigmasqus
sims[,,3] = taus
# estimate of the log marginal likelihood
-thames_mixtures(logpost,sims)$log_zhat_inv