dga-package {dga} | R Documentation |
Performs Bayesian model averaging over all decomposable graphical models, each representing a different list dependence structure, with the goal of estimating the number of uncounted elements from the universe of elements that could have been recorded or “captured". The code here is an implementation of the model described in Madigan and York (1997).
Package: | dga |
Type: | Package |
Version: | 1.2 |
Date: | 2015-04-16 |
License: | GPL >=2 |
James Johndrow james.johndrow@gmail.com, Kristian Lum kl@hrdag.org, and Patrick Ball pball@hrdag.org
Maintainer: Kristian Lum kl@hrdag.org
Madigan, David, and Jeremy C. York. "Bayesian methods for estimation of the size of a closed population." Biometrika 84.1 (1997): 19-31.
###This simualated example goes through the whole process for 3 lists.### library(chron) # Simulate some data N <- 1000 #true population size num.lists <- 3 #number of lists #simulate 3 lists independently 'capturing' l1 <- rbinom(N, 1, .2) l2 <- rbinom(N, 1, .25) l3 <- rbinom(N, 1, .3) overlaps <- data.frame(l1, l2, l3) #simulate dates of recording dates <- paste(rep(2015, N), "-", sample(1:4, N, replace = TRUE), "-", sample(1:28, N, replace = TRUE)) dates <- chron(dates, format=c(dates="y-m-d")) #save true number in each stratum for comparison later truth <- table((months(dates)))[1:4] #remove dates of unrecorded elements dates <- dates[apply(overlaps, 1, sum)>0] #remove individuals not recorded on any list overlaps <- overlaps[apply(overlaps, 1, sum)>0,] #stratify by date strata <- make.strata(overlaps, dates = dates, date.defs = 'monthly') #check to make sure that all strata are OK check <- check.strata(strata) #look at strata, just to make sure par(mfrow=c(2,2), mar = rep(1,4)) for(i in 1:nrow(strata$overlap.counts)){ venn3(strata$overlap.counts[i,], main = rownames(strata$overlap.counts)[i], cex.main = 1) } #load the graphs to make the estimates data(graphs3) #select expansion factor defining the largest number of unrecorded elements. #this makes Nmissing <- 0:(sum(Y)*fac) fac <- 5 #set prior delta <- 1/2^num.lists #loop over strata to calculate posterior distributions of #the total population size for each stratum par(mfrow=c(2,2), mar = rep(1.75,4)) # if using Rstudio, make sure your plot window is pretty big here! for(i in 1:nrow(strata$overlap.counts)){ Nmissing <- 0:(sum(strata$overlap.counts[i,])*fac) Y <- array(strata$overlap.counts[i,], dim=rep(2, num.lists)) weights <- bma.cr(Y, Nmissing, delta, graphs3) plotPosteriorN(weights, Nmissing + sum(strata$overlap.counts[i,]), main = rownames(strata$overlap.counts)[i]) points(truth[i], .5*max(weights), col = 'red', pch = 16, cex = 2) }