otherNum {PopulateR}R Documentation

Match people into existing households

Description

Creates a data frame of household inhabitants, with the specified number of inhabitants. Two data frames are required. The 'existing' data frame contains the people already in households. The 'additions' data frame contains the people. The use of an age distribution for the matching ensures that an age structure is present in the households. A less correlated age structure can be produced by entering a larger standard deviation. The output data frame of matches will only contain households of the required size.

Usage

otherNum(
  existing,
  exsid,
  exsage,
  HHNumVar = NULL,
  additions,
  addid,
  addage,
  numadd = NULL,
  sdused = NULL,
  userseed = NULL,
  attempts = 10,
  numiters = 10000,
  verbose = FALSE
)

Arguments

existing

A data frame containing the people already in households.

exsid

The variable containing the unique ID for each person, in the existing data frame.

exsage

The age variable, in the existing data frame.

HHNumVar

The household identifier variable. This must exist in only one data frame.

additions

A data frame containing the people to be added to the existing households.

addid

The variable containing the unique ID for each person, in the additions data frame.

addage

The age variable, in the additions data frame.

numadd

The number of people to be added to the household.

sdused

The standard deviation of the normal distribution for the distribution of ages in a household.

userseed

The user-defined seed for reproducibility. If left blank the normal set.seed() function will be used.

attempts

The number of times the function will randomly change two matches to improve the fit.

numiters

The maximum number of iterations used to construct the household data frame. This has a default value of 10000, and is the stopping rule if the algorithm does not converge.

verbose

Whether the number of iterations used, the critical chi-squared value, and the final chi-squared value are printed to the console. The information will be printed for each set of pairs. For example, if there are two people being added to each household, the information will be printed twice. The default is FALSE, so no information will be printed to the console.

Value

A list of three data frames $Matched contains the data frame of households containing matched people. All households will be of the specified size. $Existing, if populated, contains the excess people in the existing data frame, who could not be allocated additional people. $Additions, if populated, contains the excess people in the additions data frame who could not be allocated to an existing household.

Examples


library("dplyr")

AdultsID <- IntoSchools %>%
filter(Age > 20) %>%
select(-c(SchoolStatus, SexCode))
set.seed(2)
NoHousehold <- Township %>%
  filter(Age > 20, Relationship == "NonPartnered", !(ID %in% c(AdultsID$ID))) %>%
  slice_sample(n = 1500)

# toy example with few iterations
OldHouseholds <- otherNum(AdultsID, exsid = "ID", exsage = "Age", HHNumVar = "HouseholdID",
                          NoHousehold, addid = "ID", addage = "Age", numadd = 2, sdused = 3,
                          userseed=4, attempts= 10, numiters = 80)
CompletedHouseholds <- OldHouseholds$Matched # will match even if critical p-value not met
IncompleteHouseholds <- OldHouseholds$Existing # no-one available to match in
UnmatchedOthers <- OldHouseholds$Additions # all people not in households were matched

[Package PopulateR version 1.13 Index]