other {PopulateR}R Documentation

Match people into new households

Description

This function creates a data frame of household inhabitants, with the specified number of inhabitants. One data frame, containing the people to match, is required. The use of an age distribution for the matching ensures that an age structure is present in the households. A less correlated age structure can be produced by entering a larger standard deviation. The output data frame of matches will only contain households of the required size. If the number of rows in the people data frame is not divisible by household size, the overcount will be output to a separate data frame.

Usage

other(
  people,
  pplid,
  pplage,
  numppl = NULL,
  sdused,
  HHStartNum,
  HHNumVar,
  userseed = NULL,
  ptostop = NULL,
  numiters = 1e+06,
  verbose = FALSE
)

Arguments

people

A data frame containing the people to be matched into households.

pplid

The variable containing the unique ID for each person.

pplage

The age variable.

numppl

The number of people in the households.

sdused

The standard deviation of the normal distribution for the distribution of ages in a household.

HHStartNum

The starting value for HHNumVar. Must be numeric.

HHNumVar

The name for the household variable.

userseed

If specified, this will set the seed to the number provided. If not, the normal set.seed() function will be used.

ptostop

The critical p-value stopping rule for the function. If this value is not set, the critical p-value of .01 is used.

numiters

The maximum number of iterations used to construct the output data frame ($Matched) containing the household inhabitants. The default value is 1000000, and is the stopping rule if the algorithm does not converge.

verbose

Whether the number of iterations used, the critical chi-squared value, and the final chi-squared value are printed to the console. The information will be printed for each set of pairs. For example, if there are three people in each household, the information will be printed twice. The default is FALSE, so no information will be printed to the console.

Value

A list of two data frames $Matched contains the data frame of households containing matched people. All households will be of the specified size. $Unmatched, if populated, contains the people that were not allocated to households. If the number of rows in the people data frame is divisible by the household size required, $Unmatched will be an empty data frame.

Examples

library(dplyr)

# creating three-person households toy example with few iterations
NewHouseholds <- other(AdultsNoID, pplid = "ID", pplage = "Age", numppl = 3, sdused = 3,
                       HHStartNum = 1, HHNumVar = "Household", userseed=4, ptostop = .05,
                       numiters = 500, verbose = TRUE)

PeopleInHouseholds <- NewHouseholds$Matched
PeopleNot <- NewHouseholds$Unmatched      # 2213 not divisible by 3

[Package PopulateR version 1.13 Index]