otherNum {PopulateR} | R Documentation |
Match people into existing households
Description
Creates a data frame of household inhabitants, with the specified number of inhabitants. Two data frames are required. The 'existing' data frame contains the people already in households. The 'additions' data frame contains the people. The use of an age distribution for the matching ensures that an age structure is present in the households. A less correlated age structure can be produced by entering a larger standard deviation. The output data frame of matches will only contain households of the required size.
Usage
otherNum(
existing,
exsid,
exsage,
HHNumVar = NULL,
additions,
addid,
addage,
numadd = NULL,
sdused = NULL,
userseed = NULL,
attempts = 10,
numiters = 10000,
verbose = FALSE
)
Arguments
existing |
A data frame containing the people already in households. |
exsid |
The variable containing the unique ID for each person, in the existing data frame. |
exsage |
The age variable, in the existing data frame. |
HHNumVar |
The household identifier variable. This must exist in only one data frame. |
additions |
A data frame containing the people to be added to the existing households. |
addid |
The variable containing the unique ID for each person, in the additions data frame. |
addage |
The age variable, in the additions data frame. |
numadd |
The number of people to be added to the household. |
sdused |
The standard deviation of the normal distribution for the distribution of ages in a household. |
userseed |
The user-defined seed for reproducibility. If left blank the normal set.seed() function will be used. |
attempts |
The number of times the function will randomly change two matches to improve the fit. |
numiters |
The maximum number of iterations used to construct the household data frame. This has a default value of 10000, and is the stopping rule if the algorithm does not converge. |
verbose |
Whether the number of iterations used, the critical chi-squared value, and the final chi-squared value are printed to the console. The information will be printed for each set of pairs. For example, if there are two people being added to each household, the information will be printed twice. The default is FALSE, so no information will be printed to the console. |
Value
A list of three data frames $Matched contains the data frame of households containing matched people. All households will be of the specified size. $Existing, if populated, contains the excess people in the existing data frame, who could not be allocated additional people. $Additions, if populated, contains the excess people in the additions data frame who could not be allocated to an existing household.
Examples
library("dplyr")
AdultsID <- IntoSchools %>%
filter(Age > 20) %>%
select(-c(SchoolStatus, SexCode))
set.seed(2)
NoHousehold <- Township %>%
filter(Age > 20, Relationship == "NonPartnered", !(ID %in% c(AdultsID$ID))) %>%
slice_sample(n = 1500)
# toy example with few iterations
OldHouseholds <- otherNum(AdultsID, exsid = "ID", exsage = "Age", HHNumVar = "HouseholdID",
NoHousehold, addid = "ID", addage = "Age", numadd = 2, sdused = 3,
userseed=4, attempts= 10, numiters = 80)
CompletedHouseholds <- OldHouseholds$Matched # will match even if critical p-value not met
IncompleteHouseholds <- OldHouseholds$Existing # no-one available to match in
UnmatchedOthers <- OldHouseholds$Additions # all people not in households were matched