pairmult {PopulateR} | R Documentation |
Create many-to-one pairs of people and place them into households
Description
Creates a data frame of many-to-one pairs, based on a distribution of age differences. Designed to match multiple children to the same parent, the function can be used for any situation where a many-to-one match is required based on a range of age differences. For clarity and brevity, the terms "children" and "parents" will be used. Two data frames are required: the first contains the people representing the many (e.g children). The second contains the people that will be paired with multiple others (e.g. the parents of two or more children). The minimum and maximum ages of parents must be specified. This ensures that there are no parents who were too young (e.g. 11 years) or too old (e.g. 70 years) at the time the child was born. The presence of too young and too old parents is tested throughout this function. Thus, pre-cleaning the parents data frame is not required. Both data frames must be restricted to only those people that will be paired.
Usage
pairmult(
children,
chlid,
chlage,
numchild = 2,
twinprob = 0,
parents,
parid,
parage,
minparage = NULL,
maxparage = NULL,
HHStartNum = NULL,
HHNumVar = NULL,
userseed = NULL,
maxdiff = 1000
)
Arguments
children |
The data frame containing the children to be paired with a parent/guardian. |
chlid |
The variable containing the unique ID for each person,in the children data frame. |
chlage |
The age variable, in the children data frame. |
numchild |
The number of children that are required in each household. |
twinprob |
The probability that a person is a twin. |
parents |
The data frame containing the potential parents.(This data frame must contain at least the same number of observations as the children data frame.) |
parid |
The variable containing the unique ID for each person,in the parents data frame. |
parage |
The age variable, in the parent data frame. |
minparage |
The youngest age at which a person becomes a parent. The default value is NULL, which will cause the function to stop. |
maxparage |
The oldest age at which a person becomes a parent. The default value is NULL, which will cause the function to stop. |
HHStartNum |
The starting value for HHNumVar. Must be numeric. |
HHNumVar |
The name for the household variable. |
userseed |
If specified, this will set the seed to the number provided. If not, the normal set.seed() function will be used. |
maxdiff |
The maximum age difference for the children in a household ages. This is applied to the first child randomly selected for the household, so overall age differences may be 2* maxdiff. Default value is no constraints on child age differences in the household. |
Value
A list of three data frames. $Matched contains the data frame of child-parent matches. $Adults contains any unmatched observations from the parents data frame. $Children contains any unmatched observations from the children data frame. $Adults and/or $Children may be empty data frames.
Examples
library(dplyr)
set.seed(1)
Parents <- Township %>%
filter(Relationship == "Partnered", Age > 18) %>%
slice_sample(n = 500)
Children <- Township %>%
filter(Relationship == "NonPartnered", Age < 20) %>%
slice_sample(n = 400)
# example with assigning two children to a parent
# the same number of children is assigned to all parents
# adding two children to each parent
ChildMatched <- pairmult(Children, chlid = "ID", chlage = "Age", numchild = 2, twinprob = 0.03,
Parents, parid = "ID", parage = "Age", minparage = 18, maxparage = 54,
HHStartNum = 1, HHNumVar = "Household", userseed=4, maxdiff = 3)
MatchedFamilies <- ChildMatched$Matched