pairmultNum {PopulateR}R Documentation

Create many-to-one pairs, when there are existing households

Description

Creates a data frame of many-to-one pairs, based on a distribution of age differences. Designed to match multiple children to the same parent, the function can be used for any situation where a many-to-one match is required based on a range of age differences. For clarity and brevity, the terms "children" and "parents" will be used. Two data frames are required: one for children and one for potential parents. The data frame of potential parents must contain household identifiers The minimum and maximum ages of parents must be specified. This ensures that there are no parents who were too young (e.g. 11 years) or too old (e.g. 70 years) at the time the child was born. The presence of too young and too old parents is tested throughout this function. Thus, pre-cleaning the parents data frame is not required. Both data frames must be restricted to only those people that will be paired.

Usage

pairmultNum(
  children,
  chlid,
  chlage,
  numchild = 2,
  twinprob = 0,
  parents,
  parid,
  parage,
  minparage = NULL,
  maxparage = NULL,
  HHNumVar = NULL,
  userseed = NULL,
  maxdiff = 1000
)

Arguments

children

The data frame containing the children to be paired with a parent/guardian.

chlid

The variable containing the unique ID for each person,in the children data frame.

chlage

The age variable, in the children data frame.

numchild

The number of children that are required in each household.

twinprob

The probability that a person is a twin.

parents

The data frame containing the potential parents.(This data frame must contain at least the same number of observations as the children data frame.)

parid

The variable containing the unique ID for each person,in the parents data frame.

parage

The age variable, in the parent data frame.

minparage

The youngest age at which a person becomes a parent. The default value is NULL, which will cause the function to stop.

maxparage

The oldest age at which a person becomes a parent. The default value is NULL, which will cause the function to stop.

HHNumVar

The name of the household identifier variable in the parents data frame.

userseed

If specified, this will set the seed to the number provided. If not, the normal set.seed() function will be used.

maxdiff

The maximum age difference for the children in a household ages. This is applied to the first child randomly selected for the household, so overall age differences may be 2* maxdiff. Default value is no constraints on child age differences in the household.

Value

A list of three data frames. $Matched contains the data frame of child-parent matches. $Adults contains any unmatched observations from the parents data frame. $Children contains any unmatched observations from the children data frame. $Adults and/or $Children may be empty data frames.

Examples


library(dplyr)

set.seed(1)
Parents <- Township %>%
  filter(Relationship == "Partnered", Age > 18) %>%
  slice_sample(n = 500) %>%
  mutate(Household = row_number())
Children <- Township %>%
  filter(Relationship == "NonPartnered", Age < 20) %>%
  slice_sample(n = 400)

# example with assigning two children to a parent
# the same number of children is assigned to all parents
# adding two children to each parent

ChildMatched <- pairmultNum(Children, chlid = "ID", chlage = "Age", numchild = 2, twinprob = 0.03,
                            Parents, parid = "ID", parage = "Age", minparage = 18, maxparage = 54,
                            HHNumVar = "Household", userseed =4, maxdiff = 3)
MatchedFamilies <- ChildMatched$Matched
UnmatchedChildren <- ChildMatched$Children
UnmatchedAdults <- ChildMatched$Adults

[Package PopulateR version 1.13 Index]