org_fams {wpeR} | R Documentation |
Organize animals into families and expand pedigree data
Description
Takes pedigree data from get_colony()
or get_ped()
function and groups animals into families.
It also expands the pedigree data by adding information about the family that each individual was born in and the
family in which the individual is the reproductive animal.
Usage
org_fams(ped, sampledata, output = "both")
Arguments
ped |
Data frame. |
sampledata |
Data frame. Metadata for all genetic samples that belong
to the individuals included in pedigree reconstruction analysis.
This data frame should adhere to the formatting and naming conventions
outlined in the |
output |
Character string. Determines the format of the output. Options are: "ped": returns an extended pedigree data frame. "fams": returns a table of all families present in the pedigree. "both": returns a list with two data frames: "ped" and "fams". (Default) |
Details
The result of org_fams()
function introduces us to two important concepts
within the context of this package: family and half-sib group. A family in the
output of this function is defined as a group of animals where at least one
parent and at least one offspring is known. A half-sib group refers to a
group of half-siblings, either maternally or paternally related. In the
function output the DadHSgroup
groups paternal half-siblings and MomHSgroup
maternal half-siblings.
The fams
output dataframe contains famStart
and famEnd
columns, which estimate
a time window for the family based solely on sample collection dates provided in sampledata
.
famStart
marks the date of the earliest sample collected from any offspring
belonging to that family. famEnd
indicates the date of the latest sample collected
from either the mother or the father of that family. It is important to recognize that this
method relies on observation (sampling) times. Consequently, famEnd
(last parental sample date)
can precede famStart
(first offspring sample date), creating a biologically impossible sequence
and a negative calculated family timespan. Users should interpret the interval
between famStart
and famEnd
with this understanding.
Value
Depending on the output
parameter, the function returns either a data frame
(ped
or fams
) or a list containing both data frames (ped
and fams
).
-
ped
data frame. An extended version of the pedigree data fromget_colony()
/get_ped()
. In addition to common pedigree information (individual, mother, father, sex, family),ped
includes columns for:-
parents
: Identifier codes of both parents separated with_
. -
FamID
: Numeric identifier for the family to which the individual belongs (seefams
below). -
FirstSeen
: Date of first sample of individual. -
LastSeen
: Date of last sample of individual. -
IsDead
: Logical value (TRUE/FALSE
) that identifies if the individual is dead. -
DadHSgroup
: Identifier of paternal half-sib group (see Details). -
MomHSgroup
: Identifier of maternal half-sib group (see Details). -
hsGroup
: Numeric value indicating if the individual is part of a half-sib group (see Details).
-
-
fams
data frame includes information on families that individuals in the pedigree belong to. The families are described by:-
parents
: Identifier codes of both parents separated with_
. -
father
: Identifier code of the father. -
mother
: Identifier code of the mother. -
FamID
: Numeric identifier for the family. -
famStart
: Date when the first sample of one of the offspring from this family was collected (see Details). -
famEnd
: Date when the last sample of mother or father of this family was collected (see Details). -
FamDead
: Logical value (TRUE/FALSE
) indicating if the family no longer exists. -
DadHSgroup
: Identifier connecting families that share the same father. -
MomHSgroup
: Identifier connecting families that share the same mother. -
hsGroup
: Numeric value connecting families that share one of the parents.
-
Examples
# Prepare the data for usage with org_fams() function.
# Get animal timespan data using the anim_timespan() function.
animal_ts <- anim_timespan(
wolf_samples$AnimalRef,
wolf_samples$Date,
wolf_samples$SType,
dead = c("Tissue")
)
# Add animal timespan to the sampledata
sampledata <- merge(wolf_samples, animal_ts, by.x = "AnimalRef", by.y = "ID", all.x = TRUE)
# Define the path to the pedigree data file.
path <- paste0(system.file("extdata", package = "wpeR"), "/wpeR_samplePed")
# Retrieve the pedigree data from the get_colony function.
ped_colony <- get_colony(path, sampledata, rm_obsolete_parents = TRUE, out = "FamAgg")
# Run the function
# Organize families and expand pedigree data using the org_fams function.
org_fams(
ped = ped_colony,
sampledata = sampledata
)