rune {wizaRdry} | R Documentation |
Parse composite data frame into component data frames by variable prefix
Description
This function takes a data frame containing multiple measures and separates it into individual data frames for each measure detected in the data. It identifies the appropriate identifier column (e.g., participantId, workerId) and splits the data based on column name prefixes.
Usage
rune(df, lower = TRUE)
Arguments
df |
a dataframe containing multiple, prefixed measures |
lower |
default TRUE convert prefixes to lower case |
Details
The function performs the following steps:
Identifies which identifier column to use (participantId, workerId, PROLIFIC_PID, or src_subject_id)
Determines survey prefixes by analyzing column names
Creates separate dataframes for each survey prefix found
Assigns each dataframe to the global environment with names matching the survey prefixes
Value
Creates multiple dataframes in the global environment, one for each survey detected in the data. Each dataframe is named after its survey prefix.
Examples
# Parse a data frame containing multiple surveys
combined_df <- data.frame(
record_id = c("REC001", "REC002", "REC003", "REC004"),
src_subject_id = c("SUB001", "SUB002", "SUB003", "SUB004"),
subjectkey = c("KEY001", "KEY002", "KEY003", "KEY004"),
site = c("Yale", "NU", "Yale", "NU"),
phenotype = c("A", "B", "A", "C"),
visit = c(1, 2, 2, 1),
state = c("complete", "completed baseline", "in progress", NA),
status = c(NA, NA, NA, "complete"),
lost_to_followup = c(FALSE, FALSE, TRUE, NA),
interview_date = c("2023-01-15", "2023/02/20", NA, "2023-03-10"),
foo_1 = c(1, 3, 5, 7),
foo_2 = c("a", "b", "c", "d"),
bar_1 = c(2, 4, 6, 8),
bar_2 = c("w", "x", "y", "z")
)
rune(combined_df)
# After running, access individual survey dataframes directly:
head(foo) # Access the foo dataframe
head(bar) # Access the bar dataframe