process_antigenic_data {topolow} | R Documentation |
Process Raw Antigenic Assay Data
Description
Processes raw antigenic assay data from CSV files into standardized long and matrix formats. Handles both titer data (which needs conversion to distances) and direct distance measurements like IC50. Preserves threshold indicators (<, >) and handles repeated measurements by averaging.
Usage
process_antigenic_data(
file_path,
antigen_col,
serum_col,
value_col,
is_titer = TRUE,
metadata_cols = NULL,
id_prefix = FALSE,
base = NULL,
scale_factor = 10
)
Arguments
file_path |
Character. Path to CSV file containing raw data. |
antigen_col |
Character. Name of column containing virus/antigen identifiers. |
serum_col |
Character. Name of column containing serum/antibody identifiers. |
value_col |
Character. Name of column containing measurements (titers or distances). |
is_titer |
Logical. Whether values are titers (TRUE) or distances like IC50 (FALSE). |
metadata_cols |
Character vector. Names of additional columns to preserve. |
id_prefix |
Logical. Whether to prefix IDs with V/ and S/ (default: TRUE). |
base |
Numeric. Base for logarithm transformation (default: 2 for titers, e for IC50). |
scale_factor |
Numeric. Scale factor for titers (default: 10). |
Details
The function handles these key steps:
Reads and validates input data
Transforms values to log scale
Converts titers to distances if needed
Averages repeated measurements
Creates standardized long format
Creates distance matrix
Preserves metadata and threshold indicators
Preserves virusYear and serumYear columns if present
Input requirements and constraints:
CSV file must contain required columns
Column names must match specified parameters in the function input
Values can include threshold indicators (< or >)
Metadata columns must exist if specified
Allowed Year-related column names are "virusYear" and "serumYear"
Value
A list containing two elements:
long |
A |
matrix |
A numeric |
Examples
# Locate the example data file included in the package
file_path <- system.file("extdata", "example_titer_data.csv", package = "topolow")
# Check if the file exists before running the example
if (file.exists(file_path)) {
# Process the example titer data
results <- process_antigenic_data(
file_path,
antigen_col = "virusStrain",
serum_col = "serumStrain",
value_col = "titer",
is_titer = TRUE,
metadata_cols = c("cluster", "color")
)
# View the long format data
print(results$long)
# View the distance matrix
print(results$matrix)
}