process_antigenic_data {topolow}R Documentation

Process Raw Antigenic Assay Data

Description

Processes raw antigenic assay data from CSV files into standardized long and matrix formats. Handles both titer data (which needs conversion to distances) and direct distance measurements like IC50. Preserves threshold indicators (<, >) and handles repeated measurements by averaging.

Usage

process_antigenic_data(
  file_path,
  antigen_col,
  serum_col,
  value_col,
  is_titer = TRUE,
  metadata_cols = NULL,
  id_prefix = FALSE,
  base = NULL,
  scale_factor = 10
)

Arguments

file_path

Character. Path to CSV file containing raw data.

antigen_col

Character. Name of column containing virus/antigen identifiers.

serum_col

Character. Name of column containing serum/antibody identifiers.

value_col

Character. Name of column containing measurements (titers or distances).

is_titer

Logical. Whether values are titers (TRUE) or distances like IC50 (FALSE).

metadata_cols

Character vector. Names of additional columns to preserve.

id_prefix

Logical. Whether to prefix IDs with V/ and S/ (default: TRUE).

base

Numeric. Base for logarithm transformation (default: 2 for titers, e for IC50).

scale_factor

Numeric. Scale factor for titers (default: 10).

Details

The function handles these key steps:

  1. Reads and validates input data

  2. Transforms values to log scale

  3. Converts titers to distances if needed

  4. Averages repeated measurements

  5. Creates standardized long format

  6. Creates distance matrix

  7. Preserves metadata and threshold indicators

  8. Preserves virusYear and serumYear columns if present

Input requirements and constraints:

Value

A list containing two elements:

long

A data.frame in long format with standardized columns, including the original identifiers, processed values, and calculated distances. Any specified metadata is also included.

matrix

A numeric matrix representing the processed symmetric distance matrix, with antigens and sera on columns and rows.

Examples

# Locate the example data file included in the package
file_path <- system.file("extdata", "example_titer_data.csv", package = "topolow")

# Check if the file exists before running the example
if (file.exists(file_path)) {
  # Process the example titer data
  results <- process_antigenic_data(
    file_path,
    antigen_col = "virusStrain",
    serum_col = "serumStrain", 
    value_col = "titer",
    is_titer = TRUE,
    metadata_cols = c("cluster", "color")
  )

  # View the long format data
  print(results$long)
  # View the distance matrix
  print(results$matrix)
}

[Package topolow version 1.0.0 Index]