prep {CNVreg}R Documentation

Prepare Data for Analysis

Description

Required preprocessing of analysis data. Function converts an individual's CNV events within a genomic region (from one chromosome) to a CNV profile curve, further processes it as CNV fragments, and filter out rare fragments. In addition, the adjacency relationship between CNV fragments is analyzed and weight matrices are generated. The resulting 'WTsmth.data' object, is provided as input to the regression analysis.

Usage

prep(CNV, Y, Z = NULL, rare.out = 0.05)

Arguments

CNV

A data.frame in PLINK format. Specifically, must contain columns:

  • "ID": character, unique identity for each sample

  • "CHR": integer, allowed range 1-22 NOTE: only 1 CHR can be present, which means this function processes one chromosome at a time.

  • "BP1": integer, CNV event starting position,

  • "BP2": integer, CNV event ending position, each record must have BP1 <= BP2, CNV at least 1bp (or other unit length)

  • "TYPE": integer, range 0, 1, 3, 4, and larger allowed, i.e., 2 is not allowed.

Y

A data.frame. Must include column "ID". Must have 2 columns. For binary, values must be 0 (control) or 1 (case). For continuous, values must be real number. Y$ID must contain all unique CNV$ID. Y and Z have the same IDs.

Z

A data.frame. Must include column "ID". All other columns are covariates, which can be continuous, binary, or categorical variables. At a minimum, Z must contain all unique CNV$ID values.

rare.out

A scalar numeric in the range [0, 0.5); event rates below this value are filtered out of the data.

Value

An S3 object of class "WTsmth.data" extending a list object containing

Examples

# Note we use here a very small example data set to expedite examples. 

# load toy dataset
data("CNVCOVY")

## Continuous outcome Y_QT
frag_data <- prep(CNV = CNV, Y = Y_QT, Z = Cov, rare.out = 0.05)


[Package CNVreg version 1.0 Index]