correct_gc_bias {RCNA} | R Documentation |
correct_gc_bias: Estimate and correct GC bias in coverage
Description
This generic function is used to run to calculate and correct GC-content-based coverage bias
This function optionally estimates and then corrects the GC bias based on a GC-content factor file that is either generated or provided by the user using a sliding window approach. It creates a GC factor file and a corrected coverage file, both of which are placed in the output directory under '/gc'.
This function optionally estimates and then corrects the GC bias based on a GC-content factor file that is either generated or provided by the user using a sliding window approach. It creates a GC factor file and a corrected coverage file, both of which are placed in the output directory under '/gc'.
Usage
correct_gc_bias(obj, ...)
## Default S3 method:
correct_gc_bias(
obj = NULL,
df = NULL,
sample.names = NULL,
ano.file,
out.dir = NULL,
ncpus = 1,
file.raw.coverage = NULL,
file.corrected.coverage = NULL,
file.gc.factor = NULL,
win.size = 75,
gc.step = 0.01,
estimate_gc = TRUE,
verbose = FALSE,
...
)
## S3 method for class 'RCNA_object'
correct_gc_bias(obj, verbose = FALSE, ...)
Arguments
obj |
A RCNA_object type object - parameters will be pulled from the object instead, specifically from the 'gcParams' slot. |
... |
Additional arguments (unused) |
df |
Path to the config file, or a 'data.frame' object containing the valid parameters. Valid column names are 'file.raw.coverage', 'file.gc.factor', 'file.corrected.coverage', and 'sample.names'. Additional columns will be ignored. |
sample.names |
Character vector of sample names. Alternatively can be specified in 'df'. |
ano.file |
Location of the annotation file. This file must be in CSV format and contain the following information (with column headers as specified): "feature,chromosome,start,end". |
out.dir |
Output directory for results. A subdirectory for results will be created under this + '/nkr/'. |
ncpus |
Integer number of CPUs to use. Specifying more than one allows this function to be parallelized by feature. |
file.raw.coverage |
Character vector listing the raw input coverage files. Must be the same length as 'sample.names'. Alternatively can be specified in 'df'. |
file.corrected.coverage |
Character vector listing the corrected input coverage files. If not specified new names will be generated based on the raw coverage files. |
file.gc.factor |
Character vector listing the GC factor files used to correct coverage. If 'estimate_gc=FALSE' then this must be provided. Otherwise it is ignored. |
win.size |
Size in base pairs of the sliding window used to estimate and correct the GC bias. |
gc.step |
Bin size for GC bias in the GC factor file. If the GC factor file is provided then the file must have corresponding bin sizes. |
estimate_gc |
Logical determining if GC content estimation should be performed. If set to 'FALSE' then a factor file must be provided via 'file.gc.factor' or in 'df'. |
verbose |
If set to TRUE will display more detail |
Details
This function can be run as a stand-alone or as part of run_RCNA
The 'df' argument corresponds to the 'gcParams' matrix on RCNA_object. Valid column names are 'sample.names', 'file.raw.coverage', 'file.corrected.coverage', and 'file.gc.factor'. The 'file.gc.factor' column is not required if 'estimate_gc=TRUE'. Additional columns will be ignored.
For more parameter information, see estimate_nkr.default.
Value
A RCNA_analysis class object that describes the input parameters and output files generated by this step of the workflow.
A RCNA_analysis class object that describes the input parameters and output files generated by this step of the workflow.
A RCNA_analysis class object that describes the input parameters and output files generated by this step of the workflow.
See Also
RCNA_object, RCNA_analysis, run_RCNA
Examples
## Run GC-bias estimation and correction on example object
# See \link{example_obj} for more information on example
example_obj@ano.file <- system.file("examples" ,"annotations-example.csv",
package = "RCNA")
raw.cov <- system.file("examples", "coverage",
paste0(example_obj@sample.names, ".txt.gz"), package = "RCNA")
example_obj@gcParams$file.raw.coverage <- raw.cov
example_obj
# Create output directory
dir.create(file.path("output", "gc"), recursive = TRUE)
# Estimate and correct GC bias, append results
correct_gc_analysisObj <- correct_gc_bias(example_obj)
example_obj@commands <- c(example_obj@commands, correct_gc_analysisObj)