trainset9_rdp {phylotypr} | R Documentation |
RDP training set v9
Description
The sequence and taxonomy data for the 10,049 sequences found in the
Ribosomal Database Project's trainset9_032012 training set for use with the
naive Bayesian classifier as implemented in the {phylyotypr}
R package.
Originally released by the RDP in September 2012. The rdp
version contains
the same sequences as provided by the official RDP version (9,665 bacterial
and 384 archaeal). The pds
version contains extra eukaryotic sequences
including 119 chloroplasts and mitochondria (10,168 total sequences). See the
mothur reference file page in "Sources" for more information. Be sure to see
the mothur GitHub project where you can find the phylotyprrefdata package
(https://github.com/mothur/phylotyprrefdata) for access to other taxonomic
reference data.
Usage
trainset9_rdp
trainset9_pds
Format
A data frame with 3 columns. Each row represents a different sequence:
- id
Sequence accession identifier
- sequence
DNA sequence string
- taxonomy
Taxonomic string with each level separated with a
;
An object of class data.frame
with 10169 rows and 3 columns.
Source
-
RDP sourceforge page # nolint: line_length_linter