Menu

RDP Classifier 1.0 released

The RDP Classifier is a naïve Bayesian classifier that can rapidly and accurately provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. More information can be found at http://rdp.cme.msu.edu/.

Environmental gene library analysis has become a mainstream tool in microbial ecology due to the rapid improvements in high-throughput sequencing technology. The small-subunit rRNA gene is often targeted for library construction because of its uses as a phylogenetic marker and in bacterial identification. However, tools for simple and rapid identification of the large numbers of sequences have been lacking. Here we present a naïve Bayesian classifier (RDP Classifier) which can rapidly and accurately classify bacterial 16s rRNA sequences into the new higher-order taxonomy proposed by Bergey’s Trust. It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The RDP Classifier is suitable both for analysis of single rRNA sequences and for analysis of libraries of thousands of sequences. This Classifier is fast, does not require sequence alignment and works well with partial sequences.

The Classifier and related software were written in Java (API v1.4.1) and have been tested on the Solaris (2.8), Linux (2.4.23) and Macintosh (OS 10.3) operating systems using Java virtual machines from Sun and Apple.

The compiled RDP Classifier library includes the data from Bergey's bacterial taxonomy outline release 5.0 as the default trained data. To run the Classifier with the default settings, simply provide a file with one or more bacterial 16S rRNA sequences in FASTA, Genbank or EMBL format. The classification results will be written to an output file.

The RDP Classifier is not limited to using the bacterial taxonomy proposed by the Bergey's editors. It worked equally well when trained on the NCBI taxonomy. Two files are needed to train the Classifier: one taxonomy file describing the relationships between the taxa, and another file containing trusted seqeunces, each labeled with taxonomic assignments, from domain to genus level.

To continue improving RDP Classifier, we now need your help in testing them. Please download the package and try to use it. Please contact rdpstaff@msu.edu or (517) 432-4998 if you have any questions or suggestions.

This research was supported by the Office of Science (BER), U.S. Department of Energy, Grant No. DE-FG02-99ER62848 and the National Science Foundation, Grant No. DBI-0328255.

Posted by Qiong Wang 2006-11-30

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.