Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools.
The file formats currently supported are BAM, SAM, FASTQ, FASTA, and QSEQ.
For a longer high-level description of Hadoop-BAM, refer to the article "Hadoop-BAM: directly manipulating next generation sequencing data in the cloud" in Bioinformatics Volume 28 Issue 6 pp. 876-877, available online at: http://dx.doi.org/10.1093/bioinformatics/bts054
Note that the library part of Hadoop-BAM is mainly for developers with experience in using Hadoop. The command line tools of Hadoop-BAM should be understandable to all users, but they are limited in scope.
See the SeqPig project for a higher-level interface to the file formats supported by Hadoop-BAM: http://seqpig.sourceforge.net
If you are looking for Hadoop-based read alignment tools, consider Seal: http://biodoop-seal.sourceforge.net
Be the first to post a review of Hadoop-BAM!