

Aalto Cloud Software Program

Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools in the vein of SAMtools.

The file formats currently supported are BAM, SAM, FASTQ, FASTA, QSEQ, BCF, and VCF.

For a longer high-level description of Hadoop-BAM, refer to the article "Hadoop-BAM: directly manipulating next generation sequencing data in the cloud" in Bioinformatics Volume 28 Issue 6 pp. 876-877, also available online at:

Note that the library part of Hadoop-BAM is primarily intended for developers with experience in using Hadoop. The command line tools of Hadoop-BAM should be understandable to all users, but they are limited in scope. See the SeqPig project for a more versatile and higher-level interface to the file formats supported by Hadoop-BAM:

Project Admins:


  • Pratap Mutadak

    Pratap Mutadak - 2013-10-22

    Hi , anybody could please tell me download location of Hadoop-BAM installation PDf guide

    • Andre Schumacher

      Hi, what PDF guide are you referring to? There is a file README.txt that should explain the installation.


Log in to post a comment.