|
From: Artem T. <lom...@gm...> - 2012-08-09 10:00:46
|
Dear all, I'm glad to announce a new library for working with data in BAM/SAM formats, named sambamba (Swahili - 'parallel'). It started as a GSoC project in May of this year, and now is quite stable, plus a set of tools was developed on top of it. The library is written in D programming language and is easy to build upon, your contributions are welcome! I've tried to summarize in which way my tools are different from samtools, and have done some speed comparisons as well: https://github.com/lomereiter/sambamba/wiki/Comparison-with-samtools As of now, the toolkit provides alternatives to view, index, merge, sort, and flagstat commands. All except 'view' work only with BAM, not much attention was paid to its text counterpart. Features: * all tools do BGZF compression and decompression in parallel, making significant speedup on multicore machines * flexible filtering with expressions like "mapping_quality > 40 and [NM] == 0" * optional progressbar (implemented for BAM only) * option to output reads in JSON (for scripting purposes) Debian packages are available at https://github.com/lomereiter/sambamba/downloads I'd appreciate your feedback, and in-the-wild testing of these tools :) |