SPA: a short peptide assembler for metagenomic data

Add a Review
1 Download (This Week)
Last Update:
Download sfaspa-0.2.1-build64.tar.gz
Browse All Files
BSD Linux


The metagenomic paradigm offers the opportunity to study protein families, and therefore the metabolic and functional potential, of the constituent microbes in a community. A nucleotide assembly-based strategy does not fare much better since metagenomic assemblies are typically very fragmented and also leave a large fraction of reads unassembled. We present a method for reconstructing complete protein sequences directly from NGS metagenomic data. Our framework is based on a novel Short Peptide Assembler (SPA) that assembles protein sequences from their constituent peptide fragments identified on short reads. We also present a new implementation of SPA based on suffix array (SFA-SPA) which runs significantly faster than SPA.

Youngik Yang, Cuncong Zhong, and Shibu Yooseph*
J. Craig Venter Institute, San Diego, CA
* Corresponding author

SPA is available in binary on 64 bit Linux OS.
SFA-SPA is available in binary and source for 64 bit Linux OS.

SPA Web Site


Write a Review

User Reviews

Be the first to post a review of SPA!

Additional Project Details

Intended Audience


Programming Language

Perl, C++


Screenshots can attract more users to your project.
Features can attract more users to your project.