SPA: a short peptide assembler for metagenomic data
The metagenomic paradigm offers the opportunity to study protein families, and therefore the metabolic and functional potential, of the constituent microbes in a community. A nucleotide assembly-based strategy does not fare much better since metagenomic assemblies are typically very fragmented and also leave a large fraction of reads unassembled. We present a method for reconstructing complete protein sequences directly from NGS metagenomic data. Our framework is based on a novel Short Peptide...