ABBA: Assembly Boosted By Amino acid sequences
Assembly Boosted By Amino acid sequence is a comparative gene assembler, which uses amino acid sequences from predicted proteins to help build a better assembly. see the journal paper.
NOTE : ABBA does protein assembly but doesn't find the reference proteins to assemble. You will need to find the proteins running off the ends of contigs separately and then pass the proteins to ABBA to fill in the gaps.
- Two ways to find the reference proteins:
- Do a draft annotation of the genome using a annotation pipline. ABBA will not annotate your assembly.
- Align the draft assembly contigs to a close relative and find where the contig ends intersect protein coding regions.
ABBA is built on top of the AMOS framework but has it's own distribution. The AMOS framework is included in the ABBA tarball and will install with AMOS if you don't already have AMOS installed. The tarball here: ftp://ftp.cbcb.umd.edu/pub/data/dsommer/abba.tgz
Salzberg SL, Sommer DD, Puiu D, Lee VT 2008 PLoS Computational Biology 4(9): e1000186 doi:10.1371/journal.pcbi.1000186