[Inchworm-users] Max Kmer size
Brought to you by:
bhaas
From: Greg C. <gco...@gm...> - 2011-02-25 17:23:25
|
Hi Brian, Thanks for the awesome resource! I've been using the previous version of inchworm (inchworm_01-20-2011<http://sourceforge.net/projects/inchworm/files/OLD_VERSIONS/inchworm_01-20-2011.tgz/download>) for de novo assembly of transcriptomic data from a non-model organism with no reference genome available. So far my success has been great, both in terms of transcript length and maximum memory requirements (which crippled my Velvet/Oases assembly) My strategy so far has been to run inchworm on the raw data with a range of Kmer sizes, from 25-37, concatenate those outputs and reassemble using the '--reassembleIworm' option. This brings me to my two questions so far: 1. I had to modify the source to allow a max Kmer size >31. Is there a particular reason for this limit? 2. I doubt the strategy i'm using is "the best" one, however from one lane of a flow cell (~5 Gb raw illumina data (2x76bp * 34E06 reads)) I was able to generate ~120 Mb of consensus sequence representing >500,000 contigs/transcripts (above a 100bp threshold). Should I be doing anything differently to maximize Inchworm's potential to assemble transcripts? So far the lengths are pretty good, with >8,000 transcripts longer than 1,000bps, and a few in the 10,000bp range. Also, I'm hoping to submit this data for publication in the near future, is there an ETA on a date for a publication that I can cite? Aloha! Gregory T. Concepcion, PhD Cell Biology and Molecular Genetics 2107 Biosciences Research Building University of Maryland College Park, MD 20742 w:301.405.8300 c:301.828.8210 |