Thread: [Inchworm-users] Max Kmer size

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi Brian,

Thanks for the awesome resource!

I've been using the previous version of inchworm
(inchworm_01-20-2011<http://sourceforge.net/projects/inchworm/files/OLD_VERSIONS/inchworm_01-20-2011.tgz/download>)
for de novo assembly of transcriptomic data from a non-model organism with
no reference genome available. So far my success has been great, both in
terms of transcript length and maximum memory requirements (which crippled
my Velvet/Oases assembly)

My strategy so far has been to run inchworm on the raw data with a range of
Kmer sizes, from 25-37, concatenate those outputs and reassemble using the
'--reassembleIworm' option.
This brings me to my two questions so far:
1. I had to modify the source to allow a max Kmer size >31. Is there a
particular reason for this limit?
2. I doubt the strategy i'm using is "the best" one, however from one lane
of a flow cell (~5 Gb raw illumina data (2x76bp * 34E06 reads)) I was able
to generate ~120 Mb of consensus sequence representing >500,000
contigs/transcripts (above a 100bp threshold). Should I be doing anything
differently to maximize Inchworm's potential to assemble transcripts? So far
the lengths are pretty good, with >8,000 transcripts longer than 1,000bps,
and a few in the 10,000bp range.

Also, I'm hoping to submit this data for publication in the near future, is
there an ETA on a date for a publication that I can cite?

Aloha!

Gregory T. Concepcion, PhD
Cell Biology and Molecular Genetics
2107 Biosciences Research Building
University of Maryland
College Park, MD 20742

w:301.405.8300
c:301.828.8210

Thread: [Inchworm-users] Max Kmer size

inchworm-users