From: Walenz, B. <wa...@nb...> - 2015-03-23 20:32:11
|
Congratulations! You're the first to try (or announce trying) more than 2 billion reads. We've run assemblies with up to 2 billion reads, but never more. I forget why we didn't allow up to 4 billion reads. Aside from silly printing errors like this, the are probably places where -1 is used to indicate an invalid ID, or where exactly 31 bits were available to store an ID. I think I can spend a little time running a mock assembly with ~ 4 billion reads. This should catch most of the obvious issues. The fix for this issue is simple. The easiest for you would be to change that perl function to always return 3902080154 (the number of reads you actually have). You can try continuing with that hack, or wait until I can run the mock assembly. The only risk is lost compute; every step of the assembler can be undone. b ________________________________ From: Langhorst, Brad [Lan...@ne...] Sent: Sunday, March 22, 2015 5:09 PM To: wgs...@li... Subject: [wgs-assembler-users] last fragment id < 0 ... runCA can't match the output Hi: My frag store has finally been built, but runCA can’t continue because it can’t figure out the number of frags in the store. Is it normal for the endIID to be < 0? gatekeeper -lastfragiid deer.gkpStore Last frag in store is iid = -392887142 runCA can’t handle that. in getNumberOfFragsInStore, this regex fails to match due to the “-" $numFrags = $1 if (m/^Last frag in store is iid = (\d+)$/); I could fix that, but it doesn’t seem very reasonable to report a negative value from a function that is supposed to be counting the number of frags in a store. Should I fix the getNumberOfFragsInStore function to just parse this file and return the “active” column? or is this an indication of some deeper problem? Where should I look next? Thanks, Brad Here is the .info file: libIID bgnIID endIID active deleted mated totLen clrLen libName 0 1 -392887142 3902080154 0 3902080154 589214103254 589214103254 GLOBAL 0 0 0 0 0 0 0 0 LegacyUnmatedReads 1 1 1321836050 1321836050 0 1321836050 199597243550 199597243550 run15 2 1321836051 2564548948 1242712898 0 1242712898 187649647598 187649647598 run16 3 2564548949 3902080154 1337531206 0 1337531206 201967212106 201967212106 run17 -- Brad Langhorst, Ph.D. Applications and Product Development Scientist |