From: Waldbieser, G. <Geo...@AR...> - 2013-08-09 20:09:03
|
Hi, CA v7.0 (build assembled 3.3M pacBioToCA-corrected long reads into 146,537 unitigs (~1Gb) in 24 hrs using 62cpu and 512GB RAM. $> /home/software/wgs1/Linux-amd64/bin/runCA -version CA version CVS TIP ($Id: AS_GKP_main.c,v 1.105 2012-09-13 17:41:13 skoren Exp $). CA version CVS TIP ($Id: AS_CGB_unitigger.c,v 1.45 2011-09-06 02:15:18 mkotelbajcvi Exp $). CA version CVS TIP ($Id: BuildUnitigs.cc,v 1.88 2012-01-15 23:49:34 brianwalenz Exp $). Using up to 64 OpenMP threads. CA version CVS TIP ($Id: AS_CGW_main.c,v 1.116 2012-11-15 05:04:54 brianwalenz Exp $). CA version CVS TIP ($Id: terminator.C,v 1.17 2012-09-10 08:58:11 brianwalenz Exp $). I then added Illumina mate pairs that had been corrected, deduplicated, and chimeric reads removed. Library Pairs Illumina MP, 3kb insert 4,455,475 Illumina MP, 8kb insert 2,834,858 Illumina MP, 36kb insert 1,222,830 When these were added, CA spent a few days in splitUnitigs before failing. ------------------- $> tail -60 splitUnitigs.out.FAILED Creating new unitig 8643669 with 28 fragments unitig 2478 interval 0 0,920 good unitig 2478 interval 1 920,1123 bad unitig 2478 interval 2 1124,1428 good Fixing contains. prev 1143,1230 -- 16110937 1220,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 11823198 1220,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 14595211 1217,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 13147242 1213,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 22697521 1230,1145 (no overlap to new 1193,1413) prev 1143,1230 -- 4687598 1145,1220 (no overlap to new 1193,1413) Creating new unitig 8643670 with 24 fragments Creating new unitig 8643671 with 8 fragments Creating new unitig 8643672 with 1 fragments unitig 2494 interval 0 0,16451 good unitig 2494 interval 1 16451,16502 bad unitig 2494 interval 2 16503,48319 good Creating new unitig 8643673 with 265 fragments Creating new unitig 8643674 with 2 fragments Creating new unitig 8643675 with 497 fragments unitig 2504 interval 0 0,18635 good unitig 2504 interval 1 18635,18769 bad unitig 2504 interval 2 18770,26279 good Creating new unitig 8643676 with 295 fragments Creating new unitig 8643677 with 2 fragments splitUnitigs: MultiAlignUnitig.C:469: int unitigConsensus::computePositionFromParent(bool): Assertion `cnspos[tiid].bgn < cnspos[tiid].end' failed. Failed with 'Aborted' Backtrace (mangled): /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z17AS_UTL_catchCrashiP7siginfoPv+0x23)[0x410a13] /lib64/libpthread.so.0(+0xfd00)[0x7f55fe5bdd00] /lib64/libc.so.6(gsignal+0x35)[0x7f55fe252d95] /lib64/libc.so.6(abort+0x17b)[0x7f55fe2542ab] /lib64/libc.so.6(+0x2d8fe)[0x7f55fe24b8fe] /lib64/libc.so.6(+0x2d9a2)[0x7f55fe24b9a2] /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x42bf9f] /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z16MultiAlignUnitigP11MultiAlignTP7gkStoreP11CNS_OptionsPi+0xf0)[0x42fa10] /home/software/wgs1/Linux-amd64/bin/splitUnitigs(main+0x28bd)[0x40c78d] /lib64/libc.so.6(__libc_start_main+0xed)[0x7f55fe23f23d] /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x40cbf9] Backtrace (demangled): [0] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::AS_UTL_catchCrash(int, siginfo*, void*) + 0x23 [0x410a13] [1] /lib64/libpthread.so.0::(null) + 0xfd00 [0x7f55fe5bdd00] [2] /lib64/libc.so.6::(null) + 0x35 [0x7f55fe252d95] [3] /lib64/libc.so.6::(null) + 0x17b [0x7f55fe2542ab] [4] /lib64/libc.so.6::(null) + 0x2d8fe [0x7f55fe24b8fe] [5] /lib64/libc.so.6::(null) + 0x2d9a2 [0x7f55fe24b9a2] [6] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x42bf9f] [7] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::MultiAlignUnitig(MultiAlignT*, gkStore*, CNS_Options*, int*) + 0xf0 [0x42fa10] [8] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::(null) + 0x28bd [0x40c78d] [9] /lib64/libc.so.6::(null) + 0xed [0x7f55fe23f23d] [10] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x40cbf9] GDB: ------------------ After seeing that someone else had a splitUnitig problem, I installed build 4371 and restarted. So far it has run splitUnitigs for 24 hrs and it is currently working on unitig 35551 out of 9254397. The Illumina jump read parameters are: forceBOGunitigger=0 isNotRandom=0 doNotTrustHomopolymerRuns=0 doTrim_initialNone=0 doTrim_initialMerBased=0 doTrim_initialFlowBased=0 doTrim_initialQualityBased=0 doRemoveDuplicateReads=0 doTrim_finalLargestCovered=1 doTrim_finalEvidenceBased=0 doTrim_finalBestEdge=0 doRemoveSpurReads=1 doRemoveChimericReads=1 doConsensusCorrection=1 forceShortReadFormat=1 constantInsertSize=0 fastqQualityValues=sanger fastqOrientation=innie The corrPacBio parameters are: forceBOGunitigger=0 isNotRandom=0 doNotTrustHomopolymerRuns=0 doTrim_initialNone=0 doTrim_initialMerBased=0 doTrim_initialFlowBased=0 doTrim_initialQualityBased=0 doRemoveDuplicateReads=0 doTrim_finalLargestCovered=0 doTrim_finalEvidenceBased=1 doRemoveSpurReads=1 doRemoveChimericReads=1 doConsensusCorrection=1 forceShortReadFormat=0 constantInsertSize=0 fastqQualityValues=sanger fastqOrientation=innie I have declared bogart as the unitigger. I also set "doFragmentCorrection=0" and "doOverlapBasedTrimming = 0" because the Illumina data had already been cleaned and I assumed the Illumina correction of the PacBio reads was an error correction. Is this leading to false joins that the unitigger is identifying and having to correct by splitting? Thanks for any input. Geoff ___________________________________ Geoffrey C. Waldbieser Research Molecular Biologist Warmwater Aquaculture Research Unit Agricultural Research Service United States Department of Agriculture Stoneville, MS 38776 (662) 686-3593 This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. |