You can subscribe to this list here.
2012 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(29) |
May
(8) |
Jun
(5) |
Jul
(46) |
Aug
(16) |
Sep
(5) |
Oct
(6) |
Nov
(17) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(5) |
Feb
(2) |
Mar
(10) |
Apr
(13) |
May
(20) |
Jun
(7) |
Jul
(6) |
Aug
(14) |
Sep
(9) |
Oct
(19) |
Nov
(17) |
Dec
(3) |
2014 |
Jan
(3) |
Feb
|
Mar
(7) |
Apr
(1) |
May
(1) |
Jun
(30) |
Jul
(10) |
Aug
(2) |
Sep
(18) |
Oct
(3) |
Nov
(4) |
Dec
(13) |
2015 |
Jan
(27) |
Feb
|
Mar
(19) |
Apr
(12) |
May
(10) |
Jun
(18) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(9) |
2016 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Hornung, B. <bas...@wu...> - 2013-08-29 06:54:09
|
Hi Brian, thanks again. Somehow the output begins to make less sense now. I've recompiled the assembler with the debug option (don't see any line numbers mentioned anywhere though), and suddenly I get - an assembly without problems with the PacBio data - a failure in gatekeeper (while using runCA...???) if I try to run the matepair data - a failure at read deduplication if I run it with PacBio and matepair data I use the .frg files, which I've produced with the stable version, in case that makes a difference. For the gatekeeper error: bastian@SSB13002:~/Tools/wgs-svn/Linux-amd64/bin$ ./runCA -d test_dir -p test_prefix /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/mate_pair.frg ----------------------------------------START Thu Aug 29 08:32:23 2013 /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper -o /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test_dir/test_prefix.gkpStore.BUILDING -T -F /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/mate_pair.frg > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test_dir/test_prefix.gkpStore.err 2>&1 ----------------------------------------END Thu Aug 29 08:32:55 2013 (32 seconds) ERROR: Failed with signal ABRT (6) ================================================================================ runCA failed. ---------------------------------------- Stack trace: at ./runCA line 1418 main::caFailure('gatekeeper failed', '/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test_dir/test_pre...') called at ./runCA line 1741 main::preoverlap('/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/mate_pair.frg') called at ./runCA line 6116 ---------------------------------------- Last few lines of the relevant log file (/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test_dir/test_prefix.gkpStore.err): [0] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::AS_UTL_catchCrash(int, siginfo*, void*) + 0x34 [0x435c9e] [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 [0x7fc5763c2cb0] [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7fc57602a425] [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7fc57602db8b] [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f0ee [0x7fc5760230ee] [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f192 [0x7fc576023192] [6] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::gkStore::gkStore_computeRanges(unsigned int, unsigned int, long&, long&, long&, long&, long&, long&, long&, long&, long&) + 0x101 [0x4649dd] [7] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::gkStream::reset(unsigned int, unsigned int) + 0x244 [0x45fb24] [8] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::gkStream::gkStream(gkStore*, unsigned int, unsigned int, unsigned int) + 0xc4 [0x45f850] [9] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::gkStoreStats::init(gkStore*) + 0x5a [0x45d844] [10] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::gkStoreStats::gkStoreStats(gkStore*) + 0x23 [0x45d7e7] [11] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::AS_GKP_summarizeErrors(char*) + 0x79 [0x42aaa2] [12] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper::(null) + 0x1bd3 [0x40fed4] [13] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7fc57601576d] [14] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/gatekeeper() [0x40a9b9] GDB: Aborted (core dumped) ---------------------------------------- Failure message: gatekeeper failed Related gkpStore.err attached, related errorLog is empty, related gatekeeper log attached. Permissions (no .gpkStore folder created): bastian@SSB13002:~/Tools/wgs-svn/Linux-amd64/bin$ ls -l test_dir/ total 16 -rw-rw-r-- 1 bastian bastian 3391 Aug 29 08:40 gatekeeper_failure_test_prefix.gkpStore.err drwxr-xr-x 2 bastian bastian 4096 Aug 29 08:32 runCA-logs drwxrwxr-x 2 bastian bastian 4096 Aug 29 08:32 test_prefix.gkpStore.BUILDING -rw-rw-r-- 1 bastian bastian 0 Aug 29 08:32 test_prefix.gkpStore.BUILDING.errorLog -rw-rw-r-- 1 bastian bastian 0 Aug 29 08:32 test_prefix.gkpStore.BUILDING.fastqUIDmap -rw-rw-r-- 1 bastian bastian 3416 Aug 29 08:40 test_prefix.gkpStore.err~ The related data for the error when I run both datasets combined: The command line output is attached. Related gkpStore.err attached, related errorLog is empty, related deduplicate.err attached. Permissions: bastian@SSB13002:~/Tools/wgs-svn/Linux-amd64/bin/test_dir_combined$ ls -l *gkpStore total 9800 -rw-rw-r-- 1 bastian bastian 14800 Aug 29 08:44 clr-NORMAL-01-CLR -rw-rw-r-- 1 bastian bastian 20 Aug 29 08:44 f2p -rw-rw-r-- 1 bastian bastian 177632 Aug 29 08:44 fnm -rw-rw-r-- 1 bastian bastian 80 Aug 29 08:44 fpk -rw-rw-r-- 1 bastian bastian 80 Aug 29 08:44 fsb -rw-rw-r-- 1 bastian bastian 144 Aug 29 08:44 inf -rw-rw-r-- 1 bastian bastian 512 Aug 29 08:44 lib -rw-rw-r-- 1 bastian bastian 80 Aug 29 08:44 plc -rw-rw-r-- 1 bastian bastian 7723868 Aug 29 08:44 qnm -rw-rw-r-- 1 bastian bastian 80 Aug 29 08:44 qpk -rw-rw-r-- 1 bastian bastian 80 Aug 29 08:44 qsb -rw-rw-r-- 1 bastian bastian 1949980 Aug 29 08:44 snm -rw-rw-r-- 1 bastian bastian 80 Aug 29 08:44 ssb -rw-rw-r-- 1 bastian bastian 118452 Aug 29 08:44 u2i -rw-rw-r-- 1 bastian bastian 115 Aug 29 08:44 uid This is somehow getting a bit strange now. And not sure if I'm really helpful at the moment. Still hope to get some more ideas. Thanks already. Bastian ________________________________________ From: Walenz, Brian [bw...@jc...] Sent: Wednesday, August 28, 2013 4:44 PM To: Hornung, Bastian; wgs...@li... Subject: Re: [wgs-assembler-users] Problem with the overlapper, "AS_OVS_openBinaryOverlapFile()-- Failed to open..." It seems to have made it further. It no longer complains about no overlaps, but seems to be failing an assert trying to delete a read. I've never seen that before. Was this a full restart with the new code? Can you post a bit more of the err log, and 'ls -l *gkpStore' (to check for permissions and extraneous files). The most information will be to recompile with debug (cd wgs-svn/src ; rm -rf ../Linux-amd64 ; gmake BUILDDEBUG=1) and rerun. This will populate the crash report with line numbers. b On 8/28/13 7:50 AM, "Hornung, Bastian" <bas...@wu...> wrote: > Sorry, should probably go over the mailing list. > > Thanks for the further help Brian. > Forgot to "make install" (should've read the readme). > It doesn't go further though, just more details in the command line: > > ----------------------------------------START Wed Aug 28 13:43:07 2013 > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate \ > -gkp /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/test.gkpStore \ > -ovs > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.obtStore \ > -ovs > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.dupStore \ > -report > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicat > e.log \ > -summary > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicat > e.summary \ >> /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplica >> te.err 2>&1 > ----------------------------------------END Wed Aug 28 13:43:09 2013 (2 > seconds) > ERROR: Failed with signal ABRT (6) > ============================================================================== > == > > runCA failed. > > ---------------------------------------- > Stack trace: > > at ./runCA line 1418 > main::caFailure('failed to deduplicate the reads', > '/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptri...') called at > ./runCA line 3911 > main::overlapTrim() called at ./runCA line 6118 > > ---------------------------------------- > Last few lines of the relevant log file > (/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplica > te.err): > > > Backtrace (demangled): > > [0] > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::AS_UTL_catchCrash(int > , siginfo*, void*) + 0x24 [0x408824] > [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 [0x7fe93d576cb0] > [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7fe93d1de425] > [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7fe93d1e1b8b] > [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f0ee [0x7fe93d1d70ee] > [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f192 [0x7fe93d1d7192] > [6] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate() [0x40ea54] > [7] > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::gkStore::gkStore_delF > ragment(unsigned int, bool) + 0xd8 [0x40f678] > [8] > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::deleteFragments(gkSto > re*, fragT*) + 0x49 [0x406de9] > [9] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::(null) + 0xa1c > [0x40553c] > [10] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7fe93d1c976d] > [11] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate() [0x4055a9] > > GDB: > > > Aborted (core dumped) > > ---------------------------------------- > Failure message: > > failed to deduplicate the reads > > > Seems that it has nothing to do with the PacBio data, but the mate pair data > causes the crash. > You mentioned that this could mean that there could just not be any duplicated > reads...er...that's good for the library, isn't it? > > Thanks for any further thoughts, > > Bastian > > ________________________________________ > From: Walenz, Brian [bw...@jc...] > Sent: Wednesday, August 28, 2013 12:22 PM > To: Hornung, Bastian > Subject: Re: [wgs-assembler-users] Problem with the overlapper, > "AS_OVS_openBinaryOverlapFile()-- Failed to open..." > > I'm guessing this indicates two problems: > > 1) you don't have kmer installed. > > 2) AS_MER_meryl.cc (an obsolete version that compiles if kmer isn't > installed) is broken. > > http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Check_ou > t_and_Compile > > > On 8/28/13 3:25 AM, "Hornung, Bastian" <bas...@wu...> wrote: > >> Hi Brian, >> >> t runs with both the suggested options, thanks. >> >> I've also tried the latest unstable version, and it doesn't compile (...and >> my >> C++ skills are underwhelming, can't fix it myself...). >> ############################## AS_MER ############################## >> make[1]: *** No rule to make target `AS_global.h', needed by >> `AS_MER_meryl.o'. >> Stop. >> make: *** [objs] Error 1 >> >> I guess I'll now have to do some quality checking... >> >> thanks for the help, >> >> Bastian >> >> ________________________________________ >> From: Walenz, Brian [bw...@jc...] >> Sent: Tuesday, August 27, 2013 5:57 PM >> To: Hornung, Bastian; wgs...@li... >> Subject: RE: [wgs-assembler-users] Problem with the overlapper, >> "AS_OVS_openBinaryOverlapFile()-- Failed to open..." >> >> Hi- >> >> It is complaining that there are no overlaps in the 'dupStore'. This could >> mean store creation failed, but it probably just means there are no duplicate >> reads. >> >> Can you upgrade to the latest 'unstable' version in subversion and rerun? If >> it still fails, I should be able to fix easily. I think it might be fixed >> already. >> >> Alternatively, you can disable all trimming (doOBT=0) or just dedupe >> (doDeDuplication=0). Odd that it worked with just the pacbio! >> >> b >> >> >> ________________________________________ >> From: Hornung, Bastian [bas...@wu...] >> Sent: Tuesday, August 27, 2013 7:16 AM >> To: wgs...@li... >> Subject: [wgs-assembler-users] Problem with the overlapper, >> "AS_OVS_openBinaryOverlapFile()-- Failed to open..." >> >> Hi @all, >> >> sorry if that's the wrong way, but I have a problem with the wgs assembler, >> and not really an idea if there's a bug or if I'm just dumb and doing >> something wrong (first time I use it). >> >> I'm trying to run an assembly with PacBio short reads and illumina mate pair >> reads, and during the assembly runCA aborts, due to a failure in the >> overlapper. >> I've converted my data to 2 .frg files, then I've run gatekeeper with both, >> which worked without errors (well...sort of, had to compile it myself, due to >> https://sourceforge.net/p/wgs-assembler/bugs/253/ ). >> Then I run runCA, which then crashes because it can't deduplicate the reads. >> >> Parts of the command line output: >> >> ----------------------------------------START Tue Aug 27 12:58:34 2013 >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/deduplicate \ >> -gkp >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/PacBio_an>> d >> _mate_pair.gkpStore \ >> -ovs >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.obtStore \ >> -ovs >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.dupStore \ >> -report >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.deduplicate.log \ >> -summary >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.deduplicate.summary \ >>> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overla>>> p >>> trim/PacBio_and_mate_pair.deduplicate.err 2>&1 >> ----------------------------------------END Tue Aug 27 12:58:34 2013 (0 >> seconds) >> ERROR: Failed with signal HUP (1) >> =============================================================================>> = >> == >> >> runCA failed. >> >> ---------------------------------------- >> Stack trace: >> >> at ./runCA line 1237 >> main::caFailure('failed to deduplicate the reads', >> '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkSto...') called at >> ./runCA line 3739 >> main::overlapTrim() called at ./runCA line 5876 >> >> ---------------------------------------- >> Last few lines of the relevant log file >> (/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overla>> p >> trim/PacBio_and_mate_pair.deduplicate.err): >> >> AS_OVS_openBinaryOverlapFile()-- Failed to open >> '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overla>> p >> trim/PacBio_and_mate_pair.dupStore/0001' for reading: No such file or >> directory >> >> ---------------------------------------- >> Failure message: >> >> failed to deduplicate the reads >> >> >> I have absolutely no clue what could be happening here. >> If I run it with only the PacBio data, then it works, so the installation >> itself is okay. >> >> Any advice? >> >> Best regards, >> >> Bastian >> >> >> >> >> >> >> ----------------------------------------------------------------------------->> - >> Introducing Performance Central, a new site from SourceForge and >> AppDynamics. Performance Central is your source for news, insights, >> analysis and resources for efficient Application Performance Management. >> Visit us today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > > > > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Decker, J. E. (MU-Student) <je...@ma...> - 2013-08-29 02:44:23
|
Diego, First I would run this: gatekeeper -dumpfrg PROJECT.gkpStore | grep -v 'No source' > PROJECT.frg In your case, it looks like SE-MT8 is your PROJECT prefix. Then run cavalidate PROJECT with your PROJECT prefix. This will create a bank that you can analyze with FRCurve and Hawkeye. If I am off base, hopefully one of the other list serve members can straighten me out. :-) Thanks, Jared Jared Decker Assistant Professor, Beef Genetics Extension and Computational Genomics Division of Animal Sciences University of Missouri S132B ASRC 920 East Campus Dr. Columbia, MO 65211 Phone 573-882-2504 http://www.linkedin.com/in/jarededecker From: diego [mailto:die...@gm...] Sent: Wednesday, August 28, 2013 5:30 PM To: wgs...@li... Subject: [wgs-assembler-users] problem with celera 7.0 and amos 3.1.0 Hi I'm trying to visualize my celera assembly with Hawkeye, but i get an error when i use toAmos script to parse my .asm file. I tried with two scripts, "toAmos" and "toAmos_new", but i get similar errors. Error with toAmos "$toAmos -f ../C28.frg -a SE-MT8.asm -o - | bank-transcat -m - -b example.bnk -c Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133168. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133175. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133182. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133189. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133196. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133203. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133210. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133217. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133224. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133231. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133238. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133245. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133252. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133259. .." i checked the line 1274 on toAmos script and it has the following sentence: " $seq_range{$iid} = $clrstr;". This error arises from previous line "my $iid = $seqids{$acc};" where $seqids{$acc} is null. I noticed that "$seqids{$acc}" is filled up on the sub "parseFrgFile", on the following sentence: " if ($type eq "FRG") { my $id = getCAId($$fields{acc}); my $iid = $minSeqId++; my $nm = $$fields{src}; my @lines = split('\n', $nm); $nm = $lines[0]; # join('', @lines); if ($byaccession || !defined $nm || $nm =~ /^\s*$/) { $seqnames{$iid} = $id; } else { $seqnames{$iid} = $nm; $seqids{$nm} = $iid; } $seqids{$id} = $iid;" but there isn't any FRG string on the .asm file. Error with toAmos_new "$ toAmos_new ../data/trimmomatic_outputs/Vpkx_unmated.frg -a SE-MP-3.4.asm -b Vpkx.bank" Error fragments 110000762732 are not defined Error fragments 110000596019 are not defined Error fragments 120000810529 are not defined Error fragments 200001924304 are not defined Error fragments 200001469500 are not defined Error fragments 200001648709 are not defined Error fragments 110000674424 are not defined Error fragments 110001085229 are not defined Error fragments 200001936657 are not defined Error fragments 110001088561 are not defined Error fragments 120001030615 are not defined Error fragments 120000286346 are not defined ...." It believe this it's similar to previous error, due to the absence of the FRG string on the .asm output, i think. Despite this, i can get the fasta file with contigs and scaffolds of my assembly. First, I used sff_extract script to get fastq files, then , i converted this files to celera inputs with the fastqtoCA and finally, i executed celera. Could someone help me with this please? i need to transform my celera outputs to AMOS bank to analyze it with Hawkeye. Thanks in advance! PD: sorry for my English. Diego Díaz. |
From: diego <die...@gm...> - 2013-08-28 22:30:01
|
Hi I'm trying to visualize my celera assembly with Hawkeye, but i get an error when i use toAmos script to parse my .asm file. I tried with two scripts, "toAmos" and "toAmos_new", but i get similar errors. Error with toAmos "$toAmos -f ../C28.frg -a SE-MT8.asm -o - | bank-transcat -m - -b example.bnk -c Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133168. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133175. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133182. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133189. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133196. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133203. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133210. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133217. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133224. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133231. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133238. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133245. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133252. Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133259. .." i checked the line 1274 on toAmos script and it has the following sentence: " $seq_range{$iid} = $clrstr;". This error arises from previous line "my $iid = $seqids{$acc};" where $seqids{$acc} is null. I noticed that "$seqids{$acc}" is filled up on the sub "parseFrgFile", on the following sentence: " if ($type eq "FRG") { my $id = getCAId($$fields{acc}); my $iid = $minSeqId++; my $nm = $$fields{src}; my @lines = split('\n', $nm); $nm = $lines[0]; # join('', @lines); if ($byaccession || !defined $nm || $nm =~ /^\s*$/) { $seqnames{$iid} = $id; } else { $seqnames{$iid} = $nm; $seqids{$nm} = $iid; } $seqids{$id} = $iid;" but there isn't any FRG string on the .asm file. Error with toAmos_new "$ toAmos_new ../data/trimmomatic_outputs/Vpkx_unmated.frg -a SE-MP-3.4.asm -b Vpkx.bank" Error fragments 110000762732 are not defined Error fragments 110000596019 are not defined Error fragments 120000810529 are not defined Error fragments 200001924304 are not defined Error fragments 200001469500 are not defined Error fragments 200001648709 are not defined Error fragments 110000674424 are not defined Error fragments 110001085229 are not defined Error fragments 200001936657 are not defined Error fragments 110001088561 are not defined Error fragments 120001030615 are not defined Error fragments 120000286346 are not defined ...." It believe this it's similar to previous error, due to the absence of the FRG string on the .asm output, i think. Despite this, i can get the fasta file with contigs and scaffolds of my assembly. First, I used sff_extract script to get fastq files, then , i converted this files to celera inputs with the fastqtoCA and finally, i executed celera. Could someone help me with this please? i need to transform my celera outputs to AMOS bank to analyze it with Hawkeye. Thanks in advance! PD: sorry for my English. Diego Díaz. |
From: Walenz, B. <bw...@jc...> - 2013-08-28 14:45:01
|
It seems to have made it further. It no longer complains about no overlaps, but seems to be failing an assert trying to delete a read. I've never seen that before. Was this a full restart with the new code? Can you post a bit more of the err log, and 'ls -l *gkpStore' (to check for permissions and extraneous files). The most information will be to recompile with debug (cd wgs-svn/src ; rm -rf ../Linux-amd64 ; gmake BUILDDEBUG=1) and rerun. This will populate the crash report with line numbers. b On 8/28/13 7:50 AM, "Hornung, Bastian" <bas...@wu...> wrote: > Sorry, should probably go over the mailing list. > > Thanks for the further help Brian. > Forgot to "make install" (should've read the readme). > It doesn't go further though, just more details in the command line: > > ----------------------------------------START Wed Aug 28 13:43:07 2013 > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate \ > -gkp /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/test.gkpStore \ > -ovs > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.obtStore \ > -ovs > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.dupStore \ > -report > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicat > e.log \ > -summary > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicat > e.summary \ >> /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplica >> te.err 2>&1 > ----------------------------------------END Wed Aug 28 13:43:09 2013 (2 > seconds) > ERROR: Failed with signal ABRT (6) > ============================================================================== > == > > runCA failed. > > ---------------------------------------- > Stack trace: > > at ./runCA line 1418 > main::caFailure('failed to deduplicate the reads', > '/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptri...') called at > ./runCA line 3911 > main::overlapTrim() called at ./runCA line 6118 > > ---------------------------------------- > Last few lines of the relevant log file > (/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplica > te.err): > > > Backtrace (demangled): > > [0] > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::AS_UTL_catchCrash(int > , siginfo*, void*) + 0x24 [0x408824] > [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 [0x7fe93d576cb0] > [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7fe93d1de425] > [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7fe93d1e1b8b] > [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f0ee [0x7fe93d1d70ee] > [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f192 [0x7fe93d1d7192] > [6] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate() [0x40ea54] > [7] > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::gkStore::gkStore_delF > ragment(unsigned int, bool) + 0xd8 [0x40f678] > [8] > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::deleteFragments(gkSto > re*, fragT*) + 0x49 [0x406de9] > [9] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::(null) + 0xa1c > [0x40553c] > [10] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7fe93d1c976d] > [11] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate() [0x4055a9] > > GDB: > > > Aborted (core dumped) > > ---------------------------------------- > Failure message: > > failed to deduplicate the reads > > > Seems that it has nothing to do with the PacBio data, but the mate pair data > causes the crash. > You mentioned that this could mean that there could just not be any duplicated > reads...er...that's good for the library, isn't it? > > Thanks for any further thoughts, > > Bastian > > ________________________________________ > From: Walenz, Brian [bw...@jc...] > Sent: Wednesday, August 28, 2013 12:22 PM > To: Hornung, Bastian > Subject: Re: [wgs-assembler-users] Problem with the overlapper, > "AS_OVS_openBinaryOverlapFile()-- Failed to open..." > > I'm guessing this indicates two problems: > > 1) you don't have kmer installed. > > 2) AS_MER_meryl.cc (an obsolete version that compiles if kmer isn't > installed) is broken. > > http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Check_ou > t_and_Compile > > > On 8/28/13 3:25 AM, "Hornung, Bastian" <bas...@wu...> wrote: > >> Hi Brian, >> >> t runs with both the suggested options, thanks. >> >> I've also tried the latest unstable version, and it doesn't compile (...and >> my >> C++ skills are underwhelming, can't fix it myself...). >> ############################## AS_MER ############################## >> make[1]: *** No rule to make target `AS_global.h', needed by >> `AS_MER_meryl.o'. >> Stop. >> make: *** [objs] Error 1 >> >> I guess I'll now have to do some quality checking... >> >> thanks for the help, >> >> Bastian >> >> ________________________________________ >> From: Walenz, Brian [bw...@jc...] >> Sent: Tuesday, August 27, 2013 5:57 PM >> To: Hornung, Bastian; wgs...@li... >> Subject: RE: [wgs-assembler-users] Problem with the overlapper, >> "AS_OVS_openBinaryOverlapFile()-- Failed to open..." >> >> Hi- >> >> It is complaining that there are no overlaps in the 'dupStore'. This could >> mean store creation failed, but it probably just means there are no duplicate >> reads. >> >> Can you upgrade to the latest 'unstable' version in subversion and rerun? If >> it still fails, I should be able to fix easily. I think it might be fixed >> already. >> >> Alternatively, you can disable all trimming (doOBT=0) or just dedupe >> (doDeDuplication=0). Odd that it worked with just the pacbio! >> >> b >> >> >> ________________________________________ >> From: Hornung, Bastian [bas...@wu...] >> Sent: Tuesday, August 27, 2013 7:16 AM >> To: wgs...@li... >> Subject: [wgs-assembler-users] Problem with the overlapper, >> "AS_OVS_openBinaryOverlapFile()-- Failed to open..." >> >> Hi @all, >> >> sorry if that's the wrong way, but I have a problem with the wgs assembler, >> and not really an idea if there's a bug or if I'm just dumb and doing >> something wrong (first time I use it). >> >> I'm trying to run an assembly with PacBio short reads and illumina mate pair >> reads, and during the assembly runCA aborts, due to a failure in the >> overlapper. >> I've converted my data to 2 .frg files, then I've run gatekeeper with both, >> which worked without errors (well...sort of, had to compile it myself, due to >> https://sourceforge.net/p/wgs-assembler/bugs/253/ ). >> Then I run runCA, which then crashes because it can't deduplicate the reads. >> >> Parts of the command line output: >> >> ----------------------------------------START Tue Aug 27 12:58:34 2013 >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/deduplicate \ >> -gkp >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/PacBio_an>> d >> _mate_pair.gkpStore \ >> -ovs >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.obtStore \ >> -ovs >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.dupStore \ >> -report >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.deduplicate.log \ >> -summary >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap>> t >> rim/PacBio_and_mate_pair.deduplicate.summary \ >>> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overla>>> p >>> trim/PacBio_and_mate_pair.deduplicate.err 2>&1 >> ----------------------------------------END Tue Aug 27 12:58:34 2013 (0 >> seconds) >> ERROR: Failed with signal HUP (1) >> =============================================================================>> = >> == >> >> runCA failed. >> >> ---------------------------------------- >> Stack trace: >> >> at ./runCA line 1237 >> main::caFailure('failed to deduplicate the reads', >> '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkSto...') called at >> ./runCA line 3739 >> main::overlapTrim() called at ./runCA line 5876 >> >> ---------------------------------------- >> Last few lines of the relevant log file >> (/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overla>> p >> trim/PacBio_and_mate_pair.deduplicate.err): >> >> AS_OVS_openBinaryOverlapFile()-- Failed to open >> '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overla>> p >> trim/PacBio_and_mate_pair.dupStore/0001' for reading: No such file or >> directory >> >> ---------------------------------------- >> Failure message: >> >> failed to deduplicate the reads >> >> >> I have absolutely no clue what could be happening here. >> If I run it with only the PacBio data, then it works, so the installation >> itself is okay. >> >> Any advice? >> >> Best regards, >> >> Bastian >> >> >> >> >> >> >> ----------------------------------------------------------------------------->> - >> Introducing Performance Central, a new site from SourceForge and >> AppDynamics. Performance Central is your source for news, insights, >> analysis and resources for efficient Application Performance Management. >> Visit us today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > > > > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Hornung, B. <bas...@wu...> - 2013-08-28 11:51:07
|
Sorry, should probably go over the mailing list. Thanks for the further help Brian. Forgot to "make install" (should've read the readme). It doesn't go further though, just more details in the command line: ----------------------------------------START Wed Aug 28 13:43:07 2013 /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate \ -gkp /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/test.gkpStore \ -ovs /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.obtStore \ -ovs /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.dupStore \ -report /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicate.log \ -summary /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicate.summary \ > /home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicate.err 2>&1 ----------------------------------------END Wed Aug 28 13:43:09 2013 (2 seconds) ERROR: Failed with signal ABRT (6) ================================================================================ runCA failed. ---------------------------------------- Stack trace: at ./runCA line 1418 main::caFailure('failed to deduplicate the reads', '/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptri...') called at ./runCA line 3911 main::overlapTrim() called at ./runCA line 6118 ---------------------------------------- Last few lines of the relevant log file (/home/bastian/Tools/wgs-svn/Linux-amd64/bin/test/0-overlaptrim/test.deduplicate.err): Backtrace (demangled): [0] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::AS_UTL_catchCrash(int, siginfo*, void*) + 0x24 [0x408824] [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 [0x7fe93d576cb0] [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7fe93d1de425] [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7fe93d1e1b8b] [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f0ee [0x7fe93d1d70ee] [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2f192 [0x7fe93d1d7192] [6] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate() [0x40ea54] [7] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::gkStore::gkStore_delFragment(unsigned int, bool) + 0xd8 [0x40f678] [8] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::deleteFragments(gkStore*, fragT*) + 0x49 [0x406de9] [9] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate::(null) + 0xa1c [0x40553c] [10] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7fe93d1c976d] [11] /home/bastian/Tools/wgs-svn/Linux-amd64/bin/deduplicate() [0x4055a9] GDB: Aborted (core dumped) ---------------------------------------- Failure message: failed to deduplicate the reads Seems that it has nothing to do with the PacBio data, but the mate pair data causes the crash. You mentioned that this could mean that there could just not be any duplicated reads...er...that's good for the library, isn't it? Thanks for any further thoughts, Bastian ________________________________________ From: Walenz, Brian [bw...@jc...] Sent: Wednesday, August 28, 2013 12:22 PM To: Hornung, Bastian Subject: Re: [wgs-assembler-users] Problem with the overlapper, "AS_OVS_openBinaryOverlapFile()-- Failed to open..." I'm guessing this indicates two problems: 1) you don't have kmer installed. 2) AS_MER_meryl.cc (an obsolete version that compiles if kmer isn't installed) is broken. http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Check_ou t_and_Compile On 8/28/13 3:25 AM, "Hornung, Bastian" <bas...@wu...> wrote: > Hi Brian, > > t runs with both the suggested options, thanks. > > I've also tried the latest unstable version, and it doesn't compile (...and my > C++ skills are underwhelming, can't fix it myself...). > ############################## AS_MER ############################## > make[1]: *** No rule to make target `AS_global.h', needed by `AS_MER_meryl.o'. > Stop. > make: *** [objs] Error 1 > > I guess I'll now have to do some quality checking... > > thanks for the help, > > Bastian > > ________________________________________ > From: Walenz, Brian [bw...@jc...] > Sent: Tuesday, August 27, 2013 5:57 PM > To: Hornung, Bastian; wgs...@li... > Subject: RE: [wgs-assembler-users] Problem with the overlapper, > "AS_OVS_openBinaryOverlapFile()-- Failed to open..." > > Hi- > > It is complaining that there are no overlaps in the 'dupStore'. This could > mean store creation failed, but it probably just means there are no duplicate > reads. > > Can you upgrade to the latest 'unstable' version in subversion and rerun? If > it still fails, I should be able to fix easily. I think it might be fixed > already. > > Alternatively, you can disable all trimming (doOBT=0) or just dedupe > (doDeDuplication=0). Odd that it worked with just the pacbio! > > b > > > ________________________________________ > From: Hornung, Bastian [bas...@wu...] > Sent: Tuesday, August 27, 2013 7:16 AM > To: wgs...@li... > Subject: [wgs-assembler-users] Problem with the overlapper, > "AS_OVS_openBinaryOverlapFile()-- Failed to open..." > > Hi @all, > > sorry if that's the wrong way, but I have a problem with the wgs assembler, > and not really an idea if there's a bug or if I'm just dumb and doing > something wrong (first time I use it). > > I'm trying to run an assembly with PacBio short reads and illumina mate pair > reads, and during the assembly runCA aborts, due to a failure in the > overlapper. > I've converted my data to 2 .frg files, then I've run gatekeeper with both, > which worked without errors (well...sort of, had to compile it myself, due to > https://sourceforge.net/p/wgs-assembler/bugs/253/ ). > Then I run runCA, which then crashes because it can't deduplicate the reads. > > Parts of the command line output: > > ----------------------------------------START Tue Aug 27 12:58:34 2013 > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/deduplicate \ > -gkp > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/PacBio_and > _mate_pair.gkpStore \ > -ovs > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlapt > rim/PacBio_and_mate_pair.obtStore \ > -ovs > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlapt > rim/PacBio_and_mate_pair.dupStore \ > -report > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlapt > rim/PacBio_and_mate_pair.deduplicate.log \ > -summary > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlapt > rim/PacBio_and_mate_pair.deduplicate.summary \ >> /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap >> trim/PacBio_and_mate_pair.deduplicate.err 2>&1 > ----------------------------------------END Tue Aug 27 12:58:34 2013 (0 > seconds) > ERROR: Failed with signal HUP (1) > ============================================================================== > == > > runCA failed. > > ---------------------------------------- > Stack trace: > > at ./runCA line 1237 > main::caFailure('failed to deduplicate the reads', > '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkSto...') called at > ./runCA line 3739 > main::overlapTrim() called at ./runCA line 5876 > > ---------------------------------------- > Last few lines of the relevant log file > (/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap > trim/PacBio_and_mate_pair.deduplicate.err): > > AS_OVS_openBinaryOverlapFile()-- Failed to open > '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlap > trim/PacBio_and_mate_pair.dupStore/0001' for reading: No such file or > directory > > ---------------------------------------- > Failure message: > > failed to deduplicate the reads > > > I have absolutely no clue what could be happening here. > If I run it with only the PacBio data, then it works, so the installation > itself is okay. > > Any advice? > > Best regards, > > Bastian > > > > > > > ------------------------------------------------------------------------------ > Introducing Performance Central, a new site from SourceForge and > AppDynamics. Performance Central is your source for news, insights, > analysis and resources for efficient Application Performance Management. > Visit us today! > http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Walenz, B. <bw...@jc...> - 2013-08-27 15:58:08
|
Hi- It is complaining that there are no overlaps in the 'dupStore'. This could mean store creation failed, but it probably just means there are no duplicate reads. Can you upgrade to the latest 'unstable' version in subversion and rerun? If it still fails, I should be able to fix easily. I think it might be fixed already. Alternatively, you can disable all trimming (doOBT=0) or just dedupe (doDeDuplication=0). Odd that it worked with just the pacbio! b ________________________________________ From: Hornung, Bastian [bas...@wu...] Sent: Tuesday, August 27, 2013 7:16 AM To: wgs...@li... Subject: [wgs-assembler-users] Problem with the overlapper, "AS_OVS_openBinaryOverlapFile()-- Failed to open..." Hi @all, sorry if that's the wrong way, but I have a problem with the wgs assembler, and not really an idea if there's a bug or if I'm just dumb and doing something wrong (first time I use it). I'm trying to run an assembly with PacBio short reads and illumina mate pair reads, and during the assembly runCA aborts, due to a failure in the overlapper. I've converted my data to 2 .frg files, then I've run gatekeeper with both, which worked without errors (well...sort of, had to compile it myself, due to https://sourceforge.net/p/wgs-assembler/bugs/253/ ). Then I run runCA, which then crashes because it can't deduplicate the reads. Parts of the command line output: ----------------------------------------START Tue Aug 27 12:58:34 2013 /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/deduplicate \ -gkp /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/PacBio_and_mate_pair.gkpStore \ -ovs /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.obtStore \ -ovs /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.dupStore \ -report /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.log \ -summary /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.summary \ > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.err 2>&1 ----------------------------------------END Tue Aug 27 12:58:34 2013 (0 seconds) ERROR: Failed with signal HUP (1) ================================================================================ runCA failed. ---------------------------------------- Stack trace: at ./runCA line 1237 main::caFailure('failed to deduplicate the reads', '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkSto...') called at ./runCA line 3739 main::overlapTrim() called at ./runCA line 5876 ---------------------------------------- Last few lines of the relevant log file (/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.err): AS_OVS_openBinaryOverlapFile()-- Failed to open '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.dupStore/0001' for reading: No such file or directory ---------------------------------------- Failure message: failed to deduplicate the reads I have absolutely no clue what could be happening here. If I run it with only the PacBio data, then it works, so the installation itself is okay. Any advice? Best regards, Bastian ------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk _______________________________________________ wgs-assembler-users mailing list wgs...@li... https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Hornung, B. <bas...@wu...> - 2013-08-27 11:32:01
|
Hi @all, sorry if that's the wrong way, but I have a problem with the wgs assembler, and not really an idea if there's a bug or if I'm just dumb and doing something wrong (first time I use it). I'm trying to run an assembly with PacBio short reads and illumina mate pair reads, and during the assembly runCA aborts, due to a failure in the overlapper. I've converted my data to 2 .frg files, then I've run gatekeeper with both, which worked without errors (well...sort of, had to compile it myself, due to https://sourceforge.net/p/wgs-assembler/bugs/253/ ). Then I run runCA, which then crashes because it can't deduplicate the reads. Parts of the command line output: ----------------------------------------START Tue Aug 27 12:58:34 2013 /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/deduplicate \ -gkp /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/PacBio_and_mate_pair.gkpStore \ -ovs /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.obtStore \ -ovs /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.dupStore \ -report /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.log \ -summary /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.summary \ > /home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.err 2>&1 ----------------------------------------END Tue Aug 27 12:58:34 2013 (0 seconds) ERROR: Failed with signal HUP (1) ================================================================================ runCA failed. ---------------------------------------- Stack trace: at ./runCA line 1237 main::caFailure('failed to deduplicate the reads', '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkSto...') called at ./runCA line 3739 main::overlapTrim() called at ./runCA line 5876 ---------------------------------------- Last few lines of the relevant log file (/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.deduplicate.err): AS_OVS_openBinaryOverlapFile()-- Failed to open '/home/bastian/Tools/wgs-7.0/Linux-amd64/bin/test2.gpkStore/assembly/0-overlaptrim/PacBio_and_mate_pair.dupStore/0001' for reading: No such file or directory ---------------------------------------- Failure message: failed to deduplicate the reads I have absolutely no clue what could be happening here. If I run it with only the PacBio data, then it works, so the installation itself is okay. Any advice? Best regards, Bastian |
From: Walenz, B. <bw...@jc...> - 2013-08-16 03:15:30
|
Hi, Geoff, Heiner- Lots of questions here, I'll try to (finally) answer them. The unitig splitter is looking for a pattern of low sequence coverage, no good mate coverage and bad mate coverage. Bad meaning misoriented, or mate read not present where it should be. The intent was to find bad joins caused by chimeric reads. With long reads present, repeats internal to the read can have either zero short read coverage (because it's a common repeat excluded from overlaps) OR can have the wrong short read placed there (because it's in a common repeat with multiple equally good placements). I can certainly see, with low long read coverage, lots and lots of incorrect splitting being done. The splitter works by marking a region in the unitig as being the chimeric junction, then moving all reads that touch that region to a new 'bad' unitig. Heiner, this is likely why you end up with long singleton reads - nothing else touched the region. You can disable splitting by creating an empty file '5-consensus-split/splitUnitigs.out'. I also just added a 'doUnitigSplitting' option to revision r4387. 'doFragmentCorrection' is a misnomer. It actually changes the error rate on overlaps, but doesn't change the reads at all. It seems to help in general, but can be quite expensive to run. We turn it off on big assemblies because of this. Possibly, increasing the various unitig error rates slightly will compensate. The degen issue is much harder, and we probably should write more on it. There are two parameters to directly influence this (astatLowBound=1 and astatHighBound=5). We've set the low bound to -20 for a very repetitive genome in the past. Logs for this are in 5-consensus-coverage-stat - scatter plotting rho (basically length) against covStat might help pick a reasonable threshold. https://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=RunCA#S caffolder If the genome isn't huge, you can also run the 'toggler'. This will run an assembly to completion, then analyze unitigs and force a change to their repeat/unique label, and rerun ALL of scaffolding. https://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=RunCA#U nitig_Repeat.2FUnique_Toggling You could also, with some effort, combine all these steps. Analyze the logs in 5-consensus-coverage-stat, decide which unitigs need to be promoted to unique (which unitigs need to have a low covStat modified to a high covStat) and set the appropriate flag in the tigStore (by dumping the layout, modifying, and loading it back). b On 8/10/13 6:08 AM, "kuhl" <ku...@mo...> wrote: > Hi, > > I am running into similar problems, when combining very long read assembly > (65000 bp reads which are assembled contigs from a de bruijn graph > assembler) with short read paired end and matepair data. The assembly gets > messed up by unitig splitting, some of the long reads even end up as > singletons. Other contigs end up as degenerates although they are very > large (>100 kbp) and as far as I can tell they are unique genomic sequences > (not repeats od duplications). Is there a workaround for these problems, or > should one not mix very long reads and very short read data? > > Best wishes, > > Heiner > > > On Fri, 9 Aug 2013 20:08:50 +0000, "Waldbieser, Geoff" > <Geo...@AR...> wrote: >> Hi, >> CA v7.0 (build assembled 3.3M pacBioToCA-corrected long reads into > 146,537 >> unitigs (~1Gb) in 24 hrs using 62cpu and 512GB RAM. >> >> $> /home/software/wgs1/Linux-amd64/bin/runCA -version >> CA version CVS TIP ($Id: AS_GKP_main.c,v 1.105 2012-09-13 17:41:13 > skoren >> Exp $). >> CA version CVS TIP ($Id: AS_CGB_unitigger.c,v 1.45 2011-09-06 02:15:18 >> mkotelbajcvi Exp $). >> CA version CVS TIP ($Id: BuildUnitigs.cc,v 1.88 2012-01-15 23:49:34 >> brianwalenz Exp $). >> Using up to 64 OpenMP threads. >> CA version CVS TIP ($Id: AS_CGW_main.c,v 1.116 2012-11-15 05:04:54 >> brianwalenz Exp $). >> CA version CVS TIP ($Id: terminator.C,v 1.17 2012-09-10 08:58:11 >> brianwalenz Exp $). >> >> I then added Illumina mate pairs that had been corrected, deduplicated, >> and chimeric reads removed. >> Library >> Pairs >> Illumina MP, 3kb insert 4,455,475 >> Illumina MP, 8kb insert 2,834,858 >> Illumina MP, 36kb insert 1,222,830 >> >> When these were added, CA spent a few days in splitUnitigs before > failing. >> >> ------------------- >> $> tail -60 splitUnitigs.out.FAILED >> Creating new unitig 8643669 with 28 fragments >> unitig 2478 interval 0 0,920 good >> unitig 2478 interval 1 920,1123 bad >> unitig 2478 interval 2 1124,1428 good >> Fixing contains. >> prev 1143,1230 -- 16110937 1220,1143 (no overlap to new 1193,1413) >> prev 1143,1230 -- 11823198 1220,1143 (no overlap to new 1193,1413) >> prev 1143,1230 -- 14595211 1217,1143 (no overlap to new 1193,1413) >> prev 1143,1230 -- 13147242 1213,1143 (no overlap to new 1193,1413) >> prev 1143,1230 -- 22697521 1230,1145 (no overlap to new 1193,1413) prev >> 1143,1230 -- 4687598 1145,1220 (no overlap to new 1193,1413) >> Creating new unitig 8643670 with 24 fragments >> Creating new unitig 8643671 with 8 fragments >> Creating new unitig 8643672 with 1 fragments >> unitig 2494 interval 0 0,16451 good >> unitig 2494 interval 1 16451,16502 bad >> unitig 2494 interval 2 16503,48319 good >> Creating new unitig 8643673 with 265 fragments >> Creating new unitig 8643674 with 2 fragments >> Creating new unitig 8643675 with 497 fragments >> unitig 2504 interval 0 0,18635 good >> unitig 2504 interval 1 18635,18769 bad >> unitig 2504 interval 2 18770,26279 good >> Creating new unitig 8643676 with 295 fragments >> Creating new unitig 8643677 with 2 fragments >> splitUnitigs: MultiAlignUnitig.C:469: int >> unitigConsensus::computePositionFromParent(bool): Assertion >> `cnspos[tiid].bgn < cnspos[tiid].end' failed. >> >> Failed with 'Aborted' >> >> Backtrace (mangled): >> >> > /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z17AS_UTL_catchCrashiP7sigin > foPv+0x23)[0x410a13] >> /lib64/libpthread.so.0(+0xfd00)[0x7f55fe5bdd00] >> /lib64/libc.so.6(gsignal+0x35)[0x7f55fe252d95] >> /lib64/libc.so.6(abort+0x17b)[0x7f55fe2542ab] >> /lib64/libc.so.6(+0x2d8fe)[0x7f55fe24b8fe] >> /lib64/libc.so.6(+0x2d9a2)[0x7f55fe24b9a2] >> /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x42bf9f] >> > /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z16MultiAlignUnitigP11MultiA > lignTP7gkStoreP11CNS_OptionsPi+0xf0)[0x42fa10] >> /home/software/wgs1/Linux-amd64/bin/splitUnitigs(main+0x28bd)[0x40c78d] >> /lib64/libc.so.6(__libc_start_main+0xed)[0x7f55fe23f23d] >> /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x40cbf9] >> >> Backtrace (demangled): >> >> [0] >> /home/software/wgs1/Linux-amd64/bin/splitUnitigs::AS_UTL_catchCrash(int, >> siginfo*, void*) + 0x23 [0x410a13] >> [1] /lib64/libpthread.so.0::(null) + 0xfd00 [0x7f55fe5bdd00] >> [2] /lib64/libc.so.6::(null) + 0x35 [0x7f55fe252d95] >> [3] /lib64/libc.so.6::(null) + 0x17b [0x7f55fe2542ab] >> [4] /lib64/libc.so.6::(null) + 0x2d8fe [0x7f55fe24b8fe] >> [5] /lib64/libc.so.6::(null) + 0x2d9a2 [0x7f55fe24b9a2] >> [6] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x42bf9f] >> [7] >> > /home/software/wgs1/Linux-amd64/bin/splitUnitigs::MultiAlignUnitig(MultiAlignT > *, >> gkStore*, CNS_Options*, int*) + 0xf0 [0x42fa10] >> [8] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::(null) + 0x28bd >> [0x40c78d] >> [9] /lib64/libc.so.6::(null) + 0xed [0x7f55fe23f23d] >> [10] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x40cbf9] >> >> GDB: >> ------------------ >> >> After seeing that someone else had a splitUnitig problem, I installed >> build 4371 and restarted. So far it has run splitUnitigs for 24 hrs and > it >> is currently working on unitig 35551 out of 9254397. >> >> The Illumina jump read parameters are: >> forceBOGunitigger=0 >> isNotRandom=0 >> doNotTrustHomopolymerRuns=0 >> doTrim_initialNone=0 >> doTrim_initialMerBased=0 >> doTrim_initialFlowBased=0 >> doTrim_initialQualityBased=0 >> doRemoveDuplicateReads=0 >> doTrim_finalLargestCovered=1 >> doTrim_finalEvidenceBased=0 >> doTrim_finalBestEdge=0 >> doRemoveSpurReads=1 >> doRemoveChimericReads=1 >> doConsensusCorrection=1 >> forceShortReadFormat=1 >> constantInsertSize=0 >> fastqQualityValues=sanger >> fastqOrientation=innie >> >> The corrPacBio parameters are: >> forceBOGunitigger=0 >> isNotRandom=0 >> doNotTrustHomopolymerRuns=0 >> doTrim_initialNone=0 >> doTrim_initialMerBased=0 >> doTrim_initialFlowBased=0 >> doTrim_initialQualityBased=0 >> doRemoveDuplicateReads=0 >> doTrim_finalLargestCovered=0 >> doTrim_finalEvidenceBased=1 >> doRemoveSpurReads=1 >> doRemoveChimericReads=1 >> doConsensusCorrection=1 >> forceShortReadFormat=0 >> constantInsertSize=0 >> fastqQualityValues=sanger >> fastqOrientation=innie >> >> >> I have declared bogart as the unitigger. I also set >> "doFragmentCorrection=0" and "doOverlapBasedTrimming = 0" because the >> Illumina data had already been cleaned and I assumed the Illumina >> correction of the PacBio reads was an error correction. Is this leading > to >> false joins that the unitigger is identifying and having to correct by >> splitting? >> >> Thanks for any input. >> Geoff >> ___________________________________ >> Geoffrey C. Waldbieser >> Research Molecular Biologist >> Warmwater Aquaculture Research Unit >> Agricultural Research Service >> United States Department of Agriculture >> Stoneville, MS 38776 >> (662) 686-3593 >> >> >> >> >> >> This electronic message contains information generated by the USDA > solely >> for the intended recipients. Any unauthorized interception of this > message >> or the use or disclosure of the information it contains may violate the > law >> and subject the violator to civil or criminal penalties. If you believe > you >> have received this message in error, please notify the sender and delete >> the email immediately. |
From: kuhl <ku...@mo...> - 2013-08-10 10:28:10
|
Hi, I am running into similar problems, when combining very long read assembly (65000 bp reads which are assembled contigs from a de bruijn graph assembler) with short read paired end and matepair data. The assembly gets messed up by unitig splitting, some of the long reads even end up as singletons. Other contigs end up as degenerates although they are very large (>100 kbp) and as far as I can tell they are unique genomic sequences (not repeats od duplications). Is there a workaround for these problems, or should one not mix very long reads and very short read data? Best wishes, Heiner On Fri, 9 Aug 2013 20:08:50 +0000, "Waldbieser, Geoff" <Geo...@AR...> wrote: > Hi, > CA v7.0 (build assembled 3.3M pacBioToCA-corrected long reads into 146,537 > unitigs (~1Gb) in 24 hrs using 62cpu and 512GB RAM. > > $> /home/software/wgs1/Linux-amd64/bin/runCA -version > CA version CVS TIP ($Id: AS_GKP_main.c,v 1.105 2012-09-13 17:41:13 skoren > Exp $). > CA version CVS TIP ($Id: AS_CGB_unitigger.c,v 1.45 2011-09-06 02:15:18 > mkotelbajcvi Exp $). > CA version CVS TIP ($Id: BuildUnitigs.cc,v 1.88 2012-01-15 23:49:34 > brianwalenz Exp $). > Using up to 64 OpenMP threads. > CA version CVS TIP ($Id: AS_CGW_main.c,v 1.116 2012-11-15 05:04:54 > brianwalenz Exp $). > CA version CVS TIP ($Id: terminator.C,v 1.17 2012-09-10 08:58:11 > brianwalenz Exp $). > > I then added Illumina mate pairs that had been corrected, deduplicated, > and chimeric reads removed. > Library > Pairs > Illumina MP, 3kb insert 4,455,475 > Illumina MP, 8kb insert 2,834,858 > Illumina MP, 36kb insert 1,222,830 > > When these were added, CA spent a few days in splitUnitigs before failing. > > ------------------- > $> tail -60 splitUnitigs.out.FAILED > Creating new unitig 8643669 with 28 fragments > unitig 2478 interval 0 0,920 good > unitig 2478 interval 1 920,1123 bad > unitig 2478 interval 2 1124,1428 good > Fixing contains. > prev 1143,1230 -- 16110937 1220,1143 (no overlap to new 1193,1413) > prev 1143,1230 -- 11823198 1220,1143 (no overlap to new 1193,1413) > prev 1143,1230 -- 14595211 1217,1143 (no overlap to new 1193,1413) > prev 1143,1230 -- 13147242 1213,1143 (no overlap to new 1193,1413) > prev 1143,1230 -- 22697521 1230,1145 (no overlap to new 1193,1413) prev > 1143,1230 -- 4687598 1145,1220 (no overlap to new 1193,1413) > Creating new unitig 8643670 with 24 fragments > Creating new unitig 8643671 with 8 fragments > Creating new unitig 8643672 with 1 fragments > unitig 2494 interval 0 0,16451 good > unitig 2494 interval 1 16451,16502 bad > unitig 2494 interval 2 16503,48319 good > Creating new unitig 8643673 with 265 fragments > Creating new unitig 8643674 with 2 fragments > Creating new unitig 8643675 with 497 fragments > unitig 2504 interval 0 0,18635 good > unitig 2504 interval 1 18635,18769 bad > unitig 2504 interval 2 18770,26279 good > Creating new unitig 8643676 with 295 fragments > Creating new unitig 8643677 with 2 fragments > splitUnitigs: MultiAlignUnitig.C:469: int > unitigConsensus::computePositionFromParent(bool): Assertion > `cnspos[tiid].bgn < cnspos[tiid].end' failed. > > Failed with 'Aborted' > > Backtrace (mangled): > > /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z17AS_UTL_catchCrashiP7siginfoPv+0x23)[0x410a13] > /lib64/libpthread.so.0(+0xfd00)[0x7f55fe5bdd00] > /lib64/libc.so.6(gsignal+0x35)[0x7f55fe252d95] > /lib64/libc.so.6(abort+0x17b)[0x7f55fe2542ab] > /lib64/libc.so.6(+0x2d8fe)[0x7f55fe24b8fe] > /lib64/libc.so.6(+0x2d9a2)[0x7f55fe24b9a2] > /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x42bf9f] > /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z16MultiAlignUnitigP11MultiAlignTP7gkStoreP11CNS_OptionsPi+0xf0)[0x42fa10] > /home/software/wgs1/Linux-amd64/bin/splitUnitigs(main+0x28bd)[0x40c78d] > /lib64/libc.so.6(__libc_start_main+0xed)[0x7f55fe23f23d] > /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x40cbf9] > > Backtrace (demangled): > > [0] > /home/software/wgs1/Linux-amd64/bin/splitUnitigs::AS_UTL_catchCrash(int, > siginfo*, void*) + 0x23 [0x410a13] > [1] /lib64/libpthread.so.0::(null) + 0xfd00 [0x7f55fe5bdd00] > [2] /lib64/libc.so.6::(null) + 0x35 [0x7f55fe252d95] > [3] /lib64/libc.so.6::(null) + 0x17b [0x7f55fe2542ab] > [4] /lib64/libc.so.6::(null) + 0x2d8fe [0x7f55fe24b8fe] > [5] /lib64/libc.so.6::(null) + 0x2d9a2 [0x7f55fe24b9a2] > [6] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x42bf9f] > [7] > /home/software/wgs1/Linux-amd64/bin/splitUnitigs::MultiAlignUnitig(MultiAlignT*, > gkStore*, CNS_Options*, int*) + 0xf0 [0x42fa10] > [8] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::(null) + 0x28bd > [0x40c78d] > [9] /lib64/libc.so.6::(null) + 0xed [0x7f55fe23f23d] > [10] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x40cbf9] > > GDB: > ------------------ > > After seeing that someone else had a splitUnitig problem, I installed > build 4371 and restarted. So far it has run splitUnitigs for 24 hrs and it > is currently working on unitig 35551 out of 9254397. > > The Illumina jump read parameters are: > forceBOGunitigger=0 > isNotRandom=0 > doNotTrustHomopolymerRuns=0 > doTrim_initialNone=0 > doTrim_initialMerBased=0 > doTrim_initialFlowBased=0 > doTrim_initialQualityBased=0 > doRemoveDuplicateReads=0 > doTrim_finalLargestCovered=1 > doTrim_finalEvidenceBased=0 > doTrim_finalBestEdge=0 > doRemoveSpurReads=1 > doRemoveChimericReads=1 > doConsensusCorrection=1 > forceShortReadFormat=1 > constantInsertSize=0 > fastqQualityValues=sanger > fastqOrientation=innie > > The corrPacBio parameters are: > forceBOGunitigger=0 > isNotRandom=0 > doNotTrustHomopolymerRuns=0 > doTrim_initialNone=0 > doTrim_initialMerBased=0 > doTrim_initialFlowBased=0 > doTrim_initialQualityBased=0 > doRemoveDuplicateReads=0 > doTrim_finalLargestCovered=0 > doTrim_finalEvidenceBased=1 > doRemoveSpurReads=1 > doRemoveChimericReads=1 > doConsensusCorrection=1 > forceShortReadFormat=0 > constantInsertSize=0 > fastqQualityValues=sanger > fastqOrientation=innie > > > I have declared bogart as the unitigger. I also set > "doFragmentCorrection=0" and "doOverlapBasedTrimming = 0" because the > Illumina data had already been cleaned and I assumed the Illumina > correction of the PacBio reads was an error correction. Is this leading to > false joins that the unitigger is identifying and having to correct by > splitting? > > Thanks for any input. > Geoff > ___________________________________ > Geoffrey C. Waldbieser > Research Molecular Biologist > Warmwater Aquaculture Research Unit > Agricultural Research Service > United States Department of Agriculture > Stoneville, MS 38776 > (662) 686-3593 > > > > > > This electronic message contains information generated by the USDA solely > for the intended recipients. Any unauthorized interception of this message > or the use or disclosure of the information it contains may violate the law > and subject the violator to civil or criminal penalties. If you believe you > have received this message in error, please notify the sender and delete > the email immediately. -- --------------------------------------------------------------- Dr. Heiner Kuhl MPI Molecular Genetics Tel: + 49 + 30 / 8413 1776 Next Generation Sequencing Ihnestrasse 73 email: ku...@mo... D-14195 Berlin http://www.molgen.mpg.de/SeqCore --------------------------------------------------------------- |
From: Waldbieser, G. <Geo...@AR...> - 2013-08-09 20:09:03
|
Hi, CA v7.0 (build assembled 3.3M pacBioToCA-corrected long reads into 146,537 unitigs (~1Gb) in 24 hrs using 62cpu and 512GB RAM. $> /home/software/wgs1/Linux-amd64/bin/runCA -version CA version CVS TIP ($Id: AS_GKP_main.c,v 1.105 2012-09-13 17:41:13 skoren Exp $). CA version CVS TIP ($Id: AS_CGB_unitigger.c,v 1.45 2011-09-06 02:15:18 mkotelbajcvi Exp $). CA version CVS TIP ($Id: BuildUnitigs.cc,v 1.88 2012-01-15 23:49:34 brianwalenz Exp $). Using up to 64 OpenMP threads. CA version CVS TIP ($Id: AS_CGW_main.c,v 1.116 2012-11-15 05:04:54 brianwalenz Exp $). CA version CVS TIP ($Id: terminator.C,v 1.17 2012-09-10 08:58:11 brianwalenz Exp $). I then added Illumina mate pairs that had been corrected, deduplicated, and chimeric reads removed. Library Pairs Illumina MP, 3kb insert 4,455,475 Illumina MP, 8kb insert 2,834,858 Illumina MP, 36kb insert 1,222,830 When these were added, CA spent a few days in splitUnitigs before failing. ------------------- $> tail -60 splitUnitigs.out.FAILED Creating new unitig 8643669 with 28 fragments unitig 2478 interval 0 0,920 good unitig 2478 interval 1 920,1123 bad unitig 2478 interval 2 1124,1428 good Fixing contains. prev 1143,1230 -- 16110937 1220,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 11823198 1220,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 14595211 1217,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 13147242 1213,1143 (no overlap to new 1193,1413) prev 1143,1230 -- 22697521 1230,1145 (no overlap to new 1193,1413) prev 1143,1230 -- 4687598 1145,1220 (no overlap to new 1193,1413) Creating new unitig 8643670 with 24 fragments Creating new unitig 8643671 with 8 fragments Creating new unitig 8643672 with 1 fragments unitig 2494 interval 0 0,16451 good unitig 2494 interval 1 16451,16502 bad unitig 2494 interval 2 16503,48319 good Creating new unitig 8643673 with 265 fragments Creating new unitig 8643674 with 2 fragments Creating new unitig 8643675 with 497 fragments unitig 2504 interval 0 0,18635 good unitig 2504 interval 1 18635,18769 bad unitig 2504 interval 2 18770,26279 good Creating new unitig 8643676 with 295 fragments Creating new unitig 8643677 with 2 fragments splitUnitigs: MultiAlignUnitig.C:469: int unitigConsensus::computePositionFromParent(bool): Assertion `cnspos[tiid].bgn < cnspos[tiid].end' failed. Failed with 'Aborted' Backtrace (mangled): /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z17AS_UTL_catchCrashiP7siginfoPv+0x23)[0x410a13] /lib64/libpthread.so.0(+0xfd00)[0x7f55fe5bdd00] /lib64/libc.so.6(gsignal+0x35)[0x7f55fe252d95] /lib64/libc.so.6(abort+0x17b)[0x7f55fe2542ab] /lib64/libc.so.6(+0x2d8fe)[0x7f55fe24b8fe] /lib64/libc.so.6(+0x2d9a2)[0x7f55fe24b9a2] /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x42bf9f] /home/software/wgs1/Linux-amd64/bin/splitUnitigs(_Z16MultiAlignUnitigP11MultiAlignTP7gkStoreP11CNS_OptionsPi+0xf0)[0x42fa10] /home/software/wgs1/Linux-amd64/bin/splitUnitigs(main+0x28bd)[0x40c78d] /lib64/libc.so.6(__libc_start_main+0xed)[0x7f55fe23f23d] /home/software/wgs1/Linux-amd64/bin/splitUnitigs[0x40cbf9] Backtrace (demangled): [0] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::AS_UTL_catchCrash(int, siginfo*, void*) + 0x23 [0x410a13] [1] /lib64/libpthread.so.0::(null) + 0xfd00 [0x7f55fe5bdd00] [2] /lib64/libc.so.6::(null) + 0x35 [0x7f55fe252d95] [3] /lib64/libc.so.6::(null) + 0x17b [0x7f55fe2542ab] [4] /lib64/libc.so.6::(null) + 0x2d8fe [0x7f55fe24b8fe] [5] /lib64/libc.so.6::(null) + 0x2d9a2 [0x7f55fe24b9a2] [6] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x42bf9f] [7] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::MultiAlignUnitig(MultiAlignT*, gkStore*, CNS_Options*, int*) + 0xf0 [0x42fa10] [8] /home/software/wgs1/Linux-amd64/bin/splitUnitigs::(null) + 0x28bd [0x40c78d] [9] /lib64/libc.so.6::(null) + 0xed [0x7f55fe23f23d] [10] /home/software/wgs1/Linux-amd64/bin/splitUnitigs() [0x40cbf9] GDB: ------------------ After seeing that someone else had a splitUnitig problem, I installed build 4371 and restarted. So far it has run splitUnitigs for 24 hrs and it is currently working on unitig 35551 out of 9254397. The Illumina jump read parameters are: forceBOGunitigger=0 isNotRandom=0 doNotTrustHomopolymerRuns=0 doTrim_initialNone=0 doTrim_initialMerBased=0 doTrim_initialFlowBased=0 doTrim_initialQualityBased=0 doRemoveDuplicateReads=0 doTrim_finalLargestCovered=1 doTrim_finalEvidenceBased=0 doTrim_finalBestEdge=0 doRemoveSpurReads=1 doRemoveChimericReads=1 doConsensusCorrection=1 forceShortReadFormat=1 constantInsertSize=0 fastqQualityValues=sanger fastqOrientation=innie The corrPacBio parameters are: forceBOGunitigger=0 isNotRandom=0 doNotTrustHomopolymerRuns=0 doTrim_initialNone=0 doTrim_initialMerBased=0 doTrim_initialFlowBased=0 doTrim_initialQualityBased=0 doRemoveDuplicateReads=0 doTrim_finalLargestCovered=0 doTrim_finalEvidenceBased=1 doRemoveSpurReads=1 doRemoveChimericReads=1 doConsensusCorrection=1 forceShortReadFormat=0 constantInsertSize=0 fastqQualityValues=sanger fastqOrientation=innie I have declared bogart as the unitigger. I also set "doFragmentCorrection=0" and "doOverlapBasedTrimming = 0" because the Illumina data had already been cleaned and I assumed the Illumina correction of the PacBio reads was an error correction. Is this leading to false joins that the unitigger is identifying and having to correct by splitting? Thanks for any input. Geoff ___________________________________ Geoffrey C. Waldbieser Research Molecular Biologist Warmwater Aquaculture Research Unit Agricultural Research Service United States Department of Agriculture Stoneville, MS 38776 (662) 686-3593 This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. |
From: Walenz, B. <bw...@jc...> - 2013-08-06 22:53:55
|
Hi, Ole- Quite amusingly, I hit the same assert on a bacterial test assembly this afternoon. No rhyme or reason here; we’ve run a fish or two, and I know of a few plants that have run too. Looks like I slacked off and didn’t implement an array resize. I’ll claim that I was thinking of replacing this array with an STL vector, but likely I was just lazy and then forgot to do it. It’s fixed in r4380. b On 8/6/13 4:32 AM, "Ole Kristian Tørresen" <o.k...@ib...> wrote: Hi, I'm getting an assertion fail while running recent code from the SVN. I saw this with both July 31th and August 5th snapshots: Working on unitig 258682 (0 unitigs and 2840 fragments) utgcns: MultiAlignUnitig.C:99: int MANode2Array(MANode*, char***, int32***, int32): Assertion `0' failed. Failed with 'Aborted' Backtrace (mangled): /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns(_Z17AS_UTL_catchCrashiP7siginfoPv+0x2a)[0x40dcfa] /lib64/libpthread.so.0[0x3fefc0f500] /lib64/libc.so.6(gsignal+0x35)[0x3fef0328a5] /lib64/libc.so.6(abort+0x175)[0x3fef034085] /lib64/libc.so.6[0x3fef02ba1e] /lib64/libc.so.6(__assert_perror_fail+0x0)[0x3fef02bae0] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns(_ZN15unitigConsensus17generateConsensusEv+0x4fc)[0x42ddbc] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns(_Z16MultiAlignUnitigP11MultiAlignTP7gkStoreP11CNS_OptionsPi+0xa89)[0x42fa19] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns(main+0xa11)[0x409e51] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3fef01ecdd] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns[0x409089] Backtrace (demangled): [0] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns::AS_UTL_catchCrash(int, siginfo*, void*) + 0x2a [0x40dcfa] [1] /lib64/libpthread.so.0() [0x3fefc0f500] [2] /lib64/libc.so.6::(null) + 0x35 [0x3fef0328a5] [3] /lib64/libc.so.6::(null) + 0x175 [0x3fef034085] [4] /lib64/libc.so.6() [0x3fef02ba1e] [5] /lib64/libc.so.6::(null) + 0 [0x3fef02bae0] [6] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns::unitigConsensus::generateConsensus() + 0x4fc [0x42ddbc] [7] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns::MultiAlignUnitig(MultiAlignT*, gkStore*, CNS_Options*, int*) + 0xa89 [0x42fa19] [8] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns::(null) + 0xa11 [0x409e51] [9] /lib64/libc.so.6::(null) + 0xfd [0x3fef01ecdd] [10] /projects/cees/bin/celera/wgs-August5_2013_pacbio/Linux-amd64/bin/utgcns() [0x409089] GDB: If I understand this code correctly (http://sourceforge.net/p/wgs-assembler/svn/HEAD/tree/trunk/src/AS_CNS/MultiAlignUnitig.C#l98), if ir + 1 == depth you'll always hit the assertion. I guess I can run this with older code, from before June 17th... I've attached the layout of the unitig, if that's helpful. The assembly is a combination of Illumina, 454 and error corrected PacBio reads. Thank you. Ole |
From: Walenz, B. <bw...@jc...> - 2013-08-01 13:04:49
|
Hi- If only to save myself from responding to later bug reports and/or email... I renamed a ‘few’ files in wgs-assembler this morning to get rid of the ugly symlink hack during compilation. If you try to ‘svn update’ on a tree that you’ve compiled, those symlinks will greatly confuse svn. I recommend a clean checkout instead (or remove the symlinks first). Cheers! b -- Brian Walenz Senior Software Engineer J. Craig Venter Institute |
From: Francois S. <fra...@ir...> - 2013-07-24 07:22:55
|
Hi guys Ok, it was a problem in manes of library. Everuthing is running fine now Thanks for all Francois On 23/07/2013 15:53, Serge Koren wrote: > Hi, > > I think the library name specified to pacBioToCA is the issue. The library name given is lluminaTog5681 and, based on the Illumina FRG file name, I am guessing the Illumina library has the same name. The correction pipeline uses library attributes to figure out which library needs to be corrected and which are the short-read data. If the two have the same library name, it will get confused as the same library will be both short-read and pacbio data. Double-check that the library name in the Illumina_Tog5681_clean.frg file is not the same as the library name specified to pacBioToCA. > > Sergey > > On Jul 23, 2013, at 7:42 AM, Francois Sabot <fra...@ir...> wrote: > >> Hi >> >> The frg file >> /data/projects/assembling-glab/tempIlluminaTog5681/IlluminaTog5681.frg >> has been created during the run. >> >> Here the frg file >> /data/projects/assembling-glab/Illumina_Tog5681_clean.frg was created >> using fastqToCA from an Illumina file, as recommended on the wiki >> >> The command line was: >> >> /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA -s >> /data/projects/assembling-glab/pacbio_SGE_PACBIODEV.spec -l >> IlluminaTog5681 -partitions 13 -t 6 -fastq >> /data/projects/assembling-glab/PacBio_Tog5681/filtered_subreads.fastq >> /data/projects/assembling-glab/Illumina_Tog5681_clean.frg >> >> >> >> Francois >> >> On 23/07/2013 13:32, Ole Kristian Tørresen wrote: >>> Hi Francois. >>> >>> Which frg file contain the PacBio sequences? You point to two frg >>> files, /data/projects/assembling-glab/Illumina_Tog5681clean.frg >>> and /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg, >>> and it seems >>> that /data/projects/assembling-glab/Illumina_Tog5681clean.frg is the one >>> containing the PacBio reads. You just need to add them in the oposite >>> order, the PacBio reads needs to be the last loaded. >>> >>> Ole >>> >>> >>> On 23 July 2013 10:09, Francois Sabot <fra...@ir... >>> <mailto:fra...@ir...>> wrote: >>> >>> Dear all, >>> >>> We are using Illumina data to correct PacBio runs, as everyone those >>> days... >>> >>> After creating (successufll) the frg file for the illumina, we launched >>> the pacbioToCa system, and it started to run. >>> >>> However after a while, gatekeeper stopped with the following error: >>> >>> "/home/sabotf/sources/wgs/Linux-amd64/bin/gatekeeper -o >>> /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.BUILDING >>> -F /data/projects/assembling-glab/Illumina_Tog5681clean.frg >>> /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg >>>> >>> /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.err 2>&1 >>> ----------------------------------------END Tue Jul 23 09:56:39 2013 >>> (3084 seconds) >>> numFrags = 291073268 >>> Stop requested after 'initialstorebuilding'. >>> ----------------------------------------END Tue Jul 23 09:56:45 2013 >>> (3090 seconds) >>> Error: The PacBio library 0 must be the last library loaded but it >>> preceedes 1. Please double-check your input files and try again. at >>> /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA line 1084." >>> >>> >>> Any idea ? >>> The pacbio input file come from the provider directly, we did not make >>> any change on it >>> >>> Cheers >>> >>> Francois >>> >>> -- >>> -------------------------------------------------------- >>> Francois Sabot, PhD >>> >>> Be realistic. Demand the Impossible. >>> http://bioinfo.mpl.ird.fr/ >>> http://www.mpl.ird.fr/rice >>> ----------------------------------------- >>> UMR DIversity, Adaptation & DEvelopment >>> Centre IRD >>> 911, Av Agropolis BP 64501 >>> 34394 Montpellier Cedex 5 >>> France >>> Phone: +33 4 67 41 64 18 >>> ----------------------------------------- >>> >>> >>> ------------------------------------------------------------------------------ >>> See everything from the browser to the database with AppDynamics >>> Get end-to-end visibility with application monitoring from AppDynamics >>> Isolate bottlenecks and diagnose root cause in seconds. >>> Start your free trial of AppDynamics Pro today! >>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> <mailto:wgs...@li...> >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>> >>> >> >> -- >> -------------------------------------------------------- >> Francois Sabot, PhD >> >> Be realistic. Demand the Impossible. >> http://bioinfo.mpl.ird.fr/ >> http://www.mpl.ird.fr/rice >> ----------------------------------------- >> UMR DIversity, Adaptation & DEvelopment >> Centre IRD >> 911, Av Agropolis BP 64501 >> 34394 Montpellier Cedex 5 >> France >> Phone: +33 4 67 41 64 18 >> ----------------------------------------- >> >> ------------------------------------------------------------------------------ >> See everything from the browser to the database with AppDynamics >> Get end-to-end visibility with application monitoring from AppDynamics >> Isolate bottlenecks and diagnose root cause in seconds. >> Start your free trial of AppDynamics Pro today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk_______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > > -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Serge K. <se...@um...> - 2013-07-23 14:10:56
|
Hi, I think the library name specified to pacBioToCA is the issue. The library name given is lluminaTog5681 and, based on the Illumina FRG file name, I am guessing the Illumina library has the same name. The correction pipeline uses library attributes to figure out which library needs to be corrected and which are the short-read data. If the two have the same library name, it will get confused as the same library will be both short-read and pacbio data. Double-check that the library name in the Illumina_Tog5681_clean.frg file is not the same as the library name specified to pacBioToCA. Sergey On Jul 23, 2013, at 7:42 AM, Francois Sabot <fra...@ir...> wrote: > Hi > > The frg file > /data/projects/assembling-glab/tempIlluminaTog5681/IlluminaTog5681.frg > has been created during the run. > > Here the frg file > /data/projects/assembling-glab/Illumina_Tog5681_clean.frg was created > using fastqToCA from an Illumina file, as recommended on the wiki > > The command line was: > > /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA -s > /data/projects/assembling-glab/pacbio_SGE_PACBIODEV.spec -l > IlluminaTog5681 -partitions 13 -t 6 -fastq > /data/projects/assembling-glab/PacBio_Tog5681/filtered_subreads.fastq > /data/projects/assembling-glab/Illumina_Tog5681_clean.frg > > > > Francois > > On 23/07/2013 13:32, Ole Kristian Tørresen wrote: >> Hi Francois. >> >> Which frg file contain the PacBio sequences? You point to two frg >> files, /data/projects/assembling-glab/Illumina_Tog5681clean.frg >> and /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg, >> and it seems >> that /data/projects/assembling-glab/Illumina_Tog5681clean.frg is the one >> containing the PacBio reads. You just need to add them in the oposite >> order, the PacBio reads needs to be the last loaded. >> >> Ole >> >> >> On 23 July 2013 10:09, Francois Sabot <fra...@ir... >> <mailto:fra...@ir...>> wrote: >> >> Dear all, >> >> We are using Illumina data to correct PacBio runs, as everyone those >> days... >> >> After creating (successufll) the frg file for the illumina, we launched >> the pacbioToCa system, and it started to run. >> >> However after a while, gatekeeper stopped with the following error: >> >> "/home/sabotf/sources/wgs/Linux-amd64/bin/gatekeeper -o >> /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.BUILDING >> -F /data/projects/assembling-glab/Illumina_Tog5681clean.frg >> /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg >>> >> /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.err 2>&1 >> ----------------------------------------END Tue Jul 23 09:56:39 2013 >> (3084 seconds) >> numFrags = 291073268 >> Stop requested after 'initialstorebuilding'. >> ----------------------------------------END Tue Jul 23 09:56:45 2013 >> (3090 seconds) >> Error: The PacBio library 0 must be the last library loaded but it >> preceedes 1. Please double-check your input files and try again. at >> /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA line 1084." >> >> >> Any idea ? >> The pacbio input file come from the provider directly, we did not make >> any change on it >> >> Cheers >> >> Francois >> >> -- >> -------------------------------------------------------- >> Francois Sabot, PhD >> >> Be realistic. Demand the Impossible. >> http://bioinfo.mpl.ird.fr/ >> http://www.mpl.ird.fr/rice >> ----------------------------------------- >> UMR DIversity, Adaptation & DEvelopment >> Centre IRD >> 911, Av Agropolis BP 64501 >> 34394 Montpellier Cedex 5 >> France >> Phone: +33 4 67 41 64 18 >> ----------------------------------------- >> >> >> ------------------------------------------------------------------------------ >> See everything from the browser to the database with AppDynamics >> Get end-to-end visibility with application monitoring from AppDynamics >> Isolate bottlenecks and diagnose root cause in seconds. >> Start your free trial of AppDynamics Pro today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> <mailto:wgs...@li...> >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > > -- > -------------------------------------------------------- > Francois Sabot, PhD > > Be realistic. Demand the Impossible. > http://bioinfo.mpl.ird.fr/ > http://www.mpl.ird.fr/rice > ----------------------------------------- > UMR DIversity, Adaptation & DEvelopment > Centre IRD > 911, Av Agropolis BP 64501 > 34394 Montpellier Cedex 5 > France > Phone: +33 4 67 41 64 18 > ----------------------------------------- > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Francois S. <fra...@ir...> - 2013-07-23 11:43:25
|
Hi The frg file /data/projects/assembling-glab/tempIlluminaTog5681/IlluminaTog5681.frg has been created during the run. Here the frg file /data/projects/assembling-glab/Illumina_Tog5681_clean.frg was created using fastqToCA from an Illumina file, as recommended on the wiki The command line was: /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA -s /data/projects/assembling-glab/pacbio_SGE_PACBIODEV.spec -l IlluminaTog5681 -partitions 13 -t 6 -fastq /data/projects/assembling-glab/PacBio_Tog5681/filtered_subreads.fastq /data/projects/assembling-glab/Illumina_Tog5681_clean.frg Francois On 23/07/2013 13:32, Ole Kristian Tørresen wrote: > Hi Francois. > > Which frg file contain the PacBio sequences? You point to two frg > files, /data/projects/assembling-glab/Illumina_Tog5681clean.frg > and /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg, > and it seems > that /data/projects/assembling-glab/Illumina_Tog5681clean.frg is the one > containing the PacBio reads. You just need to add them in the oposite > order, the PacBio reads needs to be the last loaded. > > Ole > > > On 23 July 2013 10:09, Francois Sabot <fra...@ir... > <mailto:fra...@ir...>> wrote: > > Dear all, > > We are using Illumina data to correct PacBio runs, as everyone those > days... > > After creating (successufll) the frg file for the illumina, we launched > the pacbioToCa system, and it started to run. > > However after a while, gatekeeper stopped with the following error: > > "/home/sabotf/sources/wgs/Linux-amd64/bin/gatekeeper -o > /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.BUILDING > -F /data/projects/assembling-glab/Illumina_Tog5681clean.frg > /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg > > > /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.err 2>&1 > ----------------------------------------END Tue Jul 23 09:56:39 2013 > (3084 seconds) > numFrags = 291073268 > Stop requested after 'initialstorebuilding'. > ----------------------------------------END Tue Jul 23 09:56:45 2013 > (3090 seconds) > Error: The PacBio library 0 must be the last library loaded but it > preceedes 1. Please double-check your input files and try again. at > /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA line 1084." > > > Any idea ? > The pacbio input file come from the provider directly, we did not make > any change on it > > Cheers > > Francois > > -- > -------------------------------------------------------- > Francois Sabot, PhD > > Be realistic. Demand the Impossible. > http://bioinfo.mpl.ird.fr/ > http://www.mpl.ird.fr/rice > ----------------------------------------- > UMR DIversity, Adaptation & DEvelopment > Centre IRD > 911, Av Agropolis BP 64501 > 34394 Montpellier Cedex 5 > France > Phone: +33 4 67 41 64 18 > ----------------------------------------- > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > <mailto:wgs...@li...> > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Ole K. T. <ol...@st...> - 2013-07-23 11:33:06
|
Hi Francois. Which frg file contain the PacBio sequences? You point to two frg files, /data/projects/assembling-glab/Illumina_Tog5681clean.frg and /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg, and it seems that /data/projects/assembling-glab/Illumina_Tog5681clean.frg is the one containing the PacBio reads. You just need to add them in the oposite order, the PacBio reads needs to be the last loaded. Ole On 23 July 2013 10:09, Francois Sabot <fra...@ir...> wrote: > Dear all, > > We are using Illumina data to correct PacBio runs, as everyone those > days... > > After creating (successufll) the frg file for the illumina, we launched > the pacbioToCa system, and it started to run. > > However after a while, gatekeeper stopped with the following error: > > "/home/sabotf/sources/wgs/Linux-amd64/bin/gatekeeper -o > /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.BUILDING > -F /data/projects/assembling-glab/Illumina_Tog5681clean.frg > /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg > > /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.err 2>&1 > ----------------------------------------END Tue Jul 23 09:56:39 2013 > (3084 seconds) > numFrags = 291073268 > Stop requested after 'initialstorebuilding'. > ----------------------------------------END Tue Jul 23 09:56:45 2013 > (3090 seconds) > Error: The PacBio library 0 must be the last library loaded but it > preceedes 1. Please double-check your input files and try again. at > /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA line 1084." > > > Any idea ? > The pacbio input file come from the provider directly, we did not make > any change on it > > Cheers > > Francois > > -- > -------------------------------------------------------- > Francois Sabot, PhD > > Be realistic. Demand the Impossible. > http://bioinfo.mpl.ird.fr/ > http://www.mpl.ird.fr/rice > ----------------------------------------- > UMR DIversity, Adaptation & DEvelopment > Centre IRD > 911, Av Agropolis BP 64501 > 34394 Montpellier Cedex 5 > France > Phone: +33 4 67 41 64 18 > ----------------------------------------- > > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Francois S. <fra...@ir...> - 2013-07-23 08:27:29
|
Dear all, We are using Illumina data to correct PacBio runs, as everyone those days... After creating (successufll) the frg file for the illumina, we launched the pacbioToCa system, and it started to run. However after a while, gatekeeper stopped with the following error: "/home/sabotf/sources/wgs/Linux-amd64/bin/gatekeeper -o /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.BUILDING -F /data/projects/assembling-glab/Illumina_Tog5681clean.frg /data/projects/assembling-glab//tempIlluminaTog5681/IlluminaTog5681.frg > /data/projects/assembling-glab/tempIlluminaTog5681/asm.gkpStore.err 2>&1 ----------------------------------------END Tue Jul 23 09:56:39 2013 (3084 seconds) numFrags = 291073268 Stop requested after 'initialstorebuilding'. ----------------------------------------END Tue Jul 23 09:56:45 2013 (3090 seconds) Error: The PacBio library 0 must be the last library loaded but it preceedes 1. Please double-check your input files and try again. at /home/sabotf/sources/wgs/Linux-amd64/bin/pacBioToCA line 1084." Any idea ? The pacbio input file come from the provider directly, we did not make any change on it Cheers Francois -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Hajime O. <hoh...@ni...> - 2013-07-01 08:25:16
|
Hi, I found the CA7's "De Novo Classificaton" is quite useful for preprosessing of illumina Mate-Pair reads. Hopefully I can use it as a stand-alone preprosessing program. Here I have two illumina libraries (a MP and a PE), and I added these lines in my spec file and ran the runCA. stopAfter = classifyMates dncMPlibraries = MP2400 dncBBlibraries = PE300 The runCA has successfully finished, then I'd like to refer the DNC result to divide my MP-reads into three categories. But I could not find any simple text files for that. I'm afraid the information is in binary files... May I have any workaround to my goal? Or isn't it available as a stand-alone one? Many thanks in advance, --Hajime |
From: Walenz, B. <bw...@jc...> - 2013-06-20 18:34:07
|
There have been infinite loops here in the past, but they are somewhat rare, and I think were before the 6.1 release. If you want to intervene, restarting at the checkpoint after the merge (there are two merge steps) will work. If this is the last iteration of cgw (7-4-CGW typically) then care must be taken to not skip too much, otherwise pieces of the algorithm are skipped. I’d just let it run here. The CGW in MaSuRCA was modified to be less aggressive at scaffolding, and is generally quick. The error itself can cause rare (again) issues with gap sizes, but is otherwise harmless. We’ve seen a few assemblies with enormous gap sizes, far far larger than the mates would support. Nearly every assembly hits this problem. b On 6/20/13 11:41 AM, "Ben Elsworth" <el...@gm...> wrote: Hi, I'm running celera 6.1 inside MaSuRCA and I appear to have a never ending CGW condition which keeps giving this error: WARNING: variance difference is negative -- probably trouble with variances after interleaving WARNING: variance difference is negative -- probably trouble with variances after interleaving WARNING: variance difference is negative -- probably trouble with variances after interleaving WARNING: variance difference is negative -- probably trouble with variances after interleaving SOMEBODY IS SCREWING UP SCAFFOLDING -- RecomputeOffsetsInScaffold has a singularity -- assert skipped! ReomputeOffsetsInScaffold failed (1) for scaffold 898025 in MergeScaffolds MergeScaffoldsAggressive()-- iter 312 -- continue because we merged 3 scaffolds. minWeightThreshold = 6.9 maxAllowedIterations = 16 * CleanupScaffolds through scaffold 90000 * CleanupScaffolds through scaffold 120000 * CleanupScaffolds through scaffold 150000 * CleanupScaffolds through scaffold 160000 * CleanupScaffolds through scaffold 170000 I've tried stopping it and starting at a later checkpoint but it still seems to be stuck. Will this eventually finish or should I intervene? Cheers, Ben |
From: Ben E. <el...@gm...> - 2013-06-20 15:42:11
|
Hi, I'm running celera 6.1 inside MaSuRCA and I appear to have a never ending CGW condition which keeps giving this error: WARNING: variance difference is negative -- probably trouble with variances after interleaving WARNING: variance difference is negative -- probably trouble with variances after interleaving WARNING: variance difference is negative -- probably trouble with variances after interleaving WARNING: variance difference is negative -- probably trouble with variances after interleaving *SOMEBODY IS SCREWING UP SCAFFOLDING -- RecomputeOffsetsInScaffold has a singularity -- assert skipped!* *ReomputeOffsetsInScaffold failed (1) for scaffold 898025 in MergeScaffolds* MergeScaffoldsAggressive()-- iter 312 -- continue because we merged 3 scaffolds. minWeightThreshold = 6.9 maxAllowedIterations = 16 * CleanupScaffolds through scaffold 90000 * CleanupScaffolds through scaffold 120000 * CleanupScaffolds through scaffold 150000 * CleanupScaffolds through scaffold 160000 * CleanupScaffolds through scaffold 170000 I've tried stopping it and starting at a later checkpoint but it still seems to be stuck. Will this eventually finish or should I intervene? Cheers, Ben |
From: Ben E. <el...@gm...> - 2013-06-05 14:08:39
|
Hi Brian, I was concerned about the amount of trouble I was having but after a couple more similar problems it has finished successfully. Many thanks for all your help, Ben On 4 June 2013 16:35, Walenz, Brian <bw...@jc...> wrote: > Hi, Ben- > > Yes, an easy one. I don’t know why you’re having such a difficult time. > Keep plugging away at it, I guess. > > Unfortunately, you’re hitting a lot of problems that have already been > fixed in the latest CA code. > > b > > > > On 6/4/13 5:27 AM, "Ben Elsworth" <el...@gm...> wrote: > > Ok, apologies, this one was easy, I just commented out the assertion on > line 1419 and all seems well again. > > Cheers, > > Ben > > On 2 June 2013 19:59, Ben Elsworth <el...@gm...> wrote: > > Hi Brian, > > Thanks again for your help. I got a little further this time. Now I have > an error again during the cgw stage but this time the error > from 7-2-CGW/cgw.out is: > > * FOEXS: SUSPICIOUS Overlap found! Looked for (1184641,594183,O)[12,1073] > found (594183,1184641,I) 20 > WARNING: InsertChunkOverlap()-- Chunk overlap already exists. > NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 > overlap 20 hang 0,0 qual 0 offset 0,7276224 > OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 > overlap 20 hang 310,300 qual 0 offset 0,7276224 > WARNING: CreateChunkOverlapFromEdge()-- Chunk overlap already exists. > Keeping old overlap. > NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 > overlap 20 hang 0,0 qual 0 offset 0,7276224 > OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 > overlap 20 hang 310,300 qual 0 offset 0,7276224 > * Switched right-left, orientation went from O to I > * CreateAContigInScaffold() failed. > ContigContainment failed. > cgw: LeastSquaresGaps_CGW.C:1419: RecomputeOffsetsStatus > RecomputeOffsetsInScaffold(ScaffoldGraphT*, CIScaffoldT*, int, int, int): > Assertion `0' failed. > > Any ideas? > > Cheers, > > Ben > > On 30 May 2013 12:40, Walenz, Brian <bw...@jc...> wrote: > > Hi, Ben- > > Somehow, various flags on reads are inconsistent. Unless you’ve got a LOT > of time invested in this cgw run, I’d recommend deleting the 7* directories > and starting scaffolding again. Is this near the start of the cgw process? > Can you get a stack trace? If not, what does the *timing file contain? > Were there any oddities in the scaffold run so far? > > It’s failing because it wants to compute the insert size of a specific > mate pair, but the two reads are in different contigs. We can get around > the current failure by explicitly ignoring this pair. Right before the > assert that fails, add: > > if (frag->contigID != mate->contigID) > continue; > > I.e., Doctor, it hurts when I do this. Then don’t do that! > > b > > > > > On 5/28/13 10:06 AM, "Ben Elsworth" <el...@gm... < > http://el...@gm...> > wrote: > > Hi Brian, > > That didn't seem to help. I tried editing an earlier section of code in a > similar way by adding this: > > if (mateContig == NULL){ > continue; > } > > below this: > > mateContig = GetGraphNode( ScaffoldGraph->ContigGraph, mate->contigID); > > But that led to another error: > > cgw: GraphCGW_T.C:3232: void ComputeMatePairStatisticsRestricted(int, > int32, char*): Assertion `frag->contigID == mate->contigID' failed. > > Any idea what's going on? > > Cheers, > > Ben > > On 28 May 2013 11:00, Walenz, Brian <bw...@jc... < > http://bw...@jc...> > wrote: > > Hi Ben- > > I think this is harmless, and can be patched around. > > > To patch it up, and maybe avoid the crash here, add the following at line > > 3206 in AS_CGW/GraphCGW_T.c > > > > if (extremeContig == NULL) > > continue; > > > > Line 3206 is just after "extremeContig = GetGraphNode(...)" and before > the > > call to GetContigPositionInScaffold(). > > Line numbers above are relative to the latest code base. In 6.1, unless > masurca fiddled with GraphCGW_T.c, you want line 3296, in between these two > lines: > > extremeContig = GetGraphNode( ScaffoldGraph->ContigGraph, > scaff->info.Scaffold.BEndCI); > GetContigPositionInScaffold ( extremeContig, &contigLeftEnd, > &contigRightEnd, &contigScaffoldOrientation); > > The one other assembly that failed here (it was recent, too) finished > successfully after the patch. > > b > > > > On 5/27/13 7:43 AM, "Ben Elsworth" <el...@gm... < > http://el...@gm...> <http://el...@gm...> > wrote: > > Hi, > > I'm running v6.1 within MaSuRCA and keep getting this error during the cgw > step: > > cgw: GraphCGW_T.C:3275: void ComputeMatePairStatisticsRestricted(int, > int32, char*): Assertion `(mateContig) != __null' failed. > > It occurs after a lot of warnings about negative variance. I've tried > following the advice here - > http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Scaffolder_failurebut keep getting the error. > > Any ideas? > > Cheers, > > Ben > > > > > > > |
From: Walenz, B. <bw...@jc...> - 2013-06-04 15:35:51
|
Hi, Ben- Yes, an easy one. I don’t know why you’re having such a difficult time. Keep plugging away at it, I guess. Unfortunately, you’re hitting a lot of problems that have already been fixed in the latest CA code. b On 6/4/13 5:27 AM, "Ben Elsworth" <el...@gm...> wrote: Ok, apologies, this one was easy, I just commented out the assertion on line 1419 and all seems well again. Cheers, Ben On 2 June 2013 19:59, Ben Elsworth <el...@gm...> wrote: Hi Brian, Thanks again for your help. I got a little further this time. Now I have an error again during the cgw stage but this time the error from 7-2-CGW/cgw.out is: * FOEXS: SUSPICIOUS Overlap found! Looked for (1184641,594183,O)[12,1073] found (594183,1184641,I) 20 WARNING: InsertChunkOverlap()-- Chunk overlap already exists. NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 overlap 20 hang 0,0 qual 0 offset 0,7276224 OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 overlap 20 hang 310,300 qual 0 offset 0,7276224 WARNING: CreateChunkOverlapFromEdge()-- Chunk overlap already exists. Keeping old overlap. NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 overlap 20 hang 0,0 qual 0 offset 0,7276224 OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 overlap 20 hang 310,300 qual 0 offset 0,7276224 * Switched right-left, orientation went from O to I * CreateAContigInScaffold() failed. ContigContainment failed. cgw: LeastSquaresGaps_CGW.C:1419: RecomputeOffsetsStatus RecomputeOffsetsInScaffold(ScaffoldGraphT*, CIScaffoldT*, int, int, int): Assertion `0' failed. Any ideas? Cheers, Ben On 30 May 2013 12:40, Walenz, Brian <bw...@jc...> wrote: Hi, Ben- Somehow, various flags on reads are inconsistent. Unless you’ve got a LOT of time invested in this cgw run, I’d recommend deleting the 7* directories and starting scaffolding again. Is this near the start of the cgw process? Can you get a stack trace? If not, what does the *timing file contain? Were there any oddities in the scaffold run so far? It’s failing because it wants to compute the insert size of a specific mate pair, but the two reads are in different contigs. We can get around the current failure by explicitly ignoring this pair. Right before the assert that fails, add: if (frag->contigID != mate->contigID) continue; I.e., Doctor, it hurts when I do this. Then don’t do that! b On 5/28/13 10:06 AM, "Ben Elsworth" <el...@gm... <http://el...@gm...> > wrote: Hi Brian, That didn't seem to help. I tried editing an earlier section of code in a similar way by adding this: if (mateContig == NULL){ continue; } below this: mateContig = GetGraphNode( ScaffoldGraph->ContigGraph, mate->contigID); But that led to another error: cgw: GraphCGW_T.C:3232: void ComputeMatePairStatisticsRestricted(int, int32, char*): Assertion `frag->contigID == mate->contigID' failed. Any idea what's going on? Cheers, Ben On 28 May 2013 11:00, Walenz, Brian <bw...@jc... <http://bw...@jc...> > wrote: Hi Ben- I think this is harmless, and can be patched around. > To patch it up, and maybe avoid the crash here, add the following at line > 3206 in AS_CGW/GraphCGW_T.c > > if (extremeContig == NULL) > continue; > > Line 3206 is just after "extremeContig = GetGraphNode(...)" and before the > call to GetContigPositionInScaffold(). Line numbers above are relative to the latest code base. In 6.1, unless masurca fiddled with GraphCGW_T.c, you want line 3296, in between these two lines: extremeContig = GetGraphNode( ScaffoldGraph->ContigGraph, scaff->info.Scaffold.BEndCI); GetContigPositionInScaffold ( extremeContig, &contigLeftEnd, &contigRightEnd, &contigScaffoldOrientation); The one other assembly that failed here (it was recent, too) finished successfully after the patch. b On 5/27/13 7:43 AM, "Ben Elsworth" <el...@gm... <http://el...@gm...> <http://el...@gm...> > wrote: Hi, I'm running v6.1 within MaSuRCA and keep getting this error during the cgw step: cgw: GraphCGW_T.C:3275: void ComputeMatePairStatisticsRestricted(int, int32, char*): Assertion `(mateContig) != __null' failed. It occurs after a lot of warnings about negative variance. I've tried following the advice here - http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Scaffolder_failure but keep getting the error. Any ideas? Cheers, Ben |
From: Ben E. <el...@gm...> - 2013-06-04 09:28:27
|
Ok, apologies, this one was easy, I just commented out the assertion on line 1419 and all seems well again. Cheers, Ben On 2 June 2013 19:59, Ben Elsworth <el...@gm...> wrote: > Hi Brian, > > Thanks again for your help. I got a little further this time. Now I have > an error again during the cgw stage but this time the error > from 7-2-CGW/cgw.out is: > > * FOEXS: SUSPICIOUS Overlap found! Looked for (1184641,594183,O)[12,1073] > found (594183,1184641,I) 20 > WARNING: InsertChunkOverlap()-- Chunk overlap already exists. > NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 > overlap 20 hang 0,0 qual 0 offset 0,7276224 > OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 > overlap 20 hang 310,300 qual 0 offset 0,7276224 > WARNING: CreateChunkOverlapFromEdge()-- Chunk overlap already exists. > Keeping old overlap. > NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 > overlap 20 hang 0,0 qual 0 offset 0,7276224 > OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 > overlap 20 hang 310,300 qual 0 offset 0,7276224 > * Switched right-left, orientation went from O to I > * CreateAContigInScaffold() failed. > ContigContainment failed. > cgw: LeastSquaresGaps_CGW.C:1419: RecomputeOffsetsStatus > RecomputeOffsetsInScaffold(ScaffoldGraphT*, CIScaffoldT*, int, int, int): > Assertion `0' failed. > > Any ideas? > > Cheers, > > Ben > > On 30 May 2013 12:40, Walenz, Brian <bw...@jc...> wrote: > >> Hi, Ben- >> >> Somehow, various flags on reads are inconsistent. Unless you’ve got a >> LOT of time invested in this cgw run, I’d recommend deleting the 7* >> directories and starting scaffolding again. Is this near the start of the >> cgw process? Can you get a stack trace? If not, what does the *timing >> file contain? Were there any oddities in the scaffold run so far? >> >> It’s failing because it wants to compute the insert size of a specific >> mate pair, but the two reads are in different contigs. We can get around >> the current failure by explicitly ignoring this pair. Right before the >> assert that fails, add: >> >> if (frag->contigID != mate->contigID) >> continue; >> >> I.e., Doctor, it hurts when I do this. Then don’t do that! >> >> b >> >> >> >> >> On 5/28/13 10:06 AM, "Ben Elsworth" <el...@gm...> wrote: >> >> Hi Brian, >> >> That didn't seem to help. I tried editing an earlier section of code in a >> similar way by adding this: >> >> if (mateContig == NULL){ >> continue; >> } >> >> below this: >> >> mateContig = GetGraphNode( ScaffoldGraph->ContigGraph, mate->contigID); >> >> But that led to another error: >> >> cgw: GraphCGW_T.C:3232: void ComputeMatePairStatisticsRestricted(int, >> int32, char*): Assertion `frag->contigID == mate->contigID' failed. >> >> Any idea what's going on? >> >> Cheers, >> >> Ben >> >> On 28 May 2013 11:00, Walenz, Brian <bw...@jc...> wrote: >> >> Hi Ben- >> >> I think this is harmless, and can be patched around. >> >> > To patch it up, and maybe avoid the crash here, add the following at >> line >> > 3206 in AS_CGW/GraphCGW_T.c >> > >> > if (extremeContig == NULL) >> > continue; >> > >> > Line 3206 is just after "extremeContig = GetGraphNode(...)" and before >> the >> > call to GetContigPositionInScaffold(). >> >> Line numbers above are relative to the latest code base. In 6.1, unless >> masurca fiddled with GraphCGW_T.c, you want line 3296, in between these two >> lines: >> >> extremeContig = GetGraphNode( ScaffoldGraph->ContigGraph, >> scaff->info.Scaffold.BEndCI); >> GetContigPositionInScaffold ( extremeContig, &contigLeftEnd, >> &contigRightEnd, &contigScaffoldOrientation); >> >> The one other assembly that failed here (it was recent, too) finished >> successfully after the patch. >> >> b >> >> >> >> On 5/27/13 7:43 AM, "Ben Elsworth" <el...@gm... < >> http://el...@gm...> > wrote: >> >> Hi, >> >> I'm running v6.1 within MaSuRCA and keep getting this error during the >> cgw step: >> >> cgw: GraphCGW_T.C:3275: void ComputeMatePairStatisticsRestricted(int, >> int32, char*): Assertion `(mateContig) != __null' failed. >> >> It occurs after a lot of warnings about negative variance. I've tried >> following the advice here - >> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Scaffolder_failurebut keep getting the error. >> >> Any ideas? >> >> Cheers, >> >> Ben >> >> >> >> > |
From: Ben E. <el...@gm...> - 2013-06-02 18:59:39
|
Hi Brian, Thanks again for your help. I got a little further this time. Now I have an error again during the cgw stage but this time the error from 7-2-CGW/cgw.out is: * FOEXS: SUSPICIOUS Overlap found! Looked for (1184641,594183,O)[12,1073] found (594183,1184641,I) 20 WARNING: InsertChunkOverlap()-- Chunk overlap already exists. NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 overlap 20 hang 0,0 qual 0 offset 0,7276224 OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 overlap 20 hang 310,300 qual 0 offset 0,7276224 WARNING: CreateChunkOverlapFromEdge()-- Chunk overlap already exists. Keeping old overlap. NEW 594183,1184641,I - min/max 17/23 0/0 erate 0.250000 flags 10000 overlap 20 hang 0,0 qual 0 offset 0,7276224 OLD 594183,1184641,I - min/max 12/1073 12/1073 erate 0.250000 flags 10001 overlap 20 hang 310,300 qual 0 offset 0,7276224 * Switched right-left, orientation went from O to I * CreateAContigInScaffold() failed. ContigContainment failed. cgw: LeastSquaresGaps_CGW.C:1419: RecomputeOffsetsStatus RecomputeOffsetsInScaffold(ScaffoldGraphT*, CIScaffoldT*, int, int, int): Assertion `0' failed. Any ideas? Cheers, Ben On 30 May 2013 12:40, Walenz, Brian <bw...@jc...> wrote: > Hi, Ben- > > Somehow, various flags on reads are inconsistent. Unless you’ve got a LOT > of time invested in this cgw run, I’d recommend deleting the 7* directories > and starting scaffolding again. Is this near the start of the cgw process? > Can you get a stack trace? If not, what does the *timing file contain? > Were there any oddities in the scaffold run so far? > > It’s failing because it wants to compute the insert size of a specific > mate pair, but the two reads are in different contigs. We can get around > the current failure by explicitly ignoring this pair. Right before the > assert that fails, add: > > if (frag->contigID != mate->contigID) > continue; > > I.e., Doctor, it hurts when I do this. Then don’t do that! > > b > > > > > On 5/28/13 10:06 AM, "Ben Elsworth" <el...@gm...> wrote: > > Hi Brian, > > That didn't seem to help. I tried editing an earlier section of code in a > similar way by adding this: > > if (mateContig == NULL){ > continue; > } > > below this: > > mateContig = GetGraphNode( ScaffoldGraph->ContigGraph, mate->contigID); > > But that led to another error: > > cgw: GraphCGW_T.C:3232: void ComputeMatePairStatisticsRestricted(int, > int32, char*): Assertion `frag->contigID == mate->contigID' failed. > > Any idea what's going on? > > Cheers, > > Ben > > On 28 May 2013 11:00, Walenz, Brian <bw...@jc...> wrote: > > Hi Ben- > > I think this is harmless, and can be patched around. > > > To patch it up, and maybe avoid the crash here, add the following at line > > 3206 in AS_CGW/GraphCGW_T.c > > > > if (extremeContig == NULL) > > continue; > > > > Line 3206 is just after "extremeContig = GetGraphNode(...)" and before > the > > call to GetContigPositionInScaffold(). > > Line numbers above are relative to the latest code base. In 6.1, unless > masurca fiddled with GraphCGW_T.c, you want line 3296, in between these two > lines: > > extremeContig = GetGraphNode( ScaffoldGraph->ContigGraph, > scaff->info.Scaffold.BEndCI); > GetContigPositionInScaffold ( extremeContig, &contigLeftEnd, > &contigRightEnd, &contigScaffoldOrientation); > > The one other assembly that failed here (it was recent, too) finished > successfully after the patch. > > b > > > > On 5/27/13 7:43 AM, "Ben Elsworth" <el...@gm... < > http://el...@gm...> > wrote: > > Hi, > > I'm running v6.1 within MaSuRCA and keep getting this error during the cgw > step: > > cgw: GraphCGW_T.C:3275: void ComputeMatePairStatisticsRestricted(int, > int32, char*): Assertion `(mateContig) != __null' failed. > > It occurs after a lot of warnings about negative variance. I've tried > following the advice here - > http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Scaffolder_failurebut keep getting the error. > > Any ideas? > > Cheers, > > Ben > > > > |
From: Walenz, B. <bw...@jc...> - 2013-05-31 19:52:29
|
[catching up on email] The partitioned store allows multiple processes to write at the same time. In the store, v001 (for example) has three files per consensus process. Each set of files is a mini-store, and so each consensus process can modify it without needing locking or other fancy database operations. When we get to scaffolding, there is only one process, so the store transparently merges all those mini-stores into one monolithic store. The catch here is that the two formats can co-exist. If some process works with the monolithic store, the consensus processes — operating on mini-stores — don’t ever see the change. I’m at a loss to explain why v5 didn’t work and v2 did. The store is easy to break like this if one isn’t careful with the editing operations. It’s usually also easy to un-break — by removing the files that contain the monolithic store, or as an extreme, removing the whole version and recomputing it. b On 5/23/13 10:34 AM, "Ben Elsworth" <el...@gm...> wrote: Hi Brian, Thanks for the reply. I followed your advice, although I added the final unitig to version 2 not 5 as that didn't seem to work, and that has got things working. I'm not entirely sure what the difference is between the partitioned and non-partitioned formats though, and why is it working with v2 and not v5? Regards, Ben On 22 May 2013 18:51, Walenz, Brian <bw...@jc...> wrote: Hi, Ben- I’ve seen something similar to this before. As I recall, the store is getting confused between the partitioned (-up 3) and non-partitioned formats. Try adding the final unitig to unpartiitoned version 5: tigStore –g *gkpStore –t *tigStore 2 –up 3 –d layout –u 8049 > unitig8049.withcns tigStore –g *gkpStore –t *tigStore 5 –R unitig8049.withcns The first retrieves the output of utgcns (check that it has consensus sequence at the top, and that there is exactly one UTG line at the end) and the second adds this to version 5 (the input to cgw). I think that what happened is that you added unitig8049 without consensus to the unpartitioned store, (the middle command, without ‘–up3’) and that is masking the untiig that is stored in partition 3. You can hopefully test this hypothesis by removing ‘-up 3’ from the tigStore retrieval above – it should be reporting a unitig without consensus. b [Ben, sorry for the duplicate, forgot to send to the list] On 5/22/13 11:28 AM, "Ben Elsworth" <el...@gm... <http://el...@gm...> > wrote: Hi, I am having an issue with the consensus step as part of MaSuRCA. One unitig fails and I have tried a number of methods to correct it, including rearranging, splitting and even removing it. All of these methods fix the alignment issue and produces a successful outcome when I test it: utgcns -g genome.gkpStore -t genome.tigStore 1 3 -T unitig8049 Based on this I insert the updated tig and compute the new consensus sequence: tigStore -g genome.gkpStore -t genome.tigStore/ 1 -up 3 -R unitig8049 utgcns -g genome.gkpStore -t genome.tigStore 1 3 -u 8049 However, when I rerun runCA I get the following error from the cgw step: Reading unitigs. ...processed 100000 unitigs. ERROR: Unitig 8049 has no placement; probably not run through consensus. cgw: Input_CGW.C:117: int ProcessInput(int, int, char**): Assertion `1 == GetNumIntUnitigPoss(uma->u_list)' failed. How can I make sure the new unitig arrangement is found? Cheers, Ben |