From: A. B. C. <ber...@gm...> - 2015-12-12 18:43:59
|
Dear Sergey, Thank you for your suggestion. I tried two times to use the falcon_sense program from canu inside the PBcR script , and got the same errror in both attempts (error message copied below). It seems that the output of the new falcon_sense (from canu) is somehow incompatible with the PBcR script. Please let me know if you have any suggestion on how to proceed ; if none, I will wait for the canu release. Yours, Bernardo ********* Finished correcting 7200013631 bp (using 15743312583 bp). ********* Assembling corrected sequences. Assembling with average 52 (min frag 26) and using ovl is 40 ----------------------------------------START Fri Dec 11 19:16:41 2015 ln -sf dros1nf.frg dros1nf.longest25.frg ----------------------------------------END Fri Dec 11 19:16:41 2015 (0 seconds) ----------------------------------------START Fri Dec 11 19:16:41 2015 ln -sf dros1nf.fastq dros1nf.longest25.fastq ----------------------------------------END Fri Dec 11 19:16:42 2015 (1 seconds) ----------------------------------------START Fri Dec 11 19:16:42 2015 /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.longest25.frg ----------------------------------------START Fri Dec 11 19:16:42 2015 /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -o /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.BUILDING -T -F /draft1/bernardo1/drosophila/dros1nf.longest25.frg > /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err 2>&1 ----------------------------------------END Fri Dec 11 19:18:32 2015 (110 seconds) ERROR: Failed with signal HUP (1) ================================================================================ runCA failed. ---------------------------------------- Stack trace: at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1628. main::caFailure("gatekeeper failed", "/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err") called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1957 main::preoverlap("/draft1/bernardo1/drosophila/dros1nf.longest25.frg") called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 6250 ---------------------------------------- Last few lines of the relevant log file (/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err): Starting file '/draft1/bernardo1/drosophila/dros1nf.longest25.frg'. Processing SINGLE-ENDED SANGER QV encoding reads from: '/draft1/bernardo1/drosophila//dros1nf.fastq' GKP finished with 68766632 alerts or errors: 68766632 # ILL Error: not a sequence start line. ERROR: library IID 1 'dros1nf' has 51263.29% errors or warnings. ---------------------------------------- Failure message: gatekeeper failed A. Bernardo Carvalho Departamento de Genética Universidade Federal do Rio de Janeiro On 4 December 2015 at 20:32, Serge Koren <ser...@gm...> wrote: > Hi, > > The issue is that PBDAGCON relies on BLASR libraries to do alignments in > our implementation. For whatever reason, BLASR performance on D. > melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t > recommend running PBDAGCON on this genome unless you can run all the > partitions in parallel on a grid environment. > > Also, we have a new version of the assembler, canu, which has an updated > falcon_sense version which may work better for your assembly. You get the > falcon_sense Linux binary here: > > http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true > <https://github.com/marbl/canu> > and just try replacing the version in CA 8.3 to see if it improves the Y > assembly. > > Sergey > > On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm...> > wrote: > > Hi, > I noticed that while the Drosophila melanogaster MHAP assembly is very > good in general, it has many gaps in single-copy Y-linked genes. I guess > that this is caused by low coverage: the DNA came from males, and was > assembled at 25x, which leaves the Y genes at 12.5x (theoretically). > Furthermore, it seems that Y-linked reads are being lost during the first > correction step (done by falcon-sense; I checked the uncorrected and the > corrected reads). > > I am trying to fix these problems by increasing the coverage of the > corrected reads used in the "post-correction" steps (by adding > assembleCoverage=40 in the spec file ; instead of the default 25x) , and > by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1 > in the spec file). The assembly with 40x and falcon-sense worked fine , > but when I tried 40x with pbdagcon , the run seems to be abnormally slow. > Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM, > and after 9 days running it was still processing the first two partitions > of runPartition.sh > > # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1 > # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2 > > I checked the runPartition.sh script, and it seems to use only 8 threads > (instead of 24): > > cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh > > $bin/outputLayout \ > -L \ > -e 0.35 -M 1500 \ > -i /home3/users/bernardo/drosophila//tempdros10/asm \ > -o /home3/users/bernardo/drosophila//tempdros10/asm \ > -p $jobid \ > -l 500 \ > \ > -P \ > -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \ > 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err | > $bin/convertToPBCNS -consensus pbdagcon -path > /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output > /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix > /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500 > -coverage 4 -threads 8 > > /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch > > In this particular run I have not specified cnsConcurrency or > consensusConcurrency in the spec file (so the PBcR choose the values; I > only set threads=20 ), but in another run I added cnsConcurrency=20 > consensusConcurrency=20 > to the spec file, and again in 10 days it processed only 3 of the 200 > partitions. > > I tried before the ecoli 30x and the yeast data, and both worked fine with > pbdagcon (although slower than falcon-sense). Are there some limitation to > use pbdagcon with higher coverage data? Is the -threads 8 option of the > convertToPBCNS program correct? > > Thanks, > Bernardo > > > > > > > > > > A. Bernardo Carvalho > > Departamento de Genética > Universidade Federal do Rio de Janeiro > > ------------------------------------------------------------------------------ > Go from Idea to Many App Stores Faster with Intel(R) XDK > Give your users amazing mobile app experiences with Intel(R) XDK. > Use one codebase in this all-in-one HTML5 development environment. > Design, debug & build mobile apps & 2D/3D high-impact games for multiple > OSs. > > http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > > |