From: A. B. C. <ber...@gm...> - 2015-12-14 20:25:32
|
Hi Serge, Thank you for your suggestion. I followed it, but got stopped by another error (below; probably at the unitigger) . Please let me know if you have any other suggestion. best, Bernardo I issued the following commands: cd /draft1/bernardo1/drosophila rm dros1nf.fastq rm dros1nf.frg rm -fr dros1nf java -jar /home/bernardo/programs/convertFastaAndQualToFastq.jar dros1nf.fasta > dros1nf.fastq fastqToCA -libraryname dros1nf -technology pacbio-corrected -type sanger -reads dros1nf.fastq > dros1nf.frg runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.frg > dros1nf.out 2>&1 OUTPUT: ... ----------------------------------------START Mon Dec 14 09:51:20 2015 /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/bogart -O /draft1/bernardo1/drosophila/dros1nf/asm.ovlStore -G /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore -T /draft1/bernardo1/drosophila/dros1nf/asm.tigStore -B 4189 -eg 0.025 -Eg 0 -em 0.025 -Em 0 -RS -NS -CS -o /draft1/bernardo1/drosophila/dros1nf/4-unitigger/asm > /draft1/bernardo1/drosophila/dros1nf/4-unitigger/unitigger.err 2>&1 ----------------------------------------END Mon Dec 14 09:52:38 2015 (78 seconds) ----------------------------------------START Mon Dec 14 09:52:38 2015 /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -P /draft1/bernardo1/drosophila/dros1nf/4-unitigger/asm.partitioning /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore > /draft1/bernardo1/drosophila/dros1nf/5-consensus/asm.partitioned.err 2>&1 ----------------------------------------END Mon Dec 14 09:53:02 2015 (24 seconds) ----------------------------------------START CONCURRENT Mon Dec 14 09:53:02 2015 /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 1 > /dev/null 2>&1 /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 2 > /dev/null 2>&1 ... /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 67 > /dev/null 2>&1 /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 68 > /dev/null 2>&1 /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 69 > /dev/null 2>&1 ----------------------------------------END CONCURRENT Mon Dec 14 17:52:36 2015 (28774 seconds) ----------------------------------------START Mon Dec 14 17:52:36 2015 /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/tigStore -g /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore -t /draft1/bernardo1/drosophila/dros1nf/asm.tigStore 2 -N -R /draft1/bernardo1/drosophila/dros1nf/5-consensus/asm.fixes > asm.fixes.err 2>&1 ----------------------------------------END Mon Dec 14 17:52:36 2015 (0 seconds) ----------------------------------------START Mon Dec 14 17:52:36 2015 /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/tigStore \ -g /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore \ -t /draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/asm.tigStore 3 \ -d matepair -U \ > /draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/estimates.out 2>&1 ----------------------------------------END Mon Dec 14 17:52:36 2015 (0 seconds) ERROR: Failed with signal HUP (1) ================================================================================ runCA failed. ---------------------------------------- Stack trace: at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 1628. main::caFailure("Insert size estimation failed", "/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes"...) called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 4814 main::postUnitiggerConsensus() called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 6259 ---------------------------------------- Last few lines of the relevant log file (/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/estimates.out): MultiAlignStore::MultiAlignStore()-- ERROR, didn't find any unitigs or contigs in the store. MultiAlignStore::MultiAlignStore()-- asked for store '/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/asm.tigStore', correct? MultiAlignStore::MultiAlignStore()-- asked for version '3', correct? MultiAlignStore::MultiAlignStore()-- asked for partition unitig=0 contig=0, correct? MultiAlignStore::MultiAlignStore()-- asked for writable=0 inplace=0 append=0, correct? ---------------------------------------- Failure message: Insert size estimation failed A. Bernardo Carvalho Departamento de Genética Universidade Federal do Rio de Janeiro On 12 December 2015 at 17:46, Serge Koren <ser...@gm...> wrote: > Ah yes, it outputs multi-line fasta which the previous version did not and > the code is assuming it would output one line for each so it’s generating > an invalid fastq file. If you take the dros1nf.fasta file, it should be > valid. Convert it to a fastq with a fixed QV value, make a frg file, and > re-run the last failed command. > > /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/fastqToCA -libraryname > dros1nf -technology pacbio-corrected -type sanger -reads dros1nf.fastq > > dros1nf.frg > /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s > /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf > ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 > unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 > cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 > utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 > doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 > ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 > cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" > dros1nf.frg > > Sergey > > On Dec 12, 2015, at 1:43 PM, A. Bernardo Carvalho <ber...@gm...> > wrote: > > Dear Sergey, > Thank you for your suggestion. I tried two times to use the falcon_sense > program from canu inside the PBcR script , and got the same errror in both > attempts (error message copied below). It seems that the output of the new > falcon_sense (from canu) is somehow incompatible with the PBcR script. > Please let me know if you have any suggestion on how to proceed ; if none, > I will wait for the canu release. > > Yours, > Bernardo > > > > > > ********* Finished correcting 7200013631 bp (using 15743312583 bp). > ********* Assembling corrected sequences. > Assembling with average 52 (min frag 26) and using ovl is 40 > ----------------------------------------START Fri Dec 11 19:16:41 2015 > ln -sf dros1nf.frg dros1nf.longest25.frg > ----------------------------------------END Fri Dec 11 19:16:41 2015 (0 > seconds) > ----------------------------------------START Fri Dec 11 19:16:41 2015 > ln -sf dros1nf.fastq dros1nf.longest25.fastq > ----------------------------------------END Fri Dec 11 19:16:42 2015 (1 > seconds) > ----------------------------------------START Fri Dec 11 19:16:42 2015 > /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s > /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf > ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 > unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 > cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 > utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 > doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 > ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 > cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" > dros1nf.longest25.frg > ----------------------------------------START Fri Dec 11 19:16:42 2015 > /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -o > /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.BUILDING -T -F > /draft1/bernardo1/drosophila/dros1nf.longest25.frg > > /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err 2>&1 > ----------------------------------------END Fri Dec 11 19:18:32 2015 (110 > seconds) > ERROR: Failed with signal HUP (1) > > ================================================================================ > > runCA failed. > > ---------------------------------------- > Stack trace: > > at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1628. > main::caFailure("gatekeeper failed", > "/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err") called at > /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1957 > > main::preoverlap("/draft1/bernardo1/drosophila/dros1nf.longest25.frg") > called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 6250 > > ---------------------------------------- > Last few lines of the relevant log file > (/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err): > > > Starting file '/draft1/bernardo1/drosophila/dros1nf.longest25.frg'. > > Processing SINGLE-ENDED SANGER QV encoding reads from: > '/draft1/bernardo1/drosophila//dros1nf.fastq' > > > GKP finished with 68766632 alerts or errors: > 68766632 # ILL Error: not a sequence start line. > > ERROR: library IID 1 'dros1nf' has 51263.29% errors or warnings. > > ---------------------------------------- > Failure message: > > gatekeeper failed > > > A. Bernardo Carvalho > > Departamento de Genética > Universidade Federal do Rio de Janeiro > > On 4 December 2015 at 20:32, Serge Koren <ser...@gm...> wrote: > >> Hi, >> >> The issue is that PBDAGCON relies on BLASR libraries to do alignments in >> our implementation. For whatever reason, BLASR performance on D. >> melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t >> recommend running PBDAGCON on this genome unless you can run all the >> partitions in parallel on a grid environment. >> >> Also, we have a new version of the assembler, canu, which has an updated >> falcon_sense version which may work better for your assembly. You get the >> falcon_sense Linux binary here: >> >> http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true >> <https://github.com/marbl/canu> >> and just try replacing the version in CA 8.3 to see if it improves the Y >> assembly. >> >> Sergey >> >> On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm...> >> wrote: >> >> Hi, >> I noticed that while the Drosophila melanogaster MHAP assembly is very >> good in general, it has many gaps in single-copy Y-linked genes. I guess >> that this is caused by low coverage: the DNA came from males, and was >> assembled at 25x, which leaves the Y genes at 12.5x (theoretically). >> Furthermore, it seems that Y-linked reads are being lost during the first >> correction step (done by falcon-sense; I checked the uncorrected and the >> corrected reads). >> >> I am trying to fix these problems by increasing the coverage of the >> corrected reads used in the "post-correction" steps (by adding >> assembleCoverage=40 in the spec file ; instead of the default 25x) , and >> by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1 >> in the spec file). The assembly with 40x and falcon-sense worked fine , >> but when I tried 40x with pbdagcon , the run seems to be abnormally slow. >> Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM, >> and after 9 days running it was still processing the first two partitions >> of runPartition.sh >> >> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1 >> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2 >> >> I checked the runPartition.sh script, and it seems to use only 8 threads >> (instead of 24): >> >> cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh >> >> $bin/outputLayout \ >> -L \ >> -e 0.35 -M 1500 \ >> -i /home3/users/bernardo/drosophila//tempdros10/asm \ >> -o /home3/users/bernardo/drosophila//tempdros10/asm \ >> -p $jobid \ >> -l 500 \ >> \ >> -P \ >> -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \ >> 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err | >> $bin/convertToPBCNS -consensus pbdagcon -path >> /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output >> /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix >> /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500 >> -coverage 4 -threads 8 > >> /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch >> >> In this particular run I have not specified cnsConcurrency or >> consensusConcurrency in the spec file (so the PBcR choose the values; I >> only set threads=20 ), but in another run I added cnsConcurrency=20 >> consensusConcurrency=20 >> to the spec file, and again in 10 days it processed only 3 of the 200 >> partitions. >> >> I tried before the ecoli 30x and the yeast data, and both worked fine >> with pbdagcon (although slower than falcon-sense). Are there some >> limitation to use pbdagcon with higher coverage data? Is the -threads 8 >> option of the convertToPBCNS program correct? >> >> Thanks, >> Bernardo >> >> >> >> >> >> >> >> >> >> A. Bernardo Carvalho >> >> Departamento de Genética >> Universidade Federal do Rio de Janeiro >> >> ------------------------------------------------------------------------------ >> Go from Idea to Many App Stores Faster with Intel(R) XDK >> Give your users amazing mobile app experiences with Intel(R) XDK. >> Use one codebase in this all-in-one HTML5 development environment. >> Design, debug & build mobile apps & 2D/3D high-impact games for multiple >> OSs. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> >> > > |