You can subscribe to this list here.
| 2012 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(29) |
May
(8) |
Jun
(5) |
Jul
(46) |
Aug
(16) |
Sep
(5) |
Oct
(6) |
Nov
(17) |
Dec
(7) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2013 |
Jan
(5) |
Feb
(2) |
Mar
(10) |
Apr
(13) |
May
(20) |
Jun
(7) |
Jul
(6) |
Aug
(14) |
Sep
(9) |
Oct
(19) |
Nov
(17) |
Dec
(3) |
| 2014 |
Jan
(3) |
Feb
|
Mar
(7) |
Apr
(1) |
May
(1) |
Jun
(30) |
Jul
(10) |
Aug
(2) |
Sep
(18) |
Oct
(3) |
Nov
(4) |
Dec
(13) |
| 2015 |
Jan
(27) |
Feb
|
Mar
(19) |
Apr
(12) |
May
(10) |
Jun
(18) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(9) |
| 2016 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
|
From: Corey W. <cor...@gm...> - 2016-09-29 16:02:36
|
It appears that I have a problematic scaffold killing my assembly. Reading the documentation suggests removing the assertion in the source then re-compiling and re-running. I was wondering instead if it were possible to somehow skip this scaffold or otherwise prevent it from crashing? As an aside, how does WGS do with reads that were sequenced with the intention of overlapping sometimes called dovetailing? Should that be precomputed with FLASH or is WGS smart enough to handle them in their paired form? CreateAContigInScaffold()-- new contig 29939425 in scaffold 4725 WARNING: NEGATIVE VARIANCE in creating contig 29939425. Set to 6.191102e+03 based on extremal orig contig, but contig 5687395 (non-extremal) has min variance 5.129192e+03 RecomputeOffsetsInScaffold() returned RECOMPUTE_MERGE_CONTIG on scaffold 4725; scaffold modified, keep trying CleanupAScaffold() Processing total multialign of 10483 CleanupAScaffold() Error: Undoing jiggle fpr 29439904 failed multi-align, backing out changes CreateAContigInScaffold()-- new contig 29939426 in scaffold 4725 CreateAContigInScaffold()-- new contig 29939427 in scaffold 4725 ScaffoldSanity()-- contig 29439904 in scaffold 4725 -- negative gap variance -30.072749 on positive gap size 590.491808 ScaffoldSanity()-- scaffold 4725 has 1 problems. ScaffoldSanity()-- Contig 745958 at 0 +- 0 to 10360 +- 269 ctg len 10360 gap to next 42 +- 715 ScaffoldSanity()-- Contig 8923661 at 10402 +- 984 to 10584 +- 989 ctg len 182 gap to next 6 +- 1112 ScaffoldSanity()-- Contig 29432576 at 10590 +- 2101 to 29067 +- 2582 ctg len 18477 gap to next 26 +- 512 ScaffoldSanity()-- Contig 5500353 at 29094 +- 3093 to 29678 +- 3109 ctg len 584 gap to next -20 +- -557 ScaffoldSanity()-- Contig 29439904 at 29658 +- 2552 to 39557 +- 2809 ctg len 9899 gap to next 57 +- 1509 ScaffoldSanity()-- Contig 29939425 at 39614 +- 4318 to 50470 +- 4600 ctg len 10856 gap to next 137 +- 989 ScaffoldSanity()-- Contig 547439 at 50607 +- 5589 to 50693 +- 5591 ctg len 86 gap to next 79 +- 1287 ScaffoldSanity()-- Contig 29939426 at 50773 +- 6878 to 64864 +- 7244 ctg len 14091 gap to next -1 +- 205 ScaffoldSanity()-- Contig 261117 at 64863 +- 7449 to 67408 +- 7515 ctg len 2545 gap to next 26 +- 292 ScaffoldSanity()-- Contig 928325 at 67433 +- 7808 to 76417 +- 8041 ctg len 8984 gap to next -18 +- 1160 ScaffoldSanity()-- Contig 29385733 at 76400 +- 9201 to 83279 +- 9380 ctg len 6879 gap to next -13 +- 124 ScaffoldSanity()-- Contig 849822 at 83266 +- 9504 to 92248 +- 9738 ctg len 8982 gap to next 3 +- 292 ScaffoldSanity()-- Contig 5428042 at 92251 +- 10030 to 92807 +- 10044 ctg len 556 gap to next 121 +- 454 ScaffoldSanity()-- Contig 29427826 at 92927 +- 10498 to 110977 +- 10967 ctg len 18050 gap to next -6 +- 1434 ScaffoldSanity()-- Contig 18560999 at 110972 +- 12401 to 111064 +- 12404 ctg len 92 gap to next 13 +- 1663 ScaffoldSanity()-- Contig 29685140 at 111076 +- 14067 to 116805 +- 14216 ctg len 5729 gap to next 166 +- 1291 ScaffoldSanity()-- Contig 29939427 at 116971 +- 15507 to 128743 +- 15813 ctg len 11772 gap to next -12 +- 967 ScaffoldSanity()-- Contig 29630932 at 128732 +- 16780 to 136671 +- 16986 ctg len 7939 cgw: CIScaffoldT_CGW.C:1143: void ScaffoldSanity(ScaffoldGraphT*, CIScaffoldT*): Assertion `hasProblems == 0' failed. Failed with 'Aborted' Backtrace (mangled): /loginhome/cwischmeyer/cwischmeyer-home-ubuntu/source/wgs-8.3rc2/Linux-amd64/bin/cgw(_Z17AS_UTL_catchCrashiP9siginfo_tPv+0x2a)[0x41b56a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f3b66b99330] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f3b667fac37] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f3b667fe028] /lib/x86_64-linux-gnu/libc.so.6(+0x2fbf6)[0x7f3b667f3bf6] /lib/x86_64-linux-gnu/libc.so.6(+0x2fca2)[0x7f3b667f3ca2] /loginhome/cwischmeyer/cwischmeyer-home-ubuntu/source/wgs-8.3rc2/Linux-amd64/bin/cgw(_Z14ScaffoldSanityP14ScaffoldGraphTP9NodeCGW_T+0x8f9)[0x446fc9] /loginhome/cwischmeyer/cwischmeyer-home-ubuntu/source/wgs-8.3rc2/Linux-amd64/bin/cgw(main+0x181c)[0x419c9c] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f3b667e5f45] /loginhome/cwischmeyer/cwischmeyer-home-ubuntu/source/wgs-8.3rc2/Linux-amd64/bin/cgw[0x41a5cd] |
|
From: <chi...@cc...> - 2016-08-30 23:28:43
|
Greetings! I'm planning a house renovation and wanted to share some ideas with you, take a look please <http://fangefeby.video4realestate.com/e4cqegjy> Looking forward, chi...@cc... |
|
From: Ryo K. <koy...@oi...> - 2016-07-19 05:39:53
|
Dear all. I have been running CA 8.3 with a hybrid dataset, error-corrected PacBio and 300bp paired illumina and got an assertion failed error from markRepeatUnique.C as follows: Command Line options: singleReadMaxCoverage 1.000000 lowCoverage 2 coverage 1.000000 fraction minReads 2 tooLong 4294967295 tooShort 1000 Loading fragment data. Loading fragment information 10000000 out of 347729991 Loading fragment information 20000000 out of 347729991 Loading fragment information 30000000 out of 347729991 Loading fragment information 40000000 out of 347729991 Loading fragment information 50000000 out of 347729991 Loading fragment information 60000000 out of 347729991 Loading fragment information 70000000 out of 347729991 Loading fragment information 80000000 out of 347729991 Loading fragment information 90000000 out of 347729991 Loading fragment information 100000000 out of 347729991 Loading fragment information 110000000 out of 347729991 Loading fragment information 120000000 out of 347729991 Loading fragment information 130000000 out of 347729991 Loading fragment information 140000000 out of 347729991 Loading fragment information 150000000 out of 347729991 Loading fragment information 160000000 out of 347729991 Loading fragment information 170000000 out of 347729991 Loading fragment information 180000000 out of 347729991 Loading fragment information 190000000 out of 347729991 Loading fragment information 200000000 out of 347729991 Loading fragment information 210000000 out of 347729991 Loading fragment information 220000000 out of 347729991 Loading fragment information 230000000 out of 347729991 Loading fragment information 240000000 out of 347729991 Loading fragment information 250000000 out of 347729991 Loading fragment information 260000000 out of 347729991 Loading fragment information 270000000 out of 347729991 Loading fragment information 280000000 out of 347729991 Loading fragment information 290000000 out of 347729991 Loading fragment information 300000000 out of 347729991 Loading fragment information 310000000 out of 347729991 Loading fragment information 320000000 out of 347729991 Loading fragment information 330000000 out of 347729991 Loading fragment information 340000000 out of 347729991 Generating statistics. markRepeatUnique: markRepeatUnique.C:352: int main(int, char**): Assertion `ma->data.num_frags == GetNumIntMultiPoss(ma->f_list)' failed. Failed with 'Aborted' Backtrace (mangled): /work/.apps/unit/sqc_team/celera/wgs-8.3rc2/Linux-amd64/bin/markRepeatUnique(_Z17AS_UTL_catchCrashiP7siginfoPv+0x2a)[0x40a49a] /lib64/libpthread.so.0[0x37dae0f710] /lib64/libc.so.6(gsignal+0x35)[0x37da632925] /lib64/libc.so.6(abort+0x175)[0x37da634105] /lib64/libc.so.6[0x37da62ba4e] /lib64/libc.so.6(__assert_perror_fail+0x0)[0x37da62bb10] /work/.apps/unit/sqc_team/celera/wgs-8.3rc2/Linux-amd64/bin/markRepeatUnique(main+0x24cc)[0x40818c] /lib64/libc.so.6(__libc_start_main+0xfd)[0x37da61ed1d] /work/.apps/unit/sqc_team/celera/wgs-8.3rc2/Linux-amd64/bin/markRepeatUnique[0x4083b9] Backtrace (demangled): [0] /work/.apps/unit/sqc_team/celera/wgs-8.3rc2/Linux-amd64/bin/markRepeatUnique::AS_UTL_catchCrash(int, siginfo*, void*) + 0x2a [0x40a49a] [1] /lib64/libpthread.so.0() [0x37dae0f710] [2] /lib64/libc.so.6::(null) + 0x35 [0x37da632925] [3] /lib64/libc.so.6::(null) + 0x175 [0x37da634105] [4] /lib64/libc.so.6() [0x37da62ba4e] [5] /lib64/libc.so.6::(null) + 0 [0x37da62bb10] [6] /work/.apps/unit/sqc_team/celera/wgs-8.3rc2/Linux-amd64/bin/markRepeatUnique::(null) + 0x24cc [0x40818c] [7] /lib64/libc.so.6::(null) + 0xfd [0x37da61ed1d] [8] /work/.apps/unit/sqc_team/celera/wgs-8.3rc2/Linux-amd64/bin/markRepeatUnique() [0x4083b9] GDB: Any comments or workarounds are very much appreciated. Thank you in advance. Best regards, Ryo Koyanagi koy...@oi... Okinawa Institute of Science and Technology, Japan |
|
From: Brian W. <th...@gm...> - 2016-01-28 16:17:48
|
'CABOG' introduced support for a more sensitive overlap algorithm that was insensitive to homopolymer errors, the 'mer' overlapper. This did not scale much past microbial assemblies, and is NOT recommended for use. Aside from that, there was no explicit support for correcting homopolymer errors. Using sffToCA to convert from .sff to CA's .frg format will enable all the 454 specific algorithms. Most of these deal with mate pairs. b On Thu, Jan 28, 2016 at 2:43 AM, Tamim Kabir <tam...@gm...> wrote: > Dear Sir/Madam, > > Good day. We are assembling 454 (single + pair) end data for a plant > genome by using CABOG. But, we are very much worried about homopolymer > error of 454 data. How do we remove homopolymer error by using CABOG? > What's parameter used I use for this purposes. > > Thanks in advance > Shah Md Tamim Kabir > Biotechnologist > Dhaka, Bangladesh > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
|
From: Tamim K. <tam...@ya...> - 2016-01-28 09:14:58
|
Dear Sir/Madam, Good day. We are assembling 454 (single + pair) end data for a plant genome by using CABOG. But, we are very much worried about homopolymer error of 454 data. How do we remove homopolymer error by using CABOG? What's parameter used I use for this purposes. Thanks in advance Shah Md Tamim Kabir Biotechnologist Dhaka, Bangladesh |
|
From: Tamim K. <tam...@gm...> - 2016-01-28 07:43:57
|
Dear Sir/Madam, Good day. We are assembling 454 (single + pair) end data for a plant genome by using CABOG. But, we are very much worried about homopolymer error of 454 data. How do we remove homopolymer error by using CABOG? What's parameter used I use for this purposes. Thanks in advance Shah Md Tamim Kabir Biotechnologist Dhaka, Bangladesh |
|
From: Serge K. <ser...@gm...> - 2016-01-09 19:21:01
|
It should be fine to run off grid, when the SGE_TASK_ID is not set it will use command-line input instead. It sounds like the overlap jobs cannot run on the system, maybe due to a missing or old JVM. The *.err files in 1-overlapper should have more information. That said, I would suggest you switch to using Canu (canu.readthedocs.org) instead of CA for PacBio projects. > On Jan 8, 2016, at 4:06 PM, Loke Kok Keong <kk...@uk...> wrote: > > Hi, > > My assembly with CA 8.3 is always stopped at the overlapstore building step. I checked with the asm.ovlStore.err, it just show the help screen of a command fail to look for its input file. > > I check my 1-overlapper folder, the .sh file is not running as there is no SGE_TASK_ID but my run is in local which I have set all rungrid=0. > > Can anyone advise what is wrong with my settings? > > Regards, > KK > > -- > Loke Kok Keong > PhD Candidate of Computational Systems Biology > Centre for Bioinformatics Research (CBR) > Institute of Systems Biology (INBIOSIS) > Universiti Kebangsaan Malaysia (UKM) > 43600 Bangi > Selangor > Malaysia > HP: 0102266792 > <Screen Shot 2016-01-09 at 7.59.12 AM.png><Screen Shot 2016-01-09 at 8.02.34 AM.png>------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
|
From: Loke K. K. <kk...@uk...> - 2016-01-09 00:35:51
|
Hi,
My assembly with CA 8.3 is always stopped at the overlapstore
building step. I checked with the asm.ovlStore.err, it just show the help
screen of a command fail to look for its input file.
I check my 1-overlapper folder, the .sh file is not running as there
is no SGE_TASK_ID but my run is in local which I have set all rungrid=0.
Can anyone advise what is wrong with my settings?
Regards,
KK
--
Loke Kok Keong
PhD Candidate of Computational Systems Biology
Centre for Bioinformatics Research (CBR)
Institute of Systems Biology (INBIOSIS)
Universiti Kebangsaan Malaysia (UKM)
43600 Bangi
Selangor
Malaysia
HP: 0102266792
|
|
From: Ray C. <rc...@ag...> - 2016-01-04 12:22:51
|
Dear All,
I am executing CA from Masurca, it seems to get stuck at the stage
"7-0-CGW" for 12 days. When I checked cgw.out, the last line says:
* AddScaffoldInferredEdges scaffolds = 0
I tried killing and restarting, then it got stuck at the same stage:
====> Reading ~/CA/7-0-CGW/genome.ckp.3 at Tue Dec 22 13:23:07 2015
* ScaffoldGraph Memory Report:
* numElements AllocatedElements AllocatedSize
VA 448 bytes; elements: 7 active; 7 allocated;
64 bytes; 'Dists'
VA 6373749984 bytes; elements: 66393229 active; 66393229 allocated;
96 bytes; 'CIFrags'
VA 3691841920 bytes; elements: 23074012 active; 23074012 allocated;
160 bytes; 'CIs '
VA 2730912384 bytes; elements: 28447004 active; 28447004 allocated;
96 bytes; 'CIEdges'
VA 3691841920 bytes; elements: 23074012 active; 23074012 allocated;
160 bytes; 'Contigs '
VA 2730912384 bytes; elements: 28447004 active; 28447004 allocated;
96 bytes; 'ConEdges'
VA 0 bytes; elements: 0 active; 0 allocated;
160 bytes; 'Scaffolds'
VA 0 bytes; elements: 0 active; 0 allocated;
96 bytes; 'SEdges'
* TotalMemorySize = 19219259040
Beginning CHECKPOINT_AFTER_CLEANING_SCAFFOLDS
* AddScaffoldInferredEdges scaffolds = 0
Does the log look strange? Has anyone seen this before?
Thanks!
Ray Cui
Max-Planck-Institut für Biologie des Alterns / Max Planck Institute for
Biology of Ageing
Wissenschaftlicher MA / Postdoctoral researcher
Office: Joseph-Stelzmann 9b, D-50931 Köln / Cologne
Postal address: Postfach 41 06 23, D-50866 Köln / Cologne
Tel.:+49 (0)221 496
Mobile: +49 0221 37970 496
rc...@ag...
www.age.mpg.de
|
|
From: A. B. C. <ber...@gm...> - 2015-12-23 11:23:12
|
Hi Brian,
Thank you. It solved the problem in
cd kmer && make install && cd ..
but we got an analogous error in the next step:
cd src && make && cd ..
g++ -o
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/buildRefContigs
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/obj/buildRefContigs.o
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/lib/libCA.a
-L/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/lib
-L/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/lib
-L/home/bernardo/programs/wgs-8.3rc2/kmer/Linux-amd64/lib -L/usr/lib64
-L/usr/X11R6/lib64 -D_GLIBCXX_PARALLEL -fopenmp -pthread -lm -Wl,-O1
-rdynamic
* Make Target is all
############################## AS_RUN ##############################
############################## AS_UTL ##############################
############################## AS_UID ##############################
############################## AS_MSG ##############################
############################## AS_PER ##############################
############################## AS_GKP ##############################
############################## AS_OBT ##############################
############################## AS_MER ##############################
############################## AS_OVL ##############################
############################## AS_OVM ##############################
############################## AS_OVS ##############################
############################## AS_ALN ##############################
############################## AS_CGB ##############################
############################## AS_BOG ##############################
############################## AS_BAT ##############################
############################## AS_PBR ##############################
make[1]: execvp: ln: Too many levels of symbolic links
Makefile:160: recipe for target 'PBcR' failed
make[1]: *** [PBcR] Error 127
make[1]: *** Waiting for unfinished jobs....
make[1]: *** wait: No child processes. Stop.
Makefile:35: recipe for target 'all' failed
make: *** [all] Error 1
A. Bernardo Carvalho
Departamento de Genética
Universidade Federal do Rio de Janeiro
On 23 December 2015 at 04:14, Brian Walenz <th...@gm...> wrote:
> Hi-
>
> Edit kmer/Make.include to remove (or comment out) the two lines with
> 'atac-driver' and 'seatac' in them. These aren't used by the assembler,
> and this will make 'make' (ha, ha) not build them.
>
> b
>
>
>
> On Tue, Dec 22, 2015 at 12:11 PM, gabriel goldstein <
> gnr...@gm...> wrote:
>
>> Hello,
>>
>> My name is Gabriel Goldstein and I'm trying to install the assembler in a
>> Ubuntu 15.04 system, with gcc 4.9.2 and GNU Make 4.0.
>> However, whenever I go into wgs-8.3rc2/kmer/ and use the command make
>> install I get the following error:
>>
>> ln -f
>> /home/bernardo/programs/wgs-8.3rc2/kmer/atac-driver/alignOverlap/overlap-process.C
>> /home/bernardo/programs/wgs-8. 3rc2/kmer/atac-driver/alignOverlap/overlap-process1.C
>> make: execvp: ln: Too many levels of symbolic links
>> atac-driver/alignOverlap/Make.include:29: recipe for target
>> 'atac-driver/alignOverlap/overlap-process1.C' failed
>> make: *** [atac-driver/alignOverlap/overlap-process1.C] Error 12
>>
>> I've tried using an older version of GNU Make (3.8), but the same problem
>> remains. Are there any known issues related with the new version of gcc?
>> This error message really tells me nothing, is it commom?
>>
>> Thanks in advance for the attention,
>>
>> Gabriel Goldstein
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> wgs-assembler-users mailing list
>> wgs...@li...
>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> wgs-assembler-users mailing list
> wgs...@li...
> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>
>
|
|
From: Brian W. <th...@gm...> - 2015-12-23 06:14:20
|
Hi- Edit kmer/Make.include to remove (or comment out) the two lines with 'atac-driver' and 'seatac' in them. These aren't used by the assembler, and this will make 'make' (ha, ha) not build them. b On Tue, Dec 22, 2015 at 12:11 PM, gabriel goldstein <gnr...@gm...> wrote: > Hello, > > My name is Gabriel Goldstein and I'm trying to install the assembler in a > Ubuntu 15.04 system, with gcc 4.9.2 and GNU Make 4.0. > However, whenever I go into wgs-8.3rc2/kmer/ and use the command make > install I get the following error: > > ln -f > /home/bernardo/programs/wgs-8.3rc2/kmer/atac-driver/alignOverlap/overlap-process.C > /home/bernardo/programs/wgs-8. 3rc2/kmer/atac-driver/alignOverlap/overlap-process1.C > make: execvp: ln: Too many levels of symbolic links > atac-driver/alignOverlap/Make.include:29: recipe for target > 'atac-driver/alignOverlap/overlap-process1.C' failed > make: *** [atac-driver/alignOverlap/overlap-process1.C] Error 12 > > I've tried using an older version of GNU Make (3.8), but the same problem > remains. Are there any known issues related with the new version of gcc? > This error message really tells me nothing, is it commom? > > Thanks in advance for the attention, > > Gabriel Goldstein > > > ------------------------------------------------------------------------------ > > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
|
From: Serge K. <ser...@gm...> - 2015-12-22 20:19:46
|
Hmm, I haven’t seen that error before. It sounds like one of the previous steps didn’t advance the unitig version (the versions advance as the unitgs are built, the first version is layouts, version 2 is consensus, etc). I think you should be able to get your unitigs without trying to re-run the rest of the pipeline. If you run:
tigStore -g dros1nf/asm.gkpStore -t dros1nf/asm.tigStore 2 -U -d consensus -nreads 2 1000000 > asm.fasta
it should dump all the unitigs which have consensus called.
Sergey
> On Dec 14, 2015, at 3:24 PM, A. Bernardo Carvalho <ber...@gm...> wrote:
>
> Hi Serge,
> Thank you for your suggestion. I followed it, but got stopped by another error (below; probably at the unitigger) . Please let me know if you have any other suggestion.
> best,
> Bernardo
>
> I issued the following commands:
>
> cd /draft1/bernardo1/drosophila
> rm dros1nf.fastq
> rm dros1nf.frg
> rm -fr dros1nf
> java -jar /home/bernardo/programs/convertFastaAndQualToFastq.jar dros1nf.fasta > dros1nf.fastq
> fastqToCA -libraryname dros1nf -technology pacbio-corrected -type sanger -reads dros1nf.fastq > dros1nf.frg
> runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.frg > dros1nf.out 2>&1
>
>
>
> OUTPUT:
> ...
>
> ----------------------------------------START Mon Dec 14 09:51:20 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/bogart -O /draft1/bernardo1/drosophila/dros1nf/asm.ovlStore -G /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore -T /draft1/bernardo1/drosophila/dros1nf/asm.tigStore -B 4189 -eg 0.025 -Eg 0 -em 0.025 -Em 0 -RS -NS -CS -o /draft1/bernardo1/drosophila/dros1nf/4-unitigger/asm > /draft1/bernardo1/drosophila/dros1nf/4-unitigger/unitigger.err 2>&1
> ----------------------------------------END Mon Dec 14 09:52:38 2015 (78 seconds)
> ----------------------------------------START Mon Dec 14 09:52:38 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -P /draft1/bernardo1/drosophila/dros1nf/4-unitigger/asm.partitioning /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore > /draft1/bernardo1/drosophila/dros1nf/5-consensus/asm.partitioned.err 2>&1
> ----------------------------------------END Mon Dec 14 09:53:02 2015 (24 seconds)
> ----------------------------------------START CONCURRENT Mon Dec 14 09:53:02 2015
> /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 1 > /dev/null 2>&1
> /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 2 > /dev/null 2>&1
> ...
> /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 67 > /dev/null 2>&1
> /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 68 > /dev/null 2>&1
> /draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 69 > /dev/null 2>&1
> ----------------------------------------END CONCURRENT Mon Dec 14 17:52:36 2015 (28774 seconds)
> ----------------------------------------START Mon Dec 14 17:52:36 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/tigStore -g /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore -t /draft1/bernardo1/drosophila/dros1nf/asm.tigStore 2 -N -R /draft1/bernardo1/drosophila/dros1nf/5-consensus/asm.fixes > asm.fixes.err 2>&1
> ----------------------------------------END Mon Dec 14 17:52:36 2015 (0 seconds)
> ----------------------------------------START Mon Dec 14 17:52:36 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/tigStore \
> -g /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore \
> -t /draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/asm.tigStore 3 \
> -d matepair -U \
> > /draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/estimates.out 2>&1
> ----------------------------------------END Mon Dec 14 17:52:36 2015 (0 seconds)
> ERROR: Failed with signal HUP (1)
> ================================================================================
>
> runCA failed.
>
> ----------------------------------------
> Stack trace:
>
> at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 1628.
> main::caFailure("Insert size estimation failed", "/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes"...) called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 4814
> main::postUnitiggerConsensus() called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 6259
>
> ----------------------------------------
> Last few lines of the relevant log file (/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/estimates.out):
>
> MultiAlignStore::MultiAlignStore()-- ERROR, didn't find any unitigs or contigs in the store.
> MultiAlignStore::MultiAlignStore()-- asked for store '/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/asm.tigStore', correct?
> MultiAlignStore::MultiAlignStore()-- asked for version '3', correct?
> MultiAlignStore::MultiAlignStore()-- asked for partition unitig=0 contig=0, correct?
> MultiAlignStore::MultiAlignStore()-- asked for writable=0 inplace=0 append=0, correct?
>
> ----------------------------------------
> Failure message:
>
> Insert size estimation failed
>
>
>
> A. Bernardo Carvalho
>
> Departamento de Genética
> Universidade Federal do Rio de Janeiro
>
> On 12 December 2015 at 17:46, Serge Koren <ser...@gm... <mailto:ser...@gm...>> wrote:
> Ah yes, it outputs multi-line fasta which the previous version did not and the code is assuming it would output one line for each so it’s generating an invalid fastq file. If you take the dros1nf.fasta file, it should be valid. Convert it to a fastq with a fixed QV value, make a frg file, and re-run the last failed command.
>
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/fastqToCA -libraryname dros1nf -technology pacbio-corrected -type sanger -reads dros1nf.fastq > dros1nf.frg
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.frg
>
> Sergey
>
>> On Dec 12, 2015, at 1:43 PM, A. Bernardo Carvalho <ber...@gm... <mailto:ber...@gm...>> wrote:
>>
>> Dear Sergey,
>> Thank you for your suggestion. I tried two times to use the falcon_sense program from canu inside the PBcR script , and got the same errror in both attempts (error message copied below). It seems that the output of the new falcon_sense (from canu) is somehow incompatible with the PBcR script. Please let me know if you have any suggestion on how to proceed ; if none, I will wait for the canu release.
>>
>> Yours,
>> Bernardo
>>
>>
>>
>>
>>
>> ********* Finished correcting 7200013631 bp (using 15743312583 <tel:15743312583> bp).
>> ********* Assembling corrected sequences.
>> Assembling with average 52 (min frag 26) and using ovl is 40
>> ----------------------------------------START Fri Dec 11 19:16:41 2015
>> ln -sf dros1nf.frg dros1nf.longest25.frg
>> ----------------------------------------END Fri Dec 11 19:16:41 2015 (0 seconds)
>> ----------------------------------------START Fri Dec 11 19:16:41 2015
>> ln -sf dros1nf.fastq dros1nf.longest25.fastq
>> ----------------------------------------END Fri Dec 11 19:16:42 2015 (1 seconds)
>> ----------------------------------------START Fri Dec 11 19:16:42 2015
>> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.longest25.frg
>> ----------------------------------------START Fri Dec 11 19:16:42 2015
>> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -o /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.BUILDING -T -F /draft1/bernardo1/drosophila/dros1nf.longest25.frg > /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err 2>&1
>> ----------------------------------------END Fri Dec 11 19:18:32 2015 (110 seconds)
>> ERROR: Failed with signal HUP (1)
>> ================================================================================
>>
>> runCA failed.
>>
>> ----------------------------------------
>> Stack trace:
>>
>> at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1628.
>> main::caFailure("gatekeeper failed", "/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err") called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1957
>> main::preoverlap("/draft1/bernardo1/drosophila/dros1nf.longest25.frg") called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 6250
>>
>> ----------------------------------------
>> Last few lines of the relevant log file (/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err):
>>
>>
>> Starting file '/draft1/bernardo1/drosophila/dros1nf.longest25.frg'.
>>
>> Processing SINGLE-ENDED SANGER QV encoding reads from:
>> '/draft1/bernardo1/drosophila//dros1nf.fastq'
>>
>>
>> GKP finished with 68766632 alerts or errors:
>> 68766632 # ILL Error: not a sequence start line.
>>
>> ERROR: library IID 1 'dros1nf' has 51263.29% errors or warnings.
>>
>> ----------------------------------------
>> Failure message:
>>
>> gatekeeper failed
>>
>>
>> A. Bernardo Carvalho
>>
>> Departamento de Genética
>> Universidade Federal do Rio de Janeiro
>>
>> On 4 December 2015 at 20:32, Serge Koren <ser...@gm... <mailto:ser...@gm...>> wrote:
>> Hi,
>>
>> The issue is that PBDAGCON relies on BLASR libraries to do alignments in our implementation. For whatever reason, BLASR performance on D. melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t recommend running PBDAGCON on this genome unless you can run all the partitions in parallel on a grid environment.
>>
>> Also, we have a new version of the assembler, canu, which has an updated falcon_sense version which may work better for your assembly. You get the falcon_sense Linux binary here:
>> http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true <https://github.com/marbl/canu>
>> and just try replacing the version in CA 8.3 to see if it improves the Y assembly.
>>
>> Sergey
>>
>>> On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm... <mailto:ber...@gm...>> wrote:
>>>
>>> Hi,
>>> I noticed that while the Drosophila melanogaster MHAP assembly is very good in general, it has many gaps in single-copy Y-linked genes. I guess that this is caused by low coverage: the DNA came from males, and was assembled at 25x, which leaves the Y genes at 12.5x (theoretically). Furthermore, it seems that Y-linked reads are being lost during the first correction step (done by falcon-sense; I checked the uncorrected and the corrected reads).
>>>
>>> I am trying to fix these problems by increasing the coverage of the corrected reads used in the "post-correction" steps (by adding assembleCoverage=40 in the spec file ; instead of the default 25x) , and by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1 in the spec file). The assembly with 40x and falcon-sense worked fine , but when I tried 40x with pbdagcon , the run seems to be abnormally slow. Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM, and after 9 days running it was still processing the first two partitions of runPartition.sh
>>>
>>> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1
>>> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2
>>>
>>> I checked the runPartition.sh script, and it seems to use only 8 threads (instead of 24):
>>>
>>> cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh
>>>
>>> $bin/outputLayout \
>>> -L \
>>> -e 0.35 -M 1500 \
>>> -i /home3/users/bernardo/drosophila//tempdros10/asm \
>>> -o /home3/users/bernardo/drosophila//tempdros10/asm \
>>> -p $jobid \
>>> -l 500 \
>>> \
>>> -P \
>>> -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \
>>> 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err | $bin/convertToPBCNS -consensus pbdagcon -path /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500 -coverage 4 -threads 8 > /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch
>>>
>>> In this particular run I have not specified cnsConcurrency or consensusConcurrency in the spec file (so the PBcR choose the values; I only set threads=20 ), but in another run I added cnsConcurrency=20
>>> consensusConcurrency=20
>>> to the spec file, and again in 10 days it processed only 3 of the 200 partitions.
>>>
>>> I tried before the ecoli 30x and the yeast data, and both worked fine with pbdagcon (although slower than falcon-sense). Are there some limitation to use pbdagcon with higher coverage data? Is the -threads 8 option of the convertToPBCNS program correct?
>>>
>>> Thanks,
>>> Bernardo
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> A. Bernardo Carvalho
>>>
>>> Departamento de Genética
>>> Universidade Federal do Rio de Janeiro
>>> ------------------------------------------------------------------------------
>>> Go from Idea to Many App Stores Faster with Intel(R) XDK
>>> Give your users amazing mobile app experiences with Intel(R) XDK.
>>> Use one codebase in this all-in-one HTML5 development environment.
>>> Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________>
>>> wgs-assembler-users mailing list
>>> wgs...@li... <mailto:wgs...@li...>
>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users <https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users>
>>
>>
>
>
|
|
From: gabriel g. <gnr...@gm...> - 2015-12-22 17:11:55
|
Hello, My name is Gabriel Goldstein and I'm trying to install the assembler in a Ubuntu 15.04 system, with gcc 4.9.2 and GNU Make 4.0. However, whenever I go into wgs-8.3rc2/kmer/ and use the command make install I get the following error: ln -f /home/bernardo/programs/wgs-8.3rc2/kmer/atac-driver/alignOverlap/overlap-process.C /home/bernardo/programs/wgs-8. 3rc2/kmer/atac-driver/alignOverlap/overlap-process1.C make: execvp: ln: Too many levels of symbolic links atac-driver/alignOverlap/Make.include:29: recipe for target 'atac-driver/alignOverlap/overlap-process1.C' failed make: *** [atac-driver/alignOverlap/overlap-process1.C] Error 12 I've tried using an older version of GNU Make (3.8), but the same problem remains. Are there any known issues related with the new version of gcc? This error message really tells me nothing, is it commom? Thanks in advance for the attention, Gabriel Goldstein |
|
From: A. B. C. <ber...@gm...> - 2015-12-14 20:25:32
|
Hi Serge,
Thank you for your suggestion. I followed it, but got stopped by another
error (below; probably at the unitigger) . Please let me know if you have
any other suggestion.
best,
Bernardo
I issued the following commands:
cd /draft1/bernardo1/drosophila
rm dros1nf.fastq
rm dros1nf.frg
rm -fr dros1nf
java -jar /home/bernardo/programs/convertFastaAndQualToFastq.jar
dros1nf.fasta > dros1nf.fastq
fastqToCA -libraryname dros1nf -technology pacbio-corrected -type sanger
-reads dros1nf.fastq > dros1nf.frg
runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d
dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0
scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025
cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0
utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025
frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03
obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS"
consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1
gridEnginePropagateHold="pBcR_asm" dros1nf.frg > dros1nf.out 2>&1
OUTPUT:
...
----------------------------------------START Mon Dec 14 09:51:20 2015
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/bogart -O
/draft1/bernardo1/drosophila/dros1nf/asm.ovlStore -G
/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore -T
/draft1/bernardo1/drosophila/dros1nf/asm.tigStore -B 4189 -eg 0.025 -Eg
0 -em 0.025 -Em 0 -RS -NS -CS -o
/draft1/bernardo1/drosophila/dros1nf/4-unitigger/asm >
/draft1/bernardo1/drosophila/dros1nf/4-unitigger/unitigger.err 2>&1
----------------------------------------END Mon Dec 14 09:52:38 2015 (78
seconds)
----------------------------------------START Mon Dec 14 09:52:38 2015
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -P
/draft1/bernardo1/drosophila/dros1nf/4-unitigger/asm.partitioning
/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore >
/draft1/bernardo1/drosophila/dros1nf/5-consensus/asm.partitioned.err 2>&1
----------------------------------------END Mon Dec 14 09:53:02 2015 (24
seconds)
----------------------------------------START CONCURRENT Mon Dec 14
09:53:02 2015
/draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 1 > /dev/null
2>&1
/draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 2 > /dev/null
2>&1
...
/draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 67 >
/dev/null 2>&1
/draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 68 >
/dev/null 2>&1
/draft1/bernardo1/drosophila/dros1nf/5-consensus/consensus.sh 69 >
/dev/null 2>&1
----------------------------------------END CONCURRENT Mon Dec 14 17:52:36
2015 (28774 seconds)
----------------------------------------START Mon Dec 14 17:52:36 2015
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/tigStore -g
/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore -t
/draft1/bernardo1/drosophila/dros1nf/asm.tigStore 2 -N -R
/draft1/bernardo1/drosophila/dros1nf/5-consensus/asm.fixes > asm.fixes.err
2>&1
----------------------------------------END Mon Dec 14 17:52:36 2015 (0
seconds)
----------------------------------------START Mon Dec 14 17:52:36 2015
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/tigStore \
-g /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore \
-t
/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/asm.tigStore
3 \
-d matepair -U \
>
/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/estimates.out
2>&1
----------------------------------------END Mon Dec 14 17:52:36 2015 (0
seconds)
ERROR: Failed with signal HUP (1)
================================================================================
runCA failed.
----------------------------------------
Stack trace:
at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 1628.
main::caFailure("Insert size estimation failed",
"/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes"...) called
at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 4814
main::postUnitiggerConsensus() called at
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin//runCA line 6259
----------------------------------------
Last few lines of the relevant log file
(/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/estimates.out):
MultiAlignStore::MultiAlignStore()-- ERROR, didn't find any unitigs or
contigs in the store.
MultiAlignStore::MultiAlignStore()-- asked for store
'/draft1/bernardo1/drosophila/dros1nf/5-consensus-insert-sizes/asm.tigStore',
correct?
MultiAlignStore::MultiAlignStore()-- asked for version '3', correct?
MultiAlignStore::MultiAlignStore()-- asked for partition unitig=0
contig=0, correct?
MultiAlignStore::MultiAlignStore()-- asked for writable=0 inplace=0
append=0, correct?
----------------------------------------
Failure message:
Insert size estimation failed
A. Bernardo Carvalho
Departamento de Genética
Universidade Federal do Rio de Janeiro
On 12 December 2015 at 17:46, Serge Koren <ser...@gm...> wrote:
> Ah yes, it outputs multi-line fasta which the previous version did not and
> the code is assuming it would output one line for each so it’s generating
> an invalid fastq file. If you take the dros1nf.fasta file, it should be
> valid. Convert it to a fastq with a fixed QV value, make a frg file, and
> re-run the last failed command.
>
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/fastqToCA -libraryname
> dros1nf -technology pacbio-corrected -type sanger -reads dros1nf.fastq >
> dros1nf.frg
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s
> /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf
> ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0
> unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1
> cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025
> utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000
> doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26
> ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22
> cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm"
> dros1nf.frg
>
> Sergey
>
> On Dec 12, 2015, at 1:43 PM, A. Bernardo Carvalho <ber...@gm...>
> wrote:
>
> Dear Sergey,
> Thank you for your suggestion. I tried two times to use the falcon_sense
> program from canu inside the PBcR script , and got the same errror in both
> attempts (error message copied below). It seems that the output of the new
> falcon_sense (from canu) is somehow incompatible with the PBcR script.
> Please let me know if you have any suggestion on how to proceed ; if none,
> I will wait for the canu release.
>
> Yours,
> Bernardo
>
>
>
>
>
> ********* Finished correcting 7200013631 bp (using 15743312583 bp).
> ********* Assembling corrected sequences.
> Assembling with average 52 (min frag 26) and using ovl is 40
> ----------------------------------------START Fri Dec 11 19:16:41 2015
> ln -sf dros1nf.frg dros1nf.longest25.frg
> ----------------------------------------END Fri Dec 11 19:16:41 2015 (0
> seconds)
> ----------------------------------------START Fri Dec 11 19:16:41 2015
> ln -sf dros1nf.fastq dros1nf.longest25.fastq
> ----------------------------------------END Fri Dec 11 19:16:42 2015 (1
> seconds)
> ----------------------------------------START Fri Dec 11 19:16:42 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s
> /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf
> ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0
> unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1
> cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025
> utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000
> doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26
> ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22
> cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm"
> dros1nf.longest25.frg
> ----------------------------------------START Fri Dec 11 19:16:42 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -o
> /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.BUILDING -T -F
> /draft1/bernardo1/drosophila/dros1nf.longest25.frg >
> /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err 2>&1
> ----------------------------------------END Fri Dec 11 19:18:32 2015 (110
> seconds)
> ERROR: Failed with signal HUP (1)
>
> ================================================================================
>
> runCA failed.
>
> ----------------------------------------
> Stack trace:
>
> at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1628.
> main::caFailure("gatekeeper failed",
> "/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err") called at
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1957
>
> main::preoverlap("/draft1/bernardo1/drosophila/dros1nf.longest25.frg")
> called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 6250
>
> ----------------------------------------
> Last few lines of the relevant log file
> (/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err):
>
>
> Starting file '/draft1/bernardo1/drosophila/dros1nf.longest25.frg'.
>
> Processing SINGLE-ENDED SANGER QV encoding reads from:
> '/draft1/bernardo1/drosophila//dros1nf.fastq'
>
>
> GKP finished with 68766632 alerts or errors:
> 68766632 # ILL Error: not a sequence start line.
>
> ERROR: library IID 1 'dros1nf' has 51263.29% errors or warnings.
>
> ----------------------------------------
> Failure message:
>
> gatekeeper failed
>
>
> A. Bernardo Carvalho
>
> Departamento de Genética
> Universidade Federal do Rio de Janeiro
>
> On 4 December 2015 at 20:32, Serge Koren <ser...@gm...> wrote:
>
>> Hi,
>>
>> The issue is that PBDAGCON relies on BLASR libraries to do alignments in
>> our implementation. For whatever reason, BLASR performance on D.
>> melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t
>> recommend running PBDAGCON on this genome unless you can run all the
>> partitions in parallel on a grid environment.
>>
>> Also, we have a new version of the assembler, canu, which has an updated
>> falcon_sense version which may work better for your assembly. You get the
>> falcon_sense Linux binary here:
>>
>> http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true
>> <https://github.com/marbl/canu>
>> and just try replacing the version in CA 8.3 to see if it improves the Y
>> assembly.
>>
>> Sergey
>>
>> On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm...>
>> wrote:
>>
>> Hi,
>> I noticed that while the Drosophila melanogaster MHAP assembly is very
>> good in general, it has many gaps in single-copy Y-linked genes. I guess
>> that this is caused by low coverage: the DNA came from males, and was
>> assembled at 25x, which leaves the Y genes at 12.5x (theoretically).
>> Furthermore, it seems that Y-linked reads are being lost during the first
>> correction step (done by falcon-sense; I checked the uncorrected and the
>> corrected reads).
>>
>> I am trying to fix these problems by increasing the coverage of the
>> corrected reads used in the "post-correction" steps (by adding
>> assembleCoverage=40 in the spec file ; instead of the default 25x) , and
>> by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1
>> in the spec file). The assembly with 40x and falcon-sense worked fine ,
>> but when I tried 40x with pbdagcon , the run seems to be abnormally slow.
>> Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM,
>> and after 9 days running it was still processing the first two partitions
>> of runPartition.sh
>>
>> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1
>> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2
>>
>> I checked the runPartition.sh script, and it seems to use only 8 threads
>> (instead of 24):
>>
>> cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh
>>
>> $bin/outputLayout \
>> -L \
>> -e 0.35 -M 1500 \
>> -i /home3/users/bernardo/drosophila//tempdros10/asm \
>> -o /home3/users/bernardo/drosophila//tempdros10/asm \
>> -p $jobid \
>> -l 500 \
>> \
>> -P \
>> -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \
>> 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err |
>> $bin/convertToPBCNS -consensus pbdagcon -path
>> /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output
>> /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix
>> /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500
>> -coverage 4 -threads 8 >
>> /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch
>>
>> In this particular run I have not specified cnsConcurrency or
>> consensusConcurrency in the spec file (so the PBcR choose the values; I
>> only set threads=20 ), but in another run I added cnsConcurrency=20
>> consensusConcurrency=20
>> to the spec file, and again in 10 days it processed only 3 of the 200
>> partitions.
>>
>> I tried before the ecoli 30x and the yeast data, and both worked fine
>> with pbdagcon (although slower than falcon-sense). Are there some
>> limitation to use pbdagcon with higher coverage data? Is the -threads 8
>> option of the convertToPBCNS program correct?
>>
>> Thanks,
>> Bernardo
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> A. Bernardo Carvalho
>>
>> Departamento de Genética
>> Universidade Federal do Rio de Janeiro
>>
>> ------------------------------------------------------------------------------
>> Go from Idea to Many App Stores Faster with Intel(R) XDK
>> Give your users amazing mobile app experiences with Intel(R) XDK.
>> Use one codebase in this all-in-one HTML5 development environment.
>> Design, debug & build mobile apps & 2D/3D high-impact games for multiple
>> OSs.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________
>> wgs-assembler-users mailing list
>> wgs...@li...
>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>
>>
>>
>
>
|
|
From: Serge K. <ser...@gm...> - 2015-12-12 19:43:26
|
Ah yes, it outputs multi-line fasta which the previous version did not and the code is assuming it would output one line for each so it’s generating an invalid fastq file. If you take the dros1nf.fasta file, it should be valid. Convert it to a fastq with a fixed QV value, make a frg file, and re-run the last failed command.
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/fastqToCA -libraryname dros1nf -technology pacbio-corrected -type sanger -reads dros1nf.fastq > dros1nf.frg
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.frg
Sergey
> On Dec 12, 2015, at 1:43 PM, A. Bernardo Carvalho <ber...@gm...> wrote:
>
> Dear Sergey,
> Thank you for your suggestion. I tried two times to use the falcon_sense program from canu inside the PBcR script , and got the same errror in both attempts (error message copied below). It seems that the output of the new falcon_sense (from canu) is somehow incompatible with the PBcR script. Please let me know if you have any suggestion on how to proceed ; if none, I will wait for the canu release.
>
> Yours,
> Bernardo
>
>
>
>
>
> ********* Finished correcting 7200013631 bp (using 15743312583 bp).
> ********* Assembling corrected sequences.
> Assembling with average 52 (min frag 26) and using ovl is 40
> ----------------------------------------START Fri Dec 11 19:16:41 2015
> ln -sf dros1nf.frg dros1nf.longest25.frg
> ----------------------------------------END Fri Dec 11 19:16:41 2015 (0 seconds)
> ----------------------------------------START Fri Dec 11 19:16:41 2015
> ln -sf dros1nf.fastq dros1nf.longest25.fastq
> ----------------------------------------END Fri Dec 11 19:16:42 2015 (1 seconds)
> ----------------------------------------START Fri Dec 11 19:16:42 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s /draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0 unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1 cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025 utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000 doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26 ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22 cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm" dros1nf.longest25.frg
> ----------------------------------------START Fri Dec 11 19:16:42 2015
> /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -o /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.BUILDING -T -F /draft1/bernardo1/drosophila/dros1nf.longest25.frg > /draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err 2>&1
> ----------------------------------------END Fri Dec 11 19:18:32 2015 (110 seconds)
> ERROR: Failed with signal HUP (1)
> ================================================================================
>
> runCA failed.
>
> ----------------------------------------
> Stack trace:
>
> at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1628.
> main::caFailure("gatekeeper failed", "/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err") called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1957
> main::preoverlap("/draft1/bernardo1/drosophila/dros1nf.longest25.frg") called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 6250
>
> ----------------------------------------
> Last few lines of the relevant log file (/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err):
>
>
> Starting file '/draft1/bernardo1/drosophila/dros1nf.longest25.frg'.
>
> Processing SINGLE-ENDED SANGER QV encoding reads from:
> '/draft1/bernardo1/drosophila//dros1nf.fastq'
>
>
> GKP finished with 68766632 alerts or errors:
> 68766632 # ILL Error: not a sequence start line.
>
> ERROR: library IID 1 'dros1nf' has 51263.29% errors or warnings.
>
> ----------------------------------------
> Failure message:
>
> gatekeeper failed
>
>
> A. Bernardo Carvalho
>
> Departamento de Genética
> Universidade Federal do Rio de Janeiro
>
> On 4 December 2015 at 20:32, Serge Koren <ser...@gm... <mailto:ser...@gm...>> wrote:
> Hi,
>
> The issue is that PBDAGCON relies on BLASR libraries to do alignments in our implementation. For whatever reason, BLASR performance on D. melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t recommend running PBDAGCON on this genome unless you can run all the partitions in parallel on a grid environment.
>
> Also, we have a new version of the assembler, canu, which has an updated falcon_sense version which may work better for your assembly. You get the falcon_sense Linux binary here:
> http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true <https://github.com/marbl/canu>
> and just try replacing the version in CA 8.3 to see if it improves the Y assembly.
>
> Sergey
>
>> On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm... <mailto:ber...@gm...>> wrote:
>>
>> Hi,
>> I noticed that while the Drosophila melanogaster MHAP assembly is very good in general, it has many gaps in single-copy Y-linked genes. I guess that this is caused by low coverage: the DNA came from males, and was assembled at 25x, which leaves the Y genes at 12.5x (theoretically). Furthermore, it seems that Y-linked reads are being lost during the first correction step (done by falcon-sense; I checked the uncorrected and the corrected reads).
>>
>> I am trying to fix these problems by increasing the coverage of the corrected reads used in the "post-correction" steps (by adding assembleCoverage=40 in the spec file ; instead of the default 25x) , and by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1 in the spec file). The assembly with 40x and falcon-sense worked fine , but when I tried 40x with pbdagcon , the run seems to be abnormally slow. Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM, and after 9 days running it was still processing the first two partitions of runPartition.sh
>>
>> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1
>> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2
>>
>> I checked the runPartition.sh script, and it seems to use only 8 threads (instead of 24):
>>
>> cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh
>>
>> $bin/outputLayout \
>> -L \
>> -e 0.35 -M 1500 \
>> -i /home3/users/bernardo/drosophila//tempdros10/asm \
>> -o /home3/users/bernardo/drosophila//tempdros10/asm \
>> -p $jobid \
>> -l 500 \
>> \
>> -P \
>> -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \
>> 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err | $bin/convertToPBCNS -consensus pbdagcon -path /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500 -coverage 4 -threads 8 > /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch
>>
>> In this particular run I have not specified cnsConcurrency or consensusConcurrency in the spec file (so the PBcR choose the values; I only set threads=20 ), but in another run I added cnsConcurrency=20
>> consensusConcurrency=20
>> to the spec file, and again in 10 days it processed only 3 of the 200 partitions.
>>
>> I tried before the ecoli 30x and the yeast data, and both worked fine with pbdagcon (although slower than falcon-sense). Are there some limitation to use pbdagcon with higher coverage data? Is the -threads 8 option of the convertToPBCNS program correct?
>>
>> Thanks,
>> Bernardo
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> A. Bernardo Carvalho
>>
>> Departamento de Genética
>> Universidade Federal do Rio de Janeiro
>> ------------------------------------------------------------------------------
>> Go from Idea to Many App Stores Faster with Intel(R) XDK
>> Give your users amazing mobile app experiences with Intel(R) XDK.
>> Use one codebase in this all-in-one HTML5 development environment.
>> Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
>> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________>
>> wgs-assembler-users mailing list
>> wgs...@li... <mailto:wgs...@li...>
>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users <https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users>
>
>
|
|
From: A. B. C. <ber...@gm...> - 2015-12-12 18:43:59
|
Dear Sergey,
Thank you for your suggestion. I tried two times to use the falcon_sense
program from canu inside the PBcR script , and got the same errror in both
attempts (error message copied below). It seems that the output of the new
falcon_sense (from canu) is somehow incompatible with the PBcR script.
Please let me know if you have any suggestion on how to proceed ; if none,
I will wait for the canu release.
Yours,
Bernardo
********* Finished correcting 7200013631 bp (using 15743312583 bp).
********* Assembling corrected sequences.
Assembling with average 52 (min frag 26) and using ovl is 40
----------------------------------------START Fri Dec 11 19:16:41 2015
ln -sf dros1nf.frg dros1nf.longest25.frg
----------------------------------------END Fri Dec 11 19:16:41 2015 (0
seconds)
----------------------------------------START Fri Dec 11 19:16:41 2015
ln -sf dros1nf.fastq dros1nf.longest25.fastq
----------------------------------------END Fri Dec 11 19:16:42 2015 (1
seconds)
----------------------------------------START Fri Dec 11 19:16:42 2015
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA -s
/draft1/bernardo1/drosophila//tempdros1nf/dros1nf.spec -p asm -d dros1nf
ovlRefBlockLength=100000000000 ovlRefBlockSize=0 useGrid=0 scriptOnGrid=0
unitigger=bogart ovlErrorRate=0.03 utgErrorRate=0.025 cgwErrorRate=0.1
cnsErrorRate=0.1 utgGraphErrorLimit=0 utgGraphErrorRate=0.025
utgMergeErrorLimit=0 utgMergeErrorRate=0.025 frgCorrBatchSize=100000
doOverlapBasedTrimming=1 obtErrorRate=0.03 obtErrorLimit=4.5 frgMinLen=26
ovlMinLen=40 "batOptions=-RS -NS -CS" consensus=pbutgcns merSize=22
cnsMaxCoverage=1 cnsReuseUnitigs=1 gridEnginePropagateHold="pBcR_asm"
dros1nf.longest25.frg
----------------------------------------START Fri Dec 11 19:16:42 2015
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/gatekeeper -o
/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.BUILDING -T -F
/draft1/bernardo1/drosophila/dros1nf.longest25.frg >
/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err 2>&1
----------------------------------------END Fri Dec 11 19:18:32 2015 (110
seconds)
ERROR: Failed with signal HUP (1)
================================================================================
runCA failed.
----------------------------------------
Stack trace:
at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1628.
main::caFailure("gatekeeper failed",
"/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err") called at
/home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 1957
main::preoverlap("/draft1/bernardo1/drosophila/dros1nf.longest25.frg")
called at /home/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/runCA line 6250
----------------------------------------
Last few lines of the relevant log file
(/draft1/bernardo1/drosophila/dros1nf/asm.gkpStore.err):
Starting file '/draft1/bernardo1/drosophila/dros1nf.longest25.frg'.
Processing SINGLE-ENDED SANGER QV encoding reads from:
'/draft1/bernardo1/drosophila//dros1nf.fastq'
GKP finished with 68766632 alerts or errors:
68766632 # ILL Error: not a sequence start line.
ERROR: library IID 1 'dros1nf' has 51263.29% errors or warnings.
----------------------------------------
Failure message:
gatekeeper failed
A. Bernardo Carvalho
Departamento de Genética
Universidade Federal do Rio de Janeiro
On 4 December 2015 at 20:32, Serge Koren <ser...@gm...> wrote:
> Hi,
>
> The issue is that PBDAGCON relies on BLASR libraries to do alignments in
> our implementation. For whatever reason, BLASR performance on D.
> melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t
> recommend running PBDAGCON on this genome unless you can run all the
> partitions in parallel on a grid environment.
>
> Also, we have a new version of the assembler, canu, which has an updated
> falcon_sense version which may work better for your assembly. You get the
> falcon_sense Linux binary here:
>
> http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true
> <https://github.com/marbl/canu>
> and just try replacing the version in CA 8.3 to see if it improves the Y
> assembly.
>
> Sergey
>
> On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm...>
> wrote:
>
> Hi,
> I noticed that while the Drosophila melanogaster MHAP assembly is very
> good in general, it has many gaps in single-copy Y-linked genes. I guess
> that this is caused by low coverage: the DNA came from males, and was
> assembled at 25x, which leaves the Y genes at 12.5x (theoretically).
> Furthermore, it seems that Y-linked reads are being lost during the first
> correction step (done by falcon-sense; I checked the uncorrected and the
> corrected reads).
>
> I am trying to fix these problems by increasing the coverage of the
> corrected reads used in the "post-correction" steps (by adding
> assembleCoverage=40 in the spec file ; instead of the default 25x) , and
> by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1
> in the spec file). The assembly with 40x and falcon-sense worked fine ,
> but when I tried 40x with pbdagcon , the run seems to be abnormally slow.
> Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM,
> and after 9 days running it was still processing the first two partitions
> of runPartition.sh
>
> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1
> # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2
>
> I checked the runPartition.sh script, and it seems to use only 8 threads
> (instead of 24):
>
> cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh
>
> $bin/outputLayout \
> -L \
> -e 0.35 -M 1500 \
> -i /home3/users/bernardo/drosophila//tempdros10/asm \
> -o /home3/users/bernardo/drosophila//tempdros10/asm \
> -p $jobid \
> -l 500 \
> \
> -P \
> -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \
> 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err |
> $bin/convertToPBCNS -consensus pbdagcon -path
> /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output
> /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix
> /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500
> -coverage 4 -threads 8 >
> /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch
>
> In this particular run I have not specified cnsConcurrency or
> consensusConcurrency in the spec file (so the PBcR choose the values; I
> only set threads=20 ), but in another run I added cnsConcurrency=20
> consensusConcurrency=20
> to the spec file, and again in 10 days it processed only 3 of the 200
> partitions.
>
> I tried before the ecoli 30x and the yeast data, and both worked fine with
> pbdagcon (although slower than falcon-sense). Are there some limitation to
> use pbdagcon with higher coverage data? Is the -threads 8 option of the
> convertToPBCNS program correct?
>
> Thanks,
> Bernardo
>
>
>
>
>
>
>
>
>
> A. Bernardo Carvalho
>
> Departamento de Genética
> Universidade Federal do Rio de Janeiro
>
> ------------------------------------------------------------------------------
> Go from Idea to Many App Stores Faster with Intel(R) XDK
> Give your users amazing mobile app experiences with Intel(R) XDK.
> Use one codebase in this all-in-one HTML5 development environment.
> Design, debug & build mobile apps & 2D/3D high-impact games for multiple
> OSs.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________
> wgs-assembler-users mailing list
> wgs...@li...
> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>
>
>
|
|
From: Serge K. <ser...@gm...> - 2015-12-04 22:30:17
|
Hi, The issue is that PBDAGCON relies on BLASR libraries to do alignments in our implementation. For whatever reason, BLASR performance on D. melanogaster is extremely poor. Thus, PBDAGCON is very slow and I wouldn’t recommend running PBDAGCON on this genome unless you can run all the partitions in parallel on a grid environment. Also, we have a new version of the assembler, canu, which has an updated falcon_sense version which may work better for your assembly. You get the falcon_sense Linux binary here: http://github.com/marbl/canu/blob/master/src/falcon_sense/falcon_sense.Linux-amd64.bin?raw=true <https://github.com/marbl/canu> and just try replacing the version in CA 8.3 to see if it improves the Y assembly. Sergey > On Dec 1, 2015, at 8:31 AM, A. Bernardo Carvalho <ber...@gm...> wrote: > > Hi, > I noticed that while the Drosophila melanogaster MHAP assembly is very good in general, it has many gaps in single-copy Y-linked genes. I guess that this is caused by low coverage: the DNA came from males, and was assembled at 25x, which leaves the Y genes at 12.5x (theoretically). Furthermore, it seems that Y-linked reads are being lost during the first correction step (done by falcon-sense; I checked the uncorrected and the corrected reads). > > I am trying to fix these problems by increasing the coverage of the corrected reads used in the "post-correction" steps (by adding assembleCoverage=40 in the spec file ; instead of the default 25x) , and by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1 in the spec file). The assembly with 40x and falcon-sense worked fine , but when I tried 40x with pbdagcon , the run seems to be abnormally slow. Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM, and after 9 days running it was still processing the first two partitions of runPartition.sh > > # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1 > # /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2 > > I checked the runPartition.sh script, and it seems to use only 8 threads (instead of 24): > > cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh > > $bin/outputLayout \ > -L \ > -e 0.35 -M 1500 \ > -i /home3/users/bernardo/drosophila//tempdros10/asm \ > -o /home3/users/bernardo/drosophila//tempdros10/asm \ > -p $jobid \ > -l 500 \ > \ > -P \ > -G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \ > 2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err | $bin/convertToPBCNS -consensus pbdagcon -path /home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output /home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix /home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500 -coverage 4 -threads 8 > /home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch > > In this particular run I have not specified cnsConcurrency or consensusConcurrency in the spec file (so the PBcR choose the values; I only set threads=20 ), but in another run I added cnsConcurrency=20 > consensusConcurrency=20 > to the spec file, and again in 10 days it processed only 3 of the 200 partitions. > > I tried before the ecoli 30x and the yeast data, and both worked fine with pbdagcon (although slower than falcon-sense). Are there some limitation to use pbdagcon with higher coverage data? Is the -threads 8 option of the convertToPBCNS program correct? > > Thanks, > Bernardo > > > > > > > > > > A. Bernardo Carvalho > > Departamento de Genética > Universidade Federal do Rio de Janeiro > ------------------------------------------------------------------------------ > Go from Idea to Many App Stores Faster with Intel(R) XDK > Give your users amazing mobile app experiences with Intel(R) XDK. > Use one codebase in this all-in-one HTML5 development environment. > Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. > http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
|
From: A. B. C. <ber...@gm...> - 2015-12-01 13:32:49
|
Hi,
I noticed that while the Drosophila melanogaster MHAP assembly is very
good in general, it has many gaps in single-copy Y-linked genes. I guess
that this is caused by low coverage: the DNA came from males, and was
assembled at 25x, which leaves the Y genes at 12.5x (theoretically).
Furthermore, it seems that Y-linked reads are being lost during the first
correction step (done by falcon-sense; I checked the uncorrected and the
corrected reads).
I am trying to fix these problems by increasing the coverage of the
corrected reads used in the "post-correction" steps (by adding
assembleCoverage=40 in the spec file ; instead of the default 25x) , and
by forcing the use of pbdagcon instead of falcon-sense (by adding pbcns=1
in the spec file). The assembly with 40x and falcon-sense worked fine ,
but when I tried 40x with pbdagcon , the run seems to be abnormally slow.
Specifically, the machine I used is a Dell with 24 processors / 144 Gb RAM,
and after 9 days running it was still processing the first two partitions
of runPartition.sh
# /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 1
# /home3/users/bernardo/drosophila//tempdros10/runPartition.sh 2
I checked the runPartition.sh script, and it seems to use only 8 threads
(instead of 24):
cat /home3/users/bernardo/drosophila//tempdros10/runPartition.sh
$bin/outputLayout \
-L \
-e 0.35 -M 1500 \
-i /home3/users/bernardo/drosophila//tempdros10/asm \
-o /home3/users/bernardo/drosophila//tempdros10/asm \
-p $jobid \
-l 500 \
\
-P \
-G /home3/users/bernardo/drosophila//tempdros10/asm.gkpStore \
2> /home3/users/bernardo/drosophila//tempdros10/$jobid.lay.err |
$bin/convertToPBCNS -consensus pbdagcon -path
/home3/users/bernardo/programs/wgs-8.3rc2/Linux-amd64/bin/ -output
/home3/users/bernardo/drosophila//tempdros10/$jobid.fasta -prefix
/home3/users/bernardo/drosophila//tempdros10/$jobid.tmp -length 500
-coverage 4 -threads 8 >
/home3/users/bernardo/drosophila//tempdros10/$jobid.err 2>&1 && touch
In this particular run I have not specified cnsConcurrency or
consensusConcurrency in the spec file (so the PBcR choose the values; I
only set threads=20 ), but in another run I added cnsConcurrency=20
consensusConcurrency=20
to the spec file, and again in 10 days it processed only 3 of the 200
partitions.
I tried before the ecoli 30x and the yeast data, and both worked fine with
pbdagcon (although slower than falcon-sense). Are there some limitation to
use pbdagcon with higher coverage data? Is the -threads 8 option of the
convertToPBCNS program correct?
Thanks,
Bernardo
A. Bernardo Carvalho
Departamento de Genética
Universidade Federal do Rio de Janeiro
|
|
From: Corey W. <cor...@gm...> - 2015-11-10 18:01:06
|
Hello, I have two questions relating to one assembly I am working on. - Should mate pair libraries be trimmed? Link to tech note on Illumina Mate Pairs (pdf). <http://www.illumina.com/documents/products/technotes/technote_nextera_matepair_data_processing.pdf> At the moment I trim my Illumina Mate Pair reads using skewer <https://github.com/relipmoc/skewer>. Since I can tell WGS the insert size, orientation I imagine the overlapper would work just fine with getting untrimmed reads. - In the same vein; does WGS understand overlapped paired-end (aka sloptigs)? For example if we sequence a 500bp insert with 300 cycles we theoretically get 100bp overlap. Thanks, -Corey Wischmeyer |
|
From: Serge K. <ser...@gm...> - 2015-09-15 21:07:56
|
Hi, 1. Most likely this indicates PBDAGCON or BLASR might not be found which is causing the issue with the E. coli dataset. Can you send the full output from that run along with the contents of the tempK12/runPartition.sh script? 2. As for number of contigs in yeast, the numbers in the paper are after filtering out contigs with less than 50 reads. Without filtering you should have about 36 contigs. You can see the unfiltered CA 8.3 results here: http://wgs-assembler.sourceforge.net/wiki/index.php/Version_8.3_Release_Notes <http://wgs-assembler.sourceforge.net/wiki/index.php/Version_8.3_Release_Notes> As long as your max and N50 sizes are similar to those reported your assembly is running correctly. Serge > On Sep 14, 2015, at 10:37 PM, A. Bernardo Carvalho <ber...@gm...> wrote: > > Dear all, > I installed the CA 8.3rc2 in my server (Dell PE2900 running CentOS6.6, > with 8 cores and 64 Gb RAM) . As I have not used CA before, I run two > sets of test data from the PBcR web pages (or from the MHAP paper). > In both cases my assembly was quite more fragmented then the reported > ones. The datasets are: > > > 1) 30x coverage E. coli ( > http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Assembling_an_E._coli > ). It should have assembled as a single contig, and I always got 11 > contigs. My command line and spec file follows: > > PBcR -pbCNS -length 500 -l K12_attempt11 -s > /home/tools/CA_tests/yeast/yeast2.spec -fastq > /home/tools/CA_tests/ecoli_83/selfSampleData/pacbio_filtered.fastq > genomeSize=4650000 > run11.out 2>&1 & > > spec file: > useGrid = 0 > scriptOnGrid = 0 > assemble = 1 > javaPath=/usr/bin/ > ovlThreads = 12 > threads = 12 > ovlConcurrency = 1 > cnsConcurrency = 12 > merylThreads = 12 > > I also removed the last 5 lines of the spec file (allowing PBcR to > choose nearly everything), but againg got 11 contigs. However, if I > run the program with the full Ecoli data set (downloaded from the AWS > snapshot), I got a single contig. > > > > 2) The yeast data set reported in the MHAP paper (Berlin et al 2015) , > downloaded from > http://gembox.cbcb.umd.edu/mhap/raw/yeast_filtered.fastq.gz > The MHAP paper describes that the assembly resulted in 21 contigs, > whereas I am always getting around 30. The command line follows (the > spec file is the same used for Ecoli ): > > PBcR -length 500 -l yeast2 -s yeast2.spec -fastq > /home/tools/CA_tests/yeast/yeast_filtered.fastq genomeSize=12100000 > > yeast2.out 2>&1 & > > I also tried to force the use of PBDAGCON, instead of falcon_sense, by > adding the line " pbcns=1" to the spec file, , and removing the > -pbCNS from the command line. The assembly was much slower, and, to my > surprise, more fragmented: 39 contigs. > > Are these results normal, or do they indicate some problem in my > installation of the Celera Assembler? > > Yours, > Bernardo > > > A. Bernardo Carvalho > > Departamento de Genética > Universidade Federal do Rio de Janeiro > > ------------------------------------------------------------------------------ > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
|
From: A. B. C. <ber...@gm...> - 2015-09-15 02:38:04
|
Dear all, I installed the CA 8.3rc2 in my server (Dell PE2900 running CentOS6.6, with 8 cores and 64 Gb RAM) . As I have not used CA before, I run two sets of test data from the PBcR web pages (or from the MHAP paper). In both cases my assembly was quite more fragmented then the reported ones. The datasets are: 1) 30x coverage E. coli ( http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Assembling_an_E._coli ). It should have assembled as a single contig, and I always got 11 contigs. My command line and spec file follows: PBcR -pbCNS -length 500 -l K12_attempt11 -s /home/tools/CA_tests/yeast/yeast2.spec -fastq /home/tools/CA_tests/ecoli_83/selfSampleData/pacbio_filtered.fastq genomeSize=4650000 > run11.out 2>&1 & spec file: useGrid = 0 scriptOnGrid = 0 assemble = 1 javaPath=/usr/bin/ ovlThreads = 12 threads = 12 ovlConcurrency = 1 cnsConcurrency = 12 merylThreads = 12 I also removed the last 5 lines of the spec file (allowing PBcR to choose nearly everything), but againg got 11 contigs. However, if I run the program with the full Ecoli data set (downloaded from the AWS snapshot), I got a single contig. 2) The yeast data set reported in the MHAP paper (Berlin et al 2015) , downloaded from http://gembox.cbcb.umd.edu/mhap/raw/yeast_filtered.fastq.gz The MHAP paper describes that the assembly resulted in 21 contigs, whereas I am always getting around 30. The command line follows (the spec file is the same used for Ecoli ): PBcR -length 500 -l yeast2 -s yeast2.spec -fastq /home/tools/CA_tests/yeast/yeast_filtered.fastq genomeSize=12100000 > yeast2.out 2>&1 & I also tried to force the use of PBDAGCON, instead of falcon_sense, by adding the line " pbcns=1" to the spec file, , and removing the -pbCNS from the command line. The assembly was much slower, and, to my surprise, more fragmented: 39 contigs. Are these results normal, or do they indicate some problem in my installation of the Celera Assembler? Yours, Bernardo A. Bernardo Carvalho Departamento de Genética Universidade Federal do Rio de Janeiro |
|
From: Brian W. <th...@gm...> - 2015-08-27 17:31:14
|
That sounds like at least one of the jobs in 3-overlapcorrection failed to
finish properly, though I've never seen that happen.
What is in (ls -l) the 3-overlapcorrection directory?
If there are still *.err or *.out or some other kind of logging files, scan
those for errors (they'll probably be at the end of the file).
The process here is to recompute overlaps after making base changes in the
reads. The expectation is that each old overlap (numOverlapsTotal) will
generate a new error rate (iNum), which can be copied into the store. For
some reason, there were fewer new error rates than overlaps. The fix is to
rerun - maybe manually - the last step ("ovlcorr") that recomputes overlaps.
On Thu, Aug 27, 2015 at 1:18 PM, Alex Brandt <aj...@gm...> wrote:
> Hi Celera Community,
>
> I'm using celera 8.3 and getting an error message I can't seem to make
> heads or tails of:
>
> ----------------------------------------START Thu Aug 27 08:13:38 2015
>
> /global/common/genepool/jgi/assemblers/celera/8.3/Linux-amd64/bin/overlapStore
> -u
> /global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/hp.ovlStore
> /global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/3-overlapcorrection/hp.erates
> >
> /global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/3-overlapcorrection/overlapStore-update-erates.err
> 2>&1
>
> ----------------------------------------END Thu Aug 27 08:14:17 2015 (39
> seconds)
>
> ERROR: Failed with signal HUP (1)
>
>
> ================================================================================
>
>
> runCA failed.
>
>
> ----------------------------------------
>
> Stack trace:
>
>
> at /usr/common/jgi/assemblers/celera/8.3/Linux-amd64/bin/runCA line 1649.
>
> main::caFailure('failed to apply the overlap corrections',
> '/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h...') called at
> /usr/common/jgi/assemblers/celera/8.3/Linux-amd64/bin/runCA line 4569
>
> main::overlapCorrection() called at
> /usr/common/jgi/assemblers/celera/8.3/Linux-amd64/bin/runCA line 6557
>
>
> ----------------------------------------
>
> Last few lines of the relevant log file
> (/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/3-overlapcorrection/overlapStore-update-erates.err):
>
>
> ERROR: iNum 291211181897 != orig->ovs.numOverlapsTotal 418401494276
>
>
> I get that there must be some discrepancy between iNum and the number of
> total overlaps, but I have no idea how to fix the problem (what steps to
> rerun, etc).
>
>
> Thanks
>
> Alex
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> wgs-assembler-users mailing list
> wgs...@li...
> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>
>
|
|
From: Alex B. <aj...@gm...> - 2015-08-27 17:18:09
|
Hi Celera Community,
I'm using celera 8.3 and getting an error message I can't seem to make
heads or tails of:
----------------------------------------START Thu Aug 27 08:13:38 2015
/global/common/genepool/jgi/assemblers/celera/8.3/Linux-amd64/bin/overlapStore
-u
/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/hp.ovlStore
/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/3-overlapcorrection/hp.erates
>
/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/3-overlapcorrection/overlapStore-update-erates.err
2>&1
----------------------------------------END Thu Aug 27 08:14:17 2015 (39
seconds)
ERROR: Failed with signal HUP (1)
================================================================================
runCA failed.
----------------------------------------
Stack trace:
at /usr/common/jgi/assemblers/celera/8.3/Linux-amd64/bin/runCA line 1649.
main::caFailure('failed to apply the overlap corrections',
'/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h...') called at
/usr/common/jgi/assemblers/celera/8.3/Linux-amd64/bin/runCA line 4569
main::overlapCorrection() called at
/usr/common/jgi/assemblers/celera/8.3/Linux-amd64/bin/runCA line 6557
----------------------------------------
Last few lines of the relevant log file
(/global/homes/a/ajbrandt/bscratch/h/celera_attempt/h_parallel/3-overlapcorrection/overlapStore-update-erates.err):
ERROR: iNum 291211181897 != orig->ovs.numOverlapsTotal 418401494276
I get that there must be some discrepancy between iNum and the number of
total overlaps, but I have no idea how to fix the problem (what steps to
rerun, etc).
Thanks
Alex
|
|
From: Brian W. <th...@gm...> - 2015-07-24 14:42:28
|
Yes, you can safely remove the 3- directory and restart. You've got FAR too many jobs created here. Increase frgCorrBatchSize and ovlCorrBatchSize. The 'frgCorr' step will use 13 bytes of memory per base in the batch PLUS any overlaps loaded. Assuming 250bp reads, try frgCorrBatchSize=50000000 (50 million). That should use 160gb memory for data, leaving lots of space for overlaps. Also set frgCorrConcurrency=1 and frgCorrThreads=64. This will run one job at a time, using 64 compute threads. The ovlCorr step isn't as demanding on memory, so lets try ovlCorrBatchSize=10000000 (10 million), ovlCorrConcurrency=8 ovlCorrThreads=8. b On Thu, Jul 23, 2015 at 11:50 AM, Christian Dreischer < chr...@gm...> wrote: > Good hint. After 8.2beta didn't work, I switched to 8.3rc2. Running runCA > again with 8.2beta continued the assembly with "cat-corrects". As I > mentioned before, frgcorr.sh seems to have run only for a fraction of the > fragments (judging from the message "Created 36656 overlap jobs. Last > batch '037', last job '036656'" in the log). Is it safe to delete the > 3-overlapper directory and to start runCA again? > Here's my config file: > > overlapper = ovl > unitigger = bogart > utgBubblePopping = 1 > merSize = 14 > merylMemory = 128000 > merylThreads = 16 > ovlStoreMemory = 8192 > # grid info > useGrid = 0 > scriptOnGrid = 0 > frgCorrOnGrid = 0 > ovlCorrOnGrid = 0 > #ovlMemory=8GB --hashload 0.7 > ovlHashBits = 25 > ovlThreads = 6 > ovlHashBlockLength = 20000000 > ovlRefBlockSize = 5000000 > # for mer overlapper > merCompression = 1 > merOverlapperSeedBatchSize = 500000 > merOverlapperExtendBatchSize = 250000 > frgCorrThreads = 20 > frgCorrBatchSize = 500000 > ovlCorrBatchSize = 100000 > # non-Grid settings, if you set useGrid to 0 above these will be used > merylMemory = 128000 > merylThreads = 12 > ovlStoreMemory = 8192 > ovlConcurrency = 8 > merOverlapperThreads = 6 > merOverlapperSeedConcurrency = 2 > merOverlapperExtendConcurrency = 2 > frgCorrConcurrency = 8 > ovlCorrConcurrency = 16 > cnsConcurrency = 16 > doToggle=0 > toggleNumInstances = 0 > toggleUnitigLength = 2000 > doOverlapBasedTrimming = 1 > doExtendClearRanges = 2 > > I'm running the assembly on a Ubuntu 14.04.2 LTS, 64 core (AMD Opteron) > server with 512GB memory. > > Chris > > > 2015-07-23 17:21 GMT+02:00 Brian Walenz <th...@gm...>: > >> Did you do a code update between starting the assembly and now? If you >> have the source code, change AS_READ_MAX_NORMAL_LEN_BITS in file >> AS_global.H from 18 to 16. >> >> The ovlStore is likely OK. The issue is hopefully configuration of the >> frgCorr (and ovlCorr) stages. These are both I/O intense and big memory, >> and finding a tradeoff is sometimes hard. Post your config, please (and >> describe the hardware you're running on). >> >> Failing that, you can turn this step off: doFragmentCorrection=0 (IIRC). >> I'd suggest a slight increase in unitigger (bogart, the bat* parameters) >> error rates to adjust for the uncorrected overlaps. >> >> b >> >> >> On Thu, Jul 23, 2015 at 10:29 AM, Christian Dreischer < >> chr...@gm...> wrote: >> >>> >>> Hi, >>> >>> I run wgs-8.2beta until the assembler went idle on one of the overlap >>> correction steps (frgcorr.sh). Obviously, one of the early fragments didn't >>> finish, as frgcorr.sh for this fragment was running for 15h and the log >>> file contained only the first few row up to ### Using 20 pthreads. The >>> assembler stopped the correction only a few fragments after the idle one. >>> I killed the idle process and executed the fragcorr.sh command for this >>> fragment manually. After that, I run runCA again with the original command >>> and immediatly got the failure message: >>> >>> "gatekeeper failed to add fragments" >>> >>> As this didn't seem to work, I renamed the folder 3-overlapcorrection >>> and run runCA again leading to the same error message. >>> I thought that starting with the step before the error correction could >>> work and run: >>> >>> /software/wgs-8.2beta/Linux-amd64/bin/overlapStoreBuild -o >>> /cabog/CA/genome.ovlStore.BUILDING -g /cabog/CA/genome.gkpStore -M 8192 >>> -L /cabog/CA/genome.ovlStore.list > /cabog/CA/genome.ovlStore.err 2>&1 >>> >>> The log file genome.ovlStore.err contains the following: >>> >>> gkStore_open()-- ERROR! Incorrect element sizes; code and store are >>> incompatible. >>> gkLibrary: store 216 code 216 bytes >>> gkPackedFragment: store 24 code 24 bytes >>> gkNormalFragment: store 48 code 48 bytes >>> gkStrobeFragment: store 48 code 48 bytes >>> AS_READ_MAX_NORMAL_LEN_BITS: store 16 code 18 >>> >>> Is it possible to restart the assembly at this point? >>> What steps do I have to take to "rescue" the assembly results up to >>> this point (>20 days of calculation time) >>> >>> Thanks >>> Chris >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>> >>> >> > |
|
From: Brian W. <th...@gm...> - 2015-07-23 15:21:54
|
Did you do a code update between starting the assembly and now? If you have the source code, change AS_READ_MAX_NORMAL_LEN_BITS in file AS_global.H from 18 to 16. The ovlStore is likely OK. The issue is hopefully configuration of the frgCorr (and ovlCorr) stages. These are both I/O intense and big memory, and finding a tradeoff is sometimes hard. Post your config, please (and describe the hardware you're running on). Failing that, you can turn this step off: doFragmentCorrection=0 (IIRC). I'd suggest a slight increase in unitigger (bogart, the bat* parameters) error rates to adjust for the uncorrected overlaps. b On Thu, Jul 23, 2015 at 10:29 AM, Christian Dreischer < chr...@gm...> wrote: > > Hi, > > I run wgs-8.2beta until the assembler went idle on one of the overlap > correction steps (frgcorr.sh). Obviously, one of the early fragments didn't > finish, as frgcorr.sh for this fragment was running for 15h and the log > file contained only the first few row up to ### Using 20 pthreads. The > assembler stopped the correction only a few fragments after the idle one. > I killed the idle process and executed the fragcorr.sh command for this > fragment manually. After that, I run runCA again with the original command > and immediatly got the failure message: > > "gatekeeper failed to add fragments" > > As this didn't seem to work, I renamed the folder 3-overlapcorrection and > run runCA again leading to the same error message. > I thought that starting with the step before the error correction could > work and run: > > /software/wgs-8.2beta/Linux-amd64/bin/overlapStoreBuild -o > /cabog/CA/genome.ovlStore.BUILDING -g /cabog/CA/genome.gkpStore -M 8192 > -L /cabog/CA/genome.ovlStore.list > /cabog/CA/genome.ovlStore.err 2>&1 > > The log file genome.ovlStore.err contains the following: > > gkStore_open()-- ERROR! Incorrect element sizes; code and store are > incompatible. > gkLibrary: store 216 code 216 bytes > gkPackedFragment: store 24 code 24 bytes > gkNormalFragment: store 48 code 48 bytes > gkStrobeFragment: store 48 code 48 bytes > AS_READ_MAX_NORMAL_LEN_BITS: store 16 code 18 > > Is it possible to restart the assembly at this point? > What steps do I have to take to "rescue" the assembly results up to this > point (>20 days of calculation time) > > Thanks > Chris > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |