You can subscribe to this list here.
2012 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(29) |
May
(8) |
Jun
(5) |
Jul
(46) |
Aug
(16) |
Sep
(5) |
Oct
(6) |
Nov
(17) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(5) |
Feb
(2) |
Mar
(10) |
Apr
(13) |
May
(20) |
Jun
(7) |
Jul
(6) |
Aug
(14) |
Sep
(9) |
Oct
(19) |
Nov
(17) |
Dec
(3) |
2014 |
Jan
(3) |
Feb
|
Mar
(7) |
Apr
(1) |
May
(1) |
Jun
(30) |
Jul
(10) |
Aug
(2) |
Sep
(18) |
Oct
(3) |
Nov
(4) |
Dec
(13) |
2015 |
Jan
(27) |
Feb
|
Mar
(19) |
Apr
(12) |
May
(10) |
Jun
(18) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(9) |
2016 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Manjari D. <man...@gm...> - 2014-12-08 06:01:29
|
Hi i am running celera with 700000 reads each more than 200 bp length and genome size is approx 1.3Gb. The celera has generated 438064 overlap job. I want to know if there is any method to increase the assembly speed. I want results in 3 or 4 days. |
From: Ole K. T. <o.k...@ib...> - 2014-11-28 13:02:16
|
Hi, what settings have you used for fastqToCA? (Alternatively sffToCA which is most common used for 454 reads). fastqToCA has Illumina as default technology (-technology option), which only accepts reads shorter than 160 bp. Use ‘-technology 454’ or ‘-technology illumina-long’ if you have longer reads. Ole On 28 Nov 2014, at 13:29, Manjari Deshmukh <man...@gm...> wrote: > Hi > I am trying to run celera on 454 FLX > it is giving error as > > GKP finished with 20417312 alerts or errors: > 20417312 # ILL Error: seq longer than longer than gkpShortReadLength bases, truncating. > > ERROR: library IID 1 'Celera_10x' has 100.00% errors or warnings. > > what does it mean? > > Manjari > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Manjari D. <man...@gm...> - 2014-11-28 12:29:53
|
Hi I am trying to run celera on 454 FLX it is giving error as GKP finished with 20417312 alerts or errors: 20417312 # ILL Error: seq longer than longer than gkpShortReadLength bases, truncating. ERROR: library IID 1 'Celera_10x' has 100.00% errors or warnings. what does it mean? Manjari |
From: Manjari D. <man...@gm...> - 2014-11-24 09:30:39
|
Hi, We tried to run Celera on single end 454 FLX dataset for 1.4 Gb genome assebmly with 250 GB RAM, 32 core processor and 8 TB hard disc and mainly default parameter. It ran for 10 days and ran out of memory. We couldn't get any useful results from it. I have attached the spreadsheet for distribution of sequence length for the given data set. Thanks and regards, Manjari |
From: Manjari D. <man...@gm...> - 2014-11-24 08:06:36
|
Hi, I am interested to know the minimum read length of single end 454 FLX data that can be use for celera assembler. We have minimum read length from 40 to 1700 bp. Thanks and regards manjari |
From: Akshaya R. <ar...@bu...> - 2014-10-20 17:46:53
|
Ah beautiful! I did eventually get amos installed, but these are much better options. Thank you so much, Akshaya On Sat, Oct 11, 2014 at 3:53 PM, Brian Walenz <th...@gm...> wrote: > Hi- > > The .asm is unwieldy and terrible to parse. Use the 'posmap' files, > specifically the frgctg or frgscf files list the position of each fragment > in a contig/scaffold. > > http://wgs-assembler.sourceforge.net/wiki/index.php/POSMAP > > Another option: > > https://sourceforge.net/p/wgs-assembler/mailman/message/31494576/ > > b > > > > On Mon, Oct 6, 2014 at 11:13 AM, Akshaya Ramesh <ar...@bu...> wrote: > >> Dear All, >> >> I would like to get information on coverage and the reads that were used >> to form a contig/scaffold. Is it right to assume that the .asm file >> contains this information? And if it does, I was wondering what package you >> use to view these files? >> >> I have been working on installing amos3.1 which has a utility called >> hawkeye that can be used to view .asm files. However, I am unable to >> configure amos such that it recognizes CA files.I have sent an e-mail to >> the amos-help, posted in seqanswers ( >> http://seqanswers.com/forums/showthread.php?t=47221) with no luck. >> >> Do any of you have any suggestions or have run into similar problems? >> >> I really appreciate your help. >> Best, >> Akshaya >> -- >> Akshaya Ramesh >> PhD candidate >> Kepler Lab >> Laboratory of Computational Immunology >> Boston University School of Medicine >> 72 E Concord Street, Room 504D >> Boston, MA 02118 >> >> >> ------------------------------------------------------------------------------ >> Slashdot TV. Videos for Nerds. Stuff that Matters. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > -- Akshaya Ramesh PhD candidate Kepler Lab Laboratory of Computational Immunology Boston University School of Medicine 72 E Concord Street, Room 504D Boston, MA 02118 |
From: Brian W. <th...@gm...> - 2014-10-11 19:53:26
|
Hi- The .asm is unwieldy and terrible to parse. Use the 'posmap' files, specifically the frgctg or frgscf files list the position of each fragment in a contig/scaffold. http://wgs-assembler.sourceforge.net/wiki/index.php/POSMAP Another option: https://sourceforge.net/p/wgs-assembler/mailman/message/31494576/ b On Mon, Oct 6, 2014 at 11:13 AM, Akshaya Ramesh <ar...@bu...> wrote: > Dear All, > > I would like to get information on coverage and the reads that were used > to form a contig/scaffold. Is it right to assume that the .asm file > contains this information? And if it does, I was wondering what package you > use to view these files? > > I have been working on installing amos3.1 which has a utility called > hawkeye that can be used to view .asm files. However, I am unable to > configure amos such that it recognizes CA files.I have sent an e-mail to > the amos-help, posted in seqanswers ( > http://seqanswers.com/forums/showthread.php?t=47221) with no luck. > > Do any of you have any suggestions or have run into similar problems? > > I really appreciate your help. > Best, > Akshaya > -- > Akshaya Ramesh > PhD candidate > Kepler Lab > Laboratory of Computational Immunology > Boston University School of Medicine > 72 E Concord Street, Room 504D > Boston, MA 02118 > > > ------------------------------------------------------------------------------ > Slashdot TV. Videos for Nerds. Stuff that Matters. > > http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Akshaya R. <ar...@bu...> - 2014-10-06 15:14:14
|
Dear All, I would like to get information on coverage and the reads that were used to form a contig/scaffold. Is it right to assume that the .asm file contains this information? And if it does, I was wondering what package you use to view these files? I have been working on installing amos3.1 which has a utility called hawkeye that can be used to view .asm files. However, I am unable to configure amos such that it recognizes CA files.I have sent an e-mail to the amos-help, posted in seqanswers ( http://seqanswers.com/forums/showthread.php?t=47221) with no luck. Do any of you have any suggestions or have run into similar problems? I really appreciate your help. Best, Akshaya -- Akshaya Ramesh PhD candidate Kepler Lab Laboratory of Computational Immunology Boston University School of Medicine 72 E Concord Street, Room 504D Boston, MA 02118 |
From: Brian W. <th...@gm...> - 2014-09-26 20:28:53
|
Hi, Ivan- I finally had a chance to look at this. I see no problems. I computed the distance between the posmap placement and a placement found by mapping with 'bwa mem'. Most of the placements are within 100bp of the two methods. I suspect you used the fastqUIDmap from the 'trimming' run, and not from the 'assembly' run. The two read sets are different; trimming deletes many reads. In particular, the second read id (#24884) you list in the posmap file doesn't exist in my assembly. b On Tue, Sep 16, 2014 at 9:44 AM, Brian Walenz <th...@gm...> wrote: > The posmap positions are derived from the untig/contig multialignments, > and I doubt they're incorrect. Too much other stuff would be broken too. > > There are some big repeats in this genome, if I remember, one at the start > of the contig. Since most reads are in the same contig, can you compute > the distance between posmap-position and blasr-position? I don't have > (yet) this assembly to analyze. > > On Sun, Sep 14, 2014 at 2:58 AM, Ivan Sovic <iva...@gm...> wrote: > >> Hi Brian! >> >> Thank you for your reply, and I apologize for my slow response. >> It's nice to hear that I'm not the only one with this problem :) >> >> I would be happy to share an example. >> Here is the first 5 lines of the posmap.frg.ctg file, where I have >> replaced the IDs of reads with their actual names (the relation was taken >> from asm.gkpStore.fastqUIDmap): >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/9515/0_4256 >> ctg7180000000002 0 8435 f >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24884/2806_11942 >> ctg7180000000002 1495 8147 f >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24752/0_1244 >> ctg7180000000002 1617 12822 f >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/14271/1648_4730 >> ctg7180000000002 1699 8847 r >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/10369/11194_16593 >> ctg7180000000002 1760 8558 r >> >> The last two numbers of each read's name roughly gives its length (I >> think they are subreads, so read 2 should be 9136 bases long). >> Here is where BLASR placed them (I copy only the first few fields of the >> SAM entries, up to the CIGAR string): >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/9515/0_4256/0_4256 >> 0 ctg7180000000002 4174066 254 >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24884/2806_11942/0_9136 >> 16 ctg7180000000002 1510215 254 >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24752/0_1244/0_1244 >> 16 ctg7180000000002 881151 254 >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/14271/1648_4730/0_3082 >> 16 ctg7180000000002 4413614 254 >> m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/10369/11194_16593/0_5399 >> 0 ctg7180000000002 1891829 254 >> >> Contig placement is good, but it's kind of hard to miss that - there are >> only two contigs, one is the size of the genome (E. Coli, >> ctg7180000000002), and the other is the size of two reads (7180000000003). >> I checked manually, none of the listed reads were clipped (according to >> their CIGAR strings). >> >> The assembly is described here, together with the E. Coli datasets >> (PacBio reads extracted into FASTQ files) and with instructions on how to >> run the assembly: >> >> http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1 >> It runs for about half an hour, and produces a complete assembly of E. >> Coli. >> >> Do you have any ideas what's going on with these files? >> >> >> Thank you and best regards! >> Ivan >> >> >> >> >> On Thu, Sep 11, 2014 at 8:17 PM, Brian Walenz <th...@gm...> wrote: >> >>> When evaluating the read trimming used in the uncorrected assemblies, we >>> had _great_ trouble comparing results from mappings (blasr, nucmer, blast, >>> whatever) against what CA was doing. BLASR was probably the worst offender >>> here, usually failing to map portions of the read that we thought were >>> good. I think you're seeing the same effect. >>> >>> Are the placements to different contigs, or are they mostly overlapping >>> but with different end points? Can you share a small example? I'll try >>> the same experiment here. >>> >>> Mapping trimmed reads might get closer to what posmap claims, but aside >>> from a sanity check, there might be little value in it. Kind of like >>> validating with only "good" mate pairs, you won't see any mistakes. >>> >>> b >>> >>> >>> On Thu, Sep 11, 2014 at 2:08 AM, Ivan Sovic <iva...@gm...> >>> wrote: >>> >>>> Hi everyone! >>>> >>>> I have trouble with interpreting the POSMAP data of an assembly. >>>> In short - when I compare the positions of reads that are given in the >>>> asm.posmap.frgctg file with the positions I obtain after aligning the reads >>>> to the assembly in asm.ctg.fasta, I can see no relation between the two. >>>> For alignment, I used both BLASR and BWA-MEM. >>>> >>>> Description of what I am doing in more details: >>>> Following this tutorial ( >>>> http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1) >>>> I assembled the E. Coli genome from a set of PacBio reads, and the results >>>> were exactly as described. >>>> After that, I parsed the asm.posmap.frgctg file to obtain the list of >>>> reads that were actually used in the assembly. >>>> I extracted their original headers from the asm.gkpStore.fastqUIDmap >>>> file, and filtered the initial set of reads, so the resulting set contains >>>> only those reads listed in the asm.posmap.frgctg file. >>>> After that, I used both BLASR with default parameters, and BWA-MEM with >>>> PacBio parameters to align those reads on the contig file asm.ctg.fasta. >>>> I then compared the positions of obtained alignments to the positions >>>> that are reported in asm.posmap.frgctg, and I see no correspondance. >>>> >>>> Can anyone provide any insight into this? >>>> Am I missing something? >>>> Or maybe the POSMAP files weren't updated with the rest of Celera? >>>> >>>> >>>> Thank you for your help! >>>> >>>> >>>> Best regards, >>>> Ivan Sovic. >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Want excitement? >>>> Manually upgrade your production database. >>>> When you want reliability, choose Perforce >>>> Perforce version control. Predictably reliable. >>>> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> wgs-assembler-users mailing list >>>> wgs...@li... >>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>>> >>>> >>> >> > |
From: Walenz, B. <wa...@nb...> - 2014-09-23 14:41:18
|
You might still be compiling with the older compiler. Try adding the new compiler to the start of your path, or set environment variables CC and CXX to point to the new version. Your glibc package might be ancient, try updating it. This isn’t a problem specific to the assembler. Searching for ‘GLIBCXX_3.4.10 not found’ gives lots of other suggestions. For example: https://bbs.archlinux.org/viewtopic.php?pid=1065388 b From: wuk...@16... [mailto:wuk...@16...] Sent: Tuesday, September 23, 2014 9:07 AM To: Brian Walenz Cc: wgs-assembler-users Subject: Re: [wgs-assembler-users] /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) Hi, I compile it from source code like: bzip2 -dc wgs-8.1.tar.bz2 | tar -xf - cd wgs-8.1 cd kmer && make install && cd .. cd samtools && make && cd .. cd src && make && cd .. cd .. The old gcc version is 4.1.2, and I install a new version of gcc 4.5.1 on own account. ________________________________ Best, Kai Wu From: Brian Walenz<mailto:th...@gm...> Date: 2014-09-23 20:58 To: wuk...@16...<mailto:wuk...@16...> CC: wgs-assembler-users<mailto:wgs...@li...> Subject: Re: [wgs-assembler-users] /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) Hi- Did you compile this yourself, or is it the pre-compiled version from sourceforge? Try compiling yourself. How old is 'too old'? b On Sun, Sep 21, 2014 at 2:14 AM, wuk...@16...<mailto:wuk...@16...> <wuk...@16...<mailto:wuk...@16...>> wrote: Dear colleagues, When I run the command "runCA -p ipagpj029hmc001 -d ipagpj029hmc001_raw useGrid=1 scriptOnGrid=1 doOBT=1 unitigger=bogart /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_1.1.frg /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_2.1.frg" After a while, the error massege of ipagpj029hmc001.gkpStore.err is: /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by /share/work/lhuang/ my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) I don't know why it? I know my gcc version is too old. So, I install a new version of gcc on my own account. And I set the environment variable: export LD_LIBRARY_PATH=/home/kwu/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/home/kwu/lib64:$LD_LIBRARY_PATH But, it seem can't find it. ________________________________ Best, Kai Wu ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ wgs-assembler-users mailing list wgs...@li...<mailto:wgs...@li...> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: <wuk...@16...> - 2014-09-23 13:07:15
|
Hi, I compile it from source code like: bzip2 -dc wgs-8.1.tar.bz2 | tar -xf - cd wgs-8.1 cd kmer && make install && cd .. cd samtools && make && cd .. cd src && make && cd .. cd .. The old gcc version is 4.1.2, and I install a new version of gcc 4.5.1 on own account. Best, Kai Wu From: Brian Walenz Date: 2014-09-23 20:58 To: wuk...@16... CC: wgs-assembler-users Subject: Re: [wgs-assembler-users] /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) Hi- Did you compile this yourself, or is it the pre-compiled version from sourceforge? Try compiling yourself. How old is 'too old'? b On Sun, Sep 21, 2014 at 2:14 AM, wuk...@16... <wuk...@16...> wrote: Dear colleagues, When I run the command "runCA -p ipagpj029hmc001 -d ipagpj029hmc001_raw useGrid=1 scriptOnGrid=1 doOBT=1 unitigger=bogart /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_1.1.frg /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_2.1.frg" After a while, the error massege of ipagpj029hmc001.gkpStore.err is: /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by /share/work/lhuang/ my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) I don't know why it? I know my gcc version is too old. So, I install a new version of gcc on my own account. And I set the environment variable: export LD_LIBRARY_PATH=/home/kwu/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/home/kwu/lib64:$LD_LIBRARY_PATH But, it seem can't find it. Best, Kai Wu ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ wgs-assembler-users mailing list wgs...@li... https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Brian W. <th...@gm...> - 2014-09-23 12:58:53
|
Hi- Did you compile this yourself, or is it the pre-compiled version from sourceforge? Try compiling yourself. How old is 'too old'? b On Sun, Sep 21, 2014 at 2:14 AM, wuk...@16... <wuk...@16...> wrote: > Dear colleagues, > > When I run the command "runCA -p ipagpj029hmc001 -d ipagpj029hmc001_raw > useGrid=1 scriptOnGrid=1 doOBT=1 unitigger=bogart > /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_1.1.frg > /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_2.1.frg > " > After a while, the error massege of ipagpj029hmc001.gkpStore.err is: > /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper: > /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by > /share/work/lhuang/ > my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) > > I don't know why it? I know my gcc version is too old. So, I install a new > version of gcc on my own account. And I set the environment variable: > export LD_LIBRARY_PATH=/home/kwu/lib:$LD_LIBRARY_PATH > export LD_LIBRARY_PATH=/home/kwu/lib64:$LD_LIBRARY_PATH > > But, it seem can't find it. > > ------------------------------ > Best, > > Kai Wu > > > ------------------------------------------------------------------------------ > Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer > Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports > Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper > Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer > > http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: <wuk...@16...> - 2014-09-21 06:14:37
|
Dear colleagues, When I run the command "runCA -p ipagpj029hmc001 -d ipagpj029hmc001_raw useGrid=1 scriptOnGrid=1 doOBT=1 unitigger=bogart /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_1.1.frg /home/kwu/workdir/my_projects/ipag_pj029/data/CA_data/ipagpj029hmc001_2.1.frg" After a while, the error massege of ipagpj029hmc001.gkpStore.err is: /share/work/lhuang/my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by /share/work/lhuang/ my_apps/wgs-8.1/Linux-amd64/bin/gatekeeper) I don't know why it? I know my gcc version is too old. So, I install a new version of gcc on my own account. And I set the environment variable: export LD_LIBRARY_PATH=/home/kwu/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/home/kwu/lib64:$LD_LIBRARY_PATH But, it seem can't find it. Best, Kai Wu |
From: Brian W. <th...@gm...> - 2014-09-16 13:44:41
|
The posmap positions are derived from the untig/contig multialignments, and I doubt they're incorrect. Too much other stuff would be broken too. There are some big repeats in this genome, if I remember, one at the start of the contig. Since most reads are in the same contig, can you compute the distance between posmap-position and blasr-position? I don't have (yet) this assembly to analyze. On Sun, Sep 14, 2014 at 2:58 AM, Ivan Sovic <iva...@gm...> wrote: > Hi Brian! > > Thank you for your reply, and I apologize for my slow response. > It's nice to hear that I'm not the only one with this problem :) > > I would be happy to share an example. > Here is the first 5 lines of the posmap.frg.ctg file, where I have > replaced the IDs of reads with their actual names (the relation was taken > from asm.gkpStore.fastqUIDmap): > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/9515/0_4256 > ctg7180000000002 0 8435 f > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24884/2806_11942 > ctg7180000000002 1495 8147 f > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24752/0_1244 > ctg7180000000002 1617 12822 f > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/14271/1648_4730 > ctg7180000000002 1699 8847 r > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/10369/11194_16593 > ctg7180000000002 1760 8558 r > > The last two numbers of each read's name roughly gives its length (I think > they are subreads, so read 2 should be 9136 bases long). > Here is where BLASR placed them (I copy only the first few fields of the > SAM entries, up to the CIGAR string): > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/9515/0_4256/0_4256 > 0 ctg7180000000002 4174066 254 > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24884/2806_11942/0_9136 > 16 ctg7180000000002 1510215 254 > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24752/0_1244/0_1244 > 16 ctg7180000000002 881151 254 > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/14271/1648_4730/0_3082 > 16 ctg7180000000002 4413614 254 > m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/10369/11194_16593/0_5399 > 0 ctg7180000000002 1891829 254 > > Contig placement is good, but it's kind of hard to miss that - there are > only two contigs, one is the size of the genome (E. Coli, > ctg7180000000002), and the other is the size of two reads (7180000000003). > I checked manually, none of the listed reads were clipped (according to > their CIGAR strings). > > The assembly is described here, together with the E. Coli datasets (PacBio > reads extracted into FASTQ files) and with instructions on how to run the > assembly: > > http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1 > It runs for about half an hour, and produces a complete assembly of E. > Coli. > > Do you have any ideas what's going on with these files? > > > Thank you and best regards! > Ivan > > > > > On Thu, Sep 11, 2014 at 8:17 PM, Brian Walenz <th...@gm...> wrote: > >> When evaluating the read trimming used in the uncorrected assemblies, we >> had _great_ trouble comparing results from mappings (blasr, nucmer, blast, >> whatever) against what CA was doing. BLASR was probably the worst offender >> here, usually failing to map portions of the read that we thought were >> good. I think you're seeing the same effect. >> >> Are the placements to different contigs, or are they mostly overlapping >> but with different end points? Can you share a small example? I'll try >> the same experiment here. >> >> Mapping trimmed reads might get closer to what posmap claims, but aside >> from a sanity check, there might be little value in it. Kind of like >> validating with only "good" mate pairs, you won't see any mistakes. >> >> b >> >> >> On Thu, Sep 11, 2014 at 2:08 AM, Ivan Sovic <iva...@gm...> wrote: >> >>> Hi everyone! >>> >>> I have trouble with interpreting the POSMAP data of an assembly. >>> In short - when I compare the positions of reads that are given in the >>> asm.posmap.frgctg file with the positions I obtain after aligning the reads >>> to the assembly in asm.ctg.fasta, I can see no relation between the two. >>> For alignment, I used both BLASR and BWA-MEM. >>> >>> Description of what I am doing in more details: >>> Following this tutorial ( >>> http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1) >>> I assembled the E. Coli genome from a set of PacBio reads, and the results >>> were exactly as described. >>> After that, I parsed the asm.posmap.frgctg file to obtain the list of >>> reads that were actually used in the assembly. >>> I extracted their original headers from the asm.gkpStore.fastqUIDmap >>> file, and filtered the initial set of reads, so the resulting set contains >>> only those reads listed in the asm.posmap.frgctg file. >>> After that, I used both BLASR with default parameters, and BWA-MEM with >>> PacBio parameters to align those reads on the contig file asm.ctg.fasta. >>> I then compared the positions of obtained alignments to the positions >>> that are reported in asm.posmap.frgctg, and I see no correspondance. >>> >>> Can anyone provide any insight into this? >>> Am I missing something? >>> Or maybe the POSMAP files weren't updated with the rest of Celera? >>> >>> >>> Thank you for your help! >>> >>> >>> Best regards, >>> Ivan Sovic. >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Want excitement? >>> Manually upgrade your production database. >>> When you want reliability, choose Perforce >>> Perforce version control. Predictably reliable. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>> >>> >> > |
From: Ivan S. <iva...@gm...> - 2014-09-14 06:58:43
|
Hi Brian! Thank you for your reply, and I apologize for my slow response. It's nice to hear that I'm not the only one with this problem :) I would be happy to share an example. Here is the first 5 lines of the posmap.frg.ctg file, where I have replaced the IDs of reads with their actual names (the relation was taken from asm.gkpStore.fastqUIDmap): m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/9515/0_4256 ctg7180000000002 0 8435 f m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24884/2806_11942 ctg7180000000002 1495 8147 f m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24752/0_1244 ctg7180000000002 1617 12822 f m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/14271/1648_4730 ctg7180000000002 1699 8847 r m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/10369/11194_16593 ctg7180000000002 1760 8558 r The last two numbers of each read's name roughly gives its length (I think they are subreads, so read 2 should be 9136 bases long). Here is where BLASR placed them (I copy only the first few fields of the SAM entries, up to the CIGAR string): m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/9515/0_4256/0_4256 0 ctg7180000000002 4174066 254 m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24884/2806_11942/0_9136 16 ctg7180000000002 1510215 254 m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/24752/0_1244/0_1244 16 ctg7180000000002 881151 254 m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/14271/1648_4730/0_3082 16 ctg7180000000002 4413614 254 m130404_014004_sidney_c100506902550000001823076808221337_s1_p0/10369/11194_16593/0_5399 0 ctg7180000000002 1891829 254 Contig placement is good, but it's kind of hard to miss that - there are only two contigs, one is the size of the genome (E. Coli, ctg7180000000002), and the other is the size of two reads (7180000000003). I checked manually, none of the listed reads were clipped (according to their CIGAR strings). The assembly is described here, together with the E. Coli datasets (PacBio reads extracted into FASTQ files) and with instructions on how to run the assembly: http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1 It runs for about half an hour, and produces a complete assembly of E. Coli. Do you have any ideas what's going on with these files? Thank you and best regards! Ivan On Thu, Sep 11, 2014 at 8:17 PM, Brian Walenz <th...@gm...> wrote: > When evaluating the read trimming used in the uncorrected assemblies, we > had _great_ trouble comparing results from mappings (blasr, nucmer, blast, > whatever) against what CA was doing. BLASR was probably the worst offender > here, usually failing to map portions of the read that we thought were > good. I think you're seeing the same effect. > > Are the placements to different contigs, or are they mostly overlapping > but with different end points? Can you share a small example? I'll try > the same experiment here. > > Mapping trimmed reads might get closer to what posmap claims, but aside > from a sanity check, there might be little value in it. Kind of like > validating with only "good" mate pairs, you won't see any mistakes. > > b > > > On Thu, Sep 11, 2014 at 2:08 AM, Ivan Sovic <iva...@gm...> wrote: > >> Hi everyone! >> >> I have trouble with interpreting the POSMAP data of an assembly. >> In short - when I compare the positions of reads that are given in the >> asm.posmap.frgctg file with the positions I obtain after aligning the reads >> to the assembly in asm.ctg.fasta, I can see no relation between the two. >> For alignment, I used both BLASR and BWA-MEM. >> >> Description of what I am doing in more details: >> Following this tutorial ( >> http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1) >> I assembled the E. Coli genome from a set of PacBio reads, and the results >> were exactly as described. >> After that, I parsed the asm.posmap.frgctg file to obtain the list of >> reads that were actually used in the assembly. >> I extracted their original headers from the asm.gkpStore.fastqUIDmap >> file, and filtered the initial set of reads, so the resulting set contains >> only those reads listed in the asm.posmap.frgctg file. >> After that, I used both BLASR with default parameters, and BWA-MEM with >> PacBio parameters to align those reads on the contig file asm.ctg.fasta. >> I then compared the positions of obtained alignments to the positions >> that are reported in asm.posmap.frgctg, and I see no correspondance. >> >> Can anyone provide any insight into this? >> Am I missing something? >> Or maybe the POSMAP files weren't updated with the rest of Celera? >> >> >> Thank you for your help! >> >> >> Best regards, >> Ivan Sovic. >> >> >> >> ------------------------------------------------------------------------------ >> Want excitement? >> Manually upgrade your production database. >> When you want reliability, choose Perforce >> Perforce version control. Predictably reliable. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > |
From: Brian W. <th...@gm...> - 2014-09-11 12:17:54
|
When evaluating the read trimming used in the uncorrected assemblies, we had _great_ trouble comparing results from mappings (blasr, nucmer, blast, whatever) against what CA was doing. BLASR was probably the worst offender here, usually failing to map portions of the read that we thought were good. I think you're seeing the same effect. Are the placements to different contigs, or are they mostly overlapping but with different end points? Can you share a small example? I'll try the same experiment here. Mapping trimmed reads might get closer to what posmap claims, but aside from a sanity check, there might be little value in it. Kind of like validating with only "good" mate pairs, you won't see any mistakes. b On Thu, Sep 11, 2014 at 2:08 AM, Ivan Sovic <iva...@gm...> wrote: > Hi everyone! > > I have trouble with interpreting the POSMAP data of an assembly. > In short - when I compare the positions of reads that are given in the > asm.posmap.frgctg file with the positions I obtain after aligning the reads > to the assembly in asm.ctg.fasta, I can see no relation between the two. > For alignment, I used both BLASR and BWA-MEM. > > Description of what I am doing in more details: > Following this tutorial ( > http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1) > I assembled the E. Coli genome from a set of PacBio reads, and the results > were exactly as described. > After that, I parsed the asm.posmap.frgctg file to obtain the list of > reads that were actually used in the assembly. > I extracted their original headers from the asm.gkpStore.fastqUIDmap file, > and filtered the initial set of reads, so the resulting set contains only > those reads listed in the asm.posmap.frgctg file. > After that, I used both BLASR with default parameters, and BWA-MEM with > PacBio parameters to align those reads on the contig file asm.ctg.fasta. > I then compared the positions of obtained alignments to the positions that > are reported in asm.posmap.frgctg, and I see no correspondance. > > Can anyone provide any insight into this? > Am I missing something? > Or maybe the POSMAP files weren't updated with the rest of Celera? > > > Thank you for your help! > > > Best regards, > Ivan Sovic. > > > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce > Perforce version control. Predictably reliable. > > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Ivan S. <iva...@gm...> - 2014-09-11 06:08:12
|
Hi everyone! I have trouble with interpreting the POSMAP data of an assembly. In short - when I compare the positions of reads that are given in the asm.posmap.frgctg file with the positions I obtain after aligning the reads to the assembly in asm.ctg.fasta, I can see no relation between the two. For alignment, I used both BLASR and BWA-MEM. Description of what I am doing in more details: Following this tutorial ( http://wgs-assembler.sourceforge.net/wiki/index.php/Escherichia_coli_K12_MG1655,_using_uncorrected_PacBio_reads,_with_CA8.1) I assembled the E. Coli genome from a set of PacBio reads, and the results were exactly as described. After that, I parsed the asm.posmap.frgctg file to obtain the list of reads that were actually used in the assembly. I extracted their original headers from the asm.gkpStore.fastqUIDmap file, and filtered the initial set of reads, so the resulting set contains only those reads listed in the asm.posmap.frgctg file. After that, I used both BLASR with default parameters, and BWA-MEM with PacBio parameters to align those reads on the contig file asm.ctg.fasta. I then compared the positions of obtained alignments to the positions that are reported in asm.posmap.frgctg, and I see no correspondance. Can anyone provide any insight into this? Am I missing something? Or maybe the POSMAP files weren't updated with the rest of Celera? Thank you for your help! Best regards, Ivan Sovic. |
From: Johan W. <ka...@gm...> - 2014-09-10 08:26:53
|
Alright, thanks for your help! Best regards, Johan Wikander On Mon, Sep 8, 2014 at 6:48 PM, Serge Koren <ser...@gm...> wrote: > The mer overlapper can be used to assemble the corrected PacBio data. It > just cannot be specified for the correction step of the pipeline. The > pipeline will assemble the corrected data with default parameters. You can > then run any assembly parameters you prefer on the generated frg file after > correction. > > Sergey > > On Sep 6, 2014, at 1:15 AM, Johan Wikander <ka...@gm...> wrote: > > Looks like that was the problem, I reran the pipeline with ovl as > overlappar and it finished. Thanks for your help! > > It would be great if you added that feature to PBcR, we have hade som > really good assemblies with mer as overlapper! Is it possible to use any of > the output from a successful PBcR run and do a assembly with mer? > > Best regards, > Johan Wikander > > > On Fri, Sep 5, 2014 at 9:59 PM, Serge Koren <ser...@gm...> wrote: > >> The 1-overlapper directory should not contained either seeds or olaps as >> those are not generated by PBcR (they only get generated when you're >> assembling already corrected data using the mer overlapper which is not the >> default and not commonly used). Looking through your output, it seems that >> your spec file sets: >> overlapper = mer >> This option should not be set for PBcR and I think is the cause of the >> problem. I would suggest removing that option from your spec file, removing >> the temporary directory and trying again. I'll update the code to make this >> option invalid for PBcR. >> >> Sergey >> >> On Sep 5, 2014, at 4:52 AM, Johan Wikander <ka...@gm...> wrote: >> >> Hi! >> Thanks for your reply! The only folders I have in 1-overlapper/ are >> olaps/ and seeds/, both empty. The 1.out, 2.out etc. are all located in >> 1-overlapper/. >> >> Output files from partition 1: http://pastebin.com/FeJj1kF6 >> Output files from partition 79 (last partition): >> http://pastebin.com/Hqhdi2vF >> >> Best regards, >> Johan Wikander >> >> >> >> On Thu, Sep 4, 2014 at 6:05 PM, Serge Koren <ser...@gm...> wrote: >> >>> Hi, >>> >>> When you use hybrid correction, MHAP and java are not used (these are >>> only used for self-correction). In the hybrid correction case, BLASR is >>> still used for the alignments. The likeliest source of your error is >>> something going wrong with the BLASR run on your system. >>> >>> Do you have any outputs in the 1-overlapper/001/* directory? Can you >>> send all the output from one overlap partition (for example 1.out, >>> 1.hash.err, etc) that should help diagnose what went wrong. >>> >>> Sergey >>> >>> On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: >>> >>> Hi! >>> The last couple of days I have struggled to complete a correction and >>> hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. >>> >>> The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, >>> but afterwards the pipeline suddenly crashes. At first I thought I had the >>> wrong version of java, but after trying different versions I assume that >>> the problem is resource exhaustion. I have tried to tweak the settings too >>> make the assembly less demanding, but it still crashes. >>> >>> Any help or suggestions are greatly appreciated! >>> >>> The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of >>> RAM. >>> >>> I run it from the command line using the following options: >>> PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s >>> pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq >>> genomeSize=7900000 >>> >>> The spec-file: >>> useGrid=0 >>> scriptOnGrid=0 >>> >>> ovlMemory = 96 >>> ovlStoreMemory= 96000 >>> merylMemory = 96000 >>> >>> >>> frgMinLen=65 >>> ovlMinLen=55 >>> overlapper=mer >>> merOverlapperThreads=20 >>> batThreads=20 >>> gkpFixInsertSizes=1 >>> ovlHashBits=29 >>> ovlHashBlockLength=4000000 >>> unitigger=bogart >>> doChimeraDetection=normal >>> bogBadMateDepth=5 >>> createACE=1 >>> cleanup=aggressive >>> doToggle=1 >>> >>> ../bug3-ShortIllumina_30X.frg >>> ../bug3-LongIllumina_20X.frg >>> ../bug3-454.frg >>> >>> The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y >>> The output from PBcR: http://pastebin.com/dH2UPqLr >>> >>> Best regards, >>> Johan Wikander >>> >>> >>> ------------------------------------------------------------------------------ >>> Slashdot TV. >>> Video for Nerds. Stuff that matters. >>> http://tv.slashdot.org/_______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>> >>> >>> >> >> > > |
From: Serge K. <ser...@gm...> - 2014-09-08 16:48:42
|
The mer overlapper can be used to assemble the corrected PacBio data. It just cannot be specified for the correction step of the pipeline. The pipeline will assemble the corrected data with default parameters. You can then run any assembly parameters you prefer on the generated frg file after correction. Sergey On Sep 6, 2014, at 1:15 AM, Johan Wikander <ka...@gm...> wrote: > Looks like that was the problem, I reran the pipeline with ovl as overlappar and it finished. Thanks for your help! > > It would be great if you added that feature to PBcR, we have hade som really good assemblies with mer as overlapper! Is it possible to use any of the output from a successful PBcR run and do a assembly with mer? > > Best regards, > Johan Wikander > > > On Fri, Sep 5, 2014 at 9:59 PM, Serge Koren <ser...@gm...> wrote: > The 1-overlapper directory should not contained either seeds or olaps as those are not generated by PBcR (they only get generated when you're assembling already corrected data using the mer overlapper which is not the default and not commonly used). Looking through your output, it seems that your spec file sets: > overlapper = mer > This option should not be set for PBcR and I think is the cause of the problem. I would suggest removing that option from your spec file, removing the temporary directory and trying again. I'll update the code to make this option invalid for PBcR. > > Sergey > > On Sep 5, 2014, at 4:52 AM, Johan Wikander <ka...@gm...> wrote: > >> Hi! >> Thanks for your reply! The only folders I have in 1-overlapper/ are olaps/ and seeds/, both empty. The 1.out, 2.out etc. are all located in 1-overlapper/. >> >> Output files from partition 1: http://pastebin.com/FeJj1kF6 >> Output files from partition 79 (last partition): http://pastebin.com/Hqhdi2vF >> >> Best regards, >> Johan Wikander >> >> >> >> On Thu, Sep 4, 2014 at 6:05 PM, Serge Koren <ser...@gm...> wrote: >> Hi, >> >> When you use hybrid correction, MHAP and java are not used (these are only used for self-correction). In the hybrid correction case, BLASR is still used for the alignments. The likeliest source of your error is something going wrong with the BLASR run on your system. >> >> Do you have any outputs in the 1-overlapper/001/* directory? Can you send all the output from one overlap partition (for example 1.out, 1.hash.err, etc) that should help diagnose what went wrong. >> >> Sergey >> >> On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: >> >>> Hi! >>> The last couple of days I have struggled to complete a correction and hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. >>> >>> The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, but afterwards the pipeline suddenly crashes. At first I thought I had the wrong version of java, but after trying different versions I assume that the problem is resource exhaustion. I have tried to tweak the settings too make the assembly less demanding, but it still crashes. >>> >>> Any help or suggestions are greatly appreciated! >>> >>> The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of RAM. >>> >>> I run it from the command line using the following options: >>> PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq genomeSize=7900000 >>> >>> The spec-file: >>> useGrid=0 >>> scriptOnGrid=0 >>> >>> ovlMemory = 96 >>> ovlStoreMemory= 96000 >>> merylMemory = 96000 >>> >>> >>> frgMinLen=65 >>> ovlMinLen=55 >>> overlapper=mer >>> merOverlapperThreads=20 >>> batThreads=20 >>> gkpFixInsertSizes=1 >>> ovlHashBits=29 >>> ovlHashBlockLength=4000000 >>> unitigger=bogart >>> doChimeraDetection=normal >>> bogBadMateDepth=5 >>> createACE=1 >>> cleanup=aggressive >>> doToggle=1 >>> >>> ../bug3-ShortIllumina_30X.frg >>> ../bug3-LongIllumina_20X.frg >>> ../bug3-454.frg >>> >>> The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y >>> The output from PBcR: http://pastebin.com/dH2UPqLr >>> >>> Best regards, >>> Johan Wikander >>> >>> ------------------------------------------------------------------------------ >>> Slashdot TV. >>> Video for Nerds. Stuff that matters. >>> http://tv.slashdot.org/_______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > > |
From: Johan W. <ka...@gm...> - 2014-09-06 05:15:22
|
Looks like that was the problem, I reran the pipeline with ovl as overlappar and it finished. Thanks for your help! It would be great if you added that feature to PBcR, we have hade som really good assemblies with mer as overlapper! Is it possible to use any of the output from a successful PBcR run and do a assembly with mer? Best regards, Johan Wikander On Fri, Sep 5, 2014 at 9:59 PM, Serge Koren <ser...@gm...> wrote: > The 1-overlapper directory should not contained either seeds or olaps as > those are not generated by PBcR (they only get generated when you're > assembling already corrected data using the mer overlapper which is not the > default and not commonly used). Looking through your output, it seems that > your spec file sets: > overlapper = mer > This option should not be set for PBcR and I think is the cause of the > problem. I would suggest removing that option from your spec file, removing > the temporary directory and trying again. I'll update the code to make this > option invalid for PBcR. > > Sergey > > On Sep 5, 2014, at 4:52 AM, Johan Wikander <ka...@gm...> wrote: > > Hi! > Thanks for your reply! The only folders I have in 1-overlapper/ are olaps/ > and seeds/, both empty. The 1.out, 2.out etc. are all located in > 1-overlapper/. > > Output files from partition 1: http://pastebin.com/FeJj1kF6 > Output files from partition 79 (last partition): > http://pastebin.com/Hqhdi2vF > > Best regards, > Johan Wikander > > > > On Thu, Sep 4, 2014 at 6:05 PM, Serge Koren <ser...@gm...> wrote: > >> Hi, >> >> When you use hybrid correction, MHAP and java are not used (these are >> only used for self-correction). In the hybrid correction case, BLASR is >> still used for the alignments. The likeliest source of your error is >> something going wrong with the BLASR run on your system. >> >> Do you have any outputs in the 1-overlapper/001/* directory? Can you send >> all the output from one overlap partition (for example 1.out, 1.hash.err, >> etc) that should help diagnose what went wrong. >> >> Sergey >> >> On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: >> >> Hi! >> The last couple of days I have struggled to complete a correction and >> hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. >> >> The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, >> but afterwards the pipeline suddenly crashes. At first I thought I had the >> wrong version of java, but after trying different versions I assume that >> the problem is resource exhaustion. I have tried to tweak the settings too >> make the assembly less demanding, but it still crashes. >> >> Any help or suggestions are greatly appreciated! >> >> The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of >> RAM. >> >> I run it from the command line using the following options: >> PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s >> pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq >> genomeSize=7900000 >> >> The spec-file: >> useGrid=0 >> scriptOnGrid=0 >> >> ovlMemory = 96 >> ovlStoreMemory= 96000 >> merylMemory = 96000 >> >> >> frgMinLen=65 >> ovlMinLen=55 >> overlapper=mer >> merOverlapperThreads=20 >> batThreads=20 >> gkpFixInsertSizes=1 >> ovlHashBits=29 >> ovlHashBlockLength=4000000 >> unitigger=bogart >> doChimeraDetection=normal >> bogBadMateDepth=5 >> createACE=1 >> cleanup=aggressive >> doToggle=1 >> >> ../bug3-ShortIllumina_30X.frg >> ../bug3-LongIllumina_20X.frg >> ../bug3-454.frg >> >> The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y >> The output from PBcR: http://pastebin.com/dH2UPqLr >> >> Best regards, >> Johan Wikander >> >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/_______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> >> > > |
From: Johan W. <ka...@gm...> - 2014-09-06 05:15:21
|
Looks like that was the problem, I reran the pipeline with ovl as overlappar and it finished. Thanks for your help! It would be great if you added that feature to PBcR, we have hade som really good assemblies with mer as overlapper! Is it possible to use any of the output from a successful PBcR run and do a assembly with mer? Best regards, Johan Wikander On Fri, Sep 5, 2014 at 9:59 PM, Serge Koren <ser...@gm...> wrote: > The 1-overlapper directory should not contained either seeds or olaps as > those are not generated by PBcR (they only get generated when you're > assembling already corrected data using the mer overlapper which is not the > default and not commonly used). Looking through your output, it seems that > your spec file sets: > overlapper = mer > This option should not be set for PBcR and I think is the cause of the > problem. I would suggest removing that option from your spec file, removing > the temporary directory and trying again. I'll update the code to make this > option invalid for PBcR. > > Sergey > > On Sep 5, 2014, at 4:52 AM, Johan Wikander <ka...@gm...> wrote: > > Hi! > Thanks for your reply! The only folders I have in 1-overlapper/ are olaps/ > and seeds/, both empty. The 1.out, 2.out etc. are all located in > 1-overlapper/. > > Output files from partition 1: http://pastebin.com/FeJj1kF6 > Output files from partition 79 (last partition): > http://pastebin.com/Hqhdi2vF > > Best regards, > Johan Wikander > > > > On Thu, Sep 4, 2014 at 6:05 PM, Serge Koren <ser...@gm...> wrote: > >> Hi, >> >> When you use hybrid correction, MHAP and java are not used (these are >> only used for self-correction). In the hybrid correction case, BLASR is >> still used for the alignments. The likeliest source of your error is >> something going wrong with the BLASR run on your system. >> >> Do you have any outputs in the 1-overlapper/001/* directory? Can you send >> all the output from one overlap partition (for example 1.out, 1.hash.err, >> etc) that should help diagnose what went wrong. >> >> Sergey >> >> On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: >> >> Hi! >> The last couple of days I have struggled to complete a correction and >> hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. >> >> The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, >> but afterwards the pipeline suddenly crashes. At first I thought I had the >> wrong version of java, but after trying different versions I assume that >> the problem is resource exhaustion. I have tried to tweak the settings too >> make the assembly less demanding, but it still crashes. >> >> Any help or suggestions are greatly appreciated! >> >> The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of >> RAM. >> >> I run it from the command line using the following options: >> PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s >> pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq >> genomeSize=7900000 >> >> The spec-file: >> useGrid=0 >> scriptOnGrid=0 >> >> ovlMemory = 96 >> ovlStoreMemory= 96000 >> merylMemory = 96000 >> >> >> frgMinLen=65 >> ovlMinLen=55 >> overlapper=mer >> merOverlapperThreads=20 >> batThreads=20 >> gkpFixInsertSizes=1 >> ovlHashBits=29 >> ovlHashBlockLength=4000000 >> unitigger=bogart >> doChimeraDetection=normal >> bogBadMateDepth=5 >> createACE=1 >> cleanup=aggressive >> doToggle=1 >> >> ../bug3-ShortIllumina_30X.frg >> ../bug3-LongIllumina_20X.frg >> ../bug3-454.frg >> >> The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y >> The output from PBcR: http://pastebin.com/dH2UPqLr >> >> Best regards, >> Johan Wikander >> >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/_______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> >> > > |
From: Serge K. <ser...@gm...> - 2014-09-05 19:59:49
|
The 1-overlapper directory should not contained either seeds or olaps as those are not generated by PBcR (they only get generated when you're assembling already corrected data using the mer overlapper which is not the default and not commonly used). Looking through your output, it seems that your spec file sets: overlapper = mer This option should not be set for PBcR and I think is the cause of the problem. I would suggest removing that option from your spec file, removing the temporary directory and trying again. I'll update the code to make this option invalid for PBcR. Sergey On Sep 5, 2014, at 4:52 AM, Johan Wikander <ka...@gm...> wrote: > Hi! > Thanks for your reply! The only folders I have in 1-overlapper/ are olaps/ and seeds/, both empty. The 1.out, 2.out etc. are all located in 1-overlapper/. > > Output files from partition 1: http://pastebin.com/FeJj1kF6 > Output files from partition 79 (last partition): http://pastebin.com/Hqhdi2vF > > Best regards, > Johan Wikander > > > > On Thu, Sep 4, 2014 at 6:05 PM, Serge Koren <ser...@gm...> wrote: > Hi, > > When you use hybrid correction, MHAP and java are not used (these are only used for self-correction). In the hybrid correction case, BLASR is still used for the alignments. The likeliest source of your error is something going wrong with the BLASR run on your system. > > Do you have any outputs in the 1-overlapper/001/* directory? Can you send all the output from one overlap partition (for example 1.out, 1.hash.err, etc) that should help diagnose what went wrong. > > Sergey > > On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: > >> Hi! >> The last couple of days I have struggled to complete a correction and hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. >> >> The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, but afterwards the pipeline suddenly crashes. At first I thought I had the wrong version of java, but after trying different versions I assume that the problem is resource exhaustion. I have tried to tweak the settings too make the assembly less demanding, but it still crashes. >> >> Any help or suggestions are greatly appreciated! >> >> The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of RAM. >> >> I run it from the command line using the following options: >> PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq genomeSize=7900000 >> >> The spec-file: >> useGrid=0 >> scriptOnGrid=0 >> >> ovlMemory = 96 >> ovlStoreMemory= 96000 >> merylMemory = 96000 >> >> >> frgMinLen=65 >> ovlMinLen=55 >> overlapper=mer >> merOverlapperThreads=20 >> batThreads=20 >> gkpFixInsertSizes=1 >> ovlHashBits=29 >> ovlHashBlockLength=4000000 >> unitigger=bogart >> doChimeraDetection=normal >> bogBadMateDepth=5 >> createACE=1 >> cleanup=aggressive >> doToggle=1 >> >> ../bug3-ShortIllumina_30X.frg >> ../bug3-LongIllumina_20X.frg >> ../bug3-454.frg >> >> The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y >> The output from PBcR: http://pastebin.com/dH2UPqLr >> >> Best regards, >> Johan Wikander >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/_______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Johan W. <ka...@gm...> - 2014-09-05 08:52:10
|
Hi! Thanks for your reply! The only folders I have in 1-overlapper/ are olaps/ and seeds/, both empty. The 1.out, 2.out etc. are all located in 1-overlapper/. Output files from partition 1: http://pastebin.com/FeJj1kF6 Output files from partition 79 (last partition): http://pastebin.com/Hqhdi2vF Best regards, Johan Wikander On Thu, Sep 4, 2014 at 6:05 PM, Serge Koren <ser...@gm...> wrote: > Hi, > > When you use hybrid correction, MHAP and java are not used (these are only > used for self-correction). In the hybrid correction case, BLASR is still > used for the alignments. The likeliest source of your error is something > going wrong with the BLASR run on your system. > > Do you have any outputs in the 1-overlapper/001/* directory? Can you send > all the output from one overlap partition (for example 1.out, 1.hash.err, > etc) that should help diagnose what went wrong. > > Sergey > > On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: > > Hi! > The last couple of days I have struggled to complete a correction and > hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. > > The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, but > afterwards the pipeline suddenly crashes. At first I thought I had the > wrong version of java, but after trying different versions I assume that > the problem is resource exhaustion. I have tried to tweak the settings too > make the assembly less demanding, but it still crashes. > > Any help or suggestions are greatly appreciated! > > The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of RAM. > > I run it from the command line using the following options: > PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s > pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq > genomeSize=7900000 > > The spec-file: > useGrid=0 > scriptOnGrid=0 > > ovlMemory = 96 > ovlStoreMemory= 96000 > merylMemory = 96000 > > > frgMinLen=65 > ovlMinLen=55 > overlapper=mer > merOverlapperThreads=20 > batThreads=20 > gkpFixInsertSizes=1 > ovlHashBits=29 > ovlHashBlockLength=4000000 > unitigger=bogart > doChimeraDetection=normal > bogBadMateDepth=5 > createACE=1 > cleanup=aggressive > doToggle=1 > > ../bug3-ShortIllumina_30X.frg > ../bug3-LongIllumina_20X.frg > ../bug3-454.frg > > The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y > The output from PBcR: http://pastebin.com/dH2UPqLr > > Best regards, > Johan Wikander > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > > |
From: Serge K. <ser...@gm...> - 2014-09-04 16:05:29
|
Hi, When you use hybrid correction, MHAP and java are not used (these are only used for self-correction). In the hybrid correction case, BLASR is still used for the alignments. The likeliest source of your error is something going wrong with the BLASR run on your system. Do you have any outputs in the 1-overlapper/001/* directory? Can you send all the output from one overlap partition (for example 1.out, 1.hash.err, etc) that should help diagnose what went wrong. Sergey On Sep 4, 2014, at 8:17 AM, Johan Wikander <ka...@gm...> wrote: > Hi! > The last couple of days I have struggled to complete a correction and hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. > > The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, but afterwards the pipeline suddenly crashes. At first I thought I had the wrong version of java, but after trying different versions I assume that the problem is resource exhaustion. I have tried to tweak the settings too make the assembly less demanding, but it still crashes. > > Any help or suggestions are greatly appreciated! > > The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of RAM. > > I run it from the command line using the following options: > PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq genomeSize=7900000 > > The spec-file: > useGrid=0 > scriptOnGrid=0 > > ovlMemory = 96 > ovlStoreMemory= 96000 > merylMemory = 96000 > > > frgMinLen=65 > ovlMinLen=55 > overlapper=mer > merOverlapperThreads=20 > batThreads=20 > gkpFixInsertSizes=1 > ovlHashBits=29 > ovlHashBlockLength=4000000 > unitigger=bogart > doChimeraDetection=normal > bogBadMateDepth=5 > createACE=1 > cleanup=aggressive > doToggle=1 > > ../bug3-ShortIllumina_30X.frg > ../bug3-LongIllumina_20X.frg > ../bug3-454.frg > > The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y > The output from PBcR: http://pastebin.com/dH2UPqLr > > Best regards, > Johan Wikander > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Johan W. <ka...@gm...> - 2014-09-04 12:17:19
|
Hi! The last couple of days I have struggled to complete a correction and hybrid assembly with PacBio Long Reads, Illumina and 454 data using PBcR. The overlapper runs for approx. 30 hrs, finishes all the overlap jobs, but afterwards the pipeline suddenly crashes. At first I thought I had the wrong version of java, but after trying different versions I assume that the problem is resource exhaustion. I have tried to tweak the settings too make the assembly less demanding, but it still crashes. Any help or suggestions are greatly appreciated! The system is a Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz with 128GB of RAM. I run it from the command line using the following options: PBcR -pbCNS -length 500 -partitions 200 -l bug3-PBcR-Ill454PB9 -s pacbio.spec -fastq ../seq/pb_051_3-filtered_subreads.fastq genomeSize=7900000 The spec-file: useGrid=0 scriptOnGrid=0 ovlMemory = 96 ovlStoreMemory= 96000 merylMemory = 96000 frgMinLen=65 ovlMinLen=55 overlapper=mer merOverlapperThreads=20 batThreads=20 gkpFixInsertSizes=1 ovlHashBits=29 ovlHashBlockLength=4000000 unitigger=bogart doChimeraDetection=normal bogBadMateDepth=5 createACE=1 cleanup=aggressive doToggle=1 ../bug3-ShortIllumina_30X.frg ../bug3-LongIllumina_20X.frg ../bug3-454.frg The last 10 lines of *.err: http://pastebin.com/Mi4Hhi6Y The output from PBcR: http://pastebin.com/dH2UPqLr Best regards, Johan Wikander |