You can subscribe to this list here.
2012 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(29) |
May
(8) |
Jun
(5) |
Jul
(46) |
Aug
(16) |
Sep
(5) |
Oct
(6) |
Nov
(17) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(5) |
Feb
(2) |
Mar
(10) |
Apr
(13) |
May
(20) |
Jun
(7) |
Jul
(6) |
Aug
(14) |
Sep
(9) |
Oct
(19) |
Nov
(17) |
Dec
(3) |
2014 |
Jan
(3) |
Feb
|
Mar
(7) |
Apr
(1) |
May
(1) |
Jun
(30) |
Jul
(10) |
Aug
(2) |
Sep
(18) |
Oct
(3) |
Nov
(4) |
Dec
(13) |
2015 |
Jan
(27) |
Feb
|
Mar
(19) |
Apr
(12) |
May
(10) |
Jun
(18) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(9) |
2016 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Brian W. <th...@gm...> - 2015-01-19 15:29:54
|
I've never seen large overlap jobs perform better than small jobs. Target an 8gb job with ~4 CPUs each. My default configuration is: ovlHashBits = 22 ovlHashBlockLength = 200000000 ovlRefBlockSize = 18000000 ovlThreads = 6 The two 'hash' sizes control how big the job is. The 'ref block size' controls how many reads are processed by each job, i.e., how long the job runs. b On Mon, Jan 19, 2015 at 5:10 AM, Ludovic Mallet < lud...@un...> wrote: > Hi, > Not the best expert, but to me, virtual_free allow the job to swap, > which you should try to avoid. > and I think h_vmem is the hard limit, so the job would be killed > whenever the line is crossed. > > from http://gridengine.eu/grid-engine-internals > "hard limitation: All processes of the job combined are limited from the > Linux kernel that they are able to use only the requested amount of > memory. Further malloc() calls will fail." > > whether h_vmem is hard by default if GE has to be checked again, but I'd > rather use mem_free instead > > Best, > ludovic > > On 19/01/15 02:22, Miguel Grau wrote: > > Dear all, > > > > I am having some troubles to config wgs 8.2 assembler with SGE options. > > I always get a malloc memory error and I am not sure why. I am working > > with 3 paired fastq files (6 files in total) with 100b length reads (15 > > million reads in each fastq file). My config file: > > > > useGrid = 1 > > scriptOnGrid = 1 > > > > sge = -A assembly > > sgeMerTrim = -l h_vmem=150G -l virtual_free=150G > > sgeScript = -l h_vmem=50G -l virtual_free=50G > > sgeOverlap = -l h_vmem=100G -l virtual_free=100G > > sgeMerOverlapSeed = -l h_vmem=100G -l virtual_free=100G > > sgeMerOverlapExtend = -l h_vmem=100G -l virtual_free=100G > > sgeConsensus = -l h_vmem=100G -l virtual_free=100G > > sgeFragmentCorrection = -l h_vmem=100G -l virtual_free=100G > > sgeOverlapCorrection = -l h_vmem=100G -l virtual_free=100G > > > > overlapper = ovl #Best for illumina > > unitigger = bogart #Best for illumina > > > > #For 50GB... > > ovlHashBits = 28 > > ovlHashBlockLength = 480000000 > > #100Gb for overlap > > ovlStoreMemory=102400 > > > > ovlThreads = 2 > > ovlRefBlockSize = 7630000 > > frgCorrBatchSize = 1000000 > > frgCorrThreads = 8 > > > > The error that I have now is: > > > > > ------------------------------------------------------------------------------ > > bucketizing /reads/a6/0-overlaptrim-overlap/001/000278.ovb.gz > > bucketizing /reads/a6/0-overlaptrim-overlap/001/000276.ovb.gz > > bucketizing /reads/a6/0-overlaptrim-overlap/001/000275.ovb.gz > > bucketizing /reads/a6/0-overlaptrim-overlap/001/000280.ovb.gz > > bucketizing DONE! > > overlaps skipped: > > 1211882406 OBT - low quality > > 0 DUP - non-duplicate overlap > > 0 DUP - different library > > 0 DUP - dedup not requested > > terminate called after throwing an instance of 'std::bad_alloc' > > what(): std::bad_alloc > > > > Failed with 'Aborted' > > > > Backtrace (mangled): > > > > > /miquel/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] > > /lib64/libpthread.so.0[0x3ff1c0f710] > > /lib64/libc.so.6(gsignal+0x35)[0x3ff1432925] > > /lib64/libc.so.6(abort+0x175)[0x3ff1434105] > > .... > > > ---------------------------------------------------------------------------------- > > > > Some idea for the best config? > > > > Thank you, > > > > > > Miguel > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > > GigeNET is offering a free month of service with a new server in Ashburn. > > Choose from 2 high performing configs, both with 100TB of bandwidth. > > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > > http://p.sf.net/sfu/gigenet > > _______________________________________________ > > wgs-assembler-users mailing list > > wgs...@li... > > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > > > ------------------------------------------------------------------------------ > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > GigeNET is offering a free month of service with a new server in Ashburn. > Choose from 2 high performing configs, both with 100TB of bandwidth. > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > http://p.sf.net/sfu/gigenet > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > |
From: Ludovic M. <lud...@un...> - 2015-01-19 10:48:05
|
Hi, Not the best expert, but to me, virtual_free allow the job to swap, which you should try to avoid. and I think h_vmem is the hard limit, so the job would be killed whenever the line is crossed. from http://gridengine.eu/grid-engine-internals "hard limitation: All processes of the job combined are limited from the Linux kernel that they are able to use only the requested amount of memory. Further malloc() calls will fail." whether h_vmem is hard by default if GE has to be checked again, but I'd rather use mem_free instead Best, ludovic On 19/01/15 02:22, Miguel Grau wrote: > Dear all, > > I am having some troubles to config wgs 8.2 assembler with SGE options. > I always get a malloc memory error and I am not sure why. I am working > with 3 paired fastq files (6 files in total) with 100b length reads (15 > million reads in each fastq file). My config file: > > useGrid = 1 > scriptOnGrid = 1 > > sge = -A assembly > sgeMerTrim = -l h_vmem=150G -l virtual_free=150G > sgeScript = -l h_vmem=50G -l virtual_free=50G > sgeOverlap = -l h_vmem=100G -l virtual_free=100G > sgeMerOverlapSeed = -l h_vmem=100G -l virtual_free=100G > sgeMerOverlapExtend = -l h_vmem=100G -l virtual_free=100G > sgeConsensus = -l h_vmem=100G -l virtual_free=100G > sgeFragmentCorrection = -l h_vmem=100G -l virtual_free=100G > sgeOverlapCorrection = -l h_vmem=100G -l virtual_free=100G > > overlapper = ovl #Best for illumina > unitigger = bogart #Best for illumina > > #For 50GB... > ovlHashBits = 28 > ovlHashBlockLength = 480000000 > #100Gb for overlap > ovlStoreMemory=102400 > > ovlThreads = 2 > ovlRefBlockSize = 7630000 > frgCorrBatchSize = 1000000 > frgCorrThreads = 8 > > The error that I have now is: > > ------------------------------------------------------------------------------ > bucketizing /reads/a6/0-overlaptrim-overlap/001/000278.ovb.gz > bucketizing /reads/a6/0-overlaptrim-overlap/001/000276.ovb.gz > bucketizing /reads/a6/0-overlaptrim-overlap/001/000275.ovb.gz > bucketizing /reads/a6/0-overlaptrim-overlap/001/000280.ovb.gz > bucketizing DONE! > overlaps skipped: > 1211882406 OBT - low quality > 0 DUP - non-duplicate overlap > 0 DUP - different library > 0 DUP - dedup not requested > terminate called after throwing an instance of 'std::bad_alloc' > what(): std::bad_alloc > > Failed with 'Aborted' > > Backtrace (mangled): > > /miquel/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] > /lib64/libpthread.so.0[0x3ff1c0f710] > /lib64/libc.so.6(gsignal+0x35)[0x3ff1432925] > /lib64/libc.so.6(abort+0x175)[0x3ff1434105] > .... > ---------------------------------------------------------------------------------- > > Some idea for the best config? > > Thank you, > > > Miguel > > > > > > > ------------------------------------------------------------------------------ > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > GigeNET is offering a free month of service with a new server in Ashburn. > Choose from 2 high performing configs, both with 100TB of bandwidth. > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > http://p.sf.net/sfu/gigenet > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Miguel G. <mi...@uj...> - 2015-01-19 01:50:23
|
Dear all, I am having some troubles to config wgs 8.2 assembler with SGE options. I always get a malloc memory error and I am not sure why. I am working with 3 paired fastq files (6 files in total) with 100b length reads (15 million reads in each fastq file). My config file: useGrid = 1 scriptOnGrid = 1 sge = -A assembly sgeMerTrim = -l h_vmem=150G -l virtual_free=150G sgeScript = -l h_vmem=50G -l virtual_free=50G sgeOverlap = -l h_vmem=100G -l virtual_free=100G sgeMerOverlapSeed = -l h_vmem=100G -l virtual_free=100G sgeMerOverlapExtend = -l h_vmem=100G -l virtual_free=100G sgeConsensus = -l h_vmem=100G -l virtual_free=100G sgeFragmentCorrection = -l h_vmem=100G -l virtual_free=100G sgeOverlapCorrection = -l h_vmem=100G -l virtual_free=100G overlapper = ovl #Best for illumina unitigger = bogart #Best for illumina #For 50GB... ovlHashBits = 28 ovlHashBlockLength = 480000000 #100Gb for overlap ovlStoreMemory=102400 ovlThreads = 2 ovlRefBlockSize = 7630000 frgCorrBatchSize = 1000000 frgCorrThreads = 8 The error that I have now is: ------------------------------------------------------------------------------ bucketizing /reads/a6/0-overlaptrim-overlap/001/000278.ovb.gz bucketizing /reads/a6/0-overlaptrim-overlap/001/000276.ovb.gz bucketizing /reads/a6/0-overlaptrim-overlap/001/000275.ovb.gz bucketizing /reads/a6/0-overlaptrim-overlap/001/000280.ovb.gz bucketizing DONE! overlaps skipped: 1211882406 OBT - low quality 0 DUP - non-duplicate overlap 0 DUP - different library 0 DUP - dedup not requested terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Failed with 'Aborted' Backtrace (mangled): /miquel/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] /lib64/libpthread.so.0[0x3ff1c0f710] /lib64/libc.so.6(gsignal+0x35)[0x3ff1432925] /lib64/libc.so.6(abort+0x175)[0x3ff1434105] .... ---------------------------------------------------------------------------------- Some idea for the best config? Thank you, Miguel |
From: Brian W. <th...@gm...> - 2015-01-15 23:35:16
|
I can't argue with the option bloat in CA. There are a lot of options that should be removed or shouldn't have been exposed in the first place. This is the first time I've seen merTrim be a bottleneck. I suspect it might be spending lots of time building data structures. I'll admit that runCA support for this part is weak; on large assemblies, I run the trimming by hand. The merTrim binary has a '-enablecache' option that will build, dump, and reuse the data structures between jobs. There isn't runCA support for it though. Ah! if that is your bottleneck, then we are moving the wrong way by making jobs smaller. We want to be generating one job with 48 threads enabled. So, build data structures once, then let 48 threads process all the reads in the same job. I was thinking that you're not getting multiple threads for some reason. I'm also none too pleased with sourceforge performance. They killed off support for mediawiki, forcing everyone to either use rewrite pages for their inferior wiki (no tables in the markup!) or install individual mediawiki instances. It's free, so I can't really complain too much. On Thu, Jan 15, 2015 at 12:59 PM, mathog <ma...@ca...> wrote: > On 15-Jan-2015 09:02, Brian Walenz wrote: > >> The option you're looking for is mbtThreads, with a default of 4. >> >> Also look into option mbtBatchSize, which sets how many reads to process >> per job. The default is 1 million, and you've already got at least 48 >> jobs, so this is probably not an issue. >> > > (snip) > > So, in summary, I don't know why you're not getting multiple CPUs on >> these. You can work around the problem by dropping the batch size to make >> jobs with about 8gb memory (smaller than 512/48), then run 48 jobs in >> parallel. >> > > So many options, so little time. I don't suppose anybody has put together > a script that asks for the relevant system and data information and then > emits a SPEC file to run at something approximating optimal speed on the > equipment at hand? The input would be something like (no doubt I'm leaving > out key information): > > primary node: > RAM=, CPU=, DISK= #fill in the max to use, actual could be more > cluster: Y # N if none > type=older N=10, RAM=, CPU=, DISK= > type=newer N=20, RAM=, CPU=, DISK= > queue_system=SGE > FRG types: 2 #at least 1 > Illumina N=3, totalreads= > Sanger N=2, totalreads= > > As it is now, there are a lot of parameters to fiddle with > > runCA -options | wc > 184 <- !!!! > > which probably all make perfect sense to people experienced with this > software but which are fairly mysterious when first encountered > > In any case, I did try modifying the -t parameter on 0-mertrim/mertrim.sh > while the jobs were running, and the new settings "took" as each new job > started. The run times were: > > -t ~minutes > 4 22 > 16 14 > 40 12-13 > > So there isn't much to be gained by pushing that parameter up. > > You can increase the number of jobs running at once with mbtConcurrency. >> > > Kind of my point about the script, I overlooked that one. I did use > merylThreads, but didn't realize that trim and count used different > parameters. Concurrency x Threads, that is simultaneous jobs x cpus/job? > There are 7 of the former parameters and 6 of the latter. Presumably if I > spent a couple of hours reading all the documentation (which for some > reason has been loading really, really slowly from sourceforge) I could > make a guess at what would probably work best. The hypothetical script I > alluded to would be a lot more convenient! > > Thanks, > > David Mathog > ma...@ca... > Manager, Sequence Analysis Facility, Biology Division, Caltech > |
From: mathog <ma...@ca...> - 2015-01-15 18:00:08
|
On 15-Jan-2015 09:02, Brian Walenz wrote: > The option you're looking for is mbtThreads, with a default of 4. > > Also look into option mbtBatchSize, which sets how many reads to > process > per job. The default is 1 million, and you've already got at least 48 > jobs, so this is probably not an issue. (snip) > So, in summary, I don't know why you're not getting multiple CPUs on > these. You can work around the problem by dropping the batch size to > make > jobs with about 8gb memory (smaller than 512/48), then run 48 jobs in > parallel. So many options, so little time. I don't suppose anybody has put together a script that asks for the relevant system and data information and then emits a SPEC file to run at something approximating optimal speed on the equipment at hand? The input would be something like (no doubt I'm leaving out key information): primary node: RAM=, CPU=, DISK= #fill in the max to use, actual could be more cluster: Y # N if none type=older N=10, RAM=, CPU=, DISK= type=newer N=20, RAM=, CPU=, DISK= queue_system=SGE FRG types: 2 #at least 1 Illumina N=3, totalreads= Sanger N=2, totalreads= As it is now, there are a lot of parameters to fiddle with runCA -options | wc 184 <- !!!! which probably all make perfect sense to people experienced with this software but which are fairly mysterious when first encountered In any case, I did try modifying the -t parameter on 0-mertrim/mertrim.sh while the jobs were running, and the new settings "took" as each new job started. The run times were: -t ~minutes 4 22 16 14 40 12-13 So there isn't much to be gained by pushing that parameter up. > You can increase the number of jobs running at once with > mbtConcurrency. Kind of my point about the script, I overlooked that one. I did use merylThreads, but didn't realize that trim and count used different parameters. Concurrency x Threads, that is simultaneous jobs x cpus/job? There are 7 of the former parameters and 6 of the latter. Presumably if I spent a couple of hours reading all the documentation (which for some reason has been loading really, really slowly from sourceforge) I could make a guess at what would probably work best. The hypothetical script I alluded to would be a lot more convenient! Thanks, David Mathog ma...@ca... Manager, Sequence Analysis Facility, Biology Division, Caltech |
From: Brian W. <th...@gm...> - 2015-01-15 17:11:32
|
That's definitely NOT the 8.2 version. Yours has: -dumpfasta[seq|qlt] dump fragment sequence or quality, as fasta format 8.2 has: -dumpfasta <prefix> dump fragment sequence and quality into <p.fasta> and <p.fasta.qual> What does 'gatekeeper --version' report? (note, that's two dashes) You should also verify that you're using the version you think you're using: 'which gatekeeper'. There might be an older version installed in your path. I've been stung by this a couple times. b On Tue, Jan 13, 2015 at 12:30 AM, Arpita Ghosh - Xcelris < arp...@xc...> wrote: > Dear Brain > > Thank you for your prompt response. > *I want to run the following with the version 8.2 what should i use > instead of** "-dumpfastaseq".* > > gatekeeper -dumpfastaseq genome.gkpStore > gatekeeper.fasta > > > I tried gatekeeper -dumpfasta genome.gkpStore > gatekeeper.fasta with > version 8.2 but it was not running. Can you please help. > > *The gatekeeper version is not mentioned but the help is as follows:* > > gatekeeper -h > usage1: gatekeeper -o gkpStore [append/create options] <input.frg> > <input.frg> ... > usage2: gatekeeper -P partitionfile gkpStore > usage3: gatekeeper [id-selection] [options] [format] gkpStore > ---------------------------------------------------------------------- > The first usage will append to or create a GateKeeper store: > -a append to existing store > -o <gkpStore> append to or create gkpStore > > -T do not check minimum length (for OBT) > -F fix invalid insert size estimates > > -E <error.frg> write errors to this file > > -v <vector-info> load vector clear ranges into each read. > MUST be done on an existing, complete store. > example: -a -v vectorfile -o that.gkpStore > format: 'UID vec-clr-begin vec-clr-end' > > ---------------------------------------------------------------------- > The second usage will partition an existing store, allowing > the entire store partition to be loaded into memory. > -P <partitionfile> a list of (partition fragiid) > > ---------------------------------------------------------------------- > The third usage will dump the contents of a GateKeeper store. > There are THREE components to a dump, what to dump, options, and format. > The first two are optional, the last is mandatory. Examples: > > Dump metainfo for the first 100 fragments > gatekeeper -b 1 -e 100 -tabular -dumpfragments my.gkpStore > > first100.tsv > > Dump a random 25% of the reads in the first library > gatekeeper -randomsubset 1 0.25 -dumpfrg my.gkpStore > random25.frg > > Dump fasta sequence for the UIDs in 'uidFile' > gatekeeper -uid uidFile -dumpfastaseq -dumpfrg my.gkpStore > file.fasta > > ----------------------------------- > [selection of what objects to dump] > ----------------------------------- > -b <begin-iid> dump starting at this library or read > -e <ending-iid> dump stopping after this iid > -uid <uid-file> dump only objects listed in 'uid-file' > -iid <iid-file> dump only objects listed in 'iid-file' > -randommated <lib> <n> pick n mates (2n frags) at random from library > lib > -randomsubset <lib> <f> dump a random fraction f of library lib > -randomlength <lib> <l> dump a random fraction of library lib, fraction > picked > so that the untrimmed length is close to l > > --------- > [options] > --------- > -tabular dump info, libraries or fragments in a tabular > format (for -dumpinfo, -dumplibraries, > and -dumpfragments, ignores -withsequence and > -clear) > -isfeatureset <libID> <X> sets exit value to 0 if feature X is set in > library libID, 1 otherwise. > If libID == 0, check all libraries. > -nouid dump info without including the read UID (for > -dumpinfo, -dumplibraries, -dumpfragments) > > ---------------- > [format of dump] > ---------------- > -dumpinfo print information on the store > -lastfragiid just print the last IID in the store > -dumplibraries dump all library records > -dumpfragments dump fragment info, no sequence > -withsequence ...and include sequence > -clear <clr> ...in clear range <clr>, default=LATEST > -dumpfasta[seq|qlt] dump fragment sequence or quality, as fasta format > -allreads ...all reads, regardless of deletion status > (deleted are lowercase) > -allbases ...all bases (lowercase for non-clear) > -decoded ...quality as integers ('20 21 19') > -clear <clr> ...in clear range <clr>, default=LATEST > -dumpfrg extract LIB, FRG and LKG messages > -allreads ...all reads, regardless of deletion status > -donotfixmates ...only extract the fragments given, do not add > in > missing mated reads > -clear <clr> ...use clear range <clr>, default=LATEST > -format2 ...extract using frg format version 2 > -dumpnewbler <prefix> extract LIB, FRG and LKG messages, write in a > format appropriate for Newbler. This will create > files 'prefix.fna' and 'prefix.fna.qual'. Options > -donotfixmates and -clear also apply. > -dumpfastq <prefix> extract LIB, FRG and LKG messages, write in FastQ > format. Currently > this works only on a store with one library as > all the mated reads > are dumped into a single file. This will create > files 'prefix.paired.fastq', > 'prefix.1.fastq', 'prefix.2.fastq' and > 'prefix.unmated.fastq' for unmated > reads. Options -donotfixmates and -clear also > apply. > > The Gatekeeper ensures that data entering the assembly system meets > the data specification (see GateKeeper design document). It is also > used for examining and partitioning the assembler data store. > > Each input message is checked for semantic consistency as described in > the defining document for that stage. Messages containing a UID are > converted to a UID,IID pair -- the assembler modules require > consecutive IID beginning at 1 for efficient indexing of internal and > disk-based data structures; gatekeeper performs this task. Finally, > each message is inserted into the assembly data store. > > The GateKeeper succeeds if it consumes its entire input with less than > a specified number of errors (the -e option). Upon successful exit, > the store reflects all of the records that were successfully read. > Unsuccessful records are reported to stderr, along with a brief > explanation of the problem. > > If unsuccessful, the store is partially updated. > > Resoure Requirements > > The key gatekeeper data structures are in-memory copies of its store. > This store should scale linearly with the number of fragments. > > No formal benchmarking of the gatekeeper has been performed to date. > However, each LKG message requires four random disk accesses -- two to > read the linked fragment records, and two two write the updated > fragment records. This can cause problems when gatekeeper is run over > low-performance or heavily used NFS mount points. > > 3. General Pre Conditions > > a) Each input UID must be unique. A new message with a duplicate UID > will be rejected as an error. > > b) Any object referred to in a message must be defined: > def-before-ref. Likewise, before an object can be deleted, all > references to it must be removed: unref-before-undef. > > c) The input specification is defined elsewhere. > > Regards, > > Arpita Ghosh > > Scientist > > Xcelris Labs Limited > > Old Premchandnagar Road, > > Opp. Satyagrah Chhavani, > > Bodakdev, Ahmedabad-380015 India. > > Tel:+91-79-66197777/66311114 > > Fax:+91-79-66309341 > > Email: arp...@xc... > > www.xcelrisgenomics.com > On 10-01-2015 20:07, Brian Walenz wrote: > > Hi- > > "-dumpfastaseq"? What version are you using? The usage for the latest > versions (all of the 8.x series) is "gatekeeper -dumpfasta <out-prefix> > *.gkpStore". Do any of the other dumps work? > > b > > > > On Sat, Jan 10, 2015 at 6:37 AM, Arpita Ghosh - Xcelris < > arp...@xc...> wrote: > >> Dear User >> >> >> I am using Celera assembler latest version but i am facing a problem in >> one step as mentioned below. >> After runCA i am running gatekeeper -dumpfastaseq genome.gkpStore > >> gatekeeper.fasta and getting segmentation fault. >> >> I am user Server having specification of *64 core, 1TB RAM and AMD6378, >> 2.4GHz processor. *The data input is Miseq PE 90 Gb, NextSeq MP 100 Gb >> and 454 SG and PE of 2 Gb. >> >> >> Can you suggest what might be the reason of the segmentation fault and >> how can i come over it. >> >> -- >> >> Regards, >> >> Arpita Ghosh >> >> Scientist >> >> Xcelris Labs Limited >> >> Old Premchandnagar Road, >> >> Opp. Satyagrah Chhavani, >> >> Bodakdev, Ahmedabad-380015 India. >> >> Tel:+91-79-66197777/66311114 >> >> Fax:+91-79-66309341 >> >> Email: arp...@xc... >> >> www.xcelrisgenomics.com >> >> "Please don't print this e-mail unless you really need to. Save Paper, >> Save Trees" >> >> -------------------------------------------------------------------------------- >> >> Disclaimer >> The information contained in this e-mail message and / or attachments to >> it may contain confidential or privileged information of Xcelris Labs >> Limited and / or its group companies. If you are not the intended >> recipient, any dissemination, use, review, distribution, printing or >> copying of the information contained in this e-mail message and / or >> attachments to it is strictly prohibited. If you have received this >> communication in error, please notify us by reply e-mail or telephone and >> immediately delete / destroy the message and / or any attachments >> permanently. >> >> Any views or opinions presented in this e-mail are solely those of the >> author and do not necessarily represent those of the company. Employees of >> the company are expressly required not to make defamatory statements and >> not to infringe or authorize any infringement of copyright or any other >> legal right by e-mail communication. Any such communication is contrary to >> company policy and outside the scope of the employment of the individual >> concerned. The company will not accept any liability in respect of such >> communication. >> >> No employee or associate is authorized to conclude any binding agreement >> on behalf of Xcelris with another party by e-mail without expressly written >> confirmation of the company. >> >> Warning-Computer viruses can be transmitted via e-mail. The recipient >> should check this e-mail and any attachment for the presence of viruses. >> The company has taken every reasonable precaution to minimize this risk, >> and is not liable for any damage you may sustain as a result of any virus >> in this e-mail. You should carry out your own virus checks before acting >> upon the e-mail or any attachment to it. >> >> -------------------------------------------------------------------------------- >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming! The Go Parallel Website, >> sponsored by Intel and developed in partnership with Slashdot Media, is >> your >> hub for all things parallel software development, from weekly thought >> leadership blogs to news, videos, case studies, tutorials and more. Take a >> look and join the conversation now. http://goparallel.sourceforge.net >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> >> > > > > > -------------------------------------------------------------------------------- > > Disclaimer > The information contained in this e-mail message and / or attachments to > it may contain confidential or privileged information of Xcelris Labs > Limited and / or its group companies. If you are not the intended > recipient, any dissemination, use, review, distribution, printing or > copying of the information contained in this e-mail message and / or > attachments to it is strictly prohibited. If you have received this > communication in error, please notify us by reply e-mail or telephone and > immediately delete / destroy the message and / or any attachments > permanently. > > Any views or opinions presented in this e-mail are solely those of the > author and do not necessarily represent those of the company. Employees of > the company are expressly required not to make defamatory statements and > not to infringe or authorize any infringement of copyright or any other > legal right by e-mail communication. Any such communication is contrary to > company policy and outside the scope of the employment of the individual > concerned. The company will not accept any liability in respect of such > communication. > > No employee or associate is authorized to conclude any binding agreement > on behalf of Xcelris with another party by e-mail without expressly written > confirmation of the company. > > Warning-Computer viruses can be transmitted via e-mail. The recipient > should check this e-mail and any attachment for the presence of viruses. > The company has taken every reasonable precaution to minimize this risk, > and is not liable for any damage you may sustain as a result of any virus > in this e-mail. You should carry out your own virus checks before acting > upon the e-mail or any attachment to it. > > -------------------------------------------------------------------------------- > > |
From: Brian W. <th...@gm...> - 2015-01-15 17:03:02
|
The option you're looking for is mbtThreads, with a default of 4. Also look into option mbtBatchSize, which sets how many reads to process per job. The default is 1 million, and you've already got at least 48 jobs, so this is probably not an issue. You can increase the number of jobs running at once with mbtConcurrency. You should be able to run 20 with the current job size. Dropping the batch size should decrease the memory used per job, and so you can then run more jobs. On the current jobs, are the WORKING files non-zero size? If so, then the compute should be in the multi-threaded stage, and it should be using 4 CPUs. Check the mertrim.sh (or similar) script in the 0-mertrim directory to verify that it has "-t 4". Adding "-v" will make it report the number of reads processed during the compute, but it won't tell you the number of threads. Both of these are to check that the job is done with the data structure building -- after two days, it definitely should be. So, in summary, I don't know why you're not getting multiple CPUs on these. You can work around the problem by dropping the batch size to make jobs with about 8gb memory (smaller than 512/48), then run 48 jobs in parallel. b On Wed, Jan 14, 2015 at 12:33 PM, mathog <ma...@ca...> wrote: > It looks like I set the wgs spec file parameters wrong, because both mer > and mertrim run single threaded (most of the time) on a machine with 48 > cpus and 530G of RAM. The data is 4 sets of illumina data. The data was > loaded into gatekeeper OK, and that was single threaded too, but I can > see why that might be. However, after that it began doing these very, > very slowly: > > /home/wgs_project/do_illumina_wgs/./0-mertrim/mertrim.sh 48 \ > > /home/wgs_project/do_illumina_wgs/./0-mertrim/..0048.err 2>&1 > > Most of the times when I have checked this either mer or mertrim is > running at 99% cpu. On some occasions mertrim has gone a bit higher: > > 14097 wgsuser 39 19 18.7g 16g 1304 S 401.7 3.3 49:43.93 merTrim > > That is still poor use of this machine, leaving it mostly idle. > > Here is the spec file, minus all comments: > > utgErrorRate=0.03 > utgErrorLimit=2.5 > ovlErrorRate=0.06 > cnsErrorRate=0.10 > cgwErrorRate=0.10 > merSize = 22 > overlapper=ovl > unitigger = bog > utgBubblePopping = 1 > merylMemory = 128000 > merylThreads = 25 > ovlHashBits=25 > ovlHashBlockLength=180000000 > ovlThreads = 2 > ovlConcurrency = 20 > ovlRefBlockSize = 32000000 > ovlStoreMemory = 8192 # Mbp > frgCorrThreads = 10 > frgCorrConcurrency = 3 > ovlCorrBatchSize = 1000000 > ovlCorrConcurrency = 25 > cnsConcurrency = 16 > useGrid = 0 > scriptOnGrid = 0 > s_300_qseq.frg > s_1000_qseq.frg > s_3000_qseq.frg > s_5000_qseq.frg > > What needs to be tweaked to get mer, mertrim to use more of the > machine's resources? This spec file was based on a couple found here > and there on the web, and I'm sure that many of the memory/threads > parameters are not optimal. > > Thank you, > > David Mathog > ma...@ca... > Manager, Sequence Analysis Facility, Biology Division, Caltech > > > ------------------------------------------------------------------------------ > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > GigeNET is offering a free month of service with a new server in Ashburn. > Choose from 2 high performing configs, both with 100TB of bandwidth. > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > http://p.sf.net/sfu/gigenet > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > |
From: mathog <ma...@ca...> - 2015-01-14 17:33:14
|
It looks like I set the wgs spec file parameters wrong, because both mer and mertrim run single threaded (most of the time) on a machine with 48 cpus and 530G of RAM. The data is 4 sets of illumina data. The data was loaded into gatekeeper OK, and that was single threaded too, but I can see why that might be. However, after that it began doing these very, very slowly: /home/wgs_project/do_illumina_wgs/./0-mertrim/mertrim.sh 48 \ > /home/wgs_project/do_illumina_wgs/./0-mertrim/..0048.err 2>&1 Most of the times when I have checked this either mer or mertrim is running at 99% cpu. On some occasions mertrim has gone a bit higher: 14097 wgsuser 39 19 18.7g 16g 1304 S 401.7 3.3 49:43.93 merTrim That is still poor use of this machine, leaving it mostly idle. Here is the spec file, minus all comments: utgErrorRate=0.03 utgErrorLimit=2.5 ovlErrorRate=0.06 cnsErrorRate=0.10 cgwErrorRate=0.10 merSize = 22 overlapper=ovl unitigger = bog utgBubblePopping = 1 merylMemory = 128000 merylThreads = 25 ovlHashBits=25 ovlHashBlockLength=180000000 ovlThreads = 2 ovlConcurrency = 20 ovlRefBlockSize = 32000000 ovlStoreMemory = 8192 # Mbp frgCorrThreads = 10 frgCorrConcurrency = 3 ovlCorrBatchSize = 1000000 ovlCorrConcurrency = 25 cnsConcurrency = 16 useGrid = 0 scriptOnGrid = 0 s_300_qseq.frg s_1000_qseq.frg s_3000_qseq.frg s_5000_qseq.frg What needs to be tweaked to get mer, mertrim to use more of the machine's resources? This spec file was based on a couple found here and there on the web, and I'm sure that many of the memory/threads parameters are not optimal. Thank you, David Mathog ma...@ca... Manager, Sequence Analysis Facility, Biology Division, Caltech |
From: carlos v. <cha...@gm...> - 2015-01-12 12:50:16
|
Thank you very much for your reply. Actually you are right, I had run out of disk space so the file was incomplete. I ran that step again after I had released some more space and the program finished successfully. Thanks! On Fri, Dec 12, 2014 at 2:38 AM, Brian Walenz <th...@gm...> wrote: > That's a new one! I'm guessing that the file (of overlaps, ".ovb.gz" > probably) it is reading was truncated. Delete either the overlap files in > 1-overlapper or just the whole 1-overlapper directory and restart. > > Better yet, save the existing ovb's in a subdirectory, restart, and > compare. Hopefully one of the older ones will be smaller. You might need > to ungzip them first. If it still fails, and the files are the same, then > we know there's an actual problem somewhere. > > b > > > > On Thu, Dec 11, 2014 at 1:37 PM, carlos vargas <cha...@gm...> wrote: > >> Thanks a lot! I installed blasr as you suggested and the mapping was >> performed and the .ovb files were obtained, however now I have a new >> problem, is it a problem of memory? The error message is the following: >> >> >> overlapStoreBuild: AS_OVS_overlapFile.C:185: int >> AS_OVS_readOverlap(BinaryOverlapFile*, OVSoverlap*): Assertion >> `bof->bufferPos <= bof->bufferLen' failed. >> >> Failed with 'Aborted' >> >> Backtrace (mangled): >> >> >> /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] >> /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f7860d6fcb0] >> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f78609d70d5] >> /lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7f78609da83b] >> /lib/x86_64-linux-gnu/libc.so.6(+0x2ed9e)[0x7f78609cfd9e] >> /lib/x86_64-linux-gnu/libc.so.6(+0x2ee42)[0x7f78609cfe42] >> /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild[0x411dbd] >> >> /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(main+0x7e6)[0x405b56] >> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f78609c276d] >> /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild[0x405059] >> >> Backtrace (demangled): >> >> [0] >> /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild::AS_UTL_catchCrash(int, >> siginfo*, void*) + 0x27 [0x40a697] >> [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 >> [0x7f7860d6fcb0] >> [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7f78609d70d5] >> [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7f78609da83b] >> [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2ed9e [0x7f78609cfd9e] >> [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2ee42 [0x7f78609cfe42] >> [6] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild() [0x411dbd] >> [7] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild::(null) + >> 0x7e6 [0x405b56] >> [8] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7f78609c276d] >> [9] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild() [0x405059] >> >> GDB: >> >> >> Aborted (core dumped) >> >> |
From: Brian W. <th...@gm...> - 2015-01-10 14:56:28
|
Hi- Sorry, this got lost in the holidays. Look into masurca (http://www.genome.umd.edu/masurca.html). It's an offshoot of CA which handles Illumina PE/MP much better. To run with CA, I'd suggest: 1) Use all reads from the start. 2) Mark the TruSeq as a non-random library -- I don't trust it to be random, and if it isn't, repeats will be all messed up. 3) You'll need to use 'dnc' to clean up the mate pair reads. The TruSeq should help here. 4) unitigger=bogart 5) Stopping after unitigs (stopAfter=unitigger) and evaluating unitig sizes (tigStore -g *gkpStore -t *tigStore 1 -U -d sizes) is generally helpful. 6) The 'fragment error correction' module is somewhat helpful, but very expensive. Disable it (doFragmentCorrection=0). http://wgs-assembler.sourceforge.net/wiki/index.php/Pair_classification_within_Illumina_mate_pair_data http://wgs-assembler.sourceforge.net/wiki/index.php/RunCA#De-novo_Classification b On Fri, Jan 2, 2015 at 6:56 PM, Doyle, Jacqueline R M <jm...@pu...> wrote: > Hi! I’m working on an assembly of a non-model avian genome. We > currently have Illumina paired-end reads that I’d like to use for > contig-building, as well as both a mate-pair library and low coverage > Illumina TruSeq synthetic reads (formerly Moleculo) that I’d like to use > for scaffolding. Is this possible (and advisable) with Celera? > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming! The Go Parallel Website, > sponsored by Intel and developed in partnership with Slashdot Media, is > your > hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials and more. Take a > look and join the conversation now. http://goparallel.sourceforge.net > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Brian W. <th...@gm...> - 2015-01-10 14:37:14
|
Hi- "-dumpfastaseq"? What version are you using? The usage for the latest versions (all of the 8.x series) is "gatekeeper -dumpfasta <out-prefix> *.gkpStore". Do any of the other dumps work? b On Sat, Jan 10, 2015 at 6:37 AM, Arpita Ghosh - Xcelris < arp...@xc...> wrote: > Dear User > > > I am using Celera assembler latest version but i am facing a problem in > one step as mentioned below. > After runCA i am running gatekeeper -dumpfastaseq genome.gkpStore > > gatekeeper.fasta and getting segmentation fault. > > I am user Server having specification of *64 core, 1TB RAM and AMD6378, > 2.4GHz processor. *The data input is Miseq PE 90 Gb, NextSeq MP 100 Gb > and 454 SG and PE of 2 Gb. > > > Can you suggest what might be the reason of the segmentation fault and how > can i come over it. > > -- > > Regards, > > Arpita Ghosh > > Scientist > > Xcelris Labs Limited > > Old Premchandnagar Road, > > Opp. Satyagrah Chhavani, > > Bodakdev, Ahmedabad-380015 India. > > Tel:+91-79-66197777/66311114 > > Fax:+91-79-66309341 > > Email: arp...@xc... > > www.xcelrisgenomics.com > > "Please don't print this e-mail unless you really need to. Save Paper, > Save Trees" > > -------------------------------------------------------------------------------- > > Disclaimer > The information contained in this e-mail message and / or attachments to > it may contain confidential or privileged information of Xcelris Labs > Limited and / or its group companies. If you are not the intended > recipient, any dissemination, use, review, distribution, printing or > copying of the information contained in this e-mail message and / or > attachments to it is strictly prohibited. If you have received this > communication in error, please notify us by reply e-mail or telephone and > immediately delete / destroy the message and / or any attachments > permanently. > > Any views or opinions presented in this e-mail are solely those of the > author and do not necessarily represent those of the company. Employees of > the company are expressly required not to make defamatory statements and > not to infringe or authorize any infringement of copyright or any other > legal right by e-mail communication. Any such communication is contrary to > company policy and outside the scope of the employment of the individual > concerned. The company will not accept any liability in respect of such > communication. > > No employee or associate is authorized to conclude any binding agreement > on behalf of Xcelris with another party by e-mail without expressly written > confirmation of the company. > > Warning-Computer viruses can be transmitted via e-mail. The recipient > should check this e-mail and any attachment for the presence of viruses. > The company has taken every reasonable precaution to minimize this risk, > and is not liable for any damage you may sustain as a result of any virus > in this e-mail. You should carry out your own virus checks before acting > upon the e-mail or any attachment to it. > > -------------------------------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming! The Go Parallel Website, > sponsored by Intel and developed in partnership with Slashdot Media, is > your > hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials and more. Take a > look and join the conversation now. http://goparallel.sourceforge.net > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Arpita G. - X. <arp...@xc...> - 2015-01-10 11:41:54
|
<html> <head> <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"> </head> <body style="font-size: 12px; background-image: url('http://202.131.101.226/stationary/xcelris.jpg'); color: #000000; background-repeat: repeat-x; font-family: arial, verdana, tahoma, Times; margin-left: 10px; margin-right: 0px; margin-top: 100px; margin-bottom: 0px" background="file:///D%3A/My%20Documents/Thunderbird%20-%20Arpita/"> <div> Dear User<br> <br> <br> I am using Celera assembler latest version but i am facing a problem in one step as mentioned below.<br> After runCA i am running gatekeeper -dumpfastaseq genome.gkpStore > gatekeeper.fasta and getting segmentation fault.<br> <br> I am user Server having specification of <b>64 core, 1TB RAM and AMD6378, 2.4GHz processor. </b>The data input is Miseq PE 90 Gb, NextSeq MP 100 Gb and 454 SG and PE of 2 Gb.<b><br> <br> <br> </b>Can you suggest what might be the reason of the segmentation fault and how can i come over it.<b><br> </b></div> <br> <div class="moz-signature">-- <br> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> <meta name="ProgId" content="Word.Document"> <meta name="Generator" content="Microsoft Word 9"> <meta name="Originator" content="Microsoft Word 9"> <link rel="File-List" href="./Long-Abellon_files/filelist.xml"> <title>Long-Aventure Signature</title> <!--[if gte mso 9]><xml> <o:DocumentProperties> <o:Author>ithelp.corp</o:Author> <o:LastAuthor>ITHELP SYDNEY</o:LastAuthor> <o:Revision>8</o:Revision> <o:TotalTime>12</o:TotalTime> <o:Created>2009-06-11T09:59:00Z</o:Created> <o:LastSaved>2010-12-01T07:02:00Z</o:LastSaved> <o:Pages>1</o:Pages> <o:Words>37</o:Words> <o:Characters>211</o:Characters> <o:Company>cll</o:Company> <o:Lines>1</o:Lines> <o:Paragraphs>1</o:Paragraphs> <o:CharactersWithSpaces>259</o:CharactersWithSpaces> <o:Version>9.2720</o:Version> </o:DocumentProperties> </xml><![endif]--> <style> <!-- /* Font Definitions */ @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:1627421319 -2147483648 8 0 66047 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin:0in; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} a:link, span.MsoHyperlink {color:blue; text-decoration:underline; text-underline:single;} a:visited, span.MsoHyperlinkFollowed {color:blue; text-decoration:underline; text-underline:single;} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.Section1 {page:Section1;} --> </style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext="edit" spidmax="1027"/> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext="edit"> <o:idmap v:ext="edit" data="1"/> </o:shapelayout></xml><![endif]--> <div class="Section1"> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Regards,<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Arpita Ghosh<br> </span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Scientist<br> </span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Xcelris Labs Limited<br> </span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Old Premchandnagar Road,<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Opp. Satyagrah Chhavani,<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Bodakdev, Ahmedabad-380015 India.<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Tel:+91-79-66197777/66311114<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Fax:+91-79-66309341<o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma">Email: <a class="moz-txt-link-abbreviated" href="mailto:arp...@xc...">arp...@xc...</a><o:p></o:p></span></p> <p class="MsoNormal"><span style="font-size:10.0pt;font-family:Tahoma"><a class="moz-txt-link-abbreviated" href="http://www.xcelrisgenomics.com">www.xcelrisgenomics.com</a></span></p> </div> </div> <br>"Please don't print this e-mail unless you really need to. Save Paper, Save Trees"<br>--------------------------------------------------------------------------------<br> <br> Disclaimer<br> The information contained in this e-mail message and / or attachments to it may contain confidential or privileged information of Xcelris Labs Limited and / or its group companies. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and / or attachments to it is strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately delete / destroy the message and / or any attachments permanently.<br> <br> Any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of the company. Employees of the company are expressly required not to make defamatory statements and not to infringe or authorize any infringement of copyright or any other legal right by e-mail communication. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication. <br> <br> No employee or associate is authorized to conclude any binding agreement on behalf of Xcelris with another party by e-mail without expressly written confirmation of the company.<br> <br> Warning-Computer viruses can be transmitted via e-mail. The recipient should check this e-mail and any attachment for the presence of viruses. The company has taken every reasonable precaution to minimize this risk, and is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before acting upon the e-mail or any attachment to it.<br> <br> -------------------------------------------------------------------------------- </body> </html> |
From: Doyle, J. R M <jm...@pu...> - 2015-01-02 23:56:54
|
Hi! I’m working on an assembly of a non-model avian genome. We currently have Illumina paired-end reads that I’d like to use for contig-building, as well as both a mate-pair library and low coverage Illumina TruSeq synthetic reads (formerly Moleculo) that I’d like to use for scaffolding. Is this possible (and advisable) with Celera? |
From: Rui-Peng W. <rui...@gm...> - 2014-12-20 08:53:40
|
So the CNS stage has finished and the chunks have been processed with succes, however PBcR does not go to the next stage. Currenly, there is another pBcR job still on hold queu: *(smrtanalysis-2.3.0) [rpw@temp]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------------------------- 246890 0.00000 pBcR_asm_P rpw hqw 12/17/2014 04:03:23 * >From what I know this hold job calls the following script: runPBcR.sge.out.05.sh But I do not know what to do next: 1) should I unhold this job with qrls and see what happens with risk that I have to rerun the whole pipeline again or 2) is there a way to continue manually without unholding this job. I would prefer option 2 but I do not know how to do it. Also I do not want rerun from the start again cause there is no time. After the CNS stage I got the trimmed fastq and fasta files per chunk. I do not know how far I am with the self correcting step. Any help or suggestion is very appreciated. On 18 Dec 2014, at 09:12, Rui-Peng Wang <rui...@gm...> wrote: Thanks for the suggestion but it did not help.. I restarted runPartition.sh by specifying the failed chunk (i.e #26) . Although it did not complain about the perl interpreter the final output.fasta and output.qual returned empty. I do see another stderr from : 26.lay.err Parsing arguments Opening stores Loading library information Streaming fragments openLayFile()-- Failed to open '/data/../tempLib/asm.26.olaps' for reading: No such file or directory Couldn't open '/data/../tempLib/asm.26.olaps' for read No such file or directory Is there way to reproduce these asm*.olaps files again. So I can restart runPartition.sh on the failed chunks with succes. At the momentI have only 2 failed chunks out of 140 but still 60 chunks are being processed. On Thu, Dec 18, 2014 at 12:02 AM, Serge Koren <ser...@gm...> wrote: > > This is most likely due to a missing or different perl on some of your > cluster nodes. You can try logging into the nodes on which the chunks are > failing to see if you can run/access the specified perl path. If a > consistent set of nodes is failing, you could also restrict the > runPartition.sh script to only run on a subset of your machines where it > works. If it only happens for a small subset of chunks, you can re-run the > failed ones manually using the runPartition.sh script. Just specify the > missing chunk on the command line: > cd temp<library name> > sh runPartiton.sh <failed chunk #> > > Sergey > > > On Dec 17, 2014, at 6:25 AM, Rui-Peng Wang <rui...@gm...> > wrote: > > > > Hello, > > > > I am running PBcR using MHAP. It is currently in the PBcR CNS stage. > Currently 53 chunks out of 200 are being processed. Three chunks are > finished, however two of them returned this message in stderr: > > > > /opt/sge/default/pool/Node028/job_scripts/246889: > /data/rp/tools/wgs-8.2/Linux-amd64/bin/convertToPBCNS: > /opt/smrtanalysis/miscdeps/basesys/usr/bin/perl: bad interpreter: > Permission denied > > > > The strange thing is one chunk took 4 hours and it finished successfully > with a generated trim.fasta file. The failed chunks did not generate > trim.fasta but there is a broken symlink.. > > > > We have tested our SGE system with a dummy perl script (a modified copy > of convertToPBCNS) and all our nodes did not complain about the perl > interpreter. > > > > Can someone help me how to deal with this and can I continue with the > CNS stage process? > > > > > > > > > ------------------------------------------------------------------------------ > > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > > with Interactivity, Sharing, Native Excel Exports, App Integration & more > > Get technology previously reserved for billion-dollar corporations, FREE > > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk_______________________________________________ > > wgs-assembler-users mailing list > > wgs...@li... > > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Rui-Peng W. <rui...@gm...> - 2014-12-18 08:12:38
|
Thanks for the suggestion but it did not help.. I restarted runPartition.sh by specifying the failed chunk (i.e #26) . Although it did not complain about the perl interpreter the final output.fasta and output.qual returned empty. I do see another stderr from : 26.lay.err Parsing arguments Opening stores Loading library information Streaming fragments openLayFile()-- Failed to open '/data/../tempLib/asm.26.olaps' for reading: No such file or directory Couldn't open '/data/../tempLib/asm.26.olaps' for read No such file or directory Is there way to reproduce these asm*.olaps files again. So I can restart runPartition.sh on the failed chunks with succes. At the momentI have only 2 failed chunks out of 140 but still 60 chunks are being processed. On Thu, Dec 18, 2014 at 12:02 AM, Serge Koren <ser...@gm...> wrote: > > This is most likely due to a missing or different perl on some of your > cluster nodes. You can try logging into the nodes on which the chunks are > failing to see if you can run/access the specified perl path. If a > consistent set of nodes is failing, you could also restrict the > runPartition.sh script to only run on a subset of your machines where it > works. If it only happens for a small subset of chunks, you can re-run the > failed ones manually using the runPartition.sh script. Just specify the > missing chunk on the command line: > cd temp<library name> > sh runPartiton.sh <failed chunk #> > > Sergey > > > On Dec 17, 2014, at 6:25 AM, Rui-Peng Wang <rui...@gm...> > wrote: > > > > Hello, > > > > I am running PBcR using MHAP. It is currently in the PBcR CNS stage. > Currently 53 chunks out of 200 are being processed. Three chunks are > finished, however two of them returned this message in stderr: > > > > /opt/sge/default/pool/Node028/job_scripts/246889: > /data/rp/tools/wgs-8.2/Linux-amd64/bin/convertToPBCNS: > /opt/smrtanalysis/miscdeps/basesys/usr/bin/perl: bad interpreter: > Permission denied > > > > The strange thing is one chunk took 4 hours and it finished successfully > with a generated trim.fasta file. The failed chunks did not generate > trim.fasta but there is a broken symlink.. > > > > We have tested our SGE system with a dummy perl script (a modified copy > of convertToPBCNS) and all our nodes did not complain about the perl > interpreter. > > > > Can someone help me how to deal with this and can I continue with the > CNS stage process? > > > > > > > > > ------------------------------------------------------------------------------ > > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > > with Interactivity, Sharing, Native Excel Exports, App Integration & more > > Get technology previously reserved for billion-dollar corporations, FREE > > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk_______________________________________________ > > wgs-assembler-users mailing list > > wgs...@li... > > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Serge K. <ser...@gm...> - 2014-12-17 23:02:53
|
This is most likely due to a missing or different perl on some of your cluster nodes. You can try logging into the nodes on which the chunks are failing to see if you can run/access the specified perl path. If a consistent set of nodes is failing, you could also restrict the runPartition.sh script to only run on a subset of your machines where it works. If it only happens for a small subset of chunks, you can re-run the failed ones manually using the runPartition.sh script. Just specify the missing chunk on the command line: cd temp<library name> sh runPartiton.sh <failed chunk #> Sergey > On Dec 17, 2014, at 6:25 AM, Rui-Peng Wang <rui...@gm...> wrote: > > Hello, > > I am running PBcR using MHAP. It is currently in the PBcR CNS stage. Currently 53 chunks out of 200 are being processed. Three chunks are finished, however two of them returned this message in stderr: > > /opt/sge/default/pool/Node028/job_scripts/246889: /data/rp/tools/wgs-8.2/Linux-amd64/bin/convertToPBCNS: /opt/smrtanalysis/miscdeps/basesys/usr/bin/perl: bad interpreter: Permission denied > > The strange thing is one chunk took 4 hours and it finished successfully with a generated trim.fasta file. The failed chunks did not generate trim.fasta but there is a broken symlink.. > > We have tested our SGE system with a dummy perl script (a modified copy of convertToPBCNS) and all our nodes did not complain about the perl interpreter. > > Can someone help me how to deal with this and can I continue with the CNS stage process? > > > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Rui-Peng W. <rui...@gm...> - 2014-12-17 11:25:11
|
Hello, I am running PBcR using MHAP. It is currently in the PBcR CNS stage. Currently 53 chunks out of 200 are being processed. Three chunks are finished, however two of them returned this message in stderr: /opt/sge/default/pool/Node028/job_scripts/246889: /data/rp/tools/wgs-8.2/Linux-amd64/bin/convertToPBCNS: /opt/smrtanalysis/miscdeps/basesys/usr/bin/perl: bad interpreter: Permission denied The strange thing is one chunk took 4 hours and it finished successfully with a generated trim.fasta file. The failed chunks did not generate trim.fasta but there is a broken symlink.. We have tested our SGE system with a dummy perl script (a modified copy of convertToPBCNS) and all our nodes did not complain about the perl interpreter. Can someone help me how to deal with this and can I continue with the CNS stage process? |
From: Manjari D. <man...@gm...> - 2014-12-17 05:50:52
|
I am trying to run celera and it has given me following error: runCA failed. ---------------------------------------- Stack trace: at /nfshome/murali/Assembelers/wgs-8.2beta/Linux-amd64/bin/runCA line 1568, <J> line 65. main::caFailure("failed to create the overlap store", "/nfshome/murali/Assembly_results/Celera_results/Celera_grt120"...) called at /nfshome/murali/Assembelers/wgs-8.2beta/Linux-amd64/bin/runCA line 3912 main::createOverlapStore() called at /nfshome/murali/Assembelers/wgs-8.2beta/Linux-amd64/bin/runCA line 6475 ---------------------------------------- Last few lines of the relevant log file (/nfshome/murali/Assembly_results/Celera_results/Celera_grt1200/workDirCelera_grt1200/Peanut_454_grt1200Celera.ovlStore.err): /nfshome/murali/Assembelers/wgs-8.2beta/Linux-amd64/bin/overlapStoreBuild: unknown option '-i'. usage: /nfshome/murali/Assembelers/wgs-8.2beta/Linux-amd64/bin/overlapStoreBuild -o asm.ovlStore -g asm.gkpStore [opts] [-L fileList | *.ovb.gz] -o asm.ovlStore path to store to create -g asm.gkpStore path to gkpStore for this assembly -F f use up to 'f' files for store creation -M m use up to 'm' MB memory for store creation -plc t type of filtering for PLC fragments -- NOT SUPPORTED -obt filter overlaps for OBT -dup filter overlaps for OBT/dedupe -e e filter overlaps above e fraction error -L fileList read input filenames from 'flieList' ---------------------------------------- Failure message: failed to create the overlap store what should i do |
From: Brian W. <th...@gm...> - 2014-12-12 01:39:00
|
That's a new one! I'm guessing that the file (of overlaps, ".ovb.gz" probably) it is reading was truncated. Delete either the overlap files in 1-overlapper or just the whole 1-overlapper directory and restart. Better yet, save the existing ovb's in a subdirectory, restart, and compare. Hopefully one of the older ones will be smaller. You might need to ungzip them first. If it still fails, and the files are the same, then we know there's an actual problem somewhere. b On Thu, Dec 11, 2014 at 1:37 PM, carlos vargas <cha...@gm...> wrote: > Thanks a lot! I installed blasr as you suggested and the mapping was > performed and the .ovb files were obtained, however now I have a new > problem, is it a problem of memory? The error message is the following: > > > overlapStoreBuild: AS_OVS_overlapFile.C:185: int > AS_OVS_readOverlap(BinaryOverlapFile*, OVSoverlap*): Assertion > `bof->bufferPos <= bof->bufferLen' failed. > > Failed with 'Aborted' > > Backtrace (mangled): > > > /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] > /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f7860d6fcb0] > /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f78609d70d5] > /lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7f78609da83b] > /lib/x86_64-linux-gnu/libc.so.6(+0x2ed9e)[0x7f78609cfd9e] > /lib/x86_64-linux-gnu/libc.so.6(+0x2ee42)[0x7f78609cfe42] > /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild[0x411dbd] > > /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(main+0x7e6)[0x405b56] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f78609c276d] > /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild[0x405059] > > Backtrace (demangled): > > [0] > /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild::AS_UTL_catchCrash(int, > siginfo*, void*) + 0x27 [0x40a697] > [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 > [0x7f7860d6fcb0] > [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7f78609d70d5] > [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7f78609da83b] > [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2ed9e [0x7f78609cfd9e] > [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2ee42 [0x7f78609cfe42] > [6] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild() [0x411dbd] > [7] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild::(null) + > 0x7e6 [0x405b56] > [8] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7f78609c276d] > [9] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild() [0x405059] > > GDB: > > > Aborted (core dumped) > > |
From: carlos v. <cha...@gm...> - 2014-12-11 18:37:27
|
Thanks a lot! I installed blasr as you suggested and the mapping was performed and the .ovb files were obtained, however now I have a new problem, is it a problem of memory? The error message is the following: overlapStoreBuild: AS_OVS_overlapFile.C:185: int AS_OVS_readOverlap(BinaryOverlapFile*, OVSoverlap*): Assertion `bof->bufferPos <= bof->bufferLen' failed. Failed with 'Aborted' Backtrace (mangled): /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f7860d6fcb0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f78609d70d5] /lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7f78609da83b] /lib/x86_64-linux-gnu/libc.so.6(+0x2ed9e)[0x7f78609cfd9e] /lib/x86_64-linux-gnu/libc.so.6(+0x2ee42)[0x7f78609cfe42] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild[0x411dbd] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(main+0x7e6)[0x405b56] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f78609c276d] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild[0x405059] Backtrace (demangled): [0] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild::AS_UTL_catchCrash(int, siginfo*, void*) + 0x27 [0x40a697] [1] /lib/x86_64-linux-gnu/libpthread.so.0::(null) + 0xfcb0 [0x7f7860d6fcb0] [2] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x35 [0x7f78609d70d5] [3] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x17b [0x7f78609da83b] [4] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2ed9e [0x7f78609cfd9e] [5] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0x2ee42 [0x7f78609cfe42] [6] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild() [0x411dbd] [7] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild::(null) + 0x7e6 [0x405b56] [8] /lib/x86_64-linux-gnu/libc.so.6::(null) + 0xed [0x7f78609c276d] [9] /home/cvargas/wgs-8.2/Linux-amd64/bin/overlapStoreBuild() [0x405059] GDB: Aborted (core dumped) On Mon, Dec 8, 2014 at 9:36 PM, Serge Koren <ser...@gm...> wrote: > Hi, > > This error is caused by a missing component of BLASR (sawriter) not being > available in your path. It looks like you’re using BLASR which is packaged > with the SSPACE scaffolder so I’m not sure if they include the full BLASR > package. I would suggest downloading/compiling BLASR directly from GitHub > and adding it to your path to ensure the full package is available. > > Sergey > > > On Dec 8, 2014, at 3:22 PM, carlos vargas <cha...@gm...> wrote: > > > > Hello, > > > > I have been trying to clean some PacBio reads using Illumina reads with > PBcR. However it has been failing during the overlapper. It builds several > files using the Illumina reads, however when it fails it gives the > following message: > > > > ----------------------------------------END CONCURRENT Mon Dec 8 > 20:52:43 2014 (3695 seconds) > > ERROR: Overlap prep job > Illumina/PE/CA//temppacbio/1-overlapper/long_reads_part 1 FAILED. > > ERROR: Overlap prep job > Illumina/PE/CA//temppacbio/1-overlapper/long_reads_part 2 FAILED. > > > > 2 overlap partitioning jobs failed. > > > > In 1-overlapper there is the following message in the 24.hash.err: > > > > /usr/bin/time: cannot run > /home/local_installs/software/SSPACE-LongRead/sawriter: No such file or > directory > > Command exited with non-zero status 127 > > 0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata > 2112maxresident)k > > 0inputs+8outputs (0major+83minor)pagefaults 0swaps > > > > However it seems it is not a problem as it has it performs every step > until reaching that step. Any idea as to what might be the problem? > > > > Thanks in advance, > > Carlos > > > ------------------------------------------------------------------------------ > > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > > with Interactivity, Sharing, Native Excel Exports, App Integration & more > > Get technology previously reserved for billion-dollar corporations, FREE > > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk_______________________________________________ > > wgs-assembler-users mailing list > > wgs...@li... > > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Brian W. <th...@gm...> - 2014-12-11 16:30:08
|
Hi, Geoff- Yes. Make a new run directory, symlink the 0-6 directories, and copy the smaller files in gkpStore. The first y.pestis example shows how to copy gkpStore for this. http://wgs-assembler.sourceforge.net/wiki/index.php/Yersinia_pestis_KIM_D27,_using_454_8_Kbp_mated_reads,_with_CA8.2 On the gkpStore copy, edit to remove the mate ID from each read in that library. Something like: "gatekeeper -dumpfragments -randommated <lib> 1.0 -tabular *gkpStore > mated-reads-in-lib" to get a list of the reads in the library, then edit that list of reads to look like: frg iid <iid> mateiid 0 ... then gatekeeper -edit <file> *gkpStore The bottom of http://wgs-assembler.sourceforge.net/wiki/index.php/GkpStore has the docs for this. Restarting runCA on this new directory should pick up with CGW. Results won't be identical to a fresh start, as the insert size on the other libraries is already set. You could edit those back to the original. The impact will be small, I think. b On Thu, Dec 11, 2014 at 10:47 AM, Waldbieser, Geoff < Geo...@ar...> wrote: > Hi, > > I have completed an large assembly using a combination of PE reads and > jump reads, and would like to compare this with an assembly that does not > utilize the longest jump library. However this is a large assembly with a > long runtime. Can one utilize the data from prior stages and rerun cgw and > the downstream processes but remove the particular library from the > scaffolding process? > > > > Geoff > > > > > > Geoff Waldbieser > > USDA, ARS, Warmwater Aquaculture Research Unit > > 141 Experiment Station Road > > Stoneville, Mississippi 38776 > > Ofc. 662-686-3593 > > Fax. 662-686-3567 > > > > > > > This electronic message contains information generated by the USDA solely > for the intended recipients. Any unauthorized interception of this message > or the use or disclosure of the information it contains may violate the law > and subject the violator to civil or criminal penalties. If you believe you > have received this message in error, please notify the sender and delete > the email immediately. > > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |
From: Waldbieser, G. <Geo...@AR...> - 2014-12-11 16:02:36
|
Hi, I have completed an large assembly using a combination of PE reads and jump reads, and would like to compare this with an assembly that does not utilize the longest jump library. However this is a large assembly with a long runtime. Can one utilize the data from prior stages and rerun cgw and the downstream processes but remove the particular library from the scaffolding process? Geoff Geoff Waldbieser USDA, ARS, Warmwater Aquaculture Research Unit 141 Experiment Station Road Stoneville, Mississippi 38776 Ofc. 662-686-3593 Fax. 662-686-3567 This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. |
From: Serge K. <ser...@gm...> - 2014-12-08 20:37:04
|
Hi, This error is caused by a missing component of BLASR (sawriter) not being available in your path. It looks like you’re using BLASR which is packaged with the SSPACE scaffolder so I’m not sure if they include the full BLASR package. I would suggest downloading/compiling BLASR directly from GitHub and adding it to your path to ensure the full package is available. Sergey > On Dec 8, 2014, at 3:22 PM, carlos vargas <cha...@gm...> wrote: > > Hello, > > I have been trying to clean some PacBio reads using Illumina reads with PBcR. However it has been failing during the overlapper. It builds several files using the Illumina reads, however when it fails it gives the following message: > > ----------------------------------------END CONCURRENT Mon Dec 8 20:52:43 2014 (3695 seconds) > ERROR: Overlap prep job Illumina/PE/CA//temppacbio/1-overlapper/long_reads_part 1 FAILED. > ERROR: Overlap prep job Illumina/PE/CA//temppacbio/1-overlapper/long_reads_part 2 FAILED. > > 2 overlap partitioning jobs failed. > > In 1-overlapper there is the following message in the 24.hash.err: > > /usr/bin/time: cannot run /home/local_installs/software/SSPACE-LongRead/sawriter: No such file or directory > Command exited with non-zero status 127 > 0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 2112maxresident)k > 0inputs+8outputs (0major+83minor)pagefaults 0swaps > > However it seems it is not a problem as it has it performs every step until reaching that step. Any idea as to what might be the problem? > > Thanks in advance, > Carlos > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: carlos v. <cha...@gm...> - 2014-12-08 20:22:55
|
Hello, I have been trying to clean some PacBio reads using Illumina reads with PBcR. However it has been failing during the overlapper. It builds several files using the Illumina reads, however when it fails it gives the following message: ----------------------------------------END CONCURRENT Mon Dec 8 20:52:43 2014 (3695 seconds) ERROR: Overlap prep job Illumina/PE/CA//temppacbio/1-overlapper/long_reads_part 1 FAILED. ERROR: Overlap prep job Illumina/PE/CA//temppacbio/1-overlapper/long_reads_part 2 FAILED. 2 overlap partitioning jobs failed. In 1-overlapper there is the following message in the 24.hash.err: /usr/bin/time: cannot run /home/local_installs/software/SSPACE-LongRead/sawriter: No such file or directory Command exited with non-zero status 127 0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 2112maxresident)k 0inputs+8outputs (0major+83minor)pagefaults 0swaps However it seems it is not a problem as it has it performs every step until reaching that step. Any idea as to what might be the problem? Thanks in advance, Carlos |
From: Brian W. <th...@gm...> - 2014-12-08 16:18:00
|
[resending to the list too, oops] How many reads? 700k * 200 = 140 Mbp. For starters, try ovlHashBits = 22 ovlHashBlockLength = 200000000 ovlRefBlockSize = 18000000 ovlThreads = 4 This will generate jobs that need about 6gb memory each. Increasing ovlRefBlockSize will decrease the number of jobs but increase the run time of each. See http://wgs-assembler.sourceforge.net/wiki/index.php/RunCA#OVL_Overlapper for details. 3 or 4 days probably won't happen, 2-3 weeks is more reasonable, with a lot dependent on the mate size and genome complexity (or lack thereof = repeats). On Mon, Dec 8, 2014 at 1:01 AM, Manjari Deshmukh < man...@gm...> wrote: > Hi > > i am running celera with 700000 reads each more than 200 bp length and > genome size is approx 1.3Gb. The celera has generated 438064 overlap job. > > I want to know if there is any method to increase the assembly speed. I > want results in 3 or 4 days. > > > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > > |