From: Walenz, B. <wa...@nb...> - 2015-01-22 22:58:00
|
Unfortunately there isn’t anything as nice as ‘run X jobs for stage Y’. For overlaps, the hash parameters generally set how much memory each job will use, and the ref parameters control how long each job will run – and indirectly, how many jobs are generated. There is, of course, some dependence on the number of jobs on the hash parameters. I tend to find a set of those that work well on my data type (Illumina, 454, etc) and my hardware, then leave them alone and use the ref size to tune the number of jobs. A trick I use here is to “useGrid=1 scriptOnGrid=0” to get up to the overlap stage. This will set up the overlap compute but not launch it. You can then see how many jobs, and maybe run a few to see how much memory they use. To reconfigure, remove the overlap.sh script and rerun runCA. All it will do is recompute the job partitioning, make a new script, and tell you to run it. For consensus, cnsMinFrags and cnsPartitions control the number of jobs. it will try to make cnsPartitions jobs, unless there are fewer than cnsMinFrags in a job, in which case, there will be fewer jobs. Going above, iirc, cnsPartitions about 200 can cause the ‘too many open files’ error. Unlike overlaps, the partitioning here is set at the output of unitigger, and to change, unitigs must be recomputed. b From: Miguel Grau [mailto:mi...@uj...] Sent: Wednesday, January 21, 2015 8:37 PM To: Brian Walenz Cc: wgs...@li... Subject: Re: [wgs-assembler-users] Configuration SGE Hi Brian, There is some way to select the number of trim jobs running on sge mode? I mean, if I select the sge mode, for some steps (utg for example) the main job is trimmed in several small jobs (130 in my case). There are some parameters to set, for example, 1000 trim small jobs? Or the only way is to play with ovlHashBits & ovlHashBlockLength parameters? Thanks for your help, Miquel On 2015年01月20日 12:00, Brian Walenz wrote: Definitely better with smaller values! NOTE! You need to keep ovlThreads and the sge -pe thread values the same. As you have it now, you've told overlapper to run with 6 threads, but only requested 2 from SGE. This value is totally up to you, whatever works at your site. If possible, run a job or two by hand before submitting to the grid (sh overlap.sh <jobNum>). This will report stats on the hash table usage, and let you see memory usage and run time. If not possible, check the log files in the overlap directory as it runs. You want to check that the hash table isn't totally empty (a load less than 50%). If it is, increase the hash block length or decrease the bits. The other side (too full) isn't really a problem - it'll just do multiple passes to compute all the overlaps. b On Mon, Jan 19, 2015 at 9:17 PM, Miguel Grau <mi...@uj...<mailto:mi...@uj...>> wrote: @Ludovic. virtual_free and h_vmem are mandatory to work in our cluster. Thanks for the answer. @Brian. I increased these values because my batch of fastq files has around 40Gb so I thought I had to use (following the ovlHashBits table from here<http://wgs-assembler.sourceforge.net/wiki/index.php/RunCA>, if I want to use 2 threads on sge): ovlHashBits = 27 ovlHashBlockLength = 260000000 ovlRefBlockSize = 7630000 ovlThreads = 2 sge = -pe thread 2 -l h_vmem=50G Instead this, it works better if I decrease the ovlHasBits & ovlHashBlockLength values and increase the ovlRefBlockSize & ovlThreads values?: ovlHashBits = 25 ovlHashBlockLength = 240000000 ovlRefBlockSize = 18000000 ovlThreads = 6 sge = -pe thread 2 -l h_vmem=50G Thanks for your help, Miquel On 2015年01月20日 00:29, Brian Walenz wrote: I've never seen large overlap jobs perform better than small jobs. Target an 8gb job with ~4 CPUs each. My default configuration is: ovlHashBits = 22 ovlHashBlockLength = 200000000 ovlRefBlockSize = 18000000 ovlThreads = 6 The two 'hash' sizes control how big the job is. The 'ref block size' controls how many reads are processed by each job, i.e., how long the job runs. b On Mon, Jan 19, 2015 at 5:10 AM, Ludovic Mallet <lud...@un...<mailto:lud...@un...>> wrote: Hi, Not the best expert, but to me, virtual_free allow the job to swap, which you should try to avoid. and I think h_vmem is the hard limit, so the job would be killed whenever the line is crossed. from http://gridengine.eu/grid-engine-internals "hard limitation: All processes of the job combined are limited from the Linux kernel that they are able to use only the requested amount of memory. Further malloc() calls will fail." whether h_vmem is hard by default if GE has to be checked again, but I'd rather use mem_free instead Best, ludovic On 19/01/15 02:22, Miguel Grau wrote: > Dear all, > > I am having some troubles to config wgs 8.2 assembler with SGE options. > I always get a malloc memory error and I am not sure why. I am working > with 3 paired fastq files (6 files in total) with 100b length reads (15 > million reads in each fastq file). My config file: > > useGrid = 1 > scriptOnGrid = 1 > > sge = -A assembly > sgeMerTrim = -l h_vmem=150G -l virtual_free=150G > sgeScript = -l h_vmem=50G -l virtual_free=50G > sgeOverlap = -l h_vmem=100G -l virtual_free=100G > sgeMerOverlapSeed = -l h_vmem=100G -l virtual_free=100G > sgeMerOverlapExtend = -l h_vmem=100G -l virtual_free=100G > sgeConsensus = -l h_vmem=100G -l virtual_free=100G > sgeFragmentCorrection = -l h_vmem=100G -l virtual_free=100G > sgeOverlapCorrection = -l h_vmem=100G -l virtual_free=100G > > overlapper = ovl #Best for illumina > unitigger = bogart #Best for illumina > > #For 50GB... > ovlHashBits = 28 > ovlHashBlockLength = 480000000 > #100Gb for overlap > ovlStoreMemory=102400 > > ovlThreads = 2 > ovlRefBlockSize = 7630000 > frgCorrBatchSize = 1000000 > frgCorrThreads = 8 > > The error that I have now is: > > ------------------------------------------------------------------------------ > bucketizing /reads/a6/0-overlaptrim-overlap/001/000278.ovb.gz > bucketizing /reads/a6/0-overlaptrim-overlap/001/000276.ovb.gz > bucketizing /reads/a6/0-overlaptrim-overlap/001/000275.ovb.gz > bucketizing /reads/a6/0-overlaptrim-overlap/001/000280.ovb.gz > bucketizing DONE! > overlaps skipped: > 1211882406 OBT - low quality > 0 DUP - non-duplicate overlap > 0 DUP - different library > 0 DUP - dedup not requested > terminate called after throwing an instance of 'std::bad_alloc' > what(): std::bad_alloc > > Failed with 'Aborted' > > Backtrace (mangled): > > /miquel/wgs-8.2/Linux-amd64/bin/overlapStoreBuild(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x40a697] > /lib64/libpthread.so.0[0x3ff1c0f710] > /lib64/libc.so.6(gsignal+0x35)[0x3ff1432925] > /lib64/libc.so.6(abort+0x175)[0x3ff1434105] > .... > ---------------------------------------------------------------------------------- > > Some idea for the best config? > > Thank you, > > > Miguel > > > > > > > ------------------------------------------------------------------------------ > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > GigeNET is offering a free month of service with a new server in Ashburn. > Choose from 2 high performing configs, both with 100TB of bandwidth. > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > http://p.sf.net/sfu/gigenet > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li...<mailto:wgs...@li...> > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users ------------------------------------------------------------------------------ New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ wgs-assembler-users mailing list wgs...@li...<mailto:wgs...@li...> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users ------------------------------------------------------------------------------ New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ wgs-assembler-users mailing list wgs...@li...<mailto:wgs...@li...> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |