From: Santiago R. <san...@gm...> - 2014-06-17 22:45:30
|
Hi Brian, When using 1024, it said the OS wasn't able to handle it, and it recommended using 1008. When using 1008, CA ended arguing "Failed to open output file... Too many open files". Now I'm trying with fewer parts, but I don't think this would solve the problem. Do you have any more ideas? Thanks again in advance. Regards, Santiago On Sun, Jun 15, 2014 at 10:10 PM, Santiago Revale <san...@gm...> wrote: > Hi Brian, > > Thanks for your reply. In regards of your suggestions: > > 1) the PBcR process generates OVB files without zipping them; just to be > sure, I've tried to unzip some of them just in case the extension were > missing; > > 2) I've re-launched the process with the suggested parameters, but using > 512 instead of 1024; the result was exactly the same: same error in the > same step. Also, again 511 out of 512 files had a size of 2.3Gb while the > last file was 1.2Tb long. Do you know why does this happens? > > I'm trying one last time using 1024 instead. > > Thanks again for your reply. I'm open to some more suggestions. > > Regards, > Santiago > > > On Fri, Jun 13, 2014 at 4:25 PM, Brian Walenz <th...@gm...> wrote: > >> Hi- >> >> This is a flaw in gzip, where it doesn't report the uncompressed size >> correctly for files larger than 2gb. I'm not intimately familiar with this >> pipeline, so don't know exactly how to implement the fixes below. >> >> Fix with either: >> >> 1) gzip -d the *gz files before building the overlap store. The 'find' >> command in the log indicates the pipeline will pick up the uncompressed >> files. You might need to remove the 'asm.ovlStore.list' file before >> restarting (this has the list of inputs to overlapStoreBuild). >> >> 2) Set ovlStoreMemory to (exactly) "0 -f 1024". This will tell it to use >> 0MB memory, and instead use 1024 files regardless of the size. 512 files >> will also work, and is a little safer (not near some Linux 'number of open >> files' limits). >> >> 3) Build the overlap store by hand (with either the uncompressed input, >> or the -f instead of -M option), outside the script, and then restart the >> script. The script will notice there is an overlap store already present, >> and skip the build. The command is in the log file -- make sure the final >> store is called 'asm.ovlStore', and not 'asm.ovlStore.BUILDING'. >> >> Option 1 should work, but option 2 is the easiest to try. I wouldn't try >> option 3 until Sergey speaks up. >> >> b >> >> >> >> >> On Fri, Jun 13, 2014 at 12:33 PM, Santiago Revale < >> san...@gm...> wrote: >> >>> Dear CA community, >>> >>> I'm running the correction of some PacBio reads with high-identity >>> Illumina reads, in a high memory server, for a 750 Mbp genome. I've >>> considered the known issues addressed in the website when starting the >>> correction. >>> >>> When executing the pipeline, I've reached to the overlapStoreBuild step >>> with 48 ovb files, size 26 Gb each (totaling 1.2Tb). ovls files have >>> already been deleted by the script. The error happened while executing >>> overlapStoreBuild: >>> >>> ... >>> bucketizing DONE! >>> overlaps skipped: >>> 0 OBT - low quality >>> 0 DUP - non-duplicate overlap >>> 0 DUP - different library >>> 0 DUP - dedup not requested >>> terminate called after throwing an instance of 'std::bad_alloc' >>> what(): std::bad_alloc >>> >>> Failed with 'Aborted' >>> ... >>> >>> >>> I ran this step twice: the first one having set ovlStoreMemory to 8192 >>> Mb, but the second one, set it on 160000 (160 Gb). In the "Overlap store >>> failure" FAQ, it mentioned as possible causes "Out of disk space" (which is >>> not my case) and "Corrupt gzip files / too many fragments". I don't have >>> gzip files and I have only 15 fragments. Also, bucketizing step finishes OK. >>> >>> Also, some odd thing I've noticed (at least odd for me) is that 14 of >>> the 15 temp files (tmp.sort.XXX) of the asm.ovlStore.BUILDING folder have a >>> size 79Gb while the last one size is 1.2Tb. >>> >>> Could anybody tell me what could be the cause of this error and how to >>> solve it? >>> >>> I'm attaching the asm.ovlStore.err and the pacBioToCA log files for >>> complete descriptions of the error and the executed commands. >>> >>> Thank you very much in advance. >>> >>> Regards, >>> Santiago >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions >>> Find What Matters Most in Your Big Data with HPCC Systems >>> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. >>> Leverages Graph Analysis for Fast Processing & Easy Data Exploration >>> http://p.sf.net/sfu/hpccsystems >>> _______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>> >>> >> > |