From: Langhorst, B. <Lan...@ne...> - 2015-05-09 01:34:37
|
Hi: I’m trying to build unitigs for a ~ 3Gbase organism. I’m about 1/6 of the way through the 0-overlaptrim-overlap step and I’m already using about 10T of disk and I had to pause the job since that’s all I have...I can’t get 60T of disk. I have ~ 1.5B stitched illumina reads = ~ 50 - 280bp (used PEAR to stitch since wgs could not handle R1+R2 frags due to integer overflow) i figure about 120X or 60X per allele. too much? should I toss half of the shorter reads? how much will that help with disk space? I’m using the ovl overlapper… will the mer overlapper use less disk? Is there another setting I can change to use less disk? Brad spec file: merylMemory = 128000 merylThreads = 16 mbtThreads = 8 ovlStoreMemory=8192 #ovlStoreMemory=10000 useGrid = 1 scriptOnGrid = 0 ovlHashBits=25 # with this setting I observe about 2 runs at full capacity and one at 20, - can't use a smaller number because the job count is too large for sge #ovlHashBlockLength=1800000000 ovlHashBlockLength=2880000000 # should be about 400% ovlThreads=8 #ovlOverlapper=mer #merOverlapperThreads=8 #merOverlapperSeedBatchSize=1000000 frgCorrBatchSize = 10000000 frgCorrThreads = 8 unitigger=bogart batThreads=8 stopAfter=unitigger sge=-p -100 -pe smp 8 frgCorrOnGrid=1 ovlCorrOnGrid=1 -- Brad Langhorst, Ph.D. Applications and Product Development Scientist |