From: Audrey N. <aud...@ho...> - 2012-10-22 18:28:59
|
Hi wgs-users, I’m trying to generate hybrid assembly with Illumina and 454 reads for a 520Mb estimated genome. Illumina are 108bp Paired-end sequences and 454 are shotgun and 6Kb Paired-end sequences. We already used runCA 7.0 on a subset of these same sequences (including the whole Illumina dataset) and we obtained a complete assembly with a N50 consistent with the partial dataset used. The current “big” assembly is working with 454 shotgun short and long reads, 5 000 000 000 bases; 454 PE, 500 000 000 bases and Illumina PE, 10 000 000 000 bases giving a total of 16 603 061 677 bases for 81 460 599 reads. I understood that ovlHashBits, ovlHashBlockLength and ovlRefBlockSize were critical settings, as set in the below specfile, this runCA started on 4th October on a Silicon Graphics UV 100 with 1TB RAM, 64 CPU and is still running (on 22nd October). A total of 14 overlap jobs were created, 8 jobs were done in almost 2 days… 2 others were then completed on the 14th and 17th October, and the 4 remaining are still in working process… Do you think this is an acceptable run time?! This is just the 0-overlaptrim process! And I know there is still much to do… What parameters would you suggest with such inputs? Are there any other important settings to consider to optimize the process? I already tried to change some parameters settings but today I hesitate to abort this run without knowing exactly what to do to improve the run time. You will find my specfile at the end with the output I got to date. Thanks, Audrey Nisole. #____________________________________________________________ #20121004 # Spec file # Sequences from 454 Titanium technology and Illumina #_____________________________________________________________ # ERROR Rates utgErrorRate = 0.03 utgErrorLimit = 2.5 ovlErrorRate=0.06 cnsErrorRate=0.06 cgwErrorRate=0.10 # Minimum Fragment Length and Minimum Overlap Length frgMinLen = 64 ovlMinLen = 40 # OVERLAPPER overlapper = ovl obtOverlapper = ovl ovlOverlapper = ovl ovlStoreMemory = 100000 saveOverlaps = 1 merSize = 22 # OVL Ovelapper ovlThreads = 6 ovlConcurrency = 9 ovlHashBits = 28 ovlHashBlockLength = 1200000000 ovlRefBlockSize = 600000000 # MERYL calculates K-mer seeds merylMemory = 200000 merylThreads = 24 # ERROR CORRECTION applied to overlaps frgCorrBatchSize = 200000 frgCorrThreads = 6 frgCorrConcurrency = 9 # UNITIGGER unitigger = bog #utgGenomeSize = 520 # SCAFFOLDER computeInsertSize = 0 # CONSENSUS cnsConcurrency = 2 # Terminator closureOverlaps = 0 closurePlacement = 2 createACE = 0 #---------------------------------------------------- # FRG files #---------------------------------------------------- #454 Shotguns, 22 fichiers sff /project/…/frg/454shotgun.frg #Paired-ends, 4 fichiers sff + 2 fichiers fastq /project/…/frg/454Pairend.frg /project/…/sequences/frg/S1.frg # I got this output : runCA -d Budworm_wgs_20121004 –p budworm_20121004 -s specfile_Budworm_20121004 ----------------------------------------START Thu Oct 4 11:13:39 2012 /prg/wgs/7.0/Linux-amd64/bin/gatekeeper -o /…/wgs-assembly/Budworm_wgs_20121004/budworm_20121004.gkpStore.BUILDING -T -F /project/…/frg/Budworm_shotgun.frg /project/…/frg/Budworm_Pairend.frg /project/…/frg/S1.frg > /…/wgs-assembly/Budworm_wgs_20121004/budworm_20121004.gkpStore.err 2>&1 ----------------------------------------END Thu Oct 4 11:35:38 2012 (1319 seconds) numFrags = 81460599 ----------------------------------------START Thu Oct 4 11:35:40 2012 /prg/wgs/7.0/Linux-amd64/bin/meryl -B -C -v -m 22 -memory 200000 –threads 24 -c 0 -L 2 -s /…/wgs-assembly/Budworm_wgs_20121004/budworm_20121004.gkpStore:chain –o /…/wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004-C-ms22-cm0 > /…/wgs-assembly/Budworm_wgs_20121004/0-mercounts/meryl.err 2>&1 ----------------------------------------END Thu Oct 4 13:10:59 2012 (5719 seconds) ----------------------------------------START Thu Oct 4 13:10:59 2012 /prg/wgs/7.0/Linux-amd64/bin/estimate-mer-threshold -g /…/wgs-assembly/Budworm_wgs_20121004/budworm_20121004.gkpStore:chain –m /…/wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004-C-ms22-cm0 > /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004-C-ms22-cm0.estMerThresh.out 2> /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004-C-ms22-cm0.estMerThresh.err ----------------------------------------END Thu Oct 4 13:10:59 2012 (0 seconds) ----------------------------------------START Thu Oct 4 13:10:59 2012 /prg/wgs/7.0/Linux-amd64/bin/meryl -Dt -n 91 –s /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004-C-ms22-cm0 > /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004.nmers.ovl.fasta 2> /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004.nmers.ovl.fasta.err ----------------------------------------END Thu Oct 4 13:12:04 2012 (65 seconds) ----------------------------------------START Thu Oct 4 13:12:04 2012 /prg/wgs/7.0/Linux-amd64/bin/meryl -Dt -n 91 –s /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004-C-ms22-cm0 > /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004.nmers.obt.fasta 2> /.../wgs-assembly/Budworm_wgs_20121004/0-mercounts/budworm_20121004.nmers.obt.fasta.err ----------------------------------------END Thu Oct 4 13:13:10 2012 (66 seconds) Reset OBT mer threshold from auto to 91. Reset OVL mer threshold from auto to 91. ----------------------------------------START CONCURRENT Thu Oct 4 13:13:10 2012 /.../wgs-assembly/Budworm_wgs_20121004/0-mertrim/mertrim.sh 1 > /.../wgs-assembly/Budworm_wgs_20121004/0-mertrim/budworm_20121004.0001.err 2>&1 (…) HASH 13788239- 16725048 REFR 1- 81460599 STRINGS 2936810 BASES 1200000016 HASH 16725049- 27637181 REFR 1- 81460599 STRINGS 10912133 BASES 1200000017 HASH 27637182- 38807450 REFR 1- 81460599 STRINGS 11170269 BASES 1200000061 HASH 38807451- 50000808 REFR 1- 81460599 STRINGS 11193358 BASES 1200000028 HASH 50000809- 61105478 REFR 1- 81460599 STRINGS 11104670 BASES 1200000077 HASH 61105479- 72196104 REFR 1- 81460599 STRINGS 11090626 BASES 1200000008 HASH 72196105- 81460599 REFR 1- 81460599 STRINGS 9264495 BASES 1003058249 ----------------------------------------END Fri Oct 5 17:59:40 2012 (49 seconds) Created 14 overlap jobs. Last batch '001', last job '000014'. ----------------------------------------START CONCURRENT Fri Oct 5 17:59:40 2012 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 1 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000001.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 2 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000002.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 3 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000003.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 4 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000004.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 5 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000005.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 6 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000006.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 7 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000007.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 8 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000008.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 9 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000009.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 10 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000010.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 11 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000011.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 12 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000012.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 13 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000013.out 2>&1 /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/overlap.sh 14 > /.../wgs-assembly/Budworm_wgs_20121004/0-overlaptrim-overlap/000014.out 2>&1 |