[Denovoassembler-devel] Ray on Titan
Ray -- Parallel genome assemblies for parallel DNA sequencing
Brought to you by:
sebhtml
From: Sébastien B. <seb...@ul...> - 2013-10-28 18:23:22
|
Hi Jacques, Good news ! It worked (but did not finish as expected). Compute job: HiSeq-2500-NA12878-demo-2x150-8 - I used 8 ranks instead of 16 per node and it worked. Reads were loaded, the graph was built and the extensions generated in under 4 hours. This was with 2504 ranks. The time was capped to 12 hours because the job was not large enough. 1.1 billion input sequences, paired The graph had 5.6 billion vertices. The merging did not complete however. There is a ticket open on Ray's github page for improving the merging code. https://github.com/sebhtml/ray/issues/82 Compute job: HiSeq-2500-NA12878-demo-2x150-9 - I will use 8 ranks per node, and 626 nodes instead of 313. If you are wondering where I am pulling these numbers (313), they are from Titan's scheduling policy: https://www.olcf.ornl.gov/kb_articles/titan-scheduling-policy/ So that's 5008 ranks. This will run at most 12 hours (maximum). Accounting: 626*30*12 = 225360 (626 nodes, 30 core-hours are charged for every node-hour, duration: 12 hours) titan> cat HiSeq-2500-NA12878-demo-2x150-9.sh #PBS -N HiSeq-2500-NA12878-demo-2x150-9 #PBS -l walltime=00:12:00:00 #PBS -l nodes=626 #PBS -A LSC005 #PBS -l gres=widow1 cd $PBS_O_WORKDIR # 626 * 8 = 5008 aprun -n 5008 \ ./software/lsc005/Ray/616d2a26cc1e39f59325a0e632af46262edaa12c-1/Ray \ -k 31 \ -detect-sequence-files HiSeq-2500-NA12878-demo-2x150 \ -o HiSeq-2500-NA12878-demo-2x150-9 \ titan> qsub HiSeq-2500-NA12878-demo-2x150-9.sh 1769459 see https://github.com/sebhtml/ray/issues/197 |