Re: [Denovoassembler-devel] [CCS #177295] MPICH on titan uses a lot of memory (?)
Ray -- Parallel genome assemblies for parallel DNA sequencing
Brought to you by:
sebhtml
From: Sébastien B. <seb...@ul...> - 2013-10-21 21:18:27
|
On 21/10/13 05:06 PM, Fernanda Foertter via RT wrote: > > Thanks Sebastien, > > My gut tells me you're running out of memory per core. Hugepage is busting and > the max size is 2GB. Ray does not use Hugepages. As I understand, they won't be used with standard malloc or C++'s new. You need to mmap with MAP_HUGETLB to use them, which I don't do in Ray. So AFAIK, hugepages are not at involved here, at least not in Ray (MPICH2 seems to use plenty of them though). Yes. I think also that this is the case. Is there a symmetric hard limit on the memory per core/rank (32 GiB / 16 = 2 GiB) ? But, I mean, Cached is at 22 GiB, and it is possibly 100% reclaimable via VM pressure. That's 1.3 GiB per core. > MPIU_nem_gni_get_hugepages(): large page stats: free 0 nr 211 nr_overcommit > 16154 resv 0 surplus 211 > Sure. But that seems to be related to the meminfo Cached entry which should get purged since free pagecache, dentries and inodes are not required at this point by Ray since all the data is on the heap. Linux usually purges these with vfs_cache_pressure. > The network is just the one to complain about it, but not necessarily the > cause. > > Have you tried lowering the number of MPI processes to 8/node? I have 2 jobs in the queue: - one with Ray -debug ... - one with -N 2504 instead of -N 5008 (using 313 nodes). They will probably start in 1 week or 2 weeks since our allocation for LSC005 is super small (250 k core-hours 2013, and 750 k for next year up to the summer. > > FF > > On Mon Oct 21 16:47:01 2013, seb...@ul... wrote: >> On 21/10/13 04:38 PM, Fernanda Foertter via RT wrote: >>> >>> Hi Sebastien, >> >> Hello FF, >> >>> >>> Could you send me the list of modules you have loaded for the job to >> run? ( >>> cmd: module list ) >> >> To compile Ray (C++ / MPI), I used >> >> module purge >> module load PrgEnv-intel/4.1.40 >> module load cray-mpich2/5.6.3 >> module load git/1.8.2.1 >> >> My build script is https://github.com/sebhtml/Ray-on- >> Cray/blob/master/titan.ccs.ornl.gov/Build-on-Titan.sh >> >> >> >> I am not loading any module in my job script. >> It seems that aprun is already in my PATH. Anyway, I don't know if >> aprun is forked from >> bash (which implies loading all the default modules). >> >> And my executable is statically linked: >> >> titan> file >> ./software/lsc005/Ray/616d2a26cc1e39f59325a0e632af46262edaa12c-1/Ray >> ./software/lsc005/Ray/616d2a26cc1e39f59325a0e632af46262edaa12c-1/Ray: >> ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), for >> GNU/Linux 2.6.4, statically linked, not stripped >> >> >> >> >> I personally found that the sheer amount of modules that are loaded by >> default >> is very high. Hence the module purge that I did. >> >> >> Default modules loaded: >> >> titan> module list >> Currently Loaded Modulefiles: >> 1) modules/3.2.6.6 13) csa/3.0.0- >> 1_2.0401.37452.4.50.gem 25) xe-sysroot/4.1.40 >> 2) craype-network-gemini 14) dvs/1.8.6_0.9.0- >> 1.0401.1401.1.120 26) atp/1.6.1 >> 3) xt-asyncpe/5.17 15) rca/1.0.0- >> 2.0401.38656.2.2.gem 27) PrgEnv-pgi/4.1.40 >> 4) pgi/12.10.0 16) audit/1.0.0- >> 1.0401.37969.2.32.gem 28) cray-mpich2/5.6.3 >> 5) xt-libsci/12.0.00 17) ccm/2.2.0- >> 1.0401.37254.2.142 29) xtpe-interlagos >> 6) udreg/2.3.2-1.0401.5929.3.3.gem 18) configuration/1.0- >> 1.0401.35391.1.2.gem 30) eswrap/1.0.15 >> 7) ugni/4.0-1.0401.5928.9.5.gem 19) hosts/1.0- >> 1.0401.35364.1.115.gem 31) torque/4.2.5-snap.201308291703 >> 8) pmi/4.0.1-1.0000.9421.73.3.gem 20) lbcd/2.1- >> 1.0401.35360.1.2.gem 32) moab/7.1.3 >> 9) dmapp/3.2.1-1.0401.5983.4.5.gem 21) nodehealth/5.0- >> 1.0401.38460.12.18.gem 33) lustredu/1.2 >> 10) gni-headers/2.1-1.0401.5675.4.4.gem 22) pdsh/2.26- >> 1.0401.37449.1.1.gem 34) DefApps >> 11) xpmem/0.1-2.0401.36790.4.3.gem 23) shared-root/1.0- >> 1.0401.37253.3.50.gem 35) altd/1.0 >> 12) job/1.5.5-0.1_2.0401.35380.1.10.gem 24) switch/1.0- >> 1.0401.36779.2.72.gem >> >> >> And these are loaded elsewhere, not in by bashrc: >> >> titan> echo $SHELL >> /bin/bash >> titan> cat ~/.bashrc >> >> PS1="titan> " >> >> >> >> Thank you for your time and concern. >> >>> >>> FF >>> >>> On Mon Oct 21 15:06:51 2013, seb...@ul... wrote: >>>> On 21/10/13 02:48 PM, Fernanda Foertter via RT wrote: >>>>> >>>>> Ah, thank you for clarifying. I'm not even sure what would happen >> if >>>> those two >>>>> are called. Probably a singularity in the universe...or something. >>>> >>>> Haha. >>>> >>>> I suppose it would work, but each nested mpiexec would probably be >>>> spawning >>>> within its confined compute node. >>>> >>>> >>>> >>>>> >>>>> I have some folks looking into this. When you get the debug run >>>> complete, >>>>> please let me know what happens. >>>> >>>> I will. >>>> >>>> The -debug option basically collects all RayPlatform events for >> every >>>> period of >>>> 1 second (1000 ms). It also reports the value of various memory >>>> indicators >>>> (VmData, VmStack, VmRSS, and so on) >>>> >>>>> >>>>> FF >>>>> >>>> >>> >> > > > -- > |