|
From: Yury S. <Yu...@se...> - 2010-04-21 11:26:40
|
Hi, Brad ! Thank you for the previuos help. At now, I have managed to run RandomAccess program on 2 nodes. In fact, I run the program directly, without a queue system. Now my questions are concerned to the performance issues of RA example. Please, see: 1. Running the example locally gives: > bash-3.2$ ./ra -nl 1 > WARNING: Using GASNet's udp-conduit, which exists for portability > convenience. > WARNING: Support was detected for native GASNet conduits: ibv > WARNING: You should *really* use the high-performance native GASNet > conduit > WARNING: if communication performance is at all important in this > program run. > Number of Locales = 1 > Tasks per locale = 16 > Problem size = 1048576 (2**20) > Bytes per array = 8388608 > Total memory required (GB) = 0.0078125 > Number of updates = 4194304 > > Number of errors is: 5 > > Validation: SUCCESS > Execution time = 0.443489 > Performance (GUPS) = 0.00945752 2. Running the example on the 2 nodes using udp-conduit (in fact, Gigabit Ethernet ) gives > bash-3.2$ ./ra -nl 2 > WARNING: Using GASNet's udp-conduit, which exists for portability > convenience. > WARNING: Support was detected for native GASNet conduits: ibv > WARNING: You should *really* use the high-performance native GASNet > conduit > WARNING: if communication performance is at all important in this > program run. > Number of Locales = 2 > Tasks per locale = 16 > Problem size = 1048576 (2**20) > Bytes per array = 8388608 > Total memory required (GB) = 0.0078125 > Number of updates = 4194304 > > Number of errors is: 1 > > Validation: SUCCESS > Execution time = 81.5801 > Performance (GUPS) = 5.14133e-05 So, performance is more than 150 times worse. 3. Further, I rebuild the runtime for using ibv-conduit and run again on 2 nodes. In fact, the results vary from run to run due to random nature of the problem, but there is no any acceleration due to using of Infiniband: > bash-3.2$ /common/mvapich/bin/mpirun_rsh -hostfile hosts -np 2 > /usr/bin/env GASNET_VERBOSEENV=1 GASNET_IB_BOOTSTRAP_MPI=1 > GASNET_IB_CONDUIT=IBV GASNET_IB_SPAWNER=MPI GASNET_SPAWNFN=S > /home1/serdyuk/chapel-1.1/chapel/examples/hpcc/./ra_real -v -nl 2 > ENV parameter: GASNET_IB_SPAWNER = MPI > ENV parameter: GASNET_FREEZE = NO (default) > ENV parameter: GASNET_DISABLE_ARGDECODE = NO (default) > ENV parameter: GASNET_BACKTRACE = NO (default) > ENV parameter: GASNET_BACKTRACE_TYPE = GDB,EXECINFO (default) > ENV parameter: GASNET_FREEZE_ON_ERROR = NO (default) > ENV parameter: GASNET_FREEZE_SIGNAL = *not set* (default) > ENV parameter: GASNET_BACKTRACE_SIGNAL = *not set* (default) > ENV parameter: GASNET_IBV_PORTS = *empty* (default) > ENV parameter: GASNET_QP_TIMEOUT = 18 (default) > ENV parameter: GASNET_QP_RETRY_COUNT = 7 (default) > ENV parameter: GASNET_NETWORKDEPTH_PP = 64 (default) > ENV parameter: GASNET_NETWORKDEPTH_TOTAL = 0 (default) > ENV parameter: GASNET_AM_CREDITS_PP = 32 (default) > ENV parameter: GASNET_AM_CREDITS_TOTAL = 1024 (default) > ENV parameter: GASNET_AM_CREDITS_SLACK = 1 (default) > ENV parameter: GASNET_BBUF_COUNT = 1024 (default) > ENV parameter: GASNET_NUM_QPS = 0 (default) > ENV parameter: GASNET_INLINESEND_LIMIT = -1 B (default) > ENV parameter: GASNET_NONBULKPUT_BOUNCE_LIMIT = 64 KB (default) > ENV parameter: GASNET_PACKEDLONG_LIMIT = 4016 B (default) > ENV parameter: GASNET_AMRDMA_MAX_PEERS = 32 (default) > ENV parameter: GASNET_AMRDMA_LIMIT = 4084 B (default) > ENV parameter: GASNET_AMRDMA_DEPTH = 16 (default) > ENV parameter: GASNET_AMRDMA_CYCLE = 1024 (default) > ENV parameter: GASNET_PUTINMOVE_LIMIT = 3 KB (default) > ENV parameter: GASNET_RCV_THREAD = NO (default) > ENV parameter: GASNET_USE_FIREHOSE = YES (default) > ENV parameter: GASNET_EXITTIMEOUT_MAX = 360 (default) > ENV parameter: GASNET_EXITTIMEOUT_MIN = 2 (default) > ENV parameter: GASNET_EXITTIMEOUT_FACTOR = 0.25 (default) > ENV parameter: GASNET_EXITTIMEOUT = 2.5 (default) > ENV parameter: GASNET_PHYSMEM_MAX = 0 (default) > ENV parameter: GASNET_COLL_MIN_SCRATCH_SIZE = 1 KB (default) > ENV parameter: GASNET_COLL_SCRATCH_SIZE = 2 MB (default) > ENV parameter: GASNET_FIREHOSE_VERBOSE = NO (default) > ENV parameter: GASNET_FIREHOSE_M = 3 GB (default) > ENV parameter: GASNET_FIREHOSE_MAXVICTIM_M = 1023 MB (default) > ENV parameter: GASNET_FIREHOSE_R = 24574 (default) > ENV parameter: GASNET_FIREHOSE_MAXVICTIM_R = 8191 (default) > ENV parameter: GASNET_FIREHOSE_MAXREGION_SIZE = 128 KB (default) > ENV parameter: GASNET_DISABLE_MUNMAP = NO (default) > ENV parameter: GASNET_BARRIER = AMDISSEM (default) > ENV parameter: GASNET_VIS_AMPIPE = NO (default) > ENV parameter: GASNET_VIS_MAXCHUNK = 4008 B (default) > ENV parameter: GASNET_VIS_REMOTECONTIG = NO (default) > executing on locale 0 of 2 locale(s): tesla1.eth0.mvs50k.jscc.ru > executing on locale 1 of 2 locale(s): tesla2.eth0.mvs50k.jscc.ru > Number of Locales = 2 > Tasks per locale = 16 > Problem size = 1048576 (2**20) > Bytes per array = 8388608 > Total memory required (GB) = 0.0078125 > Number of updates = 4194304 > > Number of errors is: 0 > > Validation: SUCCESS > Execution time = 97.1364 > Performance (GUPS) = 4.31795e-05 Moreover, while a number of tasks for locale is 16, nevertheless I see the following workload of processors: > top - 15:02:11 up 34 days, 21:04, 1 user, load average: 10.53, 7.80, > 5.83 > Tasks: 277 total, 2 running, 275 sleeping, 0 stopped, 0 zombie > Cpu0 : 63.0%us, 37.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu1 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st > Mem: 49449700k total, 836884k used, 48612816k free, 164680k buffers > Swap: 2096472k total, 0k used, 2096472k free, 475464k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 31652 serdyuk 25 0 14.6g 17m 2712 R 99.9 0.0 1:02.80 ra_real So the question is - why using of Infiniband doesn't increase a performance of the test and, correspondingly, how to get an increasing of performance on multiply nodes ? Thanks. Yury. > > Hi Yury -- > > One thing that would be helpful to me would be to know what sort of > mpirun command you use when executing traditional MPI jobs on this > system and what sort of output you expect to get from the job > queueing/launching software? > > >> 1. At first, due to documentation, I have added MPIRUN_CMD variable, >> which reflects our launching environment: >> >>> bash-3.2$ export MPIRUN_CMD='mpirun -host cuda.jscc.ru -np %N -maxtime >>> 10 -machinefile hosts %C' >> > > One thing I'm wondering about is the role of the '-machinefile hosts' > flag. Does the queueing system create this hosts file for you? If > not, that seems like it could be problematic in that it should contain > the names of the hosts returned by the queueing process? > > >> 2. Now, the job starts succesfully through our queue system: >> >>> bash-3.2$ ./ra -v -nl 2 >>> /home1/serdyuk/chapel-1.1/chapel/runtime/src/launch/gasnetrun_ibv/../../../../third-party/gasnet/install/linux64-gnu/seg-everything/nodbg/bin/gasnetrun_ibv >>> >>> -n 2 ./ra_real '-v' '-nl' '2' >>> Trying cuda.jscc.ru 83.149.218.223 >>> Count of cpu is 16 >> > > ^^^ is this "Count of cpu" message correct given that you're trying to > run a 2-locale (node) job? > > >>> Task "env.1" queued successfully >>> >>> Running task "env.1" on following nodes: >>> tesla1 tesla2 >>> Task "env.1" started successfully, pid of manager is 9286 >> >> >> But further our queue system gives some weird error. > > > What does the error look like? > > >> 3. So I have started the job directly with -v switch for gasnetrun_ibv: >> >>> bash-3.2$ >>> /home1/serdyuk/chapel-1.1/chapel/runtime/src/launch/gasnetrun_ibv/../../../../third-party/gasnet/install/linux64-gnu/seg-everything/nodbg/bin/gasnetrun_ibv >>> >>> -v -n 2 ./ra_real '-v' '-nl' '2' >>> gasnetrun: located executable >>> '/home1/serdyuk/chapel-1.1/chapel/examples/hpcc/./ra_real' >>> gasnetrun: forwarding to mpi-based spawner >>> gasnetrun: identified MPI spawner as: unknown program (using generic >>> MPI spawner) >>> gasnetrun: located executable >>> '/home1/serdyuk/chapel-1.1/chapel/examples/hpcc/./ra_real' >>> envargs: /usr/bin/env GASNET_VERBOSEENV=1 GASNET_IB_BOOTSTRAP_MPI=1 >>> GASNET_IB_CONDUIT=IBV GASNET_IB_SPAWNER=MPI GASNET_SPAWNFN=S >>> gasnetrun: running: mpirun -host cuda.jscc.ru -np 2 -maxtime 10 >>> -machinefile hosts /usr/bin/env GASNET_VERBOSEENV=1 >>> GASNET_IB_BOOTSTRAP_MPI=1 GASNET_IB_CONDUIT=IBV GASNET_IB_SPAWNER=MPI >>> GASNET_SPAWNFN=S >>> /home1/serdyuk/chapel-1.1/chapel/examples/hpcc/./ra_real -v -nl 2 >>> Trying cuda.jscc.ru 83.149.218.223 >>> Count of cpu is 16 >>> Task "env.1" queued successfully >>> >>> Running task "env.1" on following nodes: >>> tesla1 tesla2 >>> Task "env.1" started successfully, pid of manager is 11693 >> >> >> As I see, the main command is: >> >>> mpirun -host cuda.jscc.ru -np 2 -maxtime 10 -machinefile hosts >>> /usr/bin/env GASNET_VERBOSEENV=1 GASNET_IB_BOOTSTRAP_MPI=1 >>> GASNET_IB_CONDUIT=IBV GASNET_IB_SPAWNER=MPI GASNET_SPAWNFN=S >>> /home1/serdyuk/chapel-1.1/chapel/examples/hpcc/./ra_real -v -nl 2 >> >> >> Is it a correct command at all ? > > > This seems as though it may potentially be correct. Are you balking > at everything between the /usr/bin/env and the invocation of ra_real? > > For example, using mpich on my system I can do: > > mpirun -np 1 ./foo > > or: > > mpirun -np 1 /usr/bin/env ./foo > > or: > > mpirun -np 1 /usr/bin/env FOO_VAR=foo ./foo > > As I understand it, the /usr/bin/env command is intended to pass > environment variables to the executable being run (in this case > ra_real). I think the following note from the mpi-conduit/README > pertains to this: > >> Known problems: >> --------------- >> >> * Proper operation of GASNet and its client often depends on >> environment variables passed to all the processes. Unfortunately, >> there is no uniform way to achieve this across different >> implementations of mpirun. The gasnetrun_mpi script tries many >> different things and is known to work correctly for LAM/MPI and for >> MPICH and most of its derivatives (including MPICH-NT, MVICH and >> MVAPICH). For assistance with any particular MPI implementation, >> search the Berkeley UPC Bugzilla server and open a new bug >> (registration required) if you cannot find information on your MPI. > > > Are you using a standard mpirun, or is it some sort of mpirun wrapper > script developed specifically for your system? If the latter, I could > imagine that you may need to come up with some other way to pass > environment variables through to the child processes... > > -Brad > > |