Re: [Apbs-users] lack of speedup for apbs on AIX 5L using MPI
Biomolecular electrostatics software
Brought to you by:
sobolevnrm
From: Robert K. <ro...@uc...> - 2004-03-31 20:27:05
|
Hi Lawrence, just to add to Nathan's informative email. How did you modify the actin-dimer input files to study this scaling? Just changing pdime value won't give you any speedup since all processors will be still solving the same grid size problem (as defined by dime). Increasing pdime while keeping other parameters constant will result in lowering effective grid resolution of the system but you won't observe any speedup and you can see a (very) slight increase in wall clock time due to increased communication. To study strong scaling one has to simultaneously increase pdime and decrease dime values in such a way that the system grid size is roughly constant. So I guess that's where APBS differs from most other MPI/parallel programs where simply increasing number of processors you throw on the system will get you answer faster. But as Nathan mentioned in his email APBS was designed to address a different class of computational problems. regards, robert On Wed, Mar 31, 2004 at 11:07:50AM -0600, Lawrence Hannon wrote: > > I'm an IBM'er working with Celera on code optimization. They've asked > me to take a look a apbs/MPI. They have not been big MPI users up to > now. > I built the system with the following > CC=mpcc_r > F77=mpxlf_r > CFLAGS="-O3 -qstrict -qarch=pwr3 -qtune=pwr3 -qcache=auto -qmaxmem" > FFLAGS="-qfixed=132 -O3 -qstrict -qarch=pwr3 -qtune=pwr3 -qcache=auto > -qmaxmem" > LDFLAGS="-bmaxdata:0x80000000 -bmaxstack:0x10000000 -L/usr/local/lib > -lmass -lessl " > For maloc > configure --prefix <install directory> --enable_mpi --enable_blas=no > gmake install > For apbs-0.2.6 > configure --prefix <install directory> --with_blas="-L/usr/lib -lblas" > gmake install > It seems to have hooked into IBM's MPI (poe) because It complains if > MP_PROCS or other MPI related environment variables are set > incorrectly. I ran it using the apbs-PARALLEL.in input file in > examples/actin-dimer (modified to run with 1, 2, 4, & 8 cpus). I'm not > seeing any speedup when I add cpus. As a matter of fact, it seems to > run in the same amount of time or longer as I add cpu's. When I > profile the code , I see most of the time is spent in "ivdwAccExclus". > The time spent in this routine is the same or more for each thread > even when it's run on multiple cpu's. I must be doing something wrong. > If I look at which MPI routines are being used, I see MPI_Comm_size, > MPI_Comm_rank, and MPI_ALLreduce all being used only once. When I look > thru the source, it's very hard to determine exactly how parallelism > is entering the problem. Can anyone help? > Thanks, > Lawrence Hannon |