|
From: Indi T. <iht...@ho...> - 2009-11-27 10:26:20
|
Hi, I am trying to debug a large iterative solver that has been compiled using intel fortran 10 and open mpi 1.2.6. that is run in a SLES10 based PC cluster using valgrind 3.5.0. To supress the openmpi error messages I've put mpi argument "--mca btl tcp, self", otherwise the log file is simply filled by open mpi messages, which may amount to 20Mb. Here, I am not trying to debug the open mpi. The problem appears at random in after several iterations which suggest a memory related problem. I expect Memcheck to report an error when the offending lines is executed for the first time. However, Memcheck does not report up any problem prior to the point of failure, i.e. the part of the code that fail has been passed by Memcheck as problem free in previous iterations. At the point of failure, typically the following error is logged by memcheck Invalid write of size 8 at 0x511BF6B: _int_malloc (in /opt/openmpi-1.2.6/intel/lib/libopen-pal.so.0.0.0) by 0x511B710: malloc (in /opt/openmpi-1.2.6/intel/lib/libopen-pal.so.0.0.0) by 0x907161A: ompi_coll_tuned_allreduce_intra_recursivedoubling (in /opt/openmpi-1.2.6/intel/lib/mca_coll_tuned.so) by 0x906FFED: ompi_coll_tuned_allreduce_intra_dec_fixed (in /opt/openmpi-1.2.6/intel/lib/mca_coll_tuned.so) by 0x4DFFEF7: PMPI_Allreduce (in /opt/openmpi-1.2.6/intel/lib/libmpi.so.0.0.0) by 0x4C9C612: PMPI_ALLREDUCE (in /opt/openmpi-1.2.6/intel/lib/libmpi_f77.so.0.0.0) by 0x6E006E: my_mpireduce_call (my_routine.F) . . . Adress 0x10 is not stack'd, malloc'd or (recently) free'd Since memcheck does not detect any problem earlier when the point of failure is executed, does it mean the program is fairly sound? I could not see any bug at the program line that call the mpi reduce. Does that error message suggest the dynamic memory allocation within open-mpi allreduce operation is at fault? If this is the case, could I capture the problem earlier by removing the mpi suppression ? Regards _________________________________________________________________ View your other email accounts from your Hotmail inbox. Add them now. http://clk.atdmt.com/UKM/go/186394592/direct/01/ |
|
From: Ashley P. <as...@pi...> - 2009-11-27 12:39:43
|
On Fri, 2009-11-27 at 10:26 +0000, Indi Tristanto wrote: > I am trying to debug a large iterative solver that has been compiled > using intel fortran 10 and open mpi 1.2.6. that is run in a SLES10 > based PC cluster using valgrind 3.5.0. To supress the openmpi error > messages I've put mpi argument "--mca btl tcp, self", otherwise the > log file is simply filled by open mpi messages, which may amount to > 20Mb. Here, I am not trying to debug the open mpi. Open MPI 1.2.6 is fairly old now, there have been some major improvements in the 1.3 series. > The problem appears at random in after several iterations which > suggest a memory related problem. I expect Memcheck to report an error > when the offending lines is executed for the first time. Like threaded applications parallel applications can suffer from race conditions that only occur periodically or when a certain set of timing conditions exist. > Does that error message suggest the dynamic memory allocation within > open-mpi allreduce operation is at fault? If this is the case, could I > capture the problem earlier by removing the mpi suppression ? I thought you said the program was clean up until that point? One thing you should know is that open-mpi has at times in the past replaced the libc malloc with it's own verion, if it's done this then memcheck will be able to do considerably fewer checks as it will probably not know about these allocations. I'd recommend downloading the latest Open-MPI release (I believe 1.3.3 is still the latest, 1.3.4 is due real-soon now, probably when folks get back after thanks giving) and compile it without the malloc hooks. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk |
|
From: tom f. <tf...@al...> - 2009-11-30 02:34:39
|
Indi Tristanto <iht...@ho...> writes: > I am trying to debug a large iterative solver that has been compiled using = > intel fortran 10 and open mpi 1.2.6. > > Invalid write of size 8 > at 0x511BF6B: _int_malloc (in /opt/openmpi-1.2.6/intel/lib/libopen-pal.so= > .0.0.0) =20 > by 0x511B710: malloc (in /opt/openmpi-1.2.6/intel/lib/libopen-pal.so.0.0.= > 0) > by 0x907161A: ompi_coll_tuned_allreduce_intra_recursivedoubling (in /opt/= > openmpi-1.2.6/intel/lib/mca_coll_tuned.so) > by 0x906FFED: ompi_coll_tuned_allreduce_intra_dec_fixed (in /opt/openmpi-= > 1.2.6/intel/lib/mca_coll_tuned.so) > > by 0x4DFFEF7: PMPI_Allreduce (in /opt/openmpi-1.2.6/intel/lib/libmpi.so.0= > .0.0) > > by 0x4C9C612: PMPI_ALLREDUCE (in /opt/openmpi-1.2.6/intel/lib/libmpi_f77.= > so.0.0.0) > by 0x6E006E: my_mpireduce_call (my_routine.F) This looks (very) familiar to an issue I brought up with the OpenMPI folks earlier this year. See ticket 1942: https://svn.open-mpi.org/trac/ompi/ticket/1942 As a 5 second summary, the ticket is closed, there is a FAQ for this, and at least the trunk versions of OpenMPI have a valgrind suppression file that you'll want to use. -tom |
|
From: Ashley P. <as...@pi...> - 2009-11-30 12:07:39
|
On Sun, 2009-11-29 at 19:34 -0700, tom fogal wrote: > Indi Tristanto <iht...@ho...> writes: > > I am trying to debug a large iterative solver that has been compiled using = > > intel fortran 10 and open mpi 1.2.6. > > > This looks (very) familiar to an issue I brought up with the OpenMPI > folks earlier this year. See ticket 1942: > > https://svn.open-mpi.org/trac/ompi/ticket/1942 The error in that ticket is about uninitialised reads which do happen and are semi-expected with socket programming. The error in this email is about a crash (segfault) in the open mpi library, I doubt the two are related. > As a 5 second summary, the ticket is closed, there is a FAQ for this, > and at least the trunk versions of OpenMPI have a valgrind suppression > file that you'll want to use. I wasn't aware that OpenMPI now ships with a suppression file, if it does I'd suggest using it, it'll help separate out the errors from the noise. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk |
|
From: Indi T. <iht...@ho...> - 2009-12-01 22:49:59
|
Ashley, Tom Thank you for your reply. I add --mca btl tcp, self as mpirun argument following Julian suggestion on Tom's posting. Indeed this managed to supress open mpi errors to the extend that the valgrind log file is actually reduced from 20Mb to only a few kb. However I found the discussion through a google search, hence I'd appreciate if you could point me to the tittle of the FAQ so that I can follow the complete discussion. As of my problem, the program did crash, presumably with a segfault. I put 'presumably' since valgrind simply hang the whole parallel computation after printing out the last error message. Valgrind never print out any error until the program crash with messages I've wrote in my original posting. Since it happens in the middle of an iterative process, the line where the program crashes has been passed during the previous iterations without problem. As far as I can see the program is trying to call mpi_reduce when it crashed. Looking at valgrind behaviour, I have impression that the problem lies on open-mpi 1.2.6. rather than on my program. (Valgrind did not report anything when I checked the same program in sequential environment as well as under mpich environment with gnu compilation) However, I was expecting valgrind to report the problem earlier, i.e. when mpi_allreduce is called by the same program line at the very first iteration. What I am wondering at the moment is whether the problem is hidden within the ompi-suppressed errors ? Also have I been naive in expecting this sort of error to be reported cleanly? My problem is that I have more than one call to mpi_allreduce and the failure seems to happen randomly at any of the calls as well as the number of iteration. So my valgrind error log changes from each run, thus what I quoted on my posting is the "typical" error that I have seen. By the way thank you for the suggestion on upgrading the open mpi. Regards Indi > Subject: Re: [Valgrind-users] memcheck behaviour in random failure of an open mpi based code. > From: as...@pi... > To: tf...@al... > CC: iht...@ho... > Date: Mon, 30 Nov 2009 18:50:00 +0000 > > On Mon, 2009-11-30 at 10:20 -0700, tom fogal wrote: > > Ashley Pittman <as...@pi...> writes: > > > On Sun, 2009-11-29 at 19:34 -0700, tom fogal wrote: > > > > Indi Tristanto <iht...@ho...> writes: > > > > > I am trying to debug a large iterative solver that has been compiled usin > > > g = > > > > > intel fortran 10 and open mpi 1.2.6. > > > > > > > > > This looks (very) familiar to an issue I brought up with the OpenMPI > > > > folks earlier this year. See ticket 1942: > > > > > > > > https://svn.open-mpi.org/trac/ompi/ticket/1942 > > > > > > The error in that ticket is about uninitialised reads which do happen > > > and are semi-expected with socket programming. > > > > No, it is not. It is about valgrinding OpenMPI programs. It links to > > a thread which originally started with uninitialized reads, but if you > > follow the thread you'll note that the discussion became much wider > > than the original posting. > > > > > The error in this email is about a crash (segfault) in the open mpi > > > library, I doubt the two are related. > > > > At no point in Indi's email did he mention the application segfault or > > crashed. > > I was thinking about the "Adress 0x10 is not stack'd, malloc'd or > (recently) free'd" message. Actually there's a spelling mistake in that > error message, I assume this is from transmission somewhere rather than > in the actual valgrind output. > > Given that OpenMPI has it's own malloc implementation it's likely that > allocations aren't being intercepted and buffer over-runs aren't being > intercepted and quite possible the error is being caused by an invalid > write that valgrind isn't catching. > > Ashley, > > -- > > Ashley Pittman, Bath, UK. > > Padb - A parallel job inspection tool for cluster computing > http://padb.pittman.org.uk > _________________________________________________________________ View your other email accounts from your Hotmail inbox. Add them now. http://clk.atdmt.com/UKM/go/186394592/direct/01/ |
|
From: Ashley P. <as...@pi...> - 2009-12-02 12:42:19
|
On Tue, 2009-12-01 at 22:49 +0000, Indi Tristanto wrote: > Ashley, Tom > > Thank you for your reply. > > I add --mca btl tcp, self as mpirun argument following Julian > suggestion on Tom's posting. Indeed this managed to supress open mpi > errors to the extend that the valgrind log file is actually reduced > from 20Mb to only a few kb. However I found the discussion through a > google search, hence I'd appreciate if you could point me to the > tittle of the FAQ so that I can follow the complete discussion. The "--mca btl tcp,self" option tells OpenMPI to communicate only via tcp and loopback. Of note here is that "sm" or shared memory is missing, it's the shared memory fifo's that confuse valgrind and cause the 20Mb of errors you'd have seen. These would most likely all be false positives. > Valgrind never print out any error until the program crash with > messages I've wrote in my original posting. Since it happens in the > middle of an iterative process, the line where the program crashes has > been passed during the previous iterations without problem. As far as > I can see the program is trying to call mpi_reduce when it crashed. > Looking at valgrind behaviour, I have impression that the problem lies > on open-mpi 1.2.6. rather than on my program. (Valgrind did not report > anything when I checked the same program in sequential environment as > well as under mpich environment with gnu compilation) However, I was > expecting valgrind to report the problem earlier, i.e. when > mpi_allreduce is called by the same program line at the very first > iteration. If you look at your stack trace closer it claims to be in malloc which is itself in libopen-pal. Normally malloc is in libc and valgrind intercepts it as such, OpenMPI by default replaces the libc malloc with it's own version in libopen-pal which valgrind won't have intercepted. What this means is that you aren't getting the full value of memcheck as buffer over-runs and under-runs aren't being caught (as Valgrind doesn't know what has been malloced and what hasn't). Internally mpi_reduce is calling malloc which is then crashing, most likely because it's private data structures have been over-written by a buffer over-run. If you re-link your program without libopen-pal and re-run you will be using the libc malloc and valgrind will be able to do a lot more checks, hopefully including the buffer over-run which went on to cause the problem you are seeing. > Also have I been naive in expecting this sort of error to be reported > cleanly? No that's not naive. > My problem is that I have more than one call to mpi_allreduce and the > failure seems to happen randomly at any of the calls as well as the > number of iteration. So my valgrind error log changes from each run, > thus what I quoted on my posting is the "typical" error that I have > seen. I hope the above explains this, there will be many calls to malloc throughout the code and once you've over-written the meta-data it's essentially pot luck which one goes on to crash. > By the way thank you for the suggestion on upgrading the open mpi. I believe the latest OpenMPI doesn't replace malloc by default, I'm not entirely sure on this however. Certainly you can configure it not to. Ashley. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk |