|
From: Subhankar G. <sg...@ca...> - 2007-03-12 14:27:50
|
Ashley, Thanks for your reply. > Are you using mpd (the mpich2 default) as your job launcher or some > other program? If you are using mpd is it installed as yourself or > root, I know it's possible to do either. Also, can you run the program > normally and have a look at the process list when it's running, what > is the ppid of the child process(c) and can you trace this back to > the parent(p) or does it trace back to the mpd ring? Yes, I'm using mpd as the job launcher. Probably for this reason ppid of process 'c' is mpd, ie it's tracing back to the mpd ring rather than the parent process 'p'. BTW, is there any other free tool I can use to detect memory error in child process 'c'? Regards, Subhankar On Fri, 09 Mar 2007 13:29:33 +0000, Ashley Pittman wrote > On Fri, 2007-03-09 at 18:45 +0530, Subhankar Ghosh wrote: > > Dear All, > > > > I'm trying to detect memory error in a child process(c) in my MPICH2 > > application. This child process is being generated my calling > > MPI::Comm_spawn() from parent(p). The command line I'm using: > > > > >mpiexec -l -n 1 valgrind --trace-children=yes --log=file=vgnd.log p <cmdline > > option to p> > > > > But no logfile is created for child process 'c' by valgrind. > > > > Am I missing anything? > > Sabhankar, > > It's a interesting problem, what you have done at least looks right but > there are a number of caveats and the devil is in the detail with MPI > implementations. Comm_spawn() will probably launch mpiexec to start > the new processes and as mpiexec is likely a suid program valgrind won't > trace it :( > > Are you using mpd (the mpich2 default) as your job launcher or some > other program? If you are using mpd is it installed as yourself or > root, I know it's possible to do either. Also, can you run the program > normally and have a look at the process list when it's running, what > is the ppid of the child process(c) and can you trace this back to > the parent(p) or does it trace back to the mpd ring? > > I suspect that getting this to work might require modifications to > MPICH2 itself although they should be fairly minor, I expect I can knock > up a patch in a hour or two if I'm right about the problem although I'm > very busy currently. > > Ashley, |