|
From: Ashley P. <as...@qu...> - 2005-06-13 17:31:31
|
On Sat, 2005-05-28 at 13:17 -0700, Jeremy Fitzhardinge wrote:
> Ashley Pittman wrote:
>
> >It's somewhat complicated...
> >
> >
> Er, yep.
>
> >the parent thread calls elan3_detach (an ioctl) and the device driver
> >sets some state and wakes up the kernel thread sitting in the lwp ioctl.
> >This thread then returns done and the lwp exits. Other than that the
> >lwp only returns to user-space to take signals.
> >
> >
> So what makes it return done? What triggers that event?
Either the elan3_detach ioctl or the close of the fd at program exit
causes a bit to be set and the extra thread then wakes up, notices the
bit, returns to user-space where the thread exits. The code in question
looks like this:
if (--ctxt->LwpCount != 0) /* Still other LWPs running */
{
spin_unlock_irqrestore (&dev->IntrLock, flags);
return;
}
kcondvar_wakeupall (&ctxt->LwpWait, &dev->IntrLock); /* Wakeup anyone waiting on LwpCount */
I'm not really a kernel programmer though, I can go over it again or
forward this onto someone with a better understanding of this if you
need better understanding of this.
I'd be surprised if many programs actually call elan3_detach() though,
there are no hooks from MPI_Finilize through so it probably never gets
called.
> >That's the theory anyway, it's complicated by the fact that we have
> >kernel patches (not just modules) to provide "ptrack" functionality,
> >basically the job starts in a container and when the job finishes all
> >processes (and sys-v stuff) created in that container also get
> >destroyed.
> >
> >
> Is this some extra kernel state which Valgrind needs to understand to do
> a correct emulation?
Possibly but hopefully not.
> How are these containers created? In this case,
> would the program running under valgrind create a new container which is
> expected to mop up all the threads when the main thread exits? How is a
> "job" defined?
These containers are created by the rms kernel modules (the kernel
module is open-source, RMS the application is not. There is open-source
software which uses the kernel module) Typically to run a "job" over
say four cpus you would type "prun -n4 mping" which would start four
programs each of which would be expected to call elan_init(). Each
program in this job would have a "vp" or virtual process number from 0
to N-1 (In MPI terms this is called "rank"). Each of these four
processes is kept inside it's own container and the rms kernel module
keeps track of any child processes and/or sys-v objects made and ensures
that they are all torn down properly at program exit.
There is another way of running programs outside of this mechanism
though, it's kind of messy and we don't recommend using it for anything
other than fine-grained bug-hunting but it does work so I suspect the
above may be a red herring.
> In the 2.6 NPTL thread model, exit_group() terminates all threads in the
> thread group atomically, so there's no waiting around for things to
> terminate (or dependence on termination order). Is this running in a
> 2.4 thread model, or a 2.6 one? It sounds like the container machinery
> has an atomic group termination property similar to exit_group().
It does sound similar, it works across child programs though, not just
thread groups. Probably not relevant to this bug however.
Going back to the original questions, the thread should be implicitly be
woken and then die when the parent thread terminates, hence the deadlock
if the parent thread isn't exiting. How does V work WTR any other
blocking syscall being in progress at program exit?
Ashley,
|