|
From: Ashley P. <as...@qu...> - 2005-05-26 11:36:53
|
On Wed, 2005-05-25 at 11:36 -0700, Jeremy Fitzhardinge wrote: > Julian Seward wrote: > > >When a program using this driver starts up, it creates a child > >thread using clone. No problem. The child hangs around and > >basically doesn't do anything much (purpose is unclear, but that > >doesn't matter). It calls a custom ioctl which communicates with > >the Elan3 kernel module. The ioctl doesn't return until (I assume) > >the parent thread tells the kernel module that it is done with the > >card. The ioctl returns and the child exits. > > > >Hence the child waits for the parent to exit, then exits itself. > > > > > How does the parent thread tell the kernel it is done with the module? > By closing all the file descriptors? It must be some implicit > mechanism, because if it did an explicit ioctl() or something, we would > do the same. It's somewhat complicated... the parent thread calls elan3_detach (an ioctl) and the device driver sets some state and wakes up the kernel thread sitting in the lwp ioctl. This thread then returns done and the lwp exits. Other than that the lwp only returns to user-space to take signals. That's the theory anyway, it's complicated by the fact that we have kernel patches (not just modules) to provide "ptrack" functionality, basically the job starts in a container and when the job finishes all processes (and sys-v stuff) created in that container also get destroyed. If you are using rms/pdsh/slurm to start jobs then you will be using the ptrack code (it's done by the open source "rms" kernel module), if you are just running your programs by hand then you won't have the ptrack stuff. It's purpose is to make syscalls on behalf of the nic, the c code on the nic sets up a descriptor, generates and interrupt which the lower half forwards onto the lwp kernel thread. This thread then makes syscalls back into the kernel from the top half as the appropriate user with suitable permissions. Ashley, |