From: Mike H. <mho...@gr...> - 2005-02-22 16:33:26
|
Brian Paul wrote: > Joel Welling wrote: > >> Hi folks; >> I'm encountering a problem managing the crserver processes for >> configs in which I use the teac network mechanism; I think it will >> also be seen in things like GM. When I spawn the crserver I keep >> track of its PID, and when I want it to terminate I send a TERM >> signal to that PID. Fair enough; the process that got that signal >> terminates, running through the 'teardown' routine in the crserverlib >> code. >> The problem is, by this time the crserver has spawned 2 child >> processes. I believe they are actually other threads, but I'm not >> sure- I know they are not being created with crSpawn(). One of those >> threads is very, very busy doing a wait loop, waiting for messages on >> my high-bandwidth network. That process, and the third process which >> is its child, do *not* die when their parent (the originally spawned >> process) gets the TERM signal. >> Can anyone confirm for me that these are threads, and tell me which >> bit of code is likely to be spawning them? Are their PIDs or thread >> IDs getting saved anywhere? I can modify the teardown procedure to >> kill them if I have their names, but I can't simply kill the whole >> process group- that kills innocent bystanders. > > > I don't know how the crserver would be spawning any threads. The > crserver itself isn't even thread-safe. > > Are you sure the GM library isn't creating the threads? > > -Brian > I also haven't seen the threading behavior, but I have seen the networks hang. The issue has been that faking a connection based protocol on connectionless networks (Quadrics/Myrinet/IB) has had an issue with shutdowns for as long as I can remember. Sometimes things work correctly if the application exits cleanly so that signals to terminate are passed around, but killing one of the applications or servers can cause the others to hang. I'm not sure what the best solution here is. Maybe we need to start trapping all of the signals on the nodes to make sure to shut the connections down. Another, potentially better, option is to rely on the mothership's connection brokering. If the mothership loses a connection to anyone in the group, send a notification to the others that things should terminate. The layers, other than tcp/sdp/udp, will need to timeout on waitrecv and test the mothership connection. This might also fix the deadlock detect and hang problem that happens occasionally. In general, this points to a network layer rewrite that we have talked about off and on for the past two years. There is some cruftyness in writing highspeed layers. It seems that instead of choosing to make everything look like tcpip, we might want to find a more general abstraction and map the layers to that, something like a point2point layer (i.e. send this message to this system(s), without enforcing connection semantics). For example, there is an MPI layer branched in CVS that is getting painful to get working in general cases, mainly because of reliance on special process creation semantics and the enforcement of connection based networking semantics. The reason it's hard to get a network rewrite done is the shear time commitment involved and not everyone has access to all of the different network systems. Although, many interconnects are now supporting socket interfaces and *DAPL (*=u/k/s), which might make for more simplified porting efforts. -Mike |