Re: [OSR-devel] killall5 - proposed patch

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Gordan Bobic wrote:
> Gordan Bobic wrote:
>> Marc Grimme wrote:
>>
>>>> And just to confirm, I used the same binary on another machine 
>>>> (standalone, no OSR or clustering), and it works exactly as expected 
>>>> (prints out what processes it is killing). That means that whatever 
>>>> causes killall5 to go away and never return is specific to glfs+OSR 
>>>> (since killall5 works fine on my gfs+OSR clusters). I'm not sure where
>>>> to even begin debugging this, though, so any ideas would be welcome.
>>  >
>>> You might want to try to start it with strace. I recall something that
>>> under some environments the browsing through /proc which is done by
>>> killall5 freezes. And I think this is done before killing. Somehow what
>>> does not work is a stat call on some /proc files within /proc/<pid>. I
>>> don't recall exactly but I have something like this in mind.
>>>
>>> If you have found the pid that causes the problem perhaps we get some
>>> new ideas on how to handle this behaviour.
>> OK, I have straced killall5, and the last few things it does is stat 
>> /proc/version (twice, it seems) and set up SIGTERM, SIGSTOP and SIGKILL 
>> signals. This appears to correspond to lines 682-692 in killall5.c:
>>
>> mount_proc();
>> ...
>> signal(SIGTERM, SIG_IGN);
>> signal(SIGSTOP, SIG_IGN);
>> signal(SIGKILL, SIG_IGN);
>>
>> The last thing strace reports is:
>>
>> kill(4294967295,SIGSTOP
>>
>> (note - no closing bracket)
>>
>> which seems to correspond to line 695:
>>
>> if (TEST == 0) kill(-1, SIGSTOP);
>>
>> Reading what "man 2 kill" says:
>> POSIX.1-2001 requires that kill(-1,sig) send sig to all processes that 
>> the current process may send signals to, except possibly for some 
>> implementation-defined system processes.
>>
>> I have a suspicion that this may well be the cause of the problems. 
>> killall5 doesn't iterate through all the processes to kill! According to 
>> this, sending "kill(-1, <signal>)" sends the signal to all the processes 
>> that we have permissions to terminate without explicitly specifying the 
>> processes to terminate! Since killall5 is running as root at this point, 
>> this means all processes, with the possible exception of "some 
>> implementation-defined system processes". Right now my bet would be on 
>> this killing glusterfsd (which is in fact running in userspace, and thus 
>> is extremely unlikely to be exempt).
>>
>> This brings up another issue - it sounds like the -x option may be 
>> ineffective, too, even on the normal GFS related processes. If the 
>> signals get sent to all processes, then this would include the the 
>> processes specified by -x, regardless. This leads me to suspect that 
>> unless these processes are explicitly excluded in the kernel 
>> implementation, they are not spared the killing at this stage. Looking 
>> at the ps output - fenced, groupd, aisexec and ccsd, for example, don't 
>> show up in square brackets, which implies they aren't running in kernel 
>> space (although that isn't really definitive, only indicative, AFAIK). 
>> So, this may be affected by the bug, too - but this may not be obvious 
>> because once they die, the node will get fenced by the other nodes, 
>> which will end up doing something similar. Or maybe these processes 
>> simply catch and ignore the signals if they are being used (e.g. if gfs 
>> is mounted), or something like that. Anyway, that is just hypothesis at 
>> this point, but it's probably worth checking if you have a suitable test 
>> environment handy (I don't have a non-production gfs cluster handy at 
>> the moment).
>>
>> Anyway, I'm going to comment out line 695 and see how that goes. In 
>> theory, this seems superfluous anyway, since the iteration through /proc 
>> for processes to kill should catch everything anyway, and in fact, it is 
>> this iteration that -x relies on for it's functionality! Otherwise 
>> kill(-1) will just blow everything away and preempt anything -x might do 
>> in the first place!
>>
>> Am I missing something obvious here? Is there a flaw in my analysis?
> 
> Sorry, small ammendment - line 695 only sends SIGSTOP. Since it resumes 
> the processes afterwards, this may not affect all processes, e.g. those 
> required by gfs. But if it sends a stop to glusterfsd, it's almost 
> certain that rootfs will in fact block, so it is definitely an issue for 
> that. Since SIGSTOP cannot be caught or ignored by the process itself, 
> killall5 will have to be explicitly modified to do this differently, 
> e.g. using a double-pass through /proc, specifically without including 
> glusterfsd in the list of processes to signal.

Attached is a proposed patch that tries to work around this specific 
issue. It seems to work the machine no longer locks up on killall5, 
which is a good sign, and a definitive improvement. :^)

Please review.

Now the problem is that md devices get stopped shortly afterwards, just 
after the "INIT: no more processes left in this run-level" message. Now 
I have to figure out what does that, since these must remain running 
until the shutdown sequence reaches the OSR initroot... But that's 
something for a separate thread.

Gordan