From: Bodo S. <bst...@fu...> - 2005-07-15 15:36:53
|
I tried to adapt fix-stub_segv-stack.patch to my s390. Unfortunately sigreturn on s390 needs to have the address of the entire signal-stackframe in r15. What is given to the handler as a param is a pointer to the sigcontext only. stackframe contains some space for handler's register saving _before_ sigcontext. So I thought about avoiding the sigreturn at all. Working on this I found out some small points: - there is a bug (typo) in wait_stub_done() - stopping stub_segv_handler with a "breakpoint" without calling sigreturn, lets SIGSEGV being blocked after that. At the next SIGSEGV, I see stub_wait_done() calling panic just as with Rob's problem. And I see the child being gone! I understand, that the host unblocks SIGSEGV and sets the handler to SIG_DFL. But I don't understand, why the child already is gone after the waitpid(), without resuming it. I guess, this is the reason for Rob not being able to debug the problem. Thus I added SA_NOMASK to the flags for the handler. - I changed the additional mask for the handler to be empty. The only exception is x86_64, that currently must use sigreturn and therefore still masks SIGUSR1 while the handler runs. In skas, userspace shouldn't receive SIGIO or SIGWINCH (I hope I'm right here?), SIGVTALRM already is handled by wait_stub_done. Then I changed i386's stub_segv_handler to stop using "int3" immediately after saving faultinfo. This new method saves some syscalls on i386 and s390 and simplifies s390. All three patches are attached. They are tested in i386 and s390, for me they work fine. I hope, the patches don't break x86_64. Unfortunately I can't test that. If there would be a solution for the RCX problem on x86_64, I would prefer to use the "int3" this subarch also, making the nasty ARCH_STUB_SEGV_MASK_SIGNAL macro obsolete. Bodo |
From: Jeff D. <jd...@ad...> - 2005-07-15 20:53:43
|
On Fri, Jul 15, 2005 at 05:36:34PM +0200, Bodo Stroesser wrote: > All three patches are attached. They are tested in i386 and s390, > for me they work fine. I hope, the patches don't break x86_64. > Unfortunately I can't test that. > If there would be a solution for the RCX problem on x86_64, I would > prefer to use the "int3" this subarch also, making the nasty > ARCH_STUB_SEGV_MASK_SIGNAL macro obsolete. Thanks, I've got these saved away. I might not get to merge them until after KS/OLS though. Jeff |
From: Jeff D. <jd...@ad...> - 2005-07-31 12:50:08
|
> Thus I added SA_NOMASK to the flags for the handler. This patch causes sleep to hang sometimes for me. A sleep during shutdown reproducably hangs, as does the sleep before a reboot after a fsck found errors. I had to track it down by bisection. I'm somewhat mystified by this. The only situation where this could cause a change in behavior is one that shouldn't happen, as far as I can see. But it's definitely the cause. Pop the patch, the hangs go away, push it, they come back. I did this a couple times to be sure. I'll debug this and figure out what's happening. Jeff |
From: Bodo S. <bst...@fu...> - 2005-08-05 11:44:06
|
Jeff Dike wrote: >> Thus I added SA_NOMASK to the flags for the handler. > > > This patch causes sleep to hang sometimes for me. A sleep during > shutdown reproducably hangs, as does the sleep before a reboot after a > fsck found errors. > > I had to track it down by bisection. I'm somewhat mystified by this. > The only situation where this could cause a change in behavior is one > that shouldn't happen, as far as I can see. > > But it's definitely the cause. Pop the patch, the hangs go away, push > it, they come back. I did this a couple times to be sure. > > I'll debug this and figure out what's happening. > > Jeff > Could you already find out what happens? On what subarch did you see the problem? If I understand correctly, a sleep() done by a UML userspace-task hangs? Bodo |
From: Jeff D. <jd...@ad...> - 2005-08-12 17:15:28
|
From our discussion yesterday, ... On Fri, Jul 15, 2005 at 05:36:34PM +0200, Bodo Stroesser wrote: > So I thought about avoiding the sigreturn at all. Working on > this I found out some small points: I kept this patch > - there is a bug (typo) in wait_stub_done() but dropped these two, plus my dont-save-fpregs patch. Correct? > - stopping stub_segv_handler with a "breakpoint" without > calling sigreturn, lets SIGSEGV being blocked after that. > At the next SIGSEGV, I see stub_wait_done() calling panic > just as with Rob's problem. And I see the child being gone! > I understand, that the host unblocks SIGSEGV and sets the > handler to SIG_DFL. But I don't understand, why the child > already is gone after the waitpid(), without resuming it. > I guess, this is the reason for Rob not being able to debug > the problem. > Thus I added SA_NOMASK to the flags for the handler. > > - I changed the additional mask for the handler to be empty. > The only exception is x86_64, that currently must use sigreturn > and therefore still masks SIGUSR1 while the handler runs. > In skas, userspace shouldn't receive SIGIO or SIGWINCH (I hope > I'm right here?), SIGVTALRM already is handled by wait_stub_done. > Then I changed i386's stub_segv_handler to stop using "int3" > immediately after saving faultinfo. > This new method saves some syscalls on i386 and s390 and > simplifies s390. Jeff |
From: Bodo S. <bst...@fu...> - 2005-09-13 13:17:56
|
Jeff Dike wrote: > From our discussion yesterday, ... > > On Fri, Jul 15, 2005 at 05:36:34PM +0200, Bodo Stroesser wrote: > >>So I thought about avoiding the sigreturn at all. Working on >>this I found out some small points: > > > I kept this patch > > >>- there is a bug (typo) in wait_stub_done() > > > but dropped these two, plus my dont-save-fpregs patch. Correct? > > >>- stopping stub_segv_handler with a "breakpoint" without >> calling sigreturn, lets SIGSEGV being blocked after that. >> At the next SIGSEGV, I see stub_wait_done() calling panic >> just as with Rob's problem. And I see the child being gone! >> I understand, that the host unblocks SIGSEGV and sets the >> handler to SIG_DFL. But I don't understand, why the child >> already is gone after the waitpid(), without resuming it. >> I guess, this is the reason for Rob not being able to debug >> the problem. >> Thus I added SA_NOMASK to the flags for the handler. >> >>- I changed the additional mask for the handler to be empty. >> The only exception is x86_64, that currently must use sigreturn >> and therefore still masks SIGUSR1 while the handler runs. >> In skas, userspace shouldn't receive SIGIO or SIGWINCH (I hope >> I'm right here?), SIGVTALRM already is handled by wait_stub_done. >> Then I changed i386's stub_segv_handler to stop using "int3" >> immediately after saving faultinfo. >> This new method saves some syscalls on i386 and s390 and >> simplifies s390. > > > Jeff Sorry, late reply: you are right. Bodo |