From: Thomas M. <th...@m3...> - 2015-11-20 15:22:03
|
Am 20.11.2015 3:08 nachm. schrieb Anton Ivanov <ant...@ko...>: > > On 20/11/15 13:48, st...@ni... wrote: > > Den 2015-11-20 13:50, skrev Anton Ivanov: > >> On 20/11/15 12:26, st...@ni... wrote: > >>>>> 4. While I can propose a brutal patch for signal.c which sets > >>>>> guards > >>>>> against reentrancy which works fine, I suggest we actually get to > >>>>> the > >>>>> bottom of this. Why the code in unblock_signals() does not guard > >>>>> correctly against that? > >>>> Thanks for hunting this issue. > >>>> I fear I'll have to grab my speleologist's hat to figure out why > >>>> UML > >>>> works this way. > >>>> Cc'ing Al, do you have an idea? > >>> In the few stack-traces that I have seen posted here, I could see > >>> multiple calls to unlocking of signals (with a signal occurred > >>> directly > >>> after). That probably should not happen. Do we count the number of > >>> timers of time we try to block/unblock signals and only actual > >>> perform > >>> the action when the counter reaches/leaves 0? > >>> > >>> if this series of calls happens: > >>> block() > >>> foo() > >>> block() > >>> bar() > >>> unblock() <- this should be a no-op > >>> foobar() > >>> unblock() <- first here the signals should be unblocked again > >> Block/unblock are not counting the number of enable/disable at > >> present. > >> It is either on or off. > >> > >> Any unblock will immediately re-trigger all pending interrupts. > >> > >> Some of the errata patches I have out of investigating this do > >> exactly > >> that - change: > >> > >> block to flags = set_signals(0); bar() ; set_signal(flags); > >> > >> This, if nested should be a NOP. > >> > >> However, even after fixing all of them (and their corresponding > >> kernel > >> side counterparts), I still get reentrancy, so there is something > >> else > >> at play too. > > Please, share a stack-trace if possible. > > > > > > > > As a side-note: > > The small issue with the code example above I can see is that what if > > flags should have change during bar(). > > I see it too, but I have not figured out how to deal with it. > > > > And code inside bar can do > > set_signals() magic. > > Correct, which is to some extent our issue. > > > > > I am not linux kernel ABI expert. > > > > To me, it seems to be a more safe to have a ABI that tracks each signal > > blocked mask individually, and have a ref-counted block-all/unblock-all > > call. This would be like how you normally program on a CPU. You have a > > interrupt controller that you setup (masks), and a master interrupt > > enable/disable flag. > > That is what signal.c is trying to simulate - you have a mask for ALRM > (or VTALRM with the older timers) and SIGIO and a global on/off. > > What that fails to emulate, however, is that an IRQ is usually blocked > until it is fully serviced. This, depending on IRQ controller design may > block all IRQs, all lower priority IRQs or none. > > The current code in uml tries to block all while processing an IRQ, but > for some reason fails. > > I will submit a patch to put some ducktape over this for the time being, > we should understand what is the root cause. I hope the change in arch/um/kernel/skas/MMU.c isn't the cause of all this trouble! I wanted to disable interrupt processing and so the forwarding of timer interrupts to the user space process when the user space is currently in a critical section of forking itself, and no signal handler is installed yet! > > A. > > > > > Stian |