Menu

#5066 Don't panic on triggerPipe overrun

obsolete: 8.6b2
pending-fixed
7
2012-07-11
2012-07-09
No

As detected by Vikki on thread

https://groups.google.com/group/comp.lang.tcl/tree/browse_frm/thread/b08745216b7b4932/cca074d98284b134?hl=en&rnum=1&_done=%2Fgroup%2Fcomp.lang.tcl%2Fbrowse_frm%2Fthread%2Fb08745216b7b4932%2Fd0d28843efbd42a7%3Fhl%3Den%26#doc_cca074d98284b134

when a write on the notifier's triggerPipe returns EAGAIN (because the stream of notifier configuration updates is too fast for the thread to drain the pipe), we now panic, starting from this commit:

File unix/tclUnixNotfy.c
2009-12-16 23:25:59 - part of checkin [e52afe7fa4] on branch trunk - Fix gcc warning: ignoring return value of ‘write’, declared with attribute warn_unused_result CONSTify functions TclpGetUserHome and TclSetPreInitScript (TIP #27) (user: nijtmans ) [annotate]

This too heavy handed. As explained in the above thread, in this situation the notifier will be constantly woken up, so losing one of these "wake-ups" is not an issue (one wake up can serve several updates).

Note that this only applies to the \0 signal (wake-up).
For the 'q' signal (for notifier finalization), things are trickier:
- blocking mode is still out of the question since we're holding the notifierMutex. In this situation it would lead to a deadlock.
- losing the 'q' is problematic since we'd never end the finalization
So I would advise a switch to a lockless scheme for the 'q' (which is okay since it is a monotonic flag).

Assigning to Jan as author of the offending commit ;-)

Discussion

  • Jan Nijtmans

    Jan Nijtmans - 2012-07-10

    Should be fixed now in core-8-5-branch. trunk still to be done.

    The real problem (the notifyer losing wake-ups) is still
    not fixed, but this should at least be enough to make
    the described bug disappear in the upcoming 8.5.12.

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-07-10
    • status: open --> open-fixed
     
  • Alexandre Ferrieux

    No, the notifier is not losing wake-ups when the pipe is full !

    The purpose of the \0 send over the triggerPipe is to kick the notifier out of the current select(), recompute the select masks according to the new state of all fileevents in all threads, and back to blocking select() with the new state.
    Hence, a single \0 can serve many updates. When the pipe is full of 4096 wake-ups, the 4097th is a waste anyway.

    As mentioned earlier, only the 'q' is problematic. My suggestion for this is to replace the 'q' with simply setting a global flag pleaseNotifierDie, and sending a regular \0 over the pipe, with the same waking effect. The flag, checked in the loop at the same place where the 'q' is sought, would break out of the loop and terminate the notifier. In the context of the original poster, this would even yield a faster exit since this would bypass the 4096 enqueued (useless) updates.

    Comments ?

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-07-11

    >Comments ?
    I think your suggestions are good. But the title of this issue
    is about panicing when it shouldn't, well it doesn't any
    more now in the described situation. ;-)

    What I would like very much is a test-case, showing
    the problem. I know the current nofitier has
    problems, but it's difficult to tackle that
    without a test-case. Since 8.6 is
    multi-threaded by default, I only
    would make such further changes
    in 8.6, and possibly backport
    only afther they proved theyr
    effect. By any means: feel
    free to do that! Apparently
    you experimented more with
    the notifier than I did.

    Thanks!
    Jan Nijtmans

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-07-11
    • milestone: --> obsolete: 8.6b2
    • assigned_to: nijtmans --> ferrieux
     
  • Jan Nijtmans

    Jan Nijtmans - 2012-07-11

    Forgot to mention: merged to trunk now as well.

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-07-11
    • status: open-fixed --> pending-fixed
     
  • Reinhard Max

    Reinhard Max - 2012-07-11

    FWIW, the pipe(7) manpage on Linux says that from version 2.6.11 the capacity of a pipe is 64kBytes and before it was identical to the page size, e.g. 4096 on i386.