As detected by Vikki on thread
when a write on the notifier's triggerPipe returns EAGAIN (because the stream of notifier configuration updates is too fast for the thread to drain the pipe), we now panic, starting from this commit:
File unix/tclUnixNotfy.c
2009-12-16 23:25:59 - part of checkin [e52afe7fa4] on branch trunk - Fix gcc warning: ignoring return value of ‘write’, declared with attribute warn_unused_result CONSTify functions TclpGetUserHome and TclSetPreInitScript (TIP #27) (user: nijtmans ) [annotate]
This too heavy handed. As explained in the above thread, in this situation the notifier will be constantly woken up, so losing one of these "wake-ups" is not an issue (one wake up can serve several updates).
Note that this only applies to the \0 signal (wake-up).
For the 'q' signal (for notifier finalization), things are trickier:
- blocking mode is still out of the question since we're holding the notifierMutex. In this situation it would lead to a deadlock.
- losing the 'q' is problematic since we'd never end the finalization
So I would advise a switch to a lockless scheme for the 'q' (which is okay since it is a monotonic flag).
Assigning to Jan as author of the offending commit ;-)
Should be fixed now in core-8-5-branch. trunk still to be done.
The real problem (the notifyer losing wake-ups) is still
not fixed, but this should at least be enough to make
the described bug disappear in the upcoming 8.5.12.
No, the notifier is not losing wake-ups when the pipe is full !
The purpose of the \0 send over the triggerPipe is to kick the notifier out of the current select(), recompute the select masks according to the new state of all fileevents in all threads, and back to blocking select() with the new state.
Hence, a single \0 can serve many updates. When the pipe is full of 4096 wake-ups, the 4097th is a waste anyway.
As mentioned earlier, only the 'q' is problematic. My suggestion for this is to replace the 'q' with simply setting a global flag pleaseNotifierDie, and sending a regular \0 over the pipe, with the same waking effect. The flag, checked in the loop at the same place where the 'q' is sought, would break out of the loop and terminate the notifier. In the context of the original poster, this would even yield a faster exit since this would bypass the 4096 enqueued (useless) updates.
Comments ?
>Comments ?
I think your suggestions are good. But the title of this issue
is about panicing when it shouldn't, well it doesn't any
more now in the described situation. ;-)
What I would like very much is a test-case, showing
the problem. I know the current nofitier has
problems, but it's difficult to tackle that
without a test-case. Since 8.6 is
multi-threaded by default, I only
would make such further changes
in 8.6, and possibly backport
only afther they proved theyr
effect. By any means: feel
free to do that! Apparently
you experimented more with
the notifier than I did.
Thanks!
Jan Nijtmans
Forgot to mention: merged to trunk now as well.
FWIW, the pipe(7) manpage on Linux says that from version 2.6.11 the capacity of a pipe is 64kBytes and before it was identical to the page size, e.g. 4096 on i386.