From: SourceForge.net <no...@so...> - 2010-11-02 17:01:19
|
Bugs item #3028676, was opened at 2010-07-12 15:12 Message generated for change (Comment added) made by das You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112997&aid=3028676&group_id=12997 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 70. Event Loop Group: development: 8.6b1.1 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Lars Hellström (lars_h) Assigned to: Daniel A. Steffen (das) Summary: Cocoa event loop won't let external events through Initial Comment: I've got a program where I've chopped up a long running computation into smaller pieces, using the after 0 [after idle {next_iteration}] idiom for scheduling the next chunk of work while allowing the GUI to stay responsive; I believe this is the recommended way to do this. Unfortunately my program, which has worked well on my old PPC macs, does not seem to work on my new Intel mac: although the GUI is being updated in response to changes in e.g. label -textvariable values, I'm no longer getting any response to pressing buttons when the computation is running. Since this includes buttons I'd press to control the program, it makes it pretty useless! Instructions to see the problem: 1. Start wish from a terminal. 2. [source] the attached beachball-bug.tcl. -> This will create a window "Control panel" with three buttons "Halt", "Pause", and "Run", and two labels (each displaying the number 0). 3. Press "Run". -> The left label will start to count upwards, slowing down as it goes. The right label will mostly say 1 but sometimes 0. 4. Press "Pause" -> What should happen is that the counting stops and the right label changes to 2; an [after info] command (that may be typed at the prompt in the terminal) should confirm nothing is happening. ->< However, what rather happens is that the "Pause" button becomes highlighted and *stays highlighted*, while the counting continues unaffected. The mouse pointer will soon turn into the spinning beach ball above wish windows, confirming that wish is not processing any events. At this stage, you may type one (1) command at the terminal and have wish evaluate it; if you make this command "after cancel [after info]" then you can still stop the counting, but otherwise there's no way of stopping it short of killing the wish process. I see this problem for: * Tk 8.6b1.2 that I've compiled myself from sources checked out from CVS. * Tk 8.5.7 from /System/Library/Frameworks/Tk.framework/..., that I suppose Apple compiled. I don't see this problem for: * Tk 8.4.19 from /System/Library/Frameworks/Tk.framework/..., that I again suppose Apple compiled. * Tk 8.5.6 that I compiled on my old mac; this is a PPC executable, but it runs (I suppose) using Rosetta. The problematic wishes all seem to be 64-bit executables (tcl_platform(wordSize) is 8) whereas the working ones are 32-bit (tcl_platform(wordSize) is 4). I suppose this could mean the faulty/working distinction is rather one of Cocoa/Carbon, but I don't know if there is some other way to check if the 8.4.19 is Carbon or Cocoa. ---------------------------------------------------------------------- >Comment By: Daniel A. Steffen (das) Date: 2010-11-02 10:01 Message: Lars, thanks for looking into this! The TkCocoa notifier is a hybrid of an embedding and an embedded notifier, see tkMacOSXNotifier.c for details. The outermost event loop is driven by tcl via Tcl_WaitForEvent() and has a tk event source that processes NSEvents (i.e. embedding). That processing may however involve nested CFRunLoops (possibly running in different runloop modes) e.g. during menu tracking/window resizing/when a modal dialog is presented etc. At such time, the relationship is reversed (i.e. embedded), the runloop is driven externally and our runloop observer processes tcl events via Tcl_ServiceAll(). Note that as far as tk goes, the while loop at issue here is only ever used in the second case (i.e. a nested runloop invocation). For typical tcl use as in the testsuite, that while loop should never be used at all (the notifier must have been switched to embedded-enabled mode via Tcl_ServiceModeHook() for it to be used). OTOH this is the normal mode of operation when tcl is embedded in a Cocoa app that drives its own event loop that is unaware of tcl. As far as getting stuck in this while loop, Tcl_ServiceAll() is supposed to return true as long as it has processed an event, so this might be due to new tcl events continuously arriving. >From the backtrace you posted, it looks like an after event called from TclServiceIdle() is involved that ends up calling a nested blocking [after] ? Do note that running the tcl testsuite is insufficient to guard against regressions in this area as it does not cover the notifier in embedded mode. the tk testsuite for TkCoco as well as TkX11 along with manual testing of the tk widget demo is required (in particular, test that the animated demos correctly continue animating during menu tracking/window resizing, invocation of a menu item via menu or key binding, and while a modal dialog & a modal sheet & a file open dialog/sheet are displayed (all of these involve slightly different uses of nested runloops that have caused problems in the past). ---------------------------------------------------------------------- Comment By: Lars Hellström (lars_h) Date: 2010-11-02 04:06 Message: Contued working on this. I request comments from someone who understands how a Tcl notifier is normally supposed to work. If I understand the Tcl Notifier manpage correctly, a custom modifier embedded into an external event loop (which tclMaxOSXNotify.c appears to be) is supposed to call Tcl_ServiceAll to make sure Tcl events are serviced, *but only once* on each invokation. This makes sense, since Tcl_ServiceAll already contains a loop over the event queue. However, when I try this (reduce the while loop to just a Tcl_ServiceAll call), parts of the Tcl test suite seem to become very slow, and http11.test gets stuck entirely. Reading up on CFRunLoop*, I think I understand why. UpdateWaitingListAndServiceEvents is set up as an observer callback, and the kCFRunLoopBeforeWaiting case in which it calls Tcl_ServiceAll is "I'm about to go to sleep now", so this is the CF equivalent of an [after idle] callback. The effect of servicing Tcl events here is that they only get serviced after some other event has occurred, and the CF runloop is preparing to go idle again. Hence a process must receive some external event between each invokation of UpdateWaitingListAndServiceEvents, and if it never does (as might be the case for some thread of http11.test) then it is stuck. That doesn't mean the while loop is the right thing to do, however; I rather suspect it could be covering up a second bug -- namely that Tcl events are only serviced at idle time -- but I haven't studied the code in enough detail to say that for sure. There could be other situations than this kCFRunLoopBeforeWaiting callback in which Tcl event are serviced, and in that case it might be OK. Also, upon randomly browsing documentation, it has occurred to me that a minimal fix to the problem at hand could be to set up a version 0 CFRunLoopSource, and replace the while loop while ( Tcl_ServiceAll() && tsdPtr->waitTime == 0) {} with something like if ( Tcl_ServiceAll() && tsdPtr->waitTime == 0) { CFRunLoopSourceSignal( tsdPtr->wantCallbackRLSource ); } to make sure the runloop thinks "oh, something happened", and go idle again a bit later. But I *really* want a second pair of eyes to take a look at the problem as a whole before attempting that. ---------------------------------------------------------------------- Comment By: Lars Hellström (lars_h) Date: 2010-10-28 13:00 Message: Have finally gotten around to running this in a debugger. When Wish is in its unresponsive state, it appears to be sitting in the while loop at line 1417 of tclMacOSXNotify.c: while (Tcl_ServiceAll() && tsdPtr->waitTime == 0) {} The full call stack is typically something like #0 0x7fff823fb2fa in mach_msg_trap #1 0x7fff823fb96d in mach_msg #2 0x7fff802493c2 in __CFRunLoopRun #3 0x7fff8024884f in CFRunLoopRunSpecific #4 0x10038c799 in Tcl_Sleep at tclMacOSXNotify.c:1545 #5 0x100350d73 in AfterDelay at tclTimer.c:1042 #6 0x100350446 in Tcl_AfterObjCmd at tclTimer.c:843 #7 0x100249c3c in NRRunObjProc at tclBasic.c:4371 #8 0x1002499d1 in TclNRRunCallbacks at tclBasic.c:4318 #9 0x10024c24b in TclEvalObjEx at tclBasic.c:5893 #10 0x10024c1ef in Tcl_EvalObjEx at tclBasic.c:5874 #11 0x100351019 in AfterProc at tclTimer.c:1173 #12 0x1003501a6 in TclServiceIdle at tclTimer.c:745 #13 0x100326bad in Tcl_ServiceAll at tclNotify.c:1087 #14 0x10038c4bb in UpdateWaitingListAndServiceEvents at tclMacOSXNotify.c:1417 #15 0x7fff8026d077 in __CFRunLoopDoObservers #16 0x7fff802490cf in __CFRunLoopRun #17 0x7fff8024884f in CFRunLoopRunSpecific #18 0x7fff8356691a in RunCurrentEventLoopInMode #19 0x7fff8356671f in ReceiveNextEventCommon #20 0x7fff835665d8 in BlockUntilNextEventMatchingListInMode #21 0x7fff844ec29e in _DPSNextEvent #22 0x7fff844ebbed in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] #23 0x1001309f8 in -[TKApplication(TKNotify) nextEventMatchingMask:untilDate:inMode:dequeue:] at tkMacOSXNotify.c:63 #24 0x7fff846cc7c1 in -[NSCell trackMouse:inRect:ofView:untilMouseUp:] #25 0x7fff846fd536 in -[NSButtonCell trackMouse:inRect:ofView:untilMouseUp:] #26 0x7fff846cb4b5 in -[NSControl mouseDown:] #27 0x7fff845e5763 in -[NSWindow sendEvent:] #28 0x7fff8451aee2 in -[NSApplication sendEvent:] #29 0x100130b4e in -[TKApplication(TKNotify) sendEvent:] at tkMacOSXNotify.c:85 #30 0x10013112c in TkMacOSXEventsCheckProc at tkMacOSXNotify.c:307 #31 0x100326a28 in Tcl_DoOneEvent at tclNotify.c:964 #32 0x100026a30 in Tk_MainLoop at tkEvent.c:2134 #33 0x10003add1 in Tk_MainEx at tkMain.c:363 #34 0x10000545a in main at tkAppInit.c:70 The AfterDelay and stuff is because beachball-bug.tcl contains an [after 5] (instead of some actual processing) to slow down the counter. The while loop mentioned above is part of #14 (UpdateWaitingListAndServiceEvents), whereas the nearest function from tkMacOSXNotify is #30 (TkMacOSXEventsCheckProc). I suspect this confirms Daniel's suspicions below. ---------------------------------------------------------------------- Comment By: Lars Hellström (lars_h) Date: 2010-07-16 00:41 Message: A few additional observations: 1. The following script is sufficient to show how wish goes spinning beachball when one tries to press "Stop": pack [label .l -textvariable ::count] -side top pack [button .b -text "Start" -command {step}] -side left pack [button .b2 -text "Stop" -command {after cancel [after info]}] -side right set ::count 0 proc step {} { incr ::count after 0 {after idle step} } The symptoms are a bit different however: in this case, the counting stops (at least visibly). 2. If the [after 0] is changed to [after 1] then everything works as expected. (I still haven't tried debugging it with gdb, but it'd probably be a good exercise.) ---------------------------------------------------------------------- Comment By: Lars Hellström (lars_h) Date: 2010-07-12 23:28 Message: Confirming all the working wishes are Carbon (and the buggy Cocoa). Is there a way of forcing build with Carbon? OTOH, I suppose that would mean the executable becomes 32-bit, and wouldn't be able to [load] extensions built 64-bit. Sigh. A cvs update of the tcl and tk sources changes nothing that seems related to events, save a few test files. Haven't tried building yet, but I'm probably seeing this for a version later than the changes you mentioned. ---------------------------------------------------------------------- Comment By: Daniel A. Steffen (das) Date: 2010-07-12 16:11 Message: The tk/macosx/README tells you how to differentiate between Carbon and Cocoa. However this is most likely an issue with the tcl notifier, and so not a tk bug strictly speaking (although it will only manifest in tk cocoa, which uses the embedded mode of the tcl notifier, unlike tk carbon). Can you confirm that this happens with the latest tcl from CVS HEAD? there were a bunch of fixes dealing with recursive event loop invocations not that long ago. In any case, to debug this tclMacOSXNotify.c is what would need to be looked at/stepped through. Running sample/taking a stackshot or attaching with gdb when things are spinning would also be useful I don't anticipate being able to look into this myself anytime soon however. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112997&aid=3028676&group_id=12997 |