From: Jonathan B. <jbr...@ea...> - 2006-11-04 20:53:04
|
On Sat, 2006-11-04 at 09:33 -0500, Arthur wrote: > Hi folks - > > Bruce is tied up, I know, trying to get his book to the publisher and > will not get to looking to what I am proposing re: numpy compatibility > until he gets that behind him. > > In the meantime --- > > A side effect of getting myself involved in that process was getting a > more serious look at Vpython4.0 beta, which I had been avoiding because > of the known bugs. And one look at the lighting_and_texture.py does get > one excited. > > The mouse.getclick() issue - which I first saw manifested in running the > dipole.py demo - seems the only deal killer bug. > > So I have taken a shot at it. > > After gaining what - given my C++ (lack of) expertise - is only a vague > understanding of what is going on in the relevant code, I took a shot - > just changing line 185 of the mouseobject.cpp from: > > shared_ptr<event> ret = events.py_pop(); > > to > > shared_ptr<event> ret = events.pop(); > > which has the effect - in the context of the code - of short-circuiting > the running of code that involves the direct manipulation of the Python > gil (global interpreter lock) when processing the return value from a > mouseclick. > > And - lo and behold - from the best I can presently determine, the > mouse.getclick() crash issues go away. Very interesting. The function of py_pop() vs pop() is related to shutdown. Keep the following rule in mind: The Python interpreter itself is not thread-safe. For any C/C++ code to invoke any python function, including Py_INCREF() and kin, that code must hold the Global Interpreter Lock. In native/plain/pure python multithreaded code, both threads run because each one periodically releases and re-acquires the lock. VPython's rendering thread directly accesses Python objects very rarely, and therefore tries hard to avoid holding the GIL. The rendering of any array-like object (curve, convex, faces) must access the elements of the Numeric array, and shutdown code must tell the interpreter to force any infinite loops to exit. In the case of array access, no Python functions are called because the numeric/numarray wrappers for those particular access functions dove below the published API and accessed the pointers directly. Scenario: - user program calls mouse.get_click() when no events are queued up. This eventually dispatches to events.pop(), which blocks indefinitely until user action through the GUI generates a click event. - Instead of generating a click event, the user is done with the program and clicks on the window-close button. - The window management code closes out the windows and exits the background/worker/renderer thread. As the last action of the outgoing thread, (see src/python/wrap_display_kernel.cpp), force_py_exit() is called. This function adds a "pending call" to the interpreter to call std::exit() (which is also called by sys.exit()). This is necessary to force an infinite loop in the user's code (a very common idiom) to terminate. But - in order to add the pending call, the shutdown code must make a call into the interpreter. Therefore, it must acquire the GIL. Since the interpreter did not release the lock when it called events.pop(), a deadlock occurred. The interpreter thread is waiting for a user event that will never arrive, and the gui thread is waiting for the interpreter to release a lock that it never will. My solution to this problem was to explicitly release the GIL when popping events off any of the event queues (which is what py_pop() does), and periodically wake up to call any pending events (see include/util/atomic_queue.hpp, src/core/util/atomic_queue.cpp). Apparently, the code that makes this happen has a bug in it. Your options are to either fix that code or come up with a new shutdown strategy and implement it. > Problem is that I suspect Jonathan had his reasons for doing his Python > gil thing, and I cannot determine what unwanted side effects this change > might have, e.g. on platforms other than windows. I think the "failure to exit" problem will persist on Windows, although IDLE may attempt to force program termination in a way that I'm not expecting. > Jonathan is probably the only person in a position to comment here. > > You there, Jonathan? Yep. Hope this information helps. -Jonathan |