Re: [java-gnome-hackers] Ugly deadlock with Dialog.run()
Brought to you by:
afcowie
From: Andrew C. <an...@op...> - 2010-08-11 23:44:23
|
On Wed, 2010-08-11 at 15:02 +0200, Guillaume Mazoyer wrote: > > Exception in thread "main" java.lang.IllegalArgumentException: You > asked > for ordinal 0 which is not known for the requested Constant type > org.gnome.gtk.ResponseType > Yeah, oops. I've [now] seen that a few times too. On Wed, 2010-08-11 at 16:42 +0200, Vreixo Formoso wrote: > > Attached is a bundle, so if you want to merge & try it, go ahead :) > > I'm getting the same error as Guillaume. Ok > The problem is the stop > condition in the while. The usual way to do that is while (Gtk.eventsPending()) { Gtk.mainIterationDo(false); } where "true" makes it blocking. I've never tried it with true. But anyway. > You also need a this.present() so the Dialog is actually shown. Yeah, I realized that when I was out running yesterday afternoon and thinking about this... :) > Ok once I fixed this I tested it, but it doesn't work either. Oh well > The reason > is the same as in the original problem. That's what I figured. > When native > gtk_main_iteration_do() is called, we hold a 2-level nested lock: the > one of the signal handler, plus the one in GtkMain.mainIterationDo(). > > I really think the easier approach is... Ok, before we go back to that, one more thought: The reason I roughed up that patch was to give me a starting point to suggest something else. I don't know if it helps, but: What would happen if fired off a tiny little worker thread inside our run()? I mention this because I used to do that a lot in my 2.x days — [what are now called] Button.Clicked handlers almost always started their own worker thread to do other things. So I'm wondering if it'd help here... Inside run() 1. fire off a worker thread, 2. have that worker thread be what calls maybe in some combination with Thread's join()...? Anyway. > to replace the synchronize blocks > with another kind of Lock, as I have described in my previous mail. Yeah, that's what we had from January as the likely necessary change. There are about 57 manually written synchronized () blocks in our code. So catching them all will take a bit of work. More pertinently, that would be a LOT of effort to essentially make *1* method work. I don't want to say "we don't have coverage of Dialog's run()" but... ++ I wanted to have another look at the deadlock. Here it is: "Thread-0" prio=10 tid=0x0000000001643800 nid=0x7416 waiting for monitor entry [0x00007ff241449000] java.lang.Thread.State: BLOCKED (on object monitor) at org.gnome.gtk.GtkProgressBar.pulse(GtkProgressBar.java:70) - waiting to lock <0x00007ff282b63bf0> (a org.gnome.gdk.Gdk$Lock) at org.gnome.gtk.ProgressBar.pulse(ProgressBar.java:137) at DialogDeadlock$Worker.run(DialogDeadlock.java:124) at java.lang.Thread.run(Thread.java:636) "main" prio=10 tid=0x000000000146f000 nid=0x740b runnable [0x00007ff29d044000] java.lang.Thread.State: RUNNABLE at org.gnome.gtk.GtkDialog.gtk_dialog_run(Native Method) at org.gnome.gtk.GtkDialog.run(GtkDialog.java:228) - locked <0x00007ff282b63bf0> (a org.gnome.gdk.Gdk$Lock) at org.gnome.gtk.Dialog.run(Dialog.java:244) at DialogDeadlock$1.onClicked(DialogDeadlock.java:65) at org.gnome.gtk.GtkButton.receiveClicked(GtkButton.java:392) at org.gnome.gtk.GtkMain.gtk_main(Native Method) - locked <0x00007ff282b63bf0> (a org.gnome.gdk.Gdk$Lock) at org.gnome.gtk.GtkMain.main(GtkMain.java:82) at org.gnome.gtk.Gtk.main(Gtk.java:119) at DialogDeadlock.main(DialogDeadlock.java:91) Hm. [actually I added the second "locked" line, because I know it's there; Java's thread dump doesn't show it because it was taken with MonitorEnter() not synchronized()] Now that I look at this, I'm not sure that manually letting the entire lock go is the right thing. Back to basics: The whole point of GTK's thread awareness is the requirement that only 1 thread by accessing GTK at a time. Fine. What that means is "one thread can make a call (that has a cascading effect, emits signals, etc etc, but that there's only one processing chain until that cascade is done". That way, between cascades, GTK is in a safe state. The complication is the main loop. gtk_main() actually releases the lock when it's idle. They don't really tell you that, but this mimics the behaviour of Object's wait() which also relinquishes its lock, then later tries to reacquire it. So my thought is that that is the only known safe point: when the main loop thinks it is ok to relinquish the lock, then something else can grab it and do GTK calls. Now you'll remember that we have access to the gdk_threads_enter() gdk_threads_leave() functions. We override them. But maybe, just maybe, that means that if we *also* do our own main loop iteration, then we can control the safe point ourselves, rather than it being magic. The point is "not do I have the lock" but "is GTK at a safe point where another thread can take over?" Interesting. ++ This is all just speculation. But if you go and fire off a nested main loop (which is what gtk_dialog_run() does, which is what calling gtk_main() a second time does) and you are *inside* a signal handler already, then ... what? Hm. AfC Sydney |