Re: [java-gnome-hackers] Ugly deadlock with Dialog.run()
Brought to you by:
afcowie
|
From: Andrew C. <an...@op...> - 2010-08-11 23:44:23
|
On Wed, 2010-08-11 at 15:02 +0200, Guillaume Mazoyer wrote:
>
> Exception in thread "main" java.lang.IllegalArgumentException: You
> asked
> for ordinal 0 which is not known for the requested Constant type
> org.gnome.gtk.ResponseType
>
Yeah, oops. I've [now] seen that a few times too.
On Wed, 2010-08-11 at 16:42 +0200, Vreixo Formoso wrote:
> > Attached is a bundle, so if you want to merge & try it, go ahead :)
>
> I'm getting the same error as Guillaume.
Ok
> The problem is the stop
> condition in the while.
The usual way to do that is
while (Gtk.eventsPending()) {
Gtk.mainIterationDo(false);
}
where "true" makes it blocking. I've never tried it with true. But
anyway.
> You also need a this.present() so the Dialog is actually shown.
Yeah, I realized that when I was out running yesterday afternoon and
thinking about this... :)
> Ok once I fixed this I tested it, but it doesn't work either.
Oh well
> The reason
> is the same as in the original problem.
That's what I figured.
> When native
> gtk_main_iteration_do() is called, we hold a 2-level nested lock: the
> one of the signal handler, plus the one in GtkMain.mainIterationDo().
>
> I really think the easier approach is...
Ok, before we go back to that, one more thought:
The reason I roughed up that patch was to give me a starting point to
suggest something else. I don't know if it helps, but:
What would happen if fired off a tiny little worker thread inside our
run()?
I mention this because I used to do that a lot in my 2.x days — [what
are now called] Button.Clicked handlers almost always started their own
worker thread to do other things.
So I'm wondering if it'd help here... Inside run()
1. fire off a worker thread,
2. have that worker thread be what calls
maybe in some combination with Thread's join()...?
Anyway.
> to replace the synchronize blocks
> with another kind of Lock, as I have described in my previous mail.
Yeah, that's what we had from January as the likely necessary change.
There are about 57 manually written synchronized () blocks in our code.
So catching them all will take a bit of work.
More pertinently, that would be a LOT of effort to essentially make *1*
method work. I don't want to say "we don't have coverage of Dialog's
run()" but...
++
I wanted to have another look at the deadlock. Here it is:
"Thread-0" prio=10 tid=0x0000000001643800 nid=0x7416 waiting for monitor entry [0x00007ff241449000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.gnome.gtk.GtkProgressBar.pulse(GtkProgressBar.java:70)
- waiting to lock <0x00007ff282b63bf0> (a org.gnome.gdk.Gdk$Lock)
at org.gnome.gtk.ProgressBar.pulse(ProgressBar.java:137)
at DialogDeadlock$Worker.run(DialogDeadlock.java:124)
at java.lang.Thread.run(Thread.java:636)
"main" prio=10 tid=0x000000000146f000 nid=0x740b runnable [0x00007ff29d044000]
java.lang.Thread.State: RUNNABLE
at org.gnome.gtk.GtkDialog.gtk_dialog_run(Native Method)
at org.gnome.gtk.GtkDialog.run(GtkDialog.java:228)
- locked <0x00007ff282b63bf0> (a org.gnome.gdk.Gdk$Lock)
at org.gnome.gtk.Dialog.run(Dialog.java:244)
at DialogDeadlock$1.onClicked(DialogDeadlock.java:65)
at org.gnome.gtk.GtkButton.receiveClicked(GtkButton.java:392)
at org.gnome.gtk.GtkMain.gtk_main(Native Method)
- locked <0x00007ff282b63bf0> (a org.gnome.gdk.Gdk$Lock)
at org.gnome.gtk.GtkMain.main(GtkMain.java:82)
at org.gnome.gtk.Gtk.main(Gtk.java:119)
at DialogDeadlock.main(DialogDeadlock.java:91)
Hm.
[actually I added the second "locked" line, because I know it's there;
Java's thread dump doesn't show it because it was taken with
MonitorEnter() not synchronized()]
Now that I look at this, I'm not sure that manually letting the entire
lock go is the right thing.
Back to basics:
The whole point of GTK's thread awareness is the requirement that only 1
thread by accessing GTK at a time. Fine. What that means is "one thread
can make a call (that has a cascading effect, emits signals, etc etc,
but that there's only one processing chain until that cascade is done".
That way, between cascades, GTK is in a safe state.
The complication is the main loop.
gtk_main() actually releases the lock when it's idle. They don't really
tell you that, but this mimics the behaviour of Object's wait() which
also relinquishes its lock, then later tries to reacquire it.
So my thought is that that is the only known safe point: when the main
loop thinks it is ok to relinquish the lock, then something else can
grab it and do GTK calls.
Now you'll remember that we have access to the gdk_threads_enter()
gdk_threads_leave() functions. We override them.
But maybe, just maybe, that means that if we *also* do our own main loop
iteration, then we can control the safe point ourselves, rather than it
being magic.
The point is "not do I have the lock" but "is GTK at a safe point where
another thread can take over?"
Interesting.
++
This is all just speculation.
But if you go and fire off a nested main loop (which is what
gtk_dialog_run() does, which is what calling gtk_main() a second time
does) and you are *inside* a signal handler already, then ... what?
Hm.
AfC
Sydney
|