Thread: [java-gnome-hackers] Duplicating assertion `GDK_IS_WINDOW (window)' failed crash
Brought to you by:
afcowie
From: Andrew C. <an...@op...> - 2010-12-20 13:05:51
|
I still don't have any insight into what's going on with https://bugzilla.gnome.org/show_bug.cgi?id=637132#c2 but I have come up with three useful diagnostics: 1) manually print a message when an object gets finalized We have a fair bit of printf infrastructure for debugging memory management; in this case I think all you really need is a message when a GObject Proxy is getting finalized: === modified file 'src/bindings/org/gnome/glib/Object.java' --- src/bindings/org/gnome/glib/Object.java 2010-01-06 05:24:40 +0000 +++ src/bindings/org/gnome/glib/Object.java 2010-12-20 10:59:56 +0000 @@ -111,7 +111,7 @@ * continue to exist quite happily.</i> */ protected final void release() { - if (Debug.MEMORY_MANAGEMENT) { + if (true) { System.err.println("Object.release()\t\t" + this.toString()); System.err.flush(); } 2) ensure WARNINGs in finalizer thread aren't missed out on Since exceptions thrown in finalizers are suppressed, we're missing out on the errors occurring during from GC. It is almost tempting to call ExceptionDescribe() here, although if we did that you'd see two stack traces for any normal (heh) CRITICAL. === modified file 'src/jni/bindings_java_util.c' --- src/jni/bindings_java_util.c 2010-01-06 07:07:08 +0000 +++ src/jni/bindings_java_util.c 2010-12-20 11:25:56 +0000 @@ -396,6 +396,9 @@ msg = g_strdup_printf("%s-%s\n%s", log_domain, level, message); + g_printerr("DANGER %s\n", msg); + fflush(stderr); + bindings_java_throwByName(env, "org/gnome/glib/FatalError", msg); g_free(msg); 3) do crazy amounts of GC I wrote up a thread to do nothing but invoke the garbage collector ... frequently. :) === modified file 'src/quill/ui/UserInterface.java' --- src/quill/ui/UserInterface.java 2010-11-05 13:05:45 +0000 +++ src/quill/ui/UserInterface.java 2010-12-20 12:59:19 +0000 @@ -68,7 +68,25 @@ } private void setupWindows() { + final Thread t; + primaries = new HashSet<PrimaryWindow>(3); + + t = new Thread() { + public void run() { + while (true) { + try { + Thread.sleep(1000); + } catch (InterruptedException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } + System.gc(); + } + }; + }; + t.setDaemon(true); + t.start(); } hahah... Anyway, the result is that instead of it taking a whole lot of operations and 10s of minutes of typing, with Quill I can trigger the crash *immediately*. Just press Ins and try and insert a segment. AfC Sydney |
From: Andrew C. <an...@op...> - 2010-12-20 14:02:16
|
On Tue, 2010-12-21 at 00:05 +1100, Andrew Cowie wrote: > I still don't have any insight into what's going on with Ok, I've made some progress. Attached to that bug is a short test program which demonstrates the problem. The insight from speeing up the GC was that it was pressing Ins that triggered it. Weird; what's a popup menu got to do with disappearing TextViews? And then I realized that there aren't many places in my program, Quill, which actually create Proxy objects for GdkWindows. the call which does that is Widget's getWindow() and the context menu code in Quill is the only place that happens. This afternoon I spent a couple hours trying to write a multithreaded program which would demonstrate the crash, but unsuccessfully. Tonight, with Guillaume's help, I narrowed it down to a simple program. The catch is you need something which creates org.gnome.gdk.Window proxy for something other than the top level Window; many GtkWidgets actually draw on that top level GtkWindow's GdkWindow [the whole 'client-side-windows' branch that was merged to GTK about 2.18]. So, if you create a [org.gnome.gdk] Window that can be finalized seperately from the top level, the program will crash. See https://bugzilla.gnome.org/show_bug.cgi?id=637132#c6 or http://paste.pocoo.org/show/307378/ So. The question is: what the hell is wrong with our GdkWindow proxies? AfC P.S. Interesting different WARNINGs if you do this: assert (underlying != parent.getWindow()); |
From: Andrew C. <an...@op...> - 2010-12-21 08:14:39
|
On Tue, 2010-12-21 at 01:02 +1100, Andrew Cowie wrote: > On Tue, 2010-12-21 at 00:05 +1100, Andrew Cowie wrote: > > I still don't have any insight into what's going on with > > Ok, I've made some progress. Problem located (I think): Our implementation for Widget's getWindow() did not take a Ref. ++ Turn on debugging: Debug.MEMORY_MANAGEMENT = true and #define DEBUG_MEMORY_MANAGEMENT TRUE Our GObject memory management code assumes (and requires) that when we go to make a ToggleRef, we have one (and only one) normal Ref that we own. The addToggleRef() code drops that normal Ref. Meanwhile, our override for getWindow() just returned a the address. Comparing to generated code, I realized our Override doesn't have a call to bindings_java_memory_cleanup() [which I must admit never much liked the name of]. Adding a g_object_ref() there appears to fix the problem. Replacing the Override with some .defs to get normal generated code for gtk_widget_get_window() [now that it exists; it didn't a few years ago] also appears to work [since it generates the call to bindings_java_memory_cleanup()] ++ Hm. So while that will be a "fix", I'd like to rethink some of the code in src/jni/bindings_java_memory.c. If we can consolidate bindings_java_memory_cleanup() into bindings_java_memory_ref() I think we'll be on the right track. What worries me is that this is a general class of bug that we could be exposed to all over the place whenever we've done an override. So the problem needs to be engineered out. If anyone working on this could confirm that the GdkWindows are still getting finalized without crashing that'd be outstanding. The trick is *is the GObject still being finalized*? It's easy to "add a g_object_ref()" call anywhere, but if that means the Ref count never drops to zero then you'll never get objects free()d and that wouldn't do us any good, would it? :) AfC Sydney |
From: Andrew C. <an...@op...> - 2010-12-21 16:39:18
|
On Tue, 2010-12-21 at 19:14 +1100, Andrew Cowie wrote: > On Tue, 2010-12-21 at 01:02 +1100, Andrew Cowie wrote: > > On Tue, 2010-12-21 at 00:05 +1100, Andrew Cowie wrote: > > > I still don't have any insight into what's going on with > > > > Ok, I've made some progress. > > Problem located Fix merged to 'mainline' revno 776: Fix coding mistake leading to crashes from: Gdk-CRITICAL "assertion `GDK_IS_WINDOW (window)' failed The leading indicator of trouble was: Gdk-WARNING "losing last reference to undestroyed window" but we were not seeing that since it was arising during the Java VM's Finalizer thread, and the VM is specified to swollow and discard Exceptions which occur in finalizers. Damn. Traced the problem through to Widget's getWindow(). We had written an Override to implement this method since at the time there was not accesor function. Our manually written code was eronously not calling bindings_java_memory_cleanup() [which is a terrible name] which has the effect of taking a Ref so that bindings_java_memory_ref() can discard it after creating the ToggleRef. Though we could have corrected the manually written code, a strong gtk_widget_get_window() now exists, so replace the override with data in GtkWidget.defs and call generated code instead. ++ We really need to do something about this Exceptions being swollowed in Finalizers thing. And, bindings_java_memory_cleanup() needs to GO AWAY. Pissing me off. I haven't been down in these layers of java-gnome for > 3 years. Now that we know what we're doing a little better I'm thinking of merging that function and bindings_java_memory_ref(). But that can wait. AfC Sydney |