[java-gnome-hackers] Symol issues with GtkBuilder
Brought to you by:
afcowie
From: Andrew C. <an...@op...> - 2011-04-08 03:46:45
|
I had a shot at implementing GtkBuilder support in java-gnome. The public API was easy and was done in a flash. But it doesn't work. Exception in thread "main" java.text.ParseException: Invalid object type `GtkLabel' at org.gnome.gtk.Builder.addFromFile(Builder.java:107) at Designer.setupUserInterface(Designer.java:46) at Designer.main(Designer.java:80) Invalid object type? huh? ++ I have copied in the logs of a long conversation I had on #gtk+ today with Tristan van Berkom, Benjamin Otte, and Johan Dahlin. I'd ask you to read through it. The end result appears to be that because java-gnome reaches GTK via a shared library and not by virtue of being an executable itself, GtkBuilder won't work as is. There are two options: 1) implement a vFunc (by C side subclassing GtkBuilder, I think) to do ... something 2) manually call dlopen() with RTLD_GLOBAL which does ... something The fact that I was able to hack (2) to work is not interesting. What's the cost of it? I assume switching it to global would mean an insane growth in the runtime size of the symbol table. Is that bad? I assume so. How do we measure it? No idea. Also, if we did (2) then we'd have to manually list every system library name in C code. That's fragile, to say the least. On the other hand, it's not like our symbol table isn't huge already care of JNIEXPORT [which is the reason to investigate JNI's RegisterNatives, but that's another story]. But this wouldn't read-only sharable data space in a .so, right? We want to shrink our memory footprint, not grow it! Johan said that the vFunc was just for this sort of occasion. Ok, so presumably (1) is the better thing to do, but can anyone figure out from this conversation what, exactly, we're supposed to do there? My branch is at 'hackers/andrew/builder'. It's a 4.0 branch. With the patch below it "works", but we can't use that patch as is. Slightly edited IRC log follows. As you'll read, on quite a number of occasions I point out that I really don't know enough about linking to be able to properly assess what is (or isn't) going on. Thanks to Tristan, Benjamin, and Johan for their help getting us this far! Now, my friendly little java-gnome hackers, I need yours... AfC Sydney ++ AfC: Hm. I wonder why GtkBuilder would emit "Invalid object type `GtkLabel'" when parsing a .ui file? Company: AfC: not yet called gtk_init() or so? Company: AfC: the gtk_label_get_type() function must have been called once for the type to be registered AfC: Company: no, gtk_init() etc called already AfC: Company: er AfC: what? AfC: you're kidding. Company: no, i'm not Company: but i'm pretty sure there is some function that does that AfC: You mean someone has to manually call gtk_label_new() and gtk_foobar_new() and gtk_everything_else_new() before they can use GtkBuilder? Company: AfC: no tristan: AfC, you you doing this in C ? Company: AfC: i mean someone has to call gtk_label_get_type() tristan: AfC, there is a function that creates "gtk_label_get_type" from "GtkLabel" AfC: tristan: no, I'm finally getting around to porting our libglade coverage in java-gnome to GtkBuilder instead. It was easy enough, except that I've hit this tristan: it's used along with g_module_symbol() to initialize types Company: ahhh Company: more magical than i'd have thought tristan: AfC, I suspect... not sure... but suspect that the binding has problems getting the type from name AfC: bloody hell. Do you have any idea how much we have *not* exposed the _get_type() functions? They're meaningless in a language binding. tristan: I think bindings are supposed to override that part of the builder tristan: I mean... do you want builder to create a raw GtkLabel ? or do you want it to create some java-gnome wrapper object ? tristan: I suspect that in gtkmm they have some derived code that returns a wrapper ***tristan checks with builder how thats done for bindings... I dont recall Company: i'd expect you let the builder create the label tristan: AfC, GtkBuilderClass.get_type_from_name() Company: and then wrap it AfC: tristan: the proxy object (to use Owen's terminology) is already fully engineered, that works fine [and worked fine with libglade]. AfC: tristan: we're stuck at add_from_file tristan: Company, you would expect that to be done where then ? Company: tristan: in get_object() AfC: Company: that's what I'd expect, and that's what we're doing tristan: AfC, I dont know how it worked fine in the past and how it worked with libglade. AfC: Company: we're not even at get_object() yet. Given any pointer we can ... call g_type_name() on ... we can construct our proxy, no problem Company: AfC: as tristan says, that's not enough AfC: Company: but I can't understand why gtk_builder_add_from_file() can't hack it when glade_xml_parse() managed it AfC: Well. tristan: different languages seem to have totally different binding sets tristan: for instance, I'm not sure that when you create a derived widget with gtkmm, that it even registers a GType for the new class tristan: when you create an object with python... it makes a GType tristan: so they are bidirectional tristan: in a sense Company: AfC: glade called all gtk get_type() functions afair Company: AfC: was always tricky when gtk added new widgets and libglade hadn't been updated yet tristan: Company, would it be enough to do it in get_object() ? tristan: if I go and fetch a window or box... and then itterate it's children AfC: tristan: no we don't do anything like that. Never needed to. As far as C is concerned it's just a GtkWhatever. If the developer has created a Java side subclass that's their business. Things Just Work™. But that's not the problem here tristan: do I get widgets as children or wrappers ? Company: tristan: you get wrappers AfC: Company: ... called all gtk_get_type_ functions. Oh. Yes, well, that'd do it Company: AfC: so why does gtkbuilder not find the existing C function gtk_label_get_type() when trying to g_module_symbol() it? tristan: AfC, they *kindof* work then... if you cannot get the derived class details in GType terms in C... then ... it's impossible for me to make a useful java-gnome plugin for Glade Company: AfC: in java-gnome i mean AfC: Company: hm? tristan: because I cannot read any properties or signals that the java-gnome object may have added AfC: Company: Code is like this: AfC: Gtk.init() AfC: builder = new Builder(); AfC: builder.addFromFile("simple.ui"); AfC: crash AfC: (well, GError → Exception → terminate) Company: AfC: yes, because java-gnome is doing bad things Company: AfC: because g_module_symbol(null_module, "gtk_label_get_type") fails Company: AfC: and that's java-gnome's fault somehow AfC: Company: sorry, what is that? Company: AfC: that is C AfC: Company: look I know you guys don't give a shit about language bindings, but maybe we can leave off the "fault" stuff and assume that the project means well, and has done what was asked of it over the years? AfC: So if there's a new initialization pattern that needs to be supported, fine Company: AfC: there isn't Company: AfC: there's just a new way to look up class names Company: AfC: and it's done by inspecting the running code for an exported C function AfC: Company: so you're saying we instantiated the GtkBuilder object somehow wrong? tristan: AfC, I do give a shit, and I also wish I could introspect more from language-binding created classes... I dont see exactly what could be going wrong AfC: tristan: no, me neither; tristan: if your code is running with libgtk+, then "gtk_label_get_type" MUST be there tristan: no reason for the C code calling g_module_symbol to fail. AfC: tristan: what I wrote above is == to this C code, right? Company: AfC: no, i'm saying that a java-gnome application looks different to C code than a C application AfC: gtk_init() AfC: gtk_builder_new() AfC: gtk_builder_add_from_file() tristan: AfC, yes that should really definitely work AfC: well, that's the sequence of C calls we're making. :( Company: AfC: no, deep down it's not Company: AfC: are you linking your java app with gcc? Company: AfC: is it doing the equivalent of gcc -shared -lgtk-3 ? AfC: Company: This is GTK 2, but yes AfC: LINK=/usr/bin/gcc-4.4 -shared + pkg-config --libs tristan: right, I suppose there could be low-level trickery, that some obscure calls to ldd/nm might reveal Company: AfC: so if you run ldd on the generated binary, it will list libgtk just like when running it on /usr/bin/gtk-demo ? AfC: Company: there's no binary. There's a shared library loaded at runtime, but I will double check Company: AfC: aha! tristan: i.e. what is the actual symbol name... but if it's linking to a real libgtk+ lib... I still dont see why there would be no "T gtk_label_get_type" in there Company: AfC: shared libraries loaded at runtime may be loaded with hidden symbols Company: AfC: so that g_module_symbol() would not find the symbol AfC: Company: um, ok, that's interesting AfC: notes that the rest of the GNOME stack works fine, and has for 9 years Company: yes tristan: if it obfuscates the symbol name or such, you can reverse the process by implementing GtkBuilderClass.get_type_from_name() Company: the rest of the gnome stack doesn't have to resolve C symbols at runtime AfC: if [what] obfuscates? AfC: This is very interesting tristan: whatever obscure linking method you might use Company: tristan: it's dlopen()ing libgtk with RTLD_LOCAL AfC: Um, well, it's doing whatever pkg-config tells it to do tristan: Company, that means you can find nothing in the global namespace ? Company: tristan: actually, it's not dlopen()ing libgtk but libjava-gnome-plugin.so Company: tristan: yes AfC: yes AfC: (where it is the Java VM process, yes) tristan: right, so "whoever" is dlopening it.. has access to those symbols, there is a handle somewhere tristan: that handle needs to be used in GtkBuilder->get_type_from_name() tristan: to get the actual type AfC: tristan, Company: thanks for your help. I'll be honest and say I don't really understand what needs to be done; AfC: tristan, Company: sounds like its not a matter of a different linker flag but I certainly don't have access to the dlopen call being made by the VM [nor would any language binding, I imagine] AfC: tristan, Company: unless it's $something I can do before calling [say] gtk_init() Company: AfC: i'd suspect python or other languages have the same problem Company: JS, whatever tristan: AfC, short answer/explanation is that a language binding needs to know how to get the actual GType from a type name before the type is actually registered Company: unless they don't use RTLD_LOCAL Company: AfC: and as tristan says, you can overwrite the lookup function AfC: [I don't know that the JVM is, fwiw] tristan: python works find with GtkBuilder, they create real GObjects afaik also AfC: tristan: isn't that circular? g_type_from_name() only works after a type has been registered, right? AfC: Hey, if someone can point us at the call that eg Python is making before they call gtk_init() that magically fixes the problem, then I'll get it added right away. tristan: AfC, no it's not circular, I said a binding needs to make that association *before* the type is registered AfC: tristan: ah AfC: tristan: well as it happens we do have such a list tristan: it's simple: GTK+ provides standard naming convention, and all the _get_type() stubs are there tristan: just have to mash up the string a bit and use the _get_type() to create it AfC: tristan: so you're saying we have to invoke every _get_type() possible before attempting to use GtkBuilder. tristan: GTK+ already does the string mashing part, you can copy that code pretty easily tristan: AfC, no... you have to call _get_type() as a result of GtkBuilder->get_type_from_name() tristan: in your language binding's wrapper of GtkBuilder... you override the ->get_type_from_name() vfunc tristan: and make it do something language binding custom AfC: this is where I'm lost. There's nothing custom AfC: we just have a GtkBuilder* tristan: AfC, that was an ugly issue with libglade... you actually had to register each type to use them tristan: so a.) libglade needed updates for new types... and b.) applications needed to provide modules to load their own custom types tristan: AfC, you use GtkBuilder directly from java ?? tristan: or you have something in between, right ? tristan: the in between should use CustomBuilder instead of GtkBuilder directly tristan: with ->get_type_from_name() overridden AfC: We're verging off-topic for #gtk+ and I don't want to further clutter up the channel when important bug fixing is going on. AfC: tristan: I was hoping to have GtkBuilder coverage in our 4.1 release to replace the removed libglade, but it sounds like we'll have to dig a bit harder. AfC: appreciate your help tristan: I'll be around, sorry to walk out on you AfC: tristan: nah, that's cool (12:19:43) jdahlin: AfC: libglade has a huge table which maps widget name to GType, we wanted to avoid that when rewriting libglade into GtkBuilder AfC: jdahlin: I see jdahlin: and as Company pointed out, that caused problems when a new widget was added in gtk+ but not in that table AfC: jdahlin: sure, I grok that jdahlin: so dlsym(NULL, "gtk_widget_get_type") needs to work, not sure why that isn't working for java-gnome AfC: jdahlin: yeah. I don't really understand why a shared linking to GTK is different than an executable linking to GTK [with respect to however this works in a normal C GTK program] (which is Company's hypothesis) jdahlin: AfC: executable linking is done at compile time, while dlopen/dlsym is runtime jdahlin: AfC: GtkBuilder creates a bit of problem for C programs as well, it uses dlsym to find out callbacks specific in the .ui file jdahlin: so for C programs you need to link with -rdynamic jdahlin: is java-gnome using jna or jni? AfC: jdahlin: JNI tristan|afk: jdahlin, I think he means that... java-gnome being a shared lib is linking to GTK+... I *think*... that the java-gnome lib is dlopened by whatever is running this AfC: tristan|afk: that's correct jdahlin: AfC: the main executable is the jvm (/usr/bin/java or whatever), it loads in libjava-gnome.so via dlopen AfC: So if there's something I have to add to the linker invocation to build the .so, no problem jdahlin: for GtkBuilder to work you have to make all *_get_type symbols accessible in the main namespace for jvm jdahlin: that's usually accomplished by passing in RTLD_GLOBAL to dlopen tristan|afk: if I'm guessing Company's hypothesis right... maybe what is opening java-gnome is doing so without making the symbols global tristan|afk: so when that code actually ends up running dlsym() with NULL... it finds nothing jdahlin: how is libjava-gnome.so linked against libgtk.so? tristan|afk: jdahlin, or... it can be done by overriding ->get_type_from_name() inside java-gnome... I think, no ? AfC: If so, that's going to be hard to beat, because I have zero control over that. The only workaround would be to cause the VM process to load a *separate* library first, and call dlopen() on libgtk-x11-2.0.so.0 jdahlin: tristan|afk: that would work as well AfC: jdahlin: well, let's see jdahlin: AfC: do you know how the VM loads JNI modules? AfC: jdahlin: no AfC: jdahlin: more to the point, I have zero influence over it. AfC: I mean, it's gotta be dlopen() and dlsym(), but with what flags? No idea AfC: jdahlin: LINK=/usr/bin/gcc-4.4 -g -shared -Wall -fPIC AfC: and AfC: jdahlin: $LINK -o tmp/libgtkjni-4.0.20-dev.so $objects `pkg-config --libs $modules` AfC: [sic] jdahlin: okay, you could fix it up yourself, either by overriding get_type_from_name as tristan mentioned or just doing something like dlopen("libgtk.so", RTLD_GLOBAL) in JNI's module initialization function AfC: jdahlin: I can certainly do that. AfC: That's what suggested to Company AfC: though I wasn't sure if that would do anything if invoked from a .so that is already linked against libgtk-x11-2.0.so.0 jdahlin: it's a bit ugly using RTLD_GLOBAL though, you're polluting the namespaces jdahlin: it'll hopefully do the right thing jdahlin: the same needs to be done for all shared libraries with *_get_type functions that java-gnome wants to support, it's not just gtk jdahlin: but gtk+ is arguable the most important one AfC: As for "a bit ugly", that's what I'm blown away by all this. I've tried to follow this conversation, but fundamentally I just don't understand why it works for a C program [without "polluting"] and not a .so AfC: anyway, trying my best jdahlin: AfC: because of a number of reasons, a C program is linked at compile time and all the functions its using are known jdahlin: eg, it knows that gtk_label_get_type() is going to be used, but not gtk_button_get_type() etc jdahlin: you can check which functions are linked in by running ldd on the executable jdahlin: now, dynamic languages uses dlopen() which can specific, make all functions available or none at all jdahlin: the JNI authors decided the latter should be the default for some reason, and it doesn't seem (by 2 minutes googling) that it's possible to change that AfC: [ldd + executable = functions? What option is that?] jdahlin: or if it's objdump/nm, haven't looked at it for a while jdahlin: objdump -T executable seems to do it here on my system AfC: ah AfC: Well AfC: I added this before the call to gtk_init(): === modified file 'src/bindings/org/gnome/gtk/GtkMain.c' --- src/bindings/org/gnome/gtk/GtkMain.c 2010-02-14 20:14:38 +0000 +++ src/bindings/org/gnome/gtk/GtkMain.c 2011-04-08 02:51:05 +0000 @@ -31,6 +31,7 @@ * wish to do so, delete this exception statement from your version. */ +#include <dlfcn.h> #include <jni.h> #include <glib.h> #include <gdk/gdk.h> @@ -56,6 +57,7 @@ jobjectArray _args ) { + void* handle; int argc; char** argv; gint i; @@ -90,6 +92,12 @@ argv[0] = ""; argc++; + handle = dlopen("libgtk-x11-2.0.so", RTLD_NOW | RTLD_GLOBAL); + if (handle == NULL) { + bindings_java_throw(env, "dlopen() failed: %s", dlerror()); + return; + } + // call function gtk_init(&argc, &argv); AfC: and now GtkBuilder works jdahlin: great AfC: yeah. But is it great? :) jdahlin: great that it's working AfC: Certainly lends strength to Company's hypothesis, that's for sure jdahlin: is it inconvenient for you to override the GtkBuilder vfunc get_type_from_name? AfC: jdahlin: "no", but I'm not sure what I'd be overriding it with? jdahlin: AfC: implementing it! jdahlin: you need to subclass it and override it AfC: "This is mainly used when implementing the GtkBuildable interface on a type." jdahlin: sure, but we're outside of "mainly" right now jdahlin: I added it specifically for these kind of situations jdahlin: s/added/made it overridable/ jdahlin: gtkmm uses it as well if iirc AfC: Oh? Ok. I should go look at their code, then jdahlin: not really jdahlin: all gtkmm classes are subclasses, and they implement it to map GtkLabel to their overridden GType jdahlin: http://git.gnome.org/browse/gtkmm/tree/gtk/src/builder.ccg AfC: jdahlin: yeah, I'm reading that right now AfC: All they do is call g_type_from_name() ? jdahlin: gtkmm is considerably different from a dynamic binding, so you shouldn't worry too much AfC: um generally we're mostly the same; Java is a static language too jdahlin: no, it's not really static AfC: [except for this business of getting to GTK through a .so rather than a linked executable] jdahlin: in the sense that language bindings are loaded via dynamically AfC: Sure. I means static as in static language API, not a dynamic language API AfC: So I assume that GtkBuilder just calls g_type_from_name() itself internally. jdahlin: not really no jdahlin: but it works fine for gtkmm which calls all get_type functions when the library loads AfC: Ah AfC: No, we definitely don't do that. :) AfC: We could, I guess. Sounds expensive. jdahlin: maybe it doesn't, but their impl. requires you to call the _get_type function of a class before using it in gtkbuilder AfC: which is why g_type_from_name() works AfC: right AfC: hm AfC: Embedding system library names in the source code to pass to dlopen() is going to be fragile. AfC: I wondered if -rdynamic to the linker would have the same effect, but no jdahlin: kind of fragile yes, but the alternatives are wors jdahlin: e AfC: jdahlin: that seems pretty nuts of GTKmm; if I do get_object() on something I think is be a GtkButton and it turns out to be a GtkCheckButton (say) it'll crash AfC: [actually, it won't get past add_from_file() which is my problem right now] jdahlin: dunno about the details, you need to check with the gtkmm authors if you really want to know AfC: jdahlin: yeah, I'll talk to Murray next time I see him ***jdahlin -> zzz AfC: jdahlin: do you have an opinion about the "expense" of dlopen(... , | RTLD_GLOBAL)? AfC: jdahlin: ah, g'night. Thanks for your help |