From: Geoffrey F. <fu...@ga...> - 2001-10-11 19:20:14
|
I'm having some problems with the source after a recent update. 1) The dynamic drivers don't work inside Java anymore. I traced this down to a recent change in drivers.in which removed libplplot from the driver linking step. The reason this was there was explained in dyndrv.in, commit for version 1.3. Basically the issue is that there are multiple ways dlopen can be used (this panoply of RTLD_* flags), and the way we do it in libplplot when opening the drivers is more lenient than how the JVM does it when loading libplplot. So, the drivers need to be linked against libplplot. Maybe they don't have to be linked against all the other libs, if libplplot is now free of them, but the drivers still make some calls into the core of libplplot, so they have to be linked against at least that. I'll take care of this (again). 2) The Tk driver doesn't work other than in the tmp directory anymore. I can't for example, get the /installed/ plrender to use the Tk driver anymore. For example: plplot/tmp> x01c -dev plmeta -o x1.plm Plplot library version: 5.0.4 Opened x1.plm plplot/tmp> ~/j2/bin/plrender x1.plm -dev tk TCL command " invalid command name " Program aborted Here I ran x01c in plplot/tmp, produced a .plm file, but then ran the installed plrender from $prefix/bin, and it won't run the tk driver. But I can run the tk driver from x01c directly (if LD_LIBRARY_PATH includes both . and $prefix/lib). But the local plrender won't correctly invoke the tk driver even then, even though x01c will. Does anyone else see this with a current checked out tree? I'm guessing this comes from Joao's recent changes to cf stuff, but I'm not sure. I know this stuff worked /before/ I updated... But I have some other uncommitted changes right now, so I'm wondering if the tk driver works in plrender for other developers at this point in time. -- Geoffrey Furnish fu...@ga... |
From: Geoffrey F. <fu...@ga...> - 2001-10-11 20:34:50
|
Geoffrey Furnish writes: > I'm having some problems with the source after a recent update. > > 1) The dynamic drivers don't work inside Java anymore. I traced this > down to a recent change in drivers.in which removed libplplot from the > driver linking step. The reason this was there was explained in > dyndrv.in, commit for version 1.3. Basically the issue is that there > are multiple ways dlopen can be used (this panoply of RTLD_* flags), > and the way we do it in libplplot when opening the drivers is more > lenient than how the JVM does it when loading libplplot. So, the > drivers need to be linked against libplplot. Maybe they don't have to > be linked against all the other libs, if libplplot is now free of > them, but the drivers still make some calls into the core of > libplplot, so they have to be linked against at least that. I'll take > care of this (again). BTW, just in case there is any skepticism about this, here is how you can see for sure that every driver depends on libplplot: plplot/tmp> nm drivers/pbm.drv | grep " U " U ___brk_addr@@GLIBC_2.0 U __curbrk@@GLIBC_2.0 U __environ@@GLIBC_2.0 U atexit@@GLIBC_2.0 U fclose@@GLIBC_2.1 U fprintf@@GLIBC_2.0 U fwrite@@GLIBC_2.0 U pdf_finit U plFamInit U plOpenFile U plP_setphy U plP_setpxl That's just one example, but there is some set of external PLplot API symbols showing up in every one I've checked. So libplplot.so really can't be remvoed from the list of dependencies for drivers. BTW, it would be /possible/ to break this dependency, if we introduced one of these cute structure dereferencing tricks like they do in the Java Native Interface (and I think someone was saying the TEA stubs thing is similar, though I haven't investigated that myself yet). Anyway, my point is just that there does exist a known technique for defeating this linker-visible symbol coupling, if it is sufficiently disturbing to justify the effort. Right now, I'm not sufficiently disturbed by the coupling to do this work, I'm just gonna augment the driver production rules so they are fully linked. But if in the future anyone /really really really/ wants this broken, then we should do it the right way, with one of these structure pointer tricks. -- Geoffrey Furnish fu...@ga... |
From: <jca...@in...> - 2001-10-11 23:37:18
|
On Thursday 11 October 2001 21:34, Geoffrey Furnish wrote: | Geoffrey Furnish writes: | > I'm having some problems with the source after a recent update. | > | > 1) The dynamic drivers don't work inside Java anymore. I traced | > this down to a recent change in drivers.in which removed | > libplplot from the driver linking step. The reason this was | > there was explained in dyndrv.in, commit for version 1.3.=20 | > Basically the issue is that there are multiple ways dlopen can | > be used (this panoply of RTLD_* flags), and the way we do it in | > libplplot when opening the drivers is more lenient than how the | > JVM does it when loading libplplot. So, the drivers need to be | > linked against libplplot. Maybe they don't have to be linked | > against all the other libs, if libplplot is now free of them, | > but the drivers still make some calls into the core of | > libplplot, so they have to be linked against at least that.=20 | > I'll take care of this (again). OK, if you can't, you can't. But I can. I configure/installed in=20 /usr/local/test, then, as another user, compiled x01x.c: > gcc x01c.c -o x01c -L/usr/local/test/lib -I/usr/local/test/include=20 -lplplot -Wl,-rpath -Wl,/usr/local/test/lib then, > ./x01c -dev ntk=20 Plplot library version: 5.0.4 Cannot open library file: drivers/drivers.db lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" Device not loaded! tag=3Dntk, drvidx=3D12 Trying to load ntk.drv on ./drivers/ntk.drv Trying to load at /usr/local/test/lib/drivers/ntk.drv and it work. Also for tk (that is static, not dyndrv). My understanding of dlopen() is that the libraries that the opening=20 program was linked with are available to the dlopened code, so there=20 is no need to link it again with the same libraries. This works for=20 other programs also: > ldd /usr/X11R6/lib/modules/xie.so=20 libm.so.6 =3D> /lib/libm.so.6 (0x4008b000) libc.so.6 =3D> /lib/libc.so.6 (0x400a8000) /lib/ld-linux.so.2 =3D> /lib/ld-linux.so.2 (0x80000000) | BTW, just in case there is any skepticism about this, here is how | you can see for sure that every driver depends on libplplot: | | plplot/tmp> nm drivers/pbm.drv | grep " U " | U ___brk_addr@@GLIBC_2.0 | U __curbrk@@GLIBC_2.0 | U __environ@@GLIBC_2.0 | U atexit@@GLIBC_2.0 | U fclose@@GLIBC_2.1 | U fprintf@@GLIBC_2.0 | U fwrite@@GLIBC_2.0 | U pdf_finit | U plFamInit | U plOpenFile | U plP_setphy | U plP_setpxl Those are resolved at *load* time, and as the "calling" program has=20 "loaded" the libraries (libplplot, etc), everything is OK. This is how I understand it, and it works for me. The plrender problem, is also not completely reprodutible for me: > ./x01c -dev plmeta -o po.plm Plplot library version: 5.0.4 Cannot open library file: drivers/drivers.db lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" Device not loaded! tag=3Dplm, drvidx=3D0 Trying to load plmeta.drv on ./drivers/plmeta.drv Trying to load at /usr/local/test/lib/drivers/plmeta.drv Opened po.plm > /usr/local/test/bin/plrender po.plm -dev xwin Cannot open library file: drivers/drivers.db lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" That's OK. Also for ntk and gnome, but there seems to be a problem,=20 they quickly render the page and quit after that. There is really a problem with the tk driver, as you said: > /usr/local/test/bin/plrender po.plm -dev tk Cannot open library file: drivers/drivers.db lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" TCL command "/usr/local/plplot/lib/drivers/drivers.db" failed: invalid command name=20 "/usr/local/plplot/lib/drivers/drivers.db" Program aborted But, given that the others run OK, I am inclined to another problem. tk is static, dont work, xwin is static, works, ntk, gnome are dynamic, work. ps, psc, etc are dynamic also work OK. Thus, the tk problem is another problem. Or maybe not? Joao | | That's just one example, but there is some set of external PLplot | API symbols showing up in every one I've checked. So libplplot.so | really can't be remvoed from the list of dependencies for drivers. | | BTW, it would be /possible/ to break this dependency, if we | introduced one of these cute structure dereferencing tricks like | they do in the Java Native Interface (and I think someone was | saying the TEA stubs thing is similar, though I haven't | investigated that myself yet). | | Anyway, my point is just that there does exist a known technique | for defeating this linker-visible symbol coupling, if it is | sufficiently disturbing to justify the effort. Right now, I'm not | sufficiently disturbed by the coupling to do this work, I'm just | gonna augment the driver production rules so they are fully linked. | But if in the future anyone /really really really/ wants this | broken, then we should do it the right way, with one of these | structure pointer tricks. |
From: Geoffrey F. <fu...@ga...> - 2001-10-12 15:56:46
|
Jo=E3o Cardoso writes: > OK, if you can't, you can't. But I can. I configure/installed in=20 > /usr/local/test, then, as another user, compiled x01x.c: >=20 > > gcc x01c.c -o x01c -L/usr/local/test/lib -I/usr/local/test/include= =20 > -lplplot -Wl,-rpath -Wl,/usr/local/test/lib >=20 > then, >=20 > > ./x01c -dev ntk=20 > Plplot library version: 5.0.4 >=20 > Cannot open library file: drivers/drivers.db > lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" > Device not loaded! > tag=3Dntk, drvidx=3D12 > Trying to load ntk.drv on ./drivers/ntk.drv > Trying to load at /usr/local/test/lib/drivers/ntk.drv >=20 > and it work. Also for tk (that is static, not dyndrv). Right, that's because the way /we/ do dlopen in libplplot, allows the resolution of symbols in the loaded module, against the symbols in libplplot. So, for an app linked to libplplot, the use of a dynamic driver is easy. But that's not the only usage model... > My understanding of dlopen() is that the libraries that the opening=20= > program was linked with are available to the dlopened code, so there= =20 > is no need to link it again with the same libraries. This works for=20= > other programs also: >=20 > > ldd /usr/X11R6/lib/modules/xie.so=20 > libm.so.6 =3D> /lib/libm.so.6 (0x4008b000) > libc.so.6 =3D> /lib/libc.so.6 (0x400a8000) > /lib/ld-linux.so.2 =3D> /lib/ld-linux.so.2 (0x80000000) Your statement is true some of the time. It depends on the flags used in all the dlopen calls. The way /we/ do it, makes the above statement true. The way the JVM does it, as one notable counterexample, does not allow this. > | BTW, just in case there is any skepticism about this, here is how > | you can see for sure that every driver depends on libplplot: > | > | plplot/tmp> nm drivers/pbm.drv | grep " U " > | U ___brk_addr@@GLIBC_2.0 > | U __curbrk@@GLIBC_2.0 > | U __environ@@GLIBC_2.0 > | U atexit@@GLIBC_2.0 > | U fclose@@GLIBC_2.1 > | U fprintf@@GLIBC_2.0 > | U fwrite@@GLIBC_2.0 > | U pdf_finit > | U plFamInit > | U plOpenFile > | U plP_setphy > | U plP_setpxl >=20 > Those are resolved at *load* time, and as the "calling" program has=20= > "loaded" the libraries (libplplot, etc), everything is OK. > This is how I understand it, and it works for me. Correct. This is the way /we/ do it. When /we/ load the driver into libplplot.so, we allow it to do the run-time symbol resolution agsint symbols already in our address space. But not all users of dlopen are so permissive. In my opinion, we need to do one of two things: 1) bind the symbols so that non-permisive dlopen contexts will still support use of PLplot with dynamic drivers, or 2) break the symbol resolution requirement. Last night I was thinking more about this, after my mail yesterday, and I am starting to feel some resolve for just doing 2) now. I have done the 1) thing in my makefiles (cf stuff), and am prepared to commit that now, just to relieve the immediate pressure. But I forsee this thorn sticking us in the rump a few more times before we die of old age, so I am starting to think that introducing a structure dereferencing thing to break this symbol resolution requirement, may be the best alternative, even if I didn't really want to be driven into this so early in the game. Thinking, thinking, thinking... > The plrender problem, is also not completely reprodutible for me: >=20 > > ./x01c -dev plmeta -o po.plm > Plplot library version: 5.0.4 >=20 > Cannot open library file: drivers/drivers.db > lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" > Device not loaded! > tag=3Dplm, drvidx=3D0 > Trying to load plmeta.drv on ./drivers/plmeta.drv > Trying to load at /usr/local/test/lib/drivers/plmeta.drv > Opened po.plm >=20 > > /usr/local/test/bin/plrender po.plm -dev xwin >=20 > Cannot open library file: drivers/drivers.db > lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" >=20 > That's OK. Also for ntk and gnome, but there seems to be a problem,=20= > they quickly render the page and quit after that. Umm, haven't noticed this. Hmm. I just ran x07c, directed to a plmeta file, then ran my installed $prefix/bin/plrender on the resulting multi-page plmeta file, and it renders fine with xwin, pausing between each page as usual, etc. But when I try to render with the tk driver, it fails, even though if I run x-7c -dev tk, it works fine. So there is some problem with the Tk driver when it is installed, or with plrender, or something. Not sure which. > There is really a problem with the tk driver, as you said: >=20 > > /usr/local/test/bin/plrender po.plm -dev tk >=20 > Cannot open library file: drivers/drivers.db > lib dir=3D"/usr/local/test/lib/plplot5.0.4/data" > TCL command "/usr/local/plplot/lib/drivers/drivers.db" failed: > invalid command name=20 > "/usr/local/plplot/lib/drivers/drivers.db" > Program aborted >=20 > But, given that the others run OK, I am inclined to another problem.= > tk is static, dont work, > xwin is static, works, > ntk, gnome are dynamic, work. > ps, psc, etc are dynamic also work OK. >=20 > Thus, the tk problem is another problem. Or maybe not? Yes, I believe the problem with the Tk driver is likely unrelated to the issue with the semantics of linking the dynamic drivers. BTW, I've fixed the driver linking thing in my local copy, but the Tk driver from plrender, is still nonfunctional. So I believe this is a separate problem, which I think I have not induced through my own blunders in my own working copy. --=20 Geoffrey Furnish fu...@ga... |
From: <jca...@in...> - 2001-10-12 02:12:11
|
The mailing list "reply" is again broken. Just leave it that way, we=20 will get used to it this way. =2E.. | OK, if you can't, you can't. But I can. I configure/installed in | | /usr/local/test, then, as another user, compiled x01x.c: I forgot to say, in other then the plplot tmp dir=20 | > gcc x01c.c -o x01c -L/usr/local/test/lib | > -I/usr/local/test/include =2E.. | But, given that the others run OK, I am inclined to another | problem. tk is static, dont work, | xwin is static, works, | ntk, gnome are dynamic, work. | ps, psc, etc are dynamic also work OK. As a matter of fact the pstex driver is not working after install,=20 but I already was expecting that, as you can see in the relevant=20 comments in dyndrv.in. | Thus, the tk problem is another problem. Or maybe not? By the way, Geoffrey, you are not working in linux, are you? The dlopen() man page says: External references in the library [jc: the dlopened program]=20 are resolved using the libraries in that library's dependency =20 list [jc: I (we?) don't want that] and any other libraries previously opened with the RTLD_GLOBAL flag [jc: so I guess that in linux the loader opens the libraries the=20 executable was linked against, with RTLD_GLOBAL, and=20 as such their symbols are available to the dlopened program]. =20 If the executable was linked with the flag "-rdynamic", then the global symbols in the executable will also be used to resolve references in a dynamically loaded library. Joao |
From: Geoffrey F. <fu...@ga...> - 2001-10-12 16:08:11
|
Jo=E3o Cardoso writes: >=20 > The mailing list "reply" is again broken. Just leave it that way, we= =20 > will get used to it this way. The option is gone, I literally can't put it back. Grrrr. If anybody comes up with an elisp "list-reply" function that works with VM, please let me know... :-). > | Thus, the tk problem is another problem. Or maybe not? >=20 > By the way, Geoffrey, you are not working in linux, are you? Yes, I am on Linux. I haven't used anything but Linux since joining Lightspeed. (Although today I'm gonna have to actually log onto a Slowaris machine to run some EDA tool that's only provided by the vendor on Slowaris, but that's just a one time thing. I live on Linux).=20 > The dlopen() man page says: >=20 > External references in the library [jc: the dlopened program= ]=20 > are resolved using the libraries in that library's dependency= =20 > list [jc: I (we?) don't want that] and any other > libraries previously opened with the RTLD_GLOBAL flag >=20 > [jc: so I guess that in linux the loader opens the libraries the=20 > executable was linked against, with RTLD_GLOBAL, and=20 > as such their symbols are available to the dlopened program]. > =20 > If the executable was linked with the flag "-rdynamic", then= > the global symbols in the executable will also be used to > resolve references in a dynamically loaded library. Right. I believe you understand correctly, everything that relates to understanding how our C programs, like the PLplot C demos, our own programs, etc, all work when linked against libplplot, and then using PLplot's dynamic drivers.=20 Here's what you haven't yet grasped. When running a /Java/ program, the PLplot binding is accomplished by asking the JVM to "load" libplplot. Evidently it does this by dlopen, but not using RTLD_GLOBAL. Then, when libplplot dlopen's driver/xyz.drv, the driver doesn't have access to libplpot's symbols because libplplot itself wasn't loaded with RTLD_GLOBAL. =20 I'm not defending the JVM behavior, I'm just reporting it.=20 To live with it, we need to either 1) link drivers against libplplot, or 2) break the symbol resolution requirement. I'll /probably/ check in 1) soon. I will /possibly/ do 2) sooner than sometime in the indefinite future. How's that for a vague statement of my intentions? :-). --=20 Geoffrey Furnish fu...@ga... |
From: Joao C. <jca...@in...> - 2001-10-12 17:25:41
|
On Friday 12 October 2001 17:08, Geoffrey Furnish wrote: | Jo=E3o Cardoso writes: =2E.. | Here's what you haven't yet grasped. | | When running a /Java/ program, the PLplot binding is accomplished by | asking the JVM to "load" libplplot. Evidently it does this by dlopen, | but not using RTLD_GLOBAL. Then, when libplplot dlopen's | driver/xyz.drv, the driver doesn't have access to libplpot's symbols | because libplplot itself wasn't loaded with RTLD_GLOBAL. Yes, I wasn't aware of that. | To live with it, we need to either | | 1) link drivers against libplplot, or | 2) break the symbol resolution requirement. | | I'll /probably/ check in 1) soon. OK. | I will /possibly/ do 2) sooner than | sometime in the indefinite future. How's that for a vague statement | of my intentions? :-). Fine, I got it. :) Joao |
From: Joao C. <jca...@in...> - 2001-10-25 19:23:57
|
On Friday 12 October 2001 17:08, Geoffrey Furnish wrote: | Jo=E3o Cardoso writes: =2E.. | > The dlopen() man page says: | > | > External references in the library [jc: the dlopened program= ] | > are resolved using the libraries in that library's dependency | > list [jc: I (we?) don't want that] and any other | > libraries previously opened with the RTLD_GLOBAL flag | > | > [jc: so I guess that in linux the loader opens the libraries the | > executable was linked against, with RTLD_GLOBAL, and | > as such their symbols are available to the dlopened program]. | > | > If the executable was linked with the flag "-rdynamic", then | > the global symbols in the executable will also be used to | > resolve references in a dynamically loaded library. | | Right. I believe you understand correctly, everything that relates to | understanding how our C programs, like the PLplot C demos, our own | programs, etc, all work when linked against libplplot, and then using | PLplot's dynamic drivers. | | Here's what you haven't yet grasped. | | When running a /Java/ program, the PLplot binding is accomplished by | asking the JVM to "load" libplplot. Evidently it does this by dlopen, | but not using RTLD_GLOBAL. Then, when libplplot dlopen's | driver/xyz.drv, the driver doesn't have access to libplpot's symbols | because libplplot itself wasn't loaded with RTLD_GLOBAL. | | I'm not defending the JVM behavior, I'm just reporting it. | | To live with it, we need to either | | 1) link drivers against libplplot, or | 2) break the symbol resolution requirement. | | I'll /probably/ check in 1) soon. Hi, it looks like you have not yet commited that change. Have you found a= =20 work-around for this issue? Joao | I will /possibly/ do 2) sooner than | sometime in the indefinite future. How's that for a vague statement | of my intentions? :-). |
From: Geoffrey F. <fu...@ga...> - 2001-10-25 19:33:44
|
Joao Cardoso writes: > | I'm not defending the JVM behavior, I'm just reporting it. > | > | To live with it, we need to either > | > | 1) link drivers against libplplot, or > | 2) break the symbol resolution requirement. > | > | I'll /probably/ check in 1) soon. > > Hi, it looks like you have not yet commited that change. Have you found a > work-around for this issue? No, I implmeneted 1) in my working set, but haven't checked it in yet. I'll try to do that today or tomorrow. Just FYI, there is a good chance I'll be incommunicado next week, due to circumstances beyond my control. So if I don't answer email for a while, don't be surprised. -- Geoffrey Furnish fu...@ga... |