From: Alan W. I. <ir...@be...> - 2011-09-06 08:18:42
|
Hi Andrew: On 2011-09-05 22:58+0100 Andrew Ross wrote: > Alan, > [...]The fact that this occurs in the dynamic loader handling code makes > me suspect a linkage bug somewhere along the line. I tried a minimal > build inside the Debian testing chroot (no fancy packaging stuff) > and lo and behold it all works fine. So there must be something > different about what the Debian tools do to the library. > > My first thought was the rpath since Debian doesn't use any rpath > information for libraries, but this seems to make no difference. > > (Incidentally I discovered that -DUSE_RPATH=OFF does not disable > all rpath information. You need to use the CMake option > -DCMAKE_SKIP_RPATH=ON to do this). -DCMAKE_SKIP_RPATH=ON is going to cause substantial problems for any build-tree test so I don't advise using it. As far as I know, Debian and other distros don't care about setting rpath in the build tree; it is only rpath in the install tree that concerns Debian and other distro packaging. In the old days, cmake by default used rpath in the build tree and did not use it in the install tree. -DUSE_RPATH=OFF kept this default, while -DUSE_RPATH=ON means we specifically set some install-tree rpath information for each library, as in if(USE_RPATH) set_target_properties( plplot${LIB_TAG} PROPERTIES SOVERSION ${plplot_SOVERSION} VERSION ${plplot_VERSION} INSTALL_RPATH "${LIB_INSTALL_RPATH}" INSTALL_NAME_DIR "${LIB_DIR}" ) else(USE_RPATH) set_target_properties( plplot${LIB_TAG} PROPERTIES SOVERSION ${plplot_SOVERSION} VERSION ${plplot_VERSION} INSTALL_NAME_DIR "${LIB_DIR}" ) endif(USE_RPATH) Are you saying that cmake now by default uses INSTALL_RPATH in the install tree (set to some default value) OR have we screwed up stanzas like the above for some of our libraries/shared objects? > > My minimal testing build just used qtwidget and epsqt devices. I > noticed the only difference in "ldd qt.so" between the Debian build > and the test build was the absence of libQtSvg in the latter. Tried > again, but including the svgqt device. Suddenly the crashes start > again. Looks like either something included in plplot with the svgqt > driver or simply having libQtSvg.so linked in causes the problem. > > Since the problem seemed to be related to dynamic loading I also > tried building without dynamic drivers, but with the svgqt driver. > This also fixed the problem. Similarly just calling plend1 rather > than plend at the end of the example prevented the error. Yes, plend does a lot more than plend1. It calls plend1 for each stream and then goes on to call lt_dlexit() which I take it, in turn, executes exit handlers for each library. > > A quick google search shows up the following thread. It seems > that this may be exactly the problem I am encountering. > > http://lists.gnu.org/archive/html/libtool/2009-12/msg00115.html > > So it looks like somewhere qt is setting an exit handler which > is called when the program calls exit(), but the qt libraries > have already been unloaded at that point so the handler no > longer exists. I am sure you knew this, but I had to look it up so I will state it for the rest following this; the exit handlers get automatically invoked not only by calls to exit() but also by a normal return from a main programme. So in sum all our examples inevitably invoke exit handlers for all linked libraries, but then so apparently does plend's call to lt_dlexit(). This works for me both for Debian's Qt-4.6.3 and also for the pango/cairo libraries used for cairo for Debian stable. But this causes trouble for you and Orion for Qt-4.7.3. I also recall a couple of discussions here and in plplot-general about problems with calling plend for pango/cairo as well. The problems could be caused by bugs in how certain specific versions of libraries use atexit or could be caused by them simply starting to use atexit at some version. In any case, it appears calling lt_dlexit is an accident waiting to happen. I guess we could work around all potential problems by marking _every_ driver as resident, but if that is identical to not calling lt_dlexit at all, maybe the simple fix is to comment that call out (and put in code so that lt_dlinit() is called only once). What do you think of that last simple idea? Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |