From: Rafael L. <lab...@ps...> - 2003-02-04 21:47:12
|
Preamble ======== The big recent cvs commit regarding the dyndrivers database was on the top of my TODO list. It is a necessary step towards clean configuration/building and also packaging for Debian. I am not yet completely happy with my implementation, but my changes (apparently) did not break PLplot. My simple test script succeeded, at least. I am committing without previous discussion here because, as Alan uses to say, getting novelties in cvs HEAD is the best way to foster discussions. The Problem =========== The way PLplot used to get information about the available devices provided by the dyndrivers was through the DATA_DIR/drivers.db file. This file was generated at configuration time and parsed when the library was initialized. This approach has two drawbacks: 1) Information about the devices are scattered in different places (namely in the driver source file and in configure.ac). This is ugly and may result in unnecessary maintenance burden. 2) Since the list of available devices is hardcoded in the drivers.db it is almost impossible to do clean packaging of Plplot. In Debian, for instance, packaging is granular in order to reduce the dependencies: plplot-xwin, plplot-tk, plplot-gd, etc. Users can install a subset of the available packages at will. However, they will always get the full list of available devices when plinit is called. That is not a critical problem, but annoying. The Solution ============ I have elaborated a full fix for this problem, but I just committed an intermediary solution for it. Here is how it works: 1) drivers.db does not exist anymore. 2) In each driver file, there is a global declaration like this: char* DEVICE_INFO_gd = "jpeg:JPEG file:0:gd:40:jpeg\n" "png:PNG file:0:gd:39:png"; containing the entries that used to go into drivers.db. If a driver provides more than one device, their entries must be separated by a newline character ('\n'). 3) When the library is initialized, if dyndrivers are enabled, the drivers directory is scanned for the *.la files. Each found driver is dlopened and the DEVICE_INFO_<driver> symbol is read and put in a temporary file. 4) This temporary file plays the role of the old drivers.db file. Noticed that I did minimal changes to Geoff's code in plcore.c. In the plInitDispatchTable function, I replaced the initial code (where the drivers.db were scanned) by the scanning of the drivers directory described in point 3 above. Also, all references to drivers.db and DRIVERS_DB have disappeared from the sources. Drawbacks and improvements ========================== I see two potential problems with my approach: 1) Portability. I am using new libc functions (POSIX tmpfile, opendir, readdir and closedir). Although am I following the recommended procedure found in the Autoconf docs (i.e. using AC_HEADER_DIRENT and a couple of tests with HAVE_DIRENT_H), I am sure that there will be some weird system out there (MacIntosh, say) for which my code won't compile. If that happens, we have to port that part of the code. 2) With my approach, I have to open all and every module before using them. This may appear as a regression as regards the "cache file" approach provided by the use of drivers.db. In terms of performance, with our current computer power, the overload is negligible. However, as I wrote above, my original design was much better, but harder to implement (of course). It looks like this: a) Forget about that entries a la drivers.db. b) In each driver file <driver>.c define symbols DEV_DESC_*, DEV_SEQ_*, DEV_TAG_*, etc. These symbols can be used when filling fields in function plD_dispatch_init_<device>. This would further reduce the maintenance problem due to duplication of information. c) At build time (not configuration time), a small C program dlopen the <driver>.c files, get the symbols described above and write the associated device entries in <driver>.rc (or whichever name). d) Those <driver>.rc are installed in the drivers directory, along with the <driver>.la and <driver>.so files. e) When PLplot is initialized, the <driver>.rc files are scanned. If I have some time before the 5.2.1 release, I will try to implement this idea. Postscript ========== In doing my changes, I noticed that the following drivers have never had entries in drivers.db: plbuf.c and next.c. Does anyone know why? I introduced the DEVICE_INFO_<driver> symbol in all drivers listed in variable EXTRA_LTLIBRARIES of drivers/Makefile.am. I also noticed that the pstex entry was wrongly written in drivers.db. Apparently, this bug have never bothered our users... -- Rafael |
From: Rafael L. <lab...@ps...> - 2003-02-05 11:33:33
|
[ Sorry if this message is duplicated. I sent it originally to Joao only and tried to bounce it from my MUA. Since I did not receive it from the listserver, I am trying again, as a forward. BTW, why are the plplot-devel archives @ SF not accessible?] * Joao Cardoso <jc...@fe...> [2003-02-05 02:53]: > That's true, but the target should be to get a full-working 5.2.1, that > should be, in my opinion, a bug-correcting release. Introducing such stuff > can compromise it. But we have not defined what 5.2.1 should be, and I'm > too conservative. Well, my changes does not introduce new features to PLplot. BTW, they are transparent for the users and only slightly important for the developers (actually, I think my changes improve maintainability). > I have myself lots of new stuff, and I'm refraining myself to commit them. > I hope that 5.2.2 or even 5.3.0 follows shortly. If you are planning to do changes that will either destabilize the code or introduce new features like changes in the API, I think you should refrain yourself until 5.2.1 is out. > Ah, you use a tmp file just to reuse Geoff's code. But there is no reason to > not change that also latter, right? You got it. I have not against improving the code and putting all the information in a structure, that can also be cached in the <driver>.rc files. The important point here is that we should not have a unique file (like drivers.db) that contains the information about all available drivers. > > 1) Portability. I am using new libc functions (POSIX tmpfile, opendir, > > readdir and closedir). Although am I following the recommended > > procedure found in the Autoconf docs (i.e. using AC_HEADER_DIRENT and a > > couple of tests with HAVE_DIRENT_H), I am sure that there will be some > > weird system out there (MacIntosh, say) for which my code won't compile. > > If that happens, we have to port that part of the code. > > Don't create the file :) >From the smile, I guess you are joking. I did not get the joke, though... > > c) At build time (not configuration time), a small C program dlopen the > > <driver>.c files, > > You mean "open" and not dlopen(), right? No, I meant dlopen (lt_dlopenext, to be more precise). > No, you mean dlopen() the <driver>.so (I don't know the current suffix) Yes, I mistakenly wrote <driver>.c. > I prefer the full dynamic one, i.e., reading the already build drivers. I > don't think that performance will suffer. It is not only a matter of time performance, but I was wondering about hte fact that when all the module are dynamically loaded, all the libraries (tcl, gd, gnome, etc.) will be unnecessarily dynamically linked. I am just specutlating about this, though. > Do you have some figures or only feelings? I have no figures, but my feeling is that it does not affect performance at all. I have here: $ cat /proc/cpuinfo | egrep "^(model n|cpu M|bogo)" model name : Pentium III (Coppermine) cpu MHz : 929.842 bogomips : 1854.66 > Your second idea implies that the small program that scans the source files > [..] No, it would "scan" the <driver>.la files (see above). > [...] needs information from the configure step to know what drivers are > desired by the user and supported by the system. This complicates matters. Yes, the list of dynamic drivers is built at configure time and stored in the variable DYNAMIC_DRIVERS, which is passed to drivers/Makefile.am. However, there will no further complication in the build process. The following addition to drivers/Makefile.am should do the job: drivers_DATA = $(drivers_LTLIBRARIES:.la=.rc) %.rc: %.la get_drv_info @echo get_drv_info $< $@ Where the get_drv_info.c is the C program that I mentioned already. > With the first idea, by contrary, only already build drivers, i.e, user > desired and system supported, are scanned. If there is no drawback with this solution, I will adopt it (it is in CVS already). However, I would prefer a "info cached" approach. > Also, with the second idea I think that the driver-ids (the magic numbers) > can go away -- ah, no, for historical reasons the xwin driver must be > number one, the tk driver number 2, etc. hmm, there is another solution for > this but let discuss it latter. Please, notice that these magic number (seqnum) are not so "magic". They only give a hint about how to order them. Think on them like "priority level". BTW, the gnome driver also has seqnum = 1. > If performance is really an issue, then one could implements a mixing of > the two ideas: don't generate drivers.db at configure nor build time. At > run time if drivers.db does not exists, scan the drivers and build it. This > way the performance loss will only occurs once. If a new driver is latter > added to the directory (not probable, but possible, after all this is the > only advantage of dyndrivers), one could first compare the time-stamp of > all drivers versus the drivers.db file, and rebuild drivers.db if a driver > is more recent. Reading time-stamp is fast, I believe. This is a nice possibility, it is maybe the best solution, but someone has to implement it. It adds complexity to the code. Volunteers? ;-) -- Rafael |
From: Rafael L. <lab...@ps...> - 2003-02-05 12:32:39
|
* Rafael Laboissiere <lab...@ps...> [2003-02-05 12:23]: > drivers_DATA = $(drivers_LTLIBRARIES:.la=.rc) > %.rc: %.la get_drv_info > @echo get_drv_info $< $@ Please, take that "@echo" out. I used it for testing the idea here. It works, anyway. I guess I am typing too fast and posting without reading what I wrote... -- Rafael |
From: Alan W. I. <ir...@be...> - 2003-02-05 15:01:16
|
On Wed, 5 Feb 2003, Rafael Laboissiere wrote: > > Also, with the second idea I think that the driver-ids (the magic numbers) > > can go away -- ah, no, for historical reasons the xwin driver must be > > number one, the tk driver number 2, etc. hmm, there is another solution for > > this but let discuss it latter. > > Please, notice that these magic number (seqnum) are not so "magic". They > only give a hint about how to order them. Think on them like "priority > level". BTW, the gnome driver also has seqnum = 1. I assume you are talking about your new code here. For the old code the numbers had to be unique (and seqnum was 6 for the gnome driver). Of course, that was a maintenance burden so if they just become non-unique priority numbers now, that is a significant improvement. Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Rafael L. <lab...@ps...> - 2003-02-05 15:29:57
|
* Alan W. Irwin <ir...@be...> [2003-02-05 06:59]: > I assume you are talking about your new code here. For the old code the > numbers had to be unique Not necessarily. In the old code, the entries were sorted using the seq filed of the PLDispatchtable entries using qsort. This means that, when two drivers claim the same number, of of them will (randomly) get first in the list. However, it is guaranteed that they will appear both before (above) drivers with greater (lesser) sequence numbers. This is way I see this sequence number as a kind of "priority level". > (and seqnum was 6 for the gnome driver). Right, I thought I had put "1" in the past. > Of course, that was a maintenance burden so if they just become non-unique > priority numbers now, that is a significant improvement. This is exactly the idea behind my last proposal. -- Rafael |
From: Rafael L. <lab...@ps...> - 2003-02-05 15:32:40
|
[Errata. I am typing too fast...] * Rafael Laboissiere <lab...@ps...> [2003-02-05 16:19]: > drivers claim the same number, of of them will (randomly) get first in the ^^ *one* of them > This is way I see this sequence number as a kind of "priority level". ^^^ *why* -- Rafael |
From: Rafael L. <lab...@ps...> - 2003-02-05 22:37:11
|
* Alan W. Irwin <ir...@be...> [2003-02-05 14:07]: > Rafael, you should be able to reproduce this bug if you do exactly what I > did since we have very similar systems. Joao's choice of configure options > may be important or there may be something left over in your build or > install location that makes it work on your system but not on ours. But the > fresh checkout and the above rm -rf should take care of that potential > difference between us. I cannot replicate the bug here. I did exactly what you suggested: fresh cvs checkout in a empty dir, ./bootstrap.sh, make, and make install (haveing rm -rf the install destination before). All x??c.c examples compile and work perfectly. I am puzzled. -- Rafael |
From: Alan W. I. <ir...@be...> - 2003-02-05 23:20:13
|
On Wed, 5 Feb 2003, Rafael Laboissiere wrote: > * Alan W. Irwin <ir...@be...> [2003-02-05 14:07]: > > > Rafael, you should be able to reproduce this bug if you do exactly what I > > did since we have very similar systems. Joao's choice of configure options > > may be important or there may be something left over in your build or > > install location that makes it work on your system but not on ours. But the > > fresh checkout and the above rm -rf should take care of that potential > > difference between us. > > I cannot replicate the bug here. I did exactly what you suggested: fresh > cvs checkout in a empty dir, ./bootstrap.sh, make, and make install (haveing > rm -rf the install destination before). All x??c.c examples compile and > work perfectly. > > I am puzzled. So am I! You didn't say so, but I presume you configured with exactly same options and have latest stable autotools like Joao and me (not Debian ones). Or maybe it is gcc version? I have Debian woody Version: 2:2.95.4-14 If everything is the same between us, then try valgrind on x01c in case there are memory problems but they just happen not to have any nasty consequences for your particular machine (often memory management bugs give different results on different machines because they are history dependent). If valgrind shows no problems at all, then I am really REALLY puzzled. Bitrot has set in, its the end of the world as I know it, I am going crazy....;-) Or should that be crazier....;-) Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Rafael L. <lab...@ps...> - 2003-02-05 23:31:16
|
* Alan W. Irwin <ir...@be...> [2003-02-05 15:18]: > You didn't say so, but I presume you configured with exactly same options > and have latest stable autotools like Joao and me (not Debian ones). I have : $ ./bootstrap.sh Running aclocal (GNU automake) 1.7.2... done Running libtoolize (GNU libtool) 1.4.2a... done Running autoheader (GNU Autoconf) 2.57... done Running automake (GNU automake) 1.7.2... done Running autoconf (GNU Autoconf) 2.57... done I am not using the latest version of libtool (1.4.3). That may explain the difference. I will try to use 1.4.3, but that is not going to be before this weekend (I have two busy days before me). > Or maybe it is gcc version? > > I have Debian woody > > Version: 2:2.95.4-14 I have here: Version: 2:2.95.4-17 -- Rafael |
From: Joao C. <jc...@fe...> - 2003-02-06 00:03:08
|
On Wednesday 05 February 2003 23:21, Rafael Laboissiere wrote: > * Alan W. Irwin <ir...@be...> [2003-02-05 15:18]: > > You didn't say so, but I presume you configured with exactly same opt= ions > > and have latest stable autotools like Joao and me (not Debian ones). > > I have : > > $ ./bootstrap.sh > Running aclocal (GNU automake) 1.7.2... done > Running libtoolize (GNU libtool) 1.4.2a... done > Running autoheader (GNU Autoconf) 2.57... done > Running automake (GNU automake) 1.7.2... done > Running autoconf (GNU Autoconf) 2.57... done > > I am not using the latest version of libtool (1.4.3). I noticed that "file x01c" in the build tree reports an executable ELF, = not a=20 script file as is usual with libtool. That's why I make x01c in the insta= ll=20 directory, but I still get an executable! > That may explain the > difference. I will try to use 1.4.3, but that is not going to be befor= e > this weekend (I have two busy days before me). > > > Or maybe it is gcc version? > > > > I have Debian woody > > > > Version: 2:2.95.4-14 mine is gcc-3.2 > > I have here: > > Version: 2:2.95.4-17 |
From: Alan W. I. <ir...@be...> - 2003-02-06 00:57:17
|
On Thu, 6 Feb 2003, Rafael Laboissiere wrote: > * Alan W. Irwin <ir...@be...> [2003-02-05 15:18]: > > > You didn't say so, but I presume you configured with exactly same options > > and have latest stable autotools like Joao and me (not Debian ones). > > I have : > > $ ./bootstrap.sh > Running aclocal (GNU automake) 1.7.2... done > Running libtoolize (GNU libtool) 1.4.2a... done > Running autoheader (GNU Autoconf) 2.57... done > Running automake (GNU automake) 1.7.2... done > Running autoconf (GNU Autoconf) 2.57... done [then got diverted into version stuff which might be relevant and forgot to mention the configure options]. If you had different configure options than Joao and me but are too tired to re-run the test, could you at least tell us what those configuration options are? I might find they work on my system which would imply the bug is configure option dependent and not autotools version dependent. Joao's further comment about his quite different gcc version seems to take that version dependence out of contention. Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Rafael L. <lab...@ps...> - 2003-02-06 08:05:23
|
* Alan W. Irwin <ir...@be...> [2003-02-05 16:56]: > If you had different configure options than Joao and me but are too tired > to re-run the test, could you at least tell us what those configuration > options are? I might find they work on my system which would imply the bug > is configure option dependent and not autotools version dependent. Here is exactly what I am doing. $ rm -rf plplot $ cvs -d rla...@cv...:/cvsroot/plplot co plplot $ cd plplot $ ./bootstrap.sh $ destdir=/var/tmp/plplot $ ./configure --prefix=$destdir --disable-tcl --disable-itcl \ --disable-python --enable-dyndrivers --with-double $ rm -rf $destdir $ make install $ cd examples/c $ make x01c $ ./x01c -dev xwin All the C demos compile and work fine here. I am now using Libtool 1.4.3 from Debian unstable. Please noticed that I had to make changes in bootstrap.sh in order to get this version of Libtool working properly with Autoconf 2.57 and Automake 1.7.2 (see my last cvs commit). If you guys cannot replicate my successful build with the latest version of bootstrap.sh, I will be completely puzzled. -- Rafael |
From: Alan W. I. <ir...@be...> - 2003-02-06 18:10:20
|
On Thu, 6 Feb 2003, Rafael Laboissiere wrote: > * Alan W. Irwin <ir...@be...> [2003-02-05 16:56]: > > > If you had different configure options than Joao and me but are too tired > > to re-run the test, could you at least tell us what those configuration > > options are? I might find they work on my system which would imply the bug > > is configure option dependent and not autotools version dependent. > > > Here is exactly what I am doing. > > $ rm -rf plplot > $ cvs -d rla...@cv...:/cvsroot/plplot co plplot > $ cd plplot > $ ./bootstrap.sh > $ destdir=/var/tmp/plplot > $ ./configure --prefix=$destdir --disable-tcl --disable-itcl \ > --disable-python --enable-dyndrivers --with-double > $ rm -rf $destdir > $ make install > $ cd examples/c > $ make x01c > $ ./x01c -dev xwin I confirm that set of options works, and we are back in a rational universe again after 24 hours where I wasn't sure what kind of universe we were in ....;-) I will need more investigation to see whether Joao's set of options still don't work. There is some ambiguity here because I realize there was a possibility that my old "system" versions of autotools were interfering. (My path pointed to the new versions, but some of the old configuration files might have been interfering.) So the first thing I did was remove those system versions completely. apt-get --purge remove autoconf libtool Reading Package Lists... Done Building Dependency Tree... Done The following packages will be REMOVED: autoconf* autoconf2.13* automake1.5* libtool* 0 packages upgraded, 0 newly installed, 4 to remove and 0 not upgraded. Need to get 0B of archives. After unpacking 5022kB will be freed. Do you want to continue? [Y/n] (Reading database ... 92196 files and directories currently installed.) Removing automake1.5 ... Removing libtool ... Removing autoconf ... Removing autoconf2.13 ... dpkg - warning: while removing autoconf2.13, directory /etc/autoconf2.13' not empty so not removed. Removing diversion of /usr/bin/autoconf to /usr/bin/autoconf2.50 by autoconf2.13' Removing diversion of /usr/bin/autoheader to /usr/bin/autoheader2.50 by autoconf2.13' Removing diversion of /usr/bin/autoreconf to /usr/bin/autoreconf2.50 by autoconf2.13' Purging configuration files for autoconf2.13 ... root@starling> ls /etc/autoconf2.13/ root@starling> rmdir /etc/autoconf2.13/ (note the autoconf2.13 above which always adds some version uncertainty if you are using the Debian versions of the autotools.) That is why I keep urging Rafael to get rid of his Debian versions of autotools to be consistent with the rest of us, but in this case it apparently does not make a difference because I confirm his result when I am using the latest stable version of autotools from FSF. Fresh plplot checkout as of 16:46 UT. ./bootstrap.sh '-I /home/software/autotools/install/share/libtool/' Running aclocal (GNU automake) 1.7.2... done Running autoheader (GNU Autoconf) 2.57... done Running automake (GNU automake) 1.7.2...configure.ac: installing ./install-sh' configure.ac: installing ./mkinstalldirs' configure.ac: installing ./missing' configure.ac:456: installing ./config.guess' configure.ac:456: installing ./config.sub' configure.ac:456: required file ./ltmain.sh' not found Makefile.am:25: required directory ./libltdl does not exist bindings/c++/Makefile.am: installing ./depcomp' drivers/Makefile.am: installing ./compile' Makefile.am:25: required directory ./libltdl does not exist done Running libtoolize (GNU libtool) 1.4.3... done Running autoconf (GNU Autoconf) 2.57... done I think you need quotes around the -I option to bootstrap.sh as above (contrary to your commit message). Otherwise, the $1 in bootstrap.sh will just catch the -I and miss the rest. rm -rf /usr/local/plplot_at ./configure --prefix=/usr/local/plplot_at --disable-tcl --disable-itcl \ --disable-python --enable-dyndrivers --with-double > & configure.out make >& make.out make install >& make_install.out All the *.out files looked fine. cd /usr/local/plplot_at/lib/plplot5.2.0/examples cd c ; make ; cd c++ ; make ; cd f77 ; make ; cd .. ./plplot-test.sh Plplot library version: 5.2.0 Opened x01c.ps Opened x02c.ps Opened x03c.ps Opened x04c.ps Opened x05c.ps Opened x06c.ps Opened x07c.ps Opened x08c.ps Opened x09c.ps Opened x10c.ps Opened x11c.ps Opened x12c.ps Opened x13c.ps Opened x15c.ps Opened x16c.ps Opened x18c.ps Opened x19c.ps Opened x01cc.ps Opened x01f.ps Opened x02f.ps Opened x03f.ps Opened x04f.ps Opened x05f.ps Opened x06f.ps Opened x07f.ps Opened x08f.ps Opened x09f.ps Opened x10f.ps Opened x11f.ps Opened x12f.ps Opened x13f.ps Opened x16f.ps All these results are identical with the plplot-5.2.0 results. Joao, do you confirm this set of options is working for you as well (after removing all traces of old autotools)? Next, I will try Joao's set of options starting from fresh checkout. Alan Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2003-02-06 18:38:14
|
On Thu, 6 Feb 2003, Alan W. Irwin wrote: > Next, I will try Joao's set of options starting from fresh checkout. Everything done identically from fresh checkout except for ./configure --prefix=/usr/local/plplot_at --enable-octave \ --enable-dyndrivers --disable-static --with-double > & configure.out AND THE ANSWER IS.... immediate segfault from x01c and x08c (I didn't bother to try anything else). So we really do live in a rational universe. The problem Joao found, and I confirmed yesterday still exists, and it depends on the exact configure options above and disappears if you use Rafael's options. Rafael, do you confirm this? Joao, I still think it is important for you to confirm that Rafael's exact set of options works on your system so we know we are all on the same page. Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2003-02-06 18:47:07
|
P.S. the only options that should affect the C examples are --enable-dyndrivers --with-double (which are common to Rafael and Joao's options) and --disable-static (which Joao has and Rafael does not). Thus, I bet there is some problem with the present dynamic drivers configuration or implementation that is incompatible with --disable-static. I hope that strong clue allows Rafael to find and fix the problem without too much further effort. Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Rafael L. <lab...@ps...> - 2003-02-06 23:01:14
|
* Alan W. Irwin <ir...@be...> [2003-02-06 10:45]: > P.S. the only options that should affect the C examples are > --enable-dyndrivers --with-double (which are common to Rafael and Joao's > options) and --disable-static (which Joao has and Rafael does not). > > Thus, I bet there is some problem with the present dynamic drivers > configuration or implementation that is incompatible with --disable-static. > I hope that strong clue allows Rafael to find and fix the problem without > too much further effort. Well, when I use: ./configure --prefix=/var/tmp/plplot --enable-octave --enable-dyndrivers \ --disable-static --with-double \ --disable-python --disable-tcl --disable-itcl then everything works fine. Notice that the --disable-static option has been included above. Weird. -- Rafael |
From: Rafael L. <lab...@ps...> - 2003-02-06 23:39:46
|
* Rafael Laboissiere <lab...@ps...> [2003-02-06 23:50]: > * Alan W. Irwin <ir...@be...> [2003-02-06 10:45]: > > > P.S. the only options that should affect the C examples are > > --enable-dyndrivers --with-double (which are common to Rafael and Joao's > > options) and --disable-static (which Joao has and Rafael does not). > > > > Thus, I bet there is some problem with the present dynamic drivers > > configuration or implementation that is incompatible with --disable-static. > > I hope that strong clue allows Rafael to find and fix the problem without > > too much further effort. > > Well, when I use: > > ./configure --prefix=/var/tmp/plplot --enable-octave --enable-dyndrivers \ > --disable-static --with-double \ > --disable-python --disable-tcl --disable-itcl > > then everything works fine. Notice that the --disable-static option has > been included above. I investigated this further and I think that you are following the wrong trail with --disable-static. Also, your assumption that the only options that affect the C examples are --enable-dyndrivers, --with-double, and --disable-static is wrong. The option --disable-tcl *does* affect the C example. This is the minimal pair that I found: C examples segfault: ./configure --enable-dyndrivers C examples work fine: ./configure --enable-dyndrivers --disable-tcl I think I am getting close to the source of the problem and that has nothing to do with --disable-static. Alan, could you please confirm the minimal pair above? -- Rafael |
From: Alan W. I. <ir...@be...> - 2003-02-07 02:53:28
|
On Fri, 7 Feb 2003, Rafael Laboissiere wrote: > > This is the minimal pair that I found: > > C examples segfault: > ./configure --enable-dyndrivers > > C examples work fine: > ./configure --enable-dyndrivers --disable-tcl > > Alan, could you please confirm the minimal pair above? Not precisely. ./configure --enable-dyndrivers --prefix=blah gives me *no* segfault for ./x01c -dev xwin. However, when I ran x01c with valgrind, I did get lots of "invalid reads from memory" messages and an eventual segfault. The most important point, however, is you do see segfaults on your machine so you have confirmed there is a severe problem, and you therefore have something that you can debug for your situation. Good luck in figuring this out! Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |
From: Rafael L. <lab...@ps...> - 2003-02-07 08:41:01
|
* Alan W. Irwin <ir...@be...> [2003-02-06 18:52]: > The most important point, however, is you do see segfaults on your machine > so you have confirmed there is a severe problem, and you therefore have > something that you can debug for your situation. > > Good luck in figuring this out! I think I found the source of the problem, see my last cvs commit. Could you guys confirm that HEAD works for you now? -- Rafael |
From: <jc...@fe...> - 2003-02-07 14:18:50
|
On Friday 07 February 2003 08:30, Rafael Laboissiere wrote: | * Alan W. Irwin <ir...@be...> [2003-02-06 18:52]: | > The most important point, however, is you do see segfaults on your | > machine so you have confirmed there is a severe problem, and you | > therefore have something that you can debug for your situation. | > | > Good luck in figuring this out! | | I think I found the source of the problem, see my last cvs commit.=20 | Could you guys confirm that HEAD works for you now? Yes, [jcard@feup] ./bootstrap.sh Running aclocal (GNU automake) 1.7.2... done Running autoheader (GNU Autoconf) 2.57... done Running automake (GNU automake) 1.7.2... done Running libtoolize (GNU libtool) 1.4.3... done Running autoconf (GNU Autoconf) 2.57... done =2E/configure --enable-octave --enable-dyndrivers --disable-static --with-double --prefix=3D/usr/local/test And after install x01c works. Joao |
From: Rafael L. <lab...@ps...> - 2003-02-07 14:49:08
|
* João Cardoso <jc...@fe...> [2003-02-07 14:21]: > Yes, > > [jcard@feup] ./bootstrap.sh > Running aclocal (GNU automake) 1.7.2... done > Running autoheader (GNU Autoconf) 2.57... done > Running automake (GNU automake) 1.7.2... done > Running libtoolize (GNU libtool) 1.4.3... done > Running autoconf (GNU Autoconf) 2.57... done > > ./configure --enable-octave --enable-dyndrivers --disable-static > --with-double --prefix=/usr/local/test > > And after install x01c works. Uff... I am relieved. I can go out for the weekend knowing that HEAD is not in a completely bad shape. The memory management problems are still there, though... -- Rafael |
From: <jc...@fe...> - 2003-02-07 14:49:20
|
On Wednesday 05 February 2003 16:55, Jo=E3o Cardoso wrote: | On Wednesday 05 February 2003 13:55, Rafael Laboissiere wrote: | | * Rafael Laboissiere <lab...@ps...> [2003-02-04 22:37]: | | > c) At build time (not configuration time), a small C program | | > dlopen the <driver>.c files, get the symbols described above and | | > write the associated device entries in <driver>.rc (or whichever | | > name). | | | | Just to make things a bit more concrete, here is the design that I | | have in mind. I will take the ps.c driver as an example. The | | following global variables will be defined in the driver source: | | | | char* DEVICES_ps =3D "ps:psc"; | | | | char* DESCRIPTION_ps =3D "PostScript File (monochrome)"; | | int TYPE_ps =3D plDevType_FileOriented; | | int SEQNUM_ps =3D 29; | | char* SYMTAG_ps =3D "psm"; | | | | char* DESCRIPTION_psc =3D "PostScript File (color)"; | | int TYPE_psc =3D plDevType_FileOriented; | | int SEQNUM_psc =3D 30; | | char* SYMTAG_psc =3D "psc"; | | | | Of course, these variables should be used in the | | plD_dispatch_init_* functions. In the case of ps.c: | | | | ps_dispatch_init_helper( pdt, | | DESCRIPTION_ps, "ps", | | TYPE_ps, SEQNUM_ps, | | (plD_init_fp) plD_init_psm ); | | | | ps_dispatch_init_helper( pdt, | | DESCRIPTION_psc, "psc", | | TYPE_psc, SEQNUM_psc, | | (plD_init_fp) plD_init_psc ); | | | | This will insure that things are not defined twice in different | | places (like with the current DEVICE_INFO_* variable). | | | | The small C program that will generate the <driver>.rc file from | | the <driver>.la file, would do: (1) dlopen the module; (2) get the | | DEVICES_* symbol and parse its device components (in the ps.c | | exemple, that would be "ps" and "psc"); (3) for each one of the | | devices found, get the symbols DESCRIPTION_<dev>, TYPE_<dev>, | | SEQNUM_<dev>, and SYMTAG_<dev>; (4) create the temp file that is | | parsed by Geoff's code. (This could be improved along Joao's | | suggestion of creating a structure instead of writing the | | information in a temp file.) | | | | We have to decide first whether it is not worth abandoning the | | current approach and adopting this "cached info" approach above. Now that HEAD is working again. lets continue. I measure the time that your code in plInitDispatchTable() takes to load=20 all drivers, and it is only 20ms (wall clock) for all 33 drivers, on a=20 P3@700MHz (with a small but continuous load of 1). It looks like it is=20 pretty efficient. Your concerns that dlopen() would load all libraries that a driver needs=20 does not apply, as this would only happens if some drivers code is=20 executed, which is not the case. As libtool's info says: "Unresolved symbols in the module are resolved using its dependency libraries" As the symbols you are looking for are in the modules, no further=20 loading will occur. Thus, given that the current implementation is efficient enough, avoids=20 the need of another program to build the drivers.rc file, is fully=20 dynamic because if more drivers are added to the directory they will be=20 recognized without further intervention,... lets keep it as is. Joao | | I decided to give it a try but got a seg fault: | | [jcard@feup] gdb x01c | (gdb) run -dev xwin | Starting program: /usr/local/test/lib/plplot5.2.0/examples/c/x01c | -dev xwin | | Program received signal SIGSEGV, Segmentation fault. | 0x0806a46e in lt_dlsym (handle=3D0x807b0e0, | symbol=3D0xbfffef50 "DEVICE_INFO_pstex") at ltdl.c:3330 | 3330 lensym =3D LT_STRLEN (symbol) + LT_STRLEN | (handle->loader->sym_prefix) | (gdb) where | #0 0x0806a46e in lt_dlsym (handle=3D0x807b0e0, | symbol=3D0xbfffef50 "DEVICE_INFO_pstex") at ltdl.c:3330 | #1 0x08053d38 in plInitDispatchTable () at plcore.c:1638 | #2 0x08049b9d in plMergeOpts (options=3D0x8073500, | name=3D0x11 <Address 0x11 out of bounds>, notes=3D0x11) at | plargs.c:699 #3 0x08049362 in main () | #4 0x400674a2 in __libc_start_main () from /lib/libc.so.6 | | | There is also Joao's suggestion for dynamically building the | | drivers.db, but I am too lazy to implement that (and I am not sure | | it is a superior approach). | | | | What do you think? I accept suggestions for better variable names. | | ------------------------------------------------------- | This SF.NET email is sponsored by: | SourceForge Enterprise Edition + IBM + LinuxWorld | http://www.vasoftware.com | _______________________________________________ | Plplot-devel mailing list | Plp...@li... | https://lists.sourceforge.net/lists/listinfo/plplot-devel |
From: Rafael L. <lab...@ps...> - 2003-02-08 21:38:43
|
* João Cardoso <jc...@fe...> [2003-02-07 14:51]: > I measure the time that your code in plInitDispatchTable() takes to load > all drivers, and it is only 20ms (wall clock) for all 33 drivers, on a > P3@700MHz (with a small but continuous load of 1). It looks like it is > pretty efficient. > > Your concerns that dlopen() would load all libraries that a driver needs > does not apply, as this would only happens if some drivers code is > executed, which is not the case. As libtool's info says: > > "Unresolved symbols in the module are resolved using > its dependency libraries" > > As the symbols you are looking for are in the modules, no further > loading will occur. > > Thus, given that the current implementation is efficient enough, avoids > the need of another program to build the drivers.rc file, is fully > dynamic because if more drivers are added to the directory they will be > recognized without further intervention,... lets keep it as is. Thanks for addressing those two points which I was too lazy to evaluate (efficiency and loading of libraries). Although I am not totally happy with the current implementation, I will keep it as is for now. I am pretty convinced that the memory management problems that you had when using lt_dlclose come only from the libltdl code (use of a private realloc along with a system's malloc). This has been discussed in the libtool mailing list and fixed in the libtool's cvs tree. There are good news here: version 1.5 is coming soon: http://mail.gnu.org/archive/html/libtool/2003-02/msg00014.html Next step: Debian packages. -- Rafael |
From: Rafael L. <lab...@ps...> - 2003-02-10 07:24:04
|
* Maurice LeBrun <mj...@ga...> [2003-02-09 21:13]: > I've not been following the discussion very closely, but does this mean that > the code currently loads all dynamic drivers at startup time? If true, this > seems against the spirit of dynamic drivers, and adds unnecessarily to the > memory used by the application. This is what I thought at the beginning and this is why I proposed the alternative design with the <driver>.rc files. However here is what Joao wrote some days ago: * João Cardoso <jc...@fe...> [2003-02-07 14:51]: > Your concerns that dlopen() would load all libraries that a driver needs > does not apply, as this would only happens if some drivers code is > executed, which is not the case. As libtool's info says: > > "Unresolved symbols in the module are resolved using > its dependency libraries" > > As the symbols you are looking for are in the modules, no further > loading will occur. I hope that this is true for all architectures. At any rate, as Alan pointed out, in the current design the driver moudles should be dlclosed after the plD_DEVICE_INFO_<dirver> variable is obtained. -- Rafael |
From: Alan W. I. <ir...@be...> - 2003-02-07 16:40:57
|
On Fri, 7 Feb 2003, Rafael Laboissiere wrote: > * Alan W. Irwin <ir...@be...> [2003-02-06 18:52]: > > > The most important point, however, is you do see segfaults on your machine > > so you have confirmed there is a severe problem, and you therefore have > > something that you can debug for your situation. > > > > Good luck in figuring this out! > > I think I found the source of the problem, see my last cvs commit. Could > you guys confirm that HEAD works for you now? Yes with a qualification. No segfaults for ./x01c with the "Joao" configuration. However, valgrind --num-callers=100 ./x01c -dev psc -o temp.ps reports 3 memory management errors (all described as "Conditional jump or move depends on uninitialised value(s)") having to do with the call to lt_dlopenext at plcore.c line 1634. Memory management errors with this description are often non-consequential because they typically come from code which jumps depending on the truth of condition1 OR condition2 where condition1 depends on the uninitialized values and condition2 is true (and thus the jump occurs correctly despite the memory management error). To get rid of this valgrind error message when it occurred directly within PLplot code, I simply used condition2 OR condition1 so that condition1 was never executed when condition2 was true. The actual jump with the memory management problem occurs some 20 layers deeper into libltl and the libraries it calls so ordinarily you would dismiss it as a problem with library code, but I pursued this further and think it is solely a problem with the xwin device. First, note that the exact same valgrind test produces no such messages for the 5.2.0 version with the psc device. However, *with 5.2.0* the same 3 valgrind error messages are produced using -dev xwin. So I believe -dev xwin is just not quite set up correctly to be dynamic, but the rest of the devices are okay. The reason I say this is the new code loops over every device (as a replacement for drivers.db) and I attribute the 3 valgrind messages I found above to the xwin device. Summary: the new code loops over every device so will trigger the complete collection of valgrind errors for all devices (just xwin in this case) while the old code just looked at the user-specified device so gets no valgrind errors (except when the user specified xwin). So to become valgrind-clean we should do the following: (1) Figure out what is wrong with the dynamic setup of -dev xwin compared to *all* the other devices. (2) Deal with the many memory leaks (valgrind defines these to be unfreed memory at end of programme). valgrind found all the ones specific to PLplot are generated by the dynamic driver code and some even had clobbered pointers by end of programme. For more details about these problems, please see the PROBLEMS file, and http://sourceforge.net/mailarchive/message.php?msg_id=1785861. (3) Deal with the recent problem Rafael had with using lt_dlclose. That problem may automatically get solved once the memory leaks are dealt with. Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |