From: Alan W. I. <ir...@be...> - 2003-02-07 16:40:57
|
On Fri, 7 Feb 2003, Rafael Laboissiere wrote: > * Alan W. Irwin <ir...@be...> [2003-02-06 18:52]: > > > The most important point, however, is you do see segfaults on your machine > > so you have confirmed there is a severe problem, and you therefore have > > something that you can debug for your situation. > > > > Good luck in figuring this out! > > I think I found the source of the problem, see my last cvs commit. Could > you guys confirm that HEAD works for you now? Yes with a qualification. No segfaults for ./x01c with the "Joao" configuration. However, valgrind --num-callers=100 ./x01c -dev psc -o temp.ps reports 3 memory management errors (all described as "Conditional jump or move depends on uninitialised value(s)") having to do with the call to lt_dlopenext at plcore.c line 1634. Memory management errors with this description are often non-consequential because they typically come from code which jumps depending on the truth of condition1 OR condition2 where condition1 depends on the uninitialized values and condition2 is true (and thus the jump occurs correctly despite the memory management error). To get rid of this valgrind error message when it occurred directly within PLplot code, I simply used condition2 OR condition1 so that condition1 was never executed when condition2 was true. The actual jump with the memory management problem occurs some 20 layers deeper into libltl and the libraries it calls so ordinarily you would dismiss it as a problem with library code, but I pursued this further and think it is solely a problem with the xwin device. First, note that the exact same valgrind test produces no such messages for the 5.2.0 version with the psc device. However, *with 5.2.0* the same 3 valgrind error messages are produced using -dev xwin. So I believe -dev xwin is just not quite set up correctly to be dynamic, but the rest of the devices are okay. The reason I say this is the new code loops over every device (as a replacement for drivers.db) and I attribute the 3 valgrind messages I found above to the xwin device. Summary: the new code loops over every device so will trigger the complete collection of valgrind errors for all devices (just xwin in this case) while the old code just looked at the user-specified device so gets no valgrind errors (except when the user specified xwin). So to become valgrind-clean we should do the following: (1) Figure out what is wrong with the dynamic setup of -dev xwin compared to *all* the other devices. (2) Deal with the many memory leaks (valgrind defines these to be unfreed memory at end of programme). valgrind found all the ones specific to PLplot are generated by the dynamic driver code and some even had clobbered pointers by end of programme. For more details about these problems, please see the PROBLEMS file, and http://sourceforge.net/mailarchive/message.php?msg_id=1785861. (3) Deal with the recent problem Rafael had with using lt_dlclose. That problem may automatically get solved once the memory leaks are dealt with. Alan __________________________ Alan W. Irwin email: ir...@be... phone: 250-727-2902 Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the Canadian Centre for Climate Modelling and Analysis (www.cccma.bc.ec.gc.ca) and the PLplot scientific plotting software package (plplot.org). __________________________ Linux-powered Science __________________________ |