From: Alan W. I. <ir...@be...> - 2017-07-16 23:30:24
|
On 2017-07-16 11:43+0100 Phil Rosenberg wrote: > Hi Alan et al > > Please find attached some patches that show some bare bones start to using > setjmp and longjmp exception style error handling in plplot. Please have a > look at the log for a detailed description of what is going on and how it > works. > > What this is: A first start with enough code to show the basic principles > of how this could work. > > What this isn't: A fully working version that should go anywhere near the > actual repo. I have only tested to see if it compiles, builds and that x00c > runs. In fact I get some tcl related exception on exit, which I'm not sure > if I just caused or if it was there before. > > I am sending out this super early untested code that may still have major > bugs and is based on whatever (quite old) plplot source I happened to have > on my machine so that you can see the PRINCIPLES of how it works. And > because I have other stuff on today so I might not get chance to look at > this for a while and I know Alan was keen to see something asap. So feel > free to be critical, but please not too critical :-) > > Alan, have a look, have a play. Thoughts welcome. Hi Phil: Thanks for responding so far beyond my request. I used "git am" apply your first two commits (which should be handled as a separate topic) cleanly to a private topic branch I created here called "memory_allocation". That clean result was based on the tip of the current master branch. I similarly applied your four commits cleanly to a private topic branch which I call "exception_handling". i.e., I assumed for this topic branch the memory_allocation topic would be merged before the exception handling part of your commits, although I doubt that would be actually necessary, i.e., both these topics are completely independent of each other. I extensively tested your "memory_allocation" topic as follows. I configured cmake with the option -DVALGRIND_ALL_TESTS=ON and built the test_c_psc target (which builds all our c standard examples for the psc device and runs them all using valgrind). Those valgrind reports were perfect, i.e., the list of such reports without the phrases "no leaks are possible" and "0 errors from 0 contexts") is empty as can be seen from software@raven> ls examples/*.log examples/valgrind.x00c.psc.log examples/valgrind.x11c.psc.log examples/valgrind.x22c.psc.log examples/valgrind.x01c.psc.log examples/valgrind.x12c.psc.log examples/valgrind.x23c.psc.log examples/valgrind.x02c.psc.log examples/valgrind.x13c.psc.log examples/valgrind.x24c.psc.log examples/valgrind.x03c.psc.log examples/valgrind.x14c.psc.log examples/valgrind.x25c.psc.log examples/valgrind.x04c.psc.log examples/valgrind.x15c.psc.log examples/valgrind.x26c.psc.log examples/valgrind.x05c.psc.log examples/valgrind.x16c.psc.log examples/valgrind.x27c.psc.log examples/valgrind.x06c.psc.log examples/valgrind.x17c.psc.log examples/valgrind.x28c.psc.log examples/valgrind.x07c.psc.log examples/valgrind.x18c.psc.log examples/valgrind.x29c.psc.log examples/valgrind.x08c.psc.log examples/valgrind.x19c.psc.log examples/valgrind.x30c.psc.log examples/valgrind.x09c.psc.log examples/valgrind.x20c.psc.log examples/valgrind.x31c.psc.log examples/valgrind.x10c.psc.log examples/valgrind.x21c.psc.log examples/valgrind.x33c.psc.log and the empty results from software@raven> grep -L "no leaks are possible" examples/valgrind.x??c.psc.log software@raven> grep -L "0 errors from 0 contexts" examples/valgrind.x??c.psc.log Furthermore, I like what you have done with plmalloc, plrealloc, and plfree where if inside those routines malloc, realloc, or free respectively fail, plexit is called with appropriate error message. Assuming, then you want to replace all our current direct calls to malloc, realloc and free with these pl* variants, this solves a major problem we have which was all the wide range of different treatments of how malloc, realloc and free errors were checked, i.e., in the best cases plexit is called, but more usually no error checking is done at all! So this is a most encouraging result. However, there are some issues with this topic branch that I noticed * This approach should be extended to calloc as well. * You are not finished, e.g., I notice many direct calls to malloc still within our code base. * The following two compile warnings need to be addressed: [ 18%] Building C object src/CMakeFiles/plplot.dir/plcore.c.o cd /home/software/plplot/HEAD/build_dir/src && /usr/lib/ccache/cc -DPLPLOT_HAVE_CONFIG_H -DUSINGDLL -Dplplot_EXPORTS -I/home/ software/plplot/HEAD/plplot.git/include -I/home/software/plplot/HEAD/plplot.git/lib/qsastime -I/home/software/plplot/HEAD/buil d_dir -I/home/software/plplot/HEAD/build_dir/include -O3 -fvisibility=hidden -Wuninitialized -fPIC -I/usr/include -o CMake Files/plplot.dir/plcore.c.o -c /home/software/plplot/HEAD/plplot.git/src/plcore.c /home/software/plplot/HEAD/plplot.git/src/plcore.c: In function ‘plinitialisememorylist’: /home/software/plplot/HEAD/plplot.git/src/plcore.c:117:9: warning: ignoring return value of ‘realloc’, declared with attribute warn_unused_result [-Wunused-result] realloc( pls->memorylist, n * sizeof ( void* ) ); ^ /home/software/plplot/HEAD/plplot.git/src/plcore.c: In function ‘plmalloc’: /home/software/plplot/HEAD/plplot.git/src/plcore.c:140:9: warning: ignoring return value of ‘realloc’, declared with attribute warn_unused_result [-Wunused-result] realloc( pls->memorylist, n * sizeof ( void* ) ); ^ * Once you feel you have completed development of this private topic branch, then you should test it like I did above (assuming you have good access to a Linux box or I could do that testing if you like when the time comes), and assuming all the valgrind results are still clean, then you should merge it to the master branch completely independent of what happens with your exception handling topic branch. That finishes the memory allocation topic, and I now want to move on to the exception handling topic. For that topic branch I built the x00c and ps targets OK (except for the memory allocation branchh code warnings I mentioned above), but that result segfaulted at run time, e.g., software@raven> examples/c/x00c -dev psc -o test.psc Segmentation fault Here are the valgrind results for this important memory management issue software@raven> valgrind examples/c/x00c -dev psc -o test.psc ==24594== Memcheck, a memory error detector ==24594== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==24594== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info ==24594== Command: examples/c/x00c -dev psc -o test.psc ==24594== ==24594== Conditional jump or move depends on uninitialised value(s) ==24594== at 0x53F1EE5: longjmp (longjmp.c:32) ==24594== by 0x4E840B0: c_plenv (in /home/software/plplot/HEAD/build_dir/src/libplplot.so.14.0.0) ==24594== by 0x400858: main (in /home/software/plplot/HEAD/build_dir/examples/c/x00c) ==24594== ==24594== Warning: client switching stacks? SP change: 0xffefffdc8 --> 0x3d8d115408920055 ==24594== to suppress, use: --max-stackframe=4435220191945818765 or greater ==24594== Use of uninitialised value of size 8 ==24594== at 0x53F1F52: __longjmp (__longjmp.S:60) ==24594== ==24594== Jump to the invalid address stated on the next line ==24594== at 0x3D8D115408920055: ??? ==24594== Address 0x3d8d115408920055 is not stack'd, malloc'd or (recently) free'd ==24594== ==24594== ==24594== Process terminating with default action of signal 11 (SIGSEGV) ==24594== Bad permissions for mapped region at address 0x3D8D115408920055 ==24594== at 0x3D8D115408920055: ??? ==24594== Use of uninitialised value of size 8 ==24594== at 0x4A236C0: _vgnU_freeres (vg_preloaded.c:58) ==24594== ==24594== Invalid write of size 8 ==24594== at 0x4A236C0: _vgnU_freeres (vg_preloaded.c:58) ==24594== Address 0x3d8d11540892004d is not stack'd, malloc'd or (recently) free'd ==24594== ==24594== ==24594== Process terminating with default action of signal 11 (SIGSEGV) ==24594== General Protection Fault ==24594== at 0x4A236C0: _vgnU_freeres (vg_preloaded.c:58) ==24594== ==24594== HEAP SUMMARY: ==24594== in use at exit: 76,682 bytes in 288 blocks ==24594== total heap usage: 360 allocs, 72 frees, 124,441 bytes allocated ==24594== ==24594== LEAK SUMMARY: ==24594== definitely lost: 0 bytes in 0 blocks ==24594== indirectly lost: 0 bytes in 0 blocks ==24594== possibly lost: 0 bytes in 0 blocks ==24594== still reachable: 76,682 bytes in 288 blocks ==24594== suppressed: 0 bytes in 0 blocks ==24594== Rerun with --leak-check=full to see details of leaked memory ==24594== ==24594== For counts of detected and suppressed errors, rerun with: -v ==24594== Use --track-origins=yes to see where uninitialised values come from ==24594== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0) Segmentation fault So you need to address this before we can do any extensive run-time testing of your exception handling ideas. Of course, you did warn us that this code was quite preliminary so issues like above are to be expected. Furthermore, in my view, discussion of real code developers can try for themselves always beats discussion of code that doesn't exist yet. So I very much appreciate you supplying us with this preliminary exception handling code to focus our further discussion. I do have some general points to make from just looking at the broad picture of what you are trying to accomplish here with the exception handling macros PLTRY, PLCATCH, PLENDTRY, and PLTHROW. * There is currently no ability for PLTHROW to propagate an error message. Is that a fundamental limitation of setjmp/longjmp based C exception handling or could that feature be easily implemented? If the latter, then that gives us the option to use PLTHROW wherever errors occur rather than always calling plexit which then prints the error message before it uses PLTHROW. So for example in plmalloc, plcalloc, plrealloc, and plfree you could use PLTHROW directly rather than calling plexit. However, if implementation of handling error messages for PLTHROW is difficult/impossible, then I would be content to call plexit everywhere so that there would only be one PLTHROW (within plexit after printing out the error message) in our code as now on this topic branch. * Can you explain why you have implemented all the PLTRY blocks in src/plvpor.c (with presumably the idea in mind to use PLTRY in a lot more places?) Wouldn't it be better to use just one PLTRY block? For example, could you get away with just that one PLTRY block in the library initialization code? If so, the corresponding PLCATCH block there could write out a message saying you reached that catch block (to confirm that the whole exception handling system is fundamentally working within our C library), and then exit (for now) in just that one place rather than doing nothing about whatever PLplot library error was detected. * I suggest you protect your exception handling code (but not the separate topic branch code concerning pl* versions of malloc, calloc, realloc, and free) with an PL_IMPLEMENT_EXCEPTION_HANDLING macro that is controlled (using #cmakedefine in plplot_config.h.in) by a cmake option in cmake/modules/plplot.cmake of the same name which defaults to OFF and which is self-documented as highly experimental. Once you do this, then it makes it possible to merge this topic quite rapidly to master (to facilitate further collaborative development of our exception handling system) since the user would have to go out of their way to set -DPL_IMPLEMENT_EXCEPTION_HANDLING=ON once they found PL_IMPLEMENT_EXCEPTION_HANDLING in CMakeCache.txt. And that file includes the self-documentation of each option so it should be obvious to such users that turning that option ON is highly experimental. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |