From: Alan W. I. <ir...@be...> - 2017-07-24 02:59:23
|
Hi Phil: On 2017-07-23 22:00+0100 Phil Rosenberg wrote: > On 23 July 2017 at 12:27, Alan W. Irwin <ir...@be...> wrote: >> I suggested in my big e-mail to you early this week (Mon, 17 Jul 2017 >> 14:59:00 -0700 (PDT)) concerning your second iteration a method of doing >> this using a list of plstream heap pointers. Please respond to that >> suggestion. > > When you said a list I'm not sure I understood. Do you mean a linked > list rather than an array? No, I was using list in a generic sense so it you store that information (and also access it) in an array form that would be fine with me. So basically the idea is whatever you are doing now in plfreeall for handling plstream heap pointers correctly should also be done in plmalloc, plcalloc, plrealloc, and plfree so we can replace all calls of malloc, calloc, realloc, and free in our code with calls to the pl* variants of those. > >> >> Also please respond to my suggestion (with appropriate code changes >> while I handle the CMake part) in that e-mail concerning >> >> #define PL_IMPLEMENT_EXCEPTION_HANDLING and >> #ifdef PL_IMPLEMENT_EXCEPTION_HANDLING >> Please go back to look at that original e-mail for this sub-topic, but to summarize what I meant is please implement this preprocessor instruction idea for iteration 4, and I will subsequently implement a CMake option to control whether PL_IMPLEMENT_EXCEPTION_HANDLING is #defined or not. >> On 2017-07-22 22:47+0100 Phil Rosenberg wrote: >> >>> [in that same e-mail] You asked why every API call required a PLTRY block. >>> You are correct >>> that actually they don't. Every API call that may result in a call to >>> plexit needs a PLTRY block. >> >> >> I initially didn't understand this explanation, but after _much_ too >> long a time considering various scenarios to explain my point of view >> the light finally dawned. Exception handling does not allow you to >> transfer control to arbitrary places in the code (a stupid assumption >> I was making). Instead, it merely allows you to unwind the call graph >> from the plexit call (in this case that is the only current use of >> PLTHROW in our source code) to return control to a routine potentially >> far up the call stack from there such as _any_ of the public API >> routines (as you said) that have a possible call to plexit somewhere >> lower down on their call graph. > > Yes correct, we just unwind the stack to a certain point in the call > sequence (or whatever terminology you choose as you mentioned in your > ps and pps). This is good, because the user application is still sat > waiting for our API function to return. We don't want to jump to some > specific execution point, we need that function to return and for the > user to find out somehow that there has been an error. Thanks for that confirmation. It is a big relief to me to finally move beyond my previous misconception. > >> >> So sorry for my previous noise on this sub-topic, but >> now we are finally on the same page in this regard, I have some >> further questions and comments about your implementation of >> exception handling and how you are using that exception handling >> to deal with bad errors. >> >> * If you miss putting a PLTRY and PLCATCH block in one of those public >> API cases where a potential call to plexit is somewhere lower on the >> call graph, then what does the current code do? Does it simply >> crash (i.e., is that the potential cause of the serious issues I >> have reported to you in my previous post) or do you handle that >> internal error case by, e.g., emitting an "internal error: unhandled >> exception" message followed by an exit? > > I think that currently something very odd. The value of > pls->currentjumpbuffer used in PLTHROW would be either out of date or > NULL, so probably some sort of crash. This is why it is important that > every API entry function includes a PLTRY, PLCATCH, PLENDTRY sequence. > Of course we could set things up to check for this scenario, but then > what would we do? Call exit()? I had been trying to think if there was > a C method to check at compile time, but I cannot think of one (there > would be ways in C++ I think). But the API changes so infrequently > that I think just ensuring all API functions include the PLTRY, > PLCATCH, PLENDTRY code is probably okay - open to suggestions though. Assuming we develop this proof-of-concept into the real deal and merge this topic to master (and so far I haven't seen any issues in this proof-of-concept that would preclude reaching that eventual result), there will be lots of functions where we need to add PLTRY and PLCATCH blocks. So given that situation where unhandled exceptions may be a fact of life for us for some time (and possibly any time in the future when new topics with e.g., additional externally accessible API are merged to master) I think we should handle this situation with a specific error message and calling exit rather than crashing with, e.g., a segfault. That conclusion is also based on my impression that is exactly how C++ deals with the unhandled exception issue. Furthermore, I think this is fundamentally a more satisfactory result than a crash whenever an unhandled exception occurs (especially if we copy this approach for our other non-plplot C libraries and our C device drivers, see further discussion below of this possibility). I mentioned "lots of functions" above. Note those will not just be limited to the official "c_*" API functions. Instead, our long-standing visibility work has determined all functions that are externally accessed in our tests, and we have effectively labelled all of those with the PLDLLIMPEXP visibility macro that is #defined in include/pldll.h.in. So that means identification of all the functions that need PLTRY/PLCATCH blocks has already been done for us. However, there are a maximum of 326 of them spread over 6 files as revealed by software@raven> grep -l 'PLDLLIMPEXP *' include/*.h* |grep -v pldll include/drivers.h include/ltdl_win32.h include/pdf.h include/plplot.h include/plplotP.h include/plstrm.h software@raven> grep 'PLDLLIMPEXP *' include/*.h* |grep -v pldll |wc -l 326 (Note, I have included drivers.h and pdf.h in the above list because in some configurations all driver code is absorbed into the core PLplot library.) I emphasize you don't have to be too concerned about inserting many more PLTRY/PLCATCH blocks for your proof-of-concept. But assuming the time comes in the relative near future when your C exception handling proof-of-concept has matured sufficiently so you at least want to merge it to the master branch as an option (see discussion above of the PL_IMPLEMENT_EXCEPTION_HANDLING macro) for PLplot users, then potentially there is a lot of editing work here with the prospect of dealing with typographical bugs in that work. Thus, would it be possible to deal with this situation automatically with appropriate macro to generate the code (consisting of a PLTRY block containing an appropriate call to the original renamed function, and empty PLCATCH block and all visibility issues taken care of)? However, if you don't think this automated approach is possible or it is too much work or too error prone, then you can count on me to help you with the required manual editing work for those 300+ functions when the time comes. >> * I also suggest you should implement a PLplot library state variable >> (let's call it plplot_library_status) that is initialized to 0 (good >> status) when the library is initialized but set to 1 (bad status) by >> plexit. Then follow up by implementing a c_plstatus() public API >> that returns the value of library_status and which emits a message >> such as ("library had bad error so it had to be shut down. To use >> it again you need to initialize the library by calling one of the >> plinit family of routines".) Then an external application could call >> c_plstatus to check on the status of the library after any call to >> our public API and act accordingly. >> >> Or it could be sloppy about that (i.e., our standard C examples >> might only call plstatus in one example to make sure it works). In >> which case after the library is shutdown (see above) by an internal >> call to plexit there would be a blizzard of further plabort (due to >> plsc->level being 0) and plexit messages after the first one because >> of that library shutdown. But there would be no call to the dreaded >> exit routine so there is at least the chance that the external >> application will survive that sloppiness. And if plexit does not >> call plend, I think the chances of an external application >> encountering a showstopper error are much worse. > > Hmm, this is quite interesting actually. As I said above I would like > to not totally kill the PLStream - a good example could be that maybe > plexit gets called due to a bad path for a shapefile for a map. This > maybe could be fixed if the calling program checks for an error state. > Of course people are lazy :-) so it might not get checked. In my > initial thinking, the map plot would fail but all other plotting would > carry on. maybe it would be better to set the PLStream to an error > state and have all further API calls fail. Maybe the user could clear > the error state to permit plotting to continue. I welcome thoughts on > this. > > In terms of an error flag that you suggested. I am thinking perhaps > all calls to plerr result in an error message being logged in the > PLStream, plus when we reach the top level PLTRY/PLCATCH/PLENDTRY we > call a user provided error callback. Then we have an API function to > get the list of errors, another to clear the list of errors and one to > check if there are any errors (which would just just check if the > number of error messages in the log is greater than 0). So it is like > your flag, but with some extra detail. I have now thought a bit more about this issue with propagating library error conditions to other C libraries and executables. Because of the #includes that are normally used by all such libraries that call the core PLplot library, these "other libraries and executables" have access to the same PLTRY, PLCATCH, and PLTHROW macros that the core PLplot library does. So I am thinking along the lines of all instances of PLplot calls occurring in our standard examples, e.g., pladv(0); being replaced by pladv(0); if(plstatus()) PLTHROW(NULL); The point is, this is a bit cleaner than pladv(0); if(plstatus()) { fprintf(stderr "There was an error in one of the PLplot library calls\n"); exit(1); } So I hope that first form using PLTHROW is possible in the standard examples. In any case, no matter how you decide to propagate errors across C library boundaries, I think our ultimate goal should be to use exception handling (with at least a PLTRY and PLCATCH block in each of all our C libraries and standard C executables and interface our C exception handling with native language exception handling or C++ device drivers, bindings, and standard examples, and for bindings and examples for all non-C++ languages (e.g., Ada, D, Python, OCaml) which support native exception handling. I do realize I am speculating quite a bit here, but I hope you take it in the intended spirit which is to generate discussion about the absolute best way to propagate library errors across library boundaries considering these future needs. To summarize, I think to help finish this proof-of-concept you need to * fix the bugs I found for iteration 3, * implement the "exit" idea for the case of unhandled exceptions to avoid crashing in that case, * implement the idea concerning the PL_IMPLEMENT_EXCEPTION_HANDLING macro, * finalize how you are going to propagate errors across library boundaries, and * demonstrate a completely unlazy version of one of the C examples which itself has a PLTRY block and PLCATCH block in its main routine and PLTHROWs associated with each PLplot call executed if plstatus() (or whatever you decide to replace it with) indicates a library error condition. And I need to * Test all your remaining iterations until this proof of concept is finished including devising some additional tests with a local patch (obviously not to be committed) to generate lots of calls to plexit through many different potential call stacks to test propagation of errors for most PLplot commands of most examples including the non-lazy one and all the lazy ones. * Implement a commit that removes trailing blanks if you don't do that yourself. * Implement a commit that adds a CMake option to control whether the PL_IMPLEMENT_EXCEPTION_HANDLING C macro is #defined or not. Once all tests are passed with the finished proof-of-concept then we can safely merge this to master (with PL_IMPLEMENT_EXCEPTION_HANDLING defaulting to undefined), and then afterward expand to full PLTRY coverage (automatically or manually implemented) for all externally visible PLplot functions. And at some point introduce completely consistent error handling for all our memory allocation calls whether they correspond to a plstream pointer or not. Once we are completely happy with the exception handling in libplplot and our C examples, we should turn the default option ON so PL_IMPLEMENT_EXCEPTION_HANDLING is #defined by default for most of our users, and as time permits extend exception handling to our other C libraries, our C and C++ device drivers and our bindings and examples for all supported languages that implement their own native exception handling. So it is going to be quite a lot of work to get from here to there. But this is a lofty goal (to eliminate exit completely from our entire code base). And I think the results as we gradually eliminate the exit call from our entire code base (except for internal errors due to exceptions which are not handled by accident). will be well worth it. Also, as we phase in each component of this work, that phase is not and all or nothing proposition since each component can be protected by its own non-default CMake option (just like I am proposing with the PL_IMPLEMENT_EXCEPTION_HANDLING macro) for the proof-of-concept once it is finished. So the work can be considerably spread out over a long time and remain on our master branch during all of that time. > Thanks for the thoughts and comments so far. They are very welcome. And the same goes for your thoughts and comments as well! Furthermore, since the lack of error handling has long been one of the most important fundamental PLplot library issues, it is great to see all your C exception-handler ideas fleshed out and implemented in your current proof-of-concept rather than us having a rather speculative discussion with absolutely no proof-of-concept code to help keep the discussion focussed. Best wishes, Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |