You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(58) |
Nov
(95) |
Dec
(55) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(205) |
Feb
(106) |
Mar
(36) |
Apr
(25) |
May
(34) |
Jun
(36) |
Jul
(161) |
Aug
(66) |
Sep
(100) |
Oct
(62) |
Nov
(77) |
Dec
(172) |
2003 |
Jan
(101) |
Feb
(202) |
Mar
(191) |
Apr
(97) |
May
(27) |
Jun
(21) |
Jul
(16) |
Aug
(55) |
Sep
(155) |
Oct
(166) |
Nov
(19) |
Dec
(134) |
2004 |
Jan
(569) |
Feb
(367) |
Mar
(81) |
Apr
(62) |
May
(124) |
Jun
(77) |
Jul
(85) |
Aug
(80) |
Sep
(66) |
Oct
(42) |
Nov
(20) |
Dec
(133) |
2005 |
Jan
(192) |
Feb
(143) |
Mar
(183) |
Apr
(128) |
May
(136) |
Jun
(18) |
Jul
(22) |
Aug
(33) |
Sep
(20) |
Oct
(12) |
Nov
(80) |
Dec
(44) |
2006 |
Jan
(42) |
Feb
(38) |
Mar
(17) |
Apr
(112) |
May
(220) |
Jun
(67) |
Jul
(96) |
Aug
(214) |
Sep
(104) |
Oct
(67) |
Nov
(150) |
Dec
(103) |
2007 |
Jan
(111) |
Feb
(50) |
Mar
(113) |
Apr
(19) |
May
(32) |
Jun
(34) |
Jul
(61) |
Aug
(103) |
Sep
(75) |
Oct
(99) |
Nov
(102) |
Dec
(40) |
2008 |
Jan
(86) |
Feb
(56) |
Mar
(104) |
Apr
(50) |
May
(45) |
Jun
(64) |
Jul
(71) |
Aug
(147) |
Sep
(132) |
Oct
(176) |
Nov
(46) |
Dec
(136) |
2009 |
Jan
(159) |
Feb
(136) |
Mar
(188) |
Apr
(189) |
May
(166) |
Jun
(97) |
Jul
(160) |
Aug
(235) |
Sep
(163) |
Oct
(46) |
Nov
(99) |
Dec
(54) |
2010 |
Jan
(104) |
Feb
(121) |
Mar
(153) |
Apr
(75) |
May
(138) |
Jun
(63) |
Jul
(61) |
Aug
(27) |
Sep
(93) |
Oct
(63) |
Nov
(40) |
Dec
(102) |
2011 |
Jan
(52) |
Feb
(26) |
Mar
(61) |
Apr
(27) |
May
(33) |
Jun
(43) |
Jul
(37) |
Aug
(53) |
Sep
(58) |
Oct
(63) |
Nov
(67) |
Dec
(16) |
2012 |
Jan
(97) |
Feb
(34) |
Mar
(6) |
Apr
(18) |
May
(32) |
Jun
(9) |
Jul
(17) |
Aug
(78) |
Sep
(24) |
Oct
(101) |
Nov
(31) |
Dec
(7) |
2013 |
Jan
(44) |
Feb
(35) |
Mar
(59) |
Apr
(17) |
May
(29) |
Jun
(38) |
Jul
(48) |
Aug
(46) |
Sep
(74) |
Oct
(140) |
Nov
(94) |
Dec
(177) |
2014 |
Jan
(94) |
Feb
(74) |
Mar
(75) |
Apr
(63) |
May
(24) |
Jun
(1) |
Jul
(30) |
Aug
(112) |
Sep
(78) |
Oct
(137) |
Nov
(60) |
Dec
(17) |
2015 |
Jan
(128) |
Feb
(254) |
Mar
(273) |
Apr
(137) |
May
(181) |
Jun
(157) |
Jul
(83) |
Aug
(34) |
Sep
(26) |
Oct
(9) |
Nov
(24) |
Dec
(43) |
2016 |
Jan
(94) |
Feb
(77) |
Mar
(83) |
Apr
(19) |
May
(39) |
Jun
(1) |
Jul
(5) |
Aug
(10) |
Sep
(28) |
Oct
(34) |
Nov
(82) |
Dec
(301) |
2017 |
Jan
(53) |
Feb
(50) |
Mar
(11) |
Apr
(15) |
May
(23) |
Jun
(36) |
Jul
(84) |
Aug
(90) |
Sep
(35) |
Oct
(81) |
Nov
(13) |
Dec
(11) |
2018 |
Jan
(15) |
Feb
(4) |
Mar
(2) |
Apr
(2) |
May
|
Jun
(6) |
Jul
(4) |
Aug
(13) |
Sep
(31) |
Oct
(4) |
Nov
(25) |
Dec
(64) |
2019 |
Jan
(7) |
Feb
(4) |
Mar
|
Apr
|
May
(13) |
Jun
(8) |
Jul
(16) |
Aug
(7) |
Sep
(27) |
Oct
(1) |
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(8) |
Jun
(1) |
Jul
(4) |
Aug
|
Sep
(3) |
Oct
(2) |
Nov
(4) |
Dec
(3) |
2021 |
Jan
(1) |
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(9) |
Aug
(3) |
Sep
|
Oct
(8) |
Nov
(4) |
Dec
|
2022 |
Jan
|
Feb
(6) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(8) |
2023 |
Jan
(6) |
Feb
|
Mar
(1) |
Apr
(2) |
May
(10) |
Jun
(7) |
Jul
|
Aug
(5) |
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
(9) |
Oct
|
Nov
|
Dec
|
From: Alan W. I. <ir...@be...> - 2016-02-28 03:10:41
|
On 2016-02-27 23:22-0000 Phil Rosenberg wrote: > I do agree that memory allocations should not cause exits. [...] A failure of a single very large allocation does not always imply the OS is in crisis. I now agree. > Yes I'm happy to start putting an exception based error propagation together. Thanks very much for being willing to actively lead this effort. > Perhaps it would be good for Hazen and Alan to start looking in more detail at implementing the error code propagation? If there are tools to help with that, then maybe it will be easier than I suspect. I am now confident of the return code method because of the 'warn_unused_result' gcc attribute capability, but let's not even start the substantial work required for this alternative unless it is made necessary because you run into some showstopper issue with the C exception-based alternative. > I started putting a plmalloc and similar set of functions together today. I'll drop those round for comments asap. Good idea, and I look forward to seeing what you come up with for the design of plmalloc and similar. Also, I would be happy to help propagate the use of plmalloc and similar throughout our code base immediately in this release cycle thus effectively getting rid of all those cases where we have previously been lazy about checking for memory allocation errors. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2016-02-27 23:23:02
|
Hi Alan, Hazen and anyone else interested. I do agree that memory allocations should not cause exits. I think these cases might be less rare than you think Alan. I work with some very large datasets and have hit memory allocation failures moderately often, although maybe that was more when I was using older versions of visual studio and it built 32 bit executables as standard. A failure of a single very large allocation does not always imply the OS is in crisis. Yes I'm happy to start putting an exception based error propagation together. Perhaps it would be good for Hazen and Alan to start looking in more detail at implementing the error code propagation? If there are tools to help with that, then maybe it will be easier than I suspect. If someone is implementing both methods then if someone hits a big obstacle then it means we have another option. I started putting a plmalloc and similar set of functions together today. I'll drop those round for comments asap. Even if we don't go down an exceptions route they should help with generally keeping track of memory and avoiding leaks. Phil -----Original Message----- From: "Alan W. Irwin" <ir...@be...> Sent: 27/02/2016 22:47 To: "Hazen Babcock" <hba...@ma...>; "Phil Rosenberg" <p.d...@gm...>; "PLplot development list" <Plp...@li...> Subject: Re: [Plplot-devel] Error report system plus thread safety To Hazen and Phil: Here is important additional information concerning the return code method of implementing an error report system that should make that method very much easier to maintain and test. Phil originally expressed concern about the amount of developer time it would take to implement AND maintain propagation of return codes through all relevant caller paths, and he also proved my proposed method of testing potentially would not cover all caller paths. That combined with the degree of complexity in the caller paths illustrated by the doxygen results was also making me concerned about PLplot using the return code method of error reporting. However, I just discovered a method with gcc that would make the implementation and maintenance for that method a lot less of a burden. That method is as follows (see discussion in <http://stackoverflow.com/questions/2042780/how-to-raise-warning-if-return-value-is-disregarded-gcc-or-static-code-check> where the focus is on C++, but the method also works for C): The method uses the 'warn_unused_result' gcc attribute. A complication is that we already use the 'visibility( "default" )' attribute for some gcc cases in include/pldll.h(.in), and combinations of attributes must be done with a special attribute list syntax. So the bottom line is both the 'warn_unused_result' and 'visibility( "default" )' attributes would have to be combined appropriately in include/pldll.h(.in) to form the various PLDLLIMPEXP results there. And that file would likely also be the best place to implement the following #if defined ( __GNUC__ ) #define PL_WARN_UNUSED __attribute__((warn_unused_result)) #else #define PL_WARN_UNUSED #endif (Note that the PL_WARN_UNUSED macro would be needed for all functions (e.g., static ones) that don't have some form of PLDLLIMPEXP already deployed in the PLplot headers.) So, for example, a static function would be declared as static PLINT PL_WARN_UNUSED pl<function> Anyhow, assuming all this 'warn_unused_result' attribute infrastructure had been set up, then gcc should warn whenever an attempt is made by C/C++ code that calls a PLplot function to ignore the return code set by that function. So this gcc 'warn_unused_result' attribute would be an enormous help in finding and especially testing and maintaining all caller paths that are relevant to return code propagation. Anyhow, because of this (very) recent discovery I am once again confident of going ahead with an implementation of the return code method of error reporting (including even memory allocation issues since the cost of those has suddenly gone down) early in the next release cyle, and it appears Hazen is ready to help with that implementation as well. At the same time I have nothing against the alternative setjmp/longjmp C exception handling error reporting mechanism favored by Phil except that I don't completely understand it. But I would be willing to follow where Phil actively leads on that so let's hear from Phil about whether his current time constraints allow him to actively lead (i.e., doing some active coding and testing rather than making suggestions from the side) such a development before we make the final decision about which error reporting mechanism to use. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-27 22:47:45
|
To Hazen and Phil: Here is important additional information concerning the return code method of implementing an error report system that should make that method very much easier to maintain and test. Phil originally expressed concern about the amount of developer time it would take to implement AND maintain propagation of return codes through all relevant caller paths, and he also proved my proposed method of testing potentially would not cover all caller paths. That combined with the degree of complexity in the caller paths illustrated by the doxygen results was also making me concerned about PLplot using the return code method of error reporting. However, I just discovered a method with gcc that would make the implementation and maintenance for that method a lot less of a burden. That method is as follows (see discussion in <http://stackoverflow.com/questions/2042780/how-to-raise-warning-if-return-value-is-disregarded-gcc-or-static-code-check> where the focus is on C++, but the method also works for C): The method uses the 'warn_unused_result' gcc attribute. A complication is that we already use the 'visibility( "default" )' attribute for some gcc cases in include/pldll.h(.in), and combinations of attributes must be done with a special attribute list syntax. So the bottom line is both the 'warn_unused_result' and 'visibility( "default" )' attributes would have to be combined appropriately in include/pldll.h(.in) to form the various PLDLLIMPEXP results there. And that file would likely also be the best place to implement the following #if defined ( __GNUC__ ) #define PL_WARN_UNUSED __attribute__((warn_unused_result)) #else #define PL_WARN_UNUSED #endif (Note that the PL_WARN_UNUSED macro would be needed for all functions (e.g., static ones) that don't have some form of PLDLLIMPEXP already deployed in the PLplot headers.) So, for example, a static function would be declared as static PLINT PL_WARN_UNUSED pl<function> Anyhow, assuming all this 'warn_unused_result' attribute infrastructure had been set up, then gcc should warn whenever an attempt is made by C/C++ code that calls a PLplot function to ignore the return code set by that function. So this gcc 'warn_unused_result' attribute would be an enormous help in finding and especially testing and maintaining all caller paths that are relevant to return code propagation. Anyhow, because of this (very) recent discovery I am once again confident of going ahead with an implementation of the return code method of error reporting (including even memory allocation issues since the cost of those has suddenly gone down) early in the next release cyle, and it appears Hazen is ready to help with that implementation as well. At the same time I have nothing against the alternative setjmp/longjmp C exception handling error reporting mechanism favored by Phil except that I don't completely understand it. But I would be willing to follow where Phil actively leads on that so let's hear from Phil about whether his current time constraints allow him to actively lead (i.e., doing some active coding and testing rather than making suggestions from the side) such a development before we make the final decision about which error reporting mechanism to use. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-27 20:30:50
|
Hi Chris: For a bit of background information for my response to your comment below note that my counts show that 62 per cent of current plexit calls are associated with memory allocation errors, but I believe that fraction will quickly exceed 90 per cent once we become rigourous about checking for memory allocation issues. So what we do with the memory allocation errors potentially could matter a lot in terms of implementation/maintenance costs for our error report system. On 2016-02-27 10:02-0500 Chris Marshall wrote: > The problem with immediate exit on memory errors > is that they cannot be caught. For example, trying to > make a perl binding to plplot would essentially crash > perl if the plplot did an exit. Understood. Memory allocation fails only because there are PLplot bugs in the size of memory requested or because the system is out of memory. Out of memory is clearly an emergency situation where the operating system is likely shutting down various programs arbitrarily in any case to try to claw back some memory. So under these emergency conditions an immediate exit from PLplot (and the calling environment like perl) is not a particularly bad tactic. And such out-of-memory emergency conditions are quite rare in any case. On the other hand, if we do make memory allocation errors part of our error report system, it does give at least some chance (if the OS does not kill the perl session first) for the user to save work in an out-of-memory emergency and if the issue is due to a PLplot size bug, it gives them (and us) a chance to replicate the issue and debug it further. So my conclusion is there is a modest benefit to making memory allocation errors part of the error report system, but that benefit has to be weighed against the implementation/maintenance costs. >From what Phil has said, if we can get the C setjmp/longjmp exception handling method of error reporting to work then including memory allocation errors as part of that report system would be fairly trivial so it would be a no-brainer to do that. But if we decide to use the return code method of error reporting the costs of including memory allocation errors in the system would be significantly higher (significantly more independent caller paths would have to be dealt with) so we would have to weigh those costs against the modest (in my view) benefit discussed above. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Chris M. <dev...@gm...> - 2016-02-27 15:02:18
|
On 2/24/2016 15:34, Alan W. Irwin wrote: > On 2016-02-24 06:52-0500 Hazen Babcock wrote: > > [Alan] >>> Anyhow, I am convinced by my above estimate that the propagation part >>> of the work will require similar effort to the rest of the project, I >>> do like the simplicity of the return code method that has proved to be >>> so useful in the ephcom case, and I think the above testing method >>> will insure the whole effort will give reliable return code results. >>> Also, I think we are talking a few weeks of one-man effort rather than >>> months to get this done completely. Therefore, I stand willing (early >>> in the next release cycle) to make this happen. Of course, many hands >>> make light work so I would welcome some help with this project. > [Hazen] >> I agree with your time estimate. I also agree that the return codes >> approach is the right way to do it, and even if it does end up taking >> longer than expected it will be more than worth the effort. So I'd be >> happy to help. I guess we'd start with a header file containing a list >> of possible error codes, and then split up the editing of the various >> files in the src/ directory between those who volunteer? > Hi Hazen: > > I think that would be a good approach, and thanks very much for volunteering > to help with this important project. > > Note, I have recently argued for an immediate exit approach for memory > allocation errors which would vastly simplify what we had to do. But > if you feel strongly that even those errors that only appear for > emergency conditions should be included in the error reporting system, > I would be willing to go along with that. The problem with immediate exit on memory errors is that they cannot be caught. For example, trying to make a perl binding to plplot would essentially crash perl if the plplot did an exit. In FreeGLUT, the fix was to allow an error handler be called rather than exit. At that point a python/perl/... binding could handle the problem. One example of a use case is when you are running an interactive session in the scripting language. Arguably, running a routine in the shell shouldn't kill the shell. --Chris > Also, as noted before I won't have time to work on this project until > the current release is out the door (still something like a month or > so away). However, if you do have time to work on this now, please > just go ahead on a private topic branch rather than waiting for me. > Right after the current release is out the door you could push your > matured topic branch to master or else if it is not matured by then, I > would be willing and highly motivated to add a lot of energy to the > project at that stage with the goal of getting this topic matured > quickly so we can push it to master not too long after the current > release is done. > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Plplot-devel mailing list > Plp...@li... > https://lists.sourceforge.net/lists/listinfo/plplot-devel |
From: Alan W. I. <ir...@be...> - 2016-02-27 06:26:13
|
On 2016-02-26 22:45-0500 Hazen Babcock wrote: > On 02/26/2016 05:18 PM, Alan W. Irwin wrote: [...] >> But I have to admit you [Phil] have >> already made a fairly convincing argument for error reporting via C >> exception handling, and if you look at the caller graphs I prepared >> there are some instances (even if we have immediate exit for memory >> emergencies) where the return value approach would be quite >> complicated. > > Such as? Hi Hazen: To give you an immediate fairly complex example look at the call graph and caller graph of grline in <http://plplot.sourceforge.net/doxygen/html/plcore_8c.html>. The caller graph includes some 50 functions while its call graph includes calls to plexit and one of those calls in plsave_set_locale is not anything to do with memory allocation so there is no question we would have to deal with it in our error report system. So that is already a pretty complicated example, but I just discovered that is just scratching the surface because the large call/caller graphs generated by doxygen are partitioned (as indicated by the red box around some references) so a given caller graph being presented on a single plot is in complex cases a subset of the complete result. For example, the caller graph for plsave_set_locale (in <http://plplot.sourceforge.net/doxygen/html/plctrl_8c.html> this time) does not include all the complications of the grline caller graph. Instead, it summarizes that 50-function call graph as a red box around grline with no further details. So I think to get the full picture for either a large call graph or caller graph you have to click on all the red boxes. Anyhow, please browse and click on these graphs including the red boxes to draw your own conclusions about the complexity of the call graphs that the return code approach for error reporting would have to deal with. I will stop there because I don't have the technical expertise to answer the concerns you expressed about using C exception handling (implemented with setjmp/longjmp) as the complete basis of the error reporting system for PLplot that Phil apparently has in mind. However, I assume Phil will answer your concerns. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Hazen B. <hba...@ma...> - 2016-02-27 03:45:42
|
On 02/26/2016 05:18 PM, Alan W. Irwin wrote: > @Hazen: > > Please pay close attention to this ongoing discussion and participate > in it further because I don't want you to start work on the return > value approach for reporting errors like you have already volunteered > to do just in case the (likely) decision is made that we go with error > reporting based on C exception handling instead. Don't worry, I promise not to start any time soon.. > On 2016-02-26 17:34-0000 Phil Rosenberg wrote: > > [Phil] >>>> I am a fairly recent convert to exceptions, but I now use them >>>> throughout my C++ code, because using exceptions totally frees me from >>>> having to worry about errors. I don't have to worry about checking >>>> return values or anything like that, I am free to concentrate on the >>>> interesting aspects of the code writing, rather than getting bogged >>>> down in error checking and avoiding memory leaks. But more >>>> importantly, if used in well styled code, exceptions are bomb proof, >>>> they actually make it impossible to miss or fail to deal with an >>>> error. >>> >>> > [Alan] >>> Is this possibility only available in C++? If so, it does not help us >>> with our core C library which does need an error reporting system now. >>> We do not want to use the the possibility that we _might_ move to C++ >>> in the future for our core library as an excuse for inaction on this >>> important topic for our current core C library. >>> > [Phil] >> Regarding this point - C can do similar things using the >> setjmp()/longjmp() function and a resource pool/tracker. It is not as >> simple to implement as in C++, but I still think it would be much >> easier to implement and much easier to support than trying to >> propagate error codes through our internals. I still strongly >> recommend a read of that book chapter and if you ever feel like >> experimenting with C++ that book is excellent, although a bit out of >> date now we have C++11 and beyond. >> It just dawned on me that we could simply put longjmp(errcode); inside >> plexit() and then all we need to do is add the code to catch this in >> the API functions and the job is complete. >> >> Anyway please just have a read about exceptions before we make this call. > > @Phil: > > I promise to do that later today. But I have to admit you have > already made a fairly convincing argument for error reporting via C > exception handling, and if you look at the caller graphs I prepared > there are some instances (even if we have immediate exit for memory > emergencies) where the return value approach would be quite > complicated. Such as? Furthermore, from my skimming of > <https://en.wikipedia.org/wiki/Setjmp.h> it appears that your > preferred setjmp()/longjmp() approach is usable for C exception > handling just like you have claimed above. I still favor the return value based approach. If one was to start again from scratch this is the way (I think) the library would be written, so I don't see why it would be done differently now. Yes it will probably be more work to implement, but I don't think it will be particularly hard to maintain as a compiler like gcc is more than happy to warn you if you are ignoring returned values. And yes the exit will not be clean, as there is likely to be memory that was allocated but not freed. However the possibility exists with this approach of going back in the future and cleaning all of that up, something which I don't think is possible with a setjmp() approach. Though I guess we could start with the setjmp() approach as a possibly quicker way to make the API change and then go back and clean up the internals at our leisure to use a return based approach. -Hazen |
From: Alan W. I. <ir...@be...> - 2016-02-26 22:18:34
|
@Hazen: Please pay close attention to this ongoing discussion and participate in it further because I don't want you to start work on the return value approach for reporting errors like you have already volunteered to do just in case the (likely) decision is made that we go with error reporting based on C exception handling instead. More below. On 2016-02-26 17:34-0000 Phil Rosenberg wrote: [Phil] >>> I am a fairly recent convert to exceptions, but I now use them >>> throughout my C++ code, because using exceptions totally frees me from >>> having to worry about errors. I don't have to worry about checking >>> return values or anything like that, I am free to concentrate on the >>> interesting aspects of the code writing, rather than getting bogged >>> down in error checking and avoiding memory leaks. But more >>> importantly, if used in well styled code, exceptions are bomb proof, >>> they actually make it impossible to miss or fail to deal with an >>> error. >> >> [Alan] >> Is this possibility only available in C++? If so, it does not help us >> with our core C library which does need an error reporting system now. >> We do not want to use the the possibility that we _might_ move to C++ >> in the future for our core library as an excuse for inaction on this >> important topic for our current core C library. >> [Phil] > Regarding this point - C can do similar things using the > setjmp()/longjmp() function and a resource pool/tracker. It is not as > simple to implement as in C++, but I still think it would be much > easier to implement and much easier to support than trying to > propagate error codes through our internals. I still strongly > recommend a read of that book chapter and if you ever feel like > experimenting with C++ that book is excellent, although a bit out of > date now we have C++11 and beyond. > It just dawned on me that we could simply put longjmp(errcode); inside > plexit() and then all we need to do is add the code to catch this in > the API functions and the job is complete. > > Anyway please just have a read about exceptions before we make this call. @Phil: I promise to do that later today. But I have to admit you have already made a fairly convincing argument for error reporting via C exception handling, and if you look at the caller graphs I prepared there are some instances (even if we have immediate exit for memory emergencies) where the return value approach would be quite complicated. Furthermore, from my skimming of <https://en.wikipedia.org/wiki/Setjmp.h> it appears that your preferred setjmp()/longjmp() approach is usable for C exception handling just like you have claimed above. So the current status is both Hazen and I are keen to see a proper error report system for PLplot implemented soon. To the point where we were both willing to implement that with the return code method early in the next release cycle. But you have made some good arguments that primitive but easy to understand method should not be used, and instead C exception handling using setjmp()/longjmp() should be used instead as the basis for the error report system. I cannot speak for Hazen, but the major difficulty I have with the setjmp()/longjmp() method is I don't understand exactly how it should be implemented. So would you be willing to implement what you had in mind now on a private topic branch with the goal of merging it into master early in the next release cycle? (Or even in this release cycle if you can get something matured within the next few weeks.) Of course, if some part of that task was just routine editing, I would be willing to help with that, but it sounds like you might have something in mind that hardly requires much editing at all. Note, also that once you had a brand-new exception-based error report system matured and merged to master, I think you would pick up a lot of volunteers (I would be one of them) to go through our code with a fine-toothed comb looking for other errors we should be reporting (after every memory allocation, file manipulation, etc.) with that error report system. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-26 21:34:07
|
On 2016-02-26 17:34-0000 Phil Rosenberg wrote: > We have discussed a few API breaking changes recently. Error > reporting, thread safety, C++ API changes and the Fortran API changes > are pretty close to complete by the seems of things. > > Is it time to bring up the possibility of PLPlot 6 again and how we > intend to manage that? Hi Phil: Good question concerning PLplot 6 which deserves its own subject line. I have noticed there are some large free software projects (e.g., Linux kernel, octave) where they simply make cumulative backwards incompatible changes that are more or less well documented and leave it at that. And that is the model we are currently following right now with PLplot. From my perspective, the big advantage of this model for handling backward-incompatible change is it allows users to slowly adjust to such changes as they occur rather than the alternative of us maintaining strict backwards compatibility at the expense of accumulating huge cruft, and then getting rid of all such cruft in one (likely buggy since so much change is involved) "major release" change that everybody hates! For more on the Linux kernel release history see <https://en.wikipedia.org/wiki/Linux_kernel#Maintenance>. You can see there that they ran to very high "patch" numbers in the 2.6.<patch> series before transitioning to 3.0, and also fairly high "minor" numbers in 3.<minor> before moving to 4.0. In fact, IIRC, there was essentially no difference between 2.6.39 and 3.0.0. In other words the only reason for bumping the major number at that stage was the kernel developers were getting tired of the high patch numbers for 2.6.<patch>, and also I believe they were concerned those "patch" releases were much more than simple bug fixing releases, and moving to kernel 3.0 allowed them to use the patch number for such ("bug-fixing only") releases. The turnover from Linux kernel 3 to 4 had similar "against high version number" motivations rather than introducing a major new feature (see <http://www.zdnet.com/article/linux-kernel-turns-over-release-odometer-to-4-0/>). Actually, PLplot is also following a similar model right now where the release numbers are 5.<minor>.<patch> where I bump just the patch number if there is mostly just bug fixing in the release (e.g., the release of 5.11.1), but I bump the minor number and zero the patch number (e.g., the forthcoming 5.12.0) when there are major developments (e.g., the new Fortran binding) in the release. If we decide to continue this model, then following what was done for the Linux kernel, PLplot 6.0.0 might not be anything special other than a desire on our part to start the minor numbers at zero again. But in my book "12" is not that high, and I would only start worrying about PLplot 6 once we get to PLplot-5.20 or so which will likely be quite a long time from now. In sum, version numbers are fairly arbitrary and each project tends to interpret them in various ways. The above is my interpretation of major, minor, and patch in the PLplot-<major>.<minor>.patch that I release, and I intend to stick with that model. So I plan to use a new minor release version when we introduce the change to no static variables and/or the introduction of an error reporting system. But at some point someone else may take over as PLplot release manager, and they may have their own strong ideas on how to interpret major, minor, and patch in our release numbers. All the above being said, I think regardless of the philosophy of the release manager we should always make it as easy as possible for our users to stick with us through our PLplot-<major>.<minor>.<patch> releases so we should never introduce backwards-incompatible API changes in those releases unless there is a compelling and well-documented motivation (e.g., the cruft-reduction and better consistency of the present Fortran binding changes that I discuss in the 5.12.0 README.release file that is currently being prepared). Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2016-02-26 17:34:54
|
We have discussed a few API breaking changes recently. Error reporting, thread safety, C++ API changes and the Fortran API changes are pretty close to complete by the seems of things. Is it time to bring up the possibility of PLPlot 6 again and how we intend to manage that? Alan there are some further points below > >> I am a fairly recent convert to exceptions, but I now use them >> throughout my C++ code, because using exceptions totally frees me from >> having to worry about errors. I don't have to worry about checking >> return values or anything like that, I am free to concentrate on the >> interesting aspects of the code writing, rather than getting bogged >> down in error checking and avoiding memory leaks. But more >> importantly, if used in well styled code, exceptions are bomb proof, >> they actually make it impossible to miss or fail to deal with an >> error. > > > Is this possibility only available in C++? If so, it does not help us > with our core C library which does need an error reporting system now. > We do not want to use the the possibility that we _might_ move to C++ > in the future for our core library as an excuse for inaction on this > important topic for our current core C library. > Regarding this point - C can do similar things using the setjmp()/longjmp() function and a resource pool/tracker. It is not as simple to implement as in C++, but I still think it would be much easier to implement and much easier to support than trying to propagate error codes through our internals. I still strongly recommend a read of that book chapter and if you ever feel like experimenting with C++ that book is excellent, although a bit out of date now we have C++11 and beyond. It just dawned on me that we could simply put longjmp(errcode); inside plexit() and then all we need to do is add the code to catch this in the API functions and the job is complete. Anyway please just have a read about exceptions before we make this call. >However, it should be noted that memory allocation errors (for >debugged code that calls malloc et al. correctly) are essentially >always a sign of an emergency condition where you have run out of >memory due to some process leaking memory like crazy. And remember >this is an extremely rare emergency condition in any case. For >example, in my decades of experience with PLplot I have _never_ seen a >plexit message concerning memory allocation failing. So my conclusion >is an immediate exit is unlikely to irritate users in this >one particular memory allocation error case. I would suggest that we should report back on memory allocation failures rather than exit. Even with a memory allocation failure, someone using plplot in an application (e.g. in GDL) may be able to offer their user a final chance to save their work, or may be able to free some memory. I'm not sure I have ever hit a memory allocation error within plplot, but I certainly regularly hit memory allocation errors during data analysis in general. Phil |
From: Alan W. I. <ir...@be...> - 2016-02-25 02:20:30
|
Hi Hazen: The propagation issue can be restated as follows: For all functions where plexit is currently called (except possibly the memory allocation issues which we might handle differently) you need to propagate the return code backwards through the "caller graph". The doxygen application uses the terminology "caller graph" for the call graph _to_ a given function and simply uses "call graph" for the call graph _from_ a given function so I adopt that terminology here. Note, however, that the wikipedia article <https://en.wikipedia.org/wiki/Call_graph> on call graphs doesn't distinguish those two cases. Anyhow, it turns out that doxygen can generate both call graphs and caller graphs so I configured (commit b5d5b01) that by setting HAVE_DOT, CALL_GRAPH and CALLER_GRAPH to YES in doc/Doxyfile.in; installed the graphviz and doxygen software on my Debian system; configured cmake with -DBUILD_DOX_DOC=ON; successfully built the build_doxygen target; and uploaded the html results generated by that target to our website. Please check out those doxygen-generated results by looking at <http://plplot.sourceforge.net/doxygen/html/plcore_8c.html> which documents src/plcore.c and associated source files. The caller graphs are the ones that are relevant to the present issue of propagating return codes. So to take one example, assume in the static void function plSelectDev that is defined in plcore.c you have just replaced plexit( "plSelectDev: Too many tries." ); by return plerror_return(PL_TOO_MANY_TRIES, "plSelectDev: Too many tries."); where PL_TOO_MANY_TRIES is the return code you have chosen to use to identify this particular error, plerror_return simply writes the message to stderr and returns the specified error code, and you have also modified the type of plSelectDev so it can return a value, and in particular return a 0 value on success. Now you want to propagate the case when the return code of plSelectDev is non-zero. For that purpose, the caller graph for plSelectDev can be found in <http://plplot.sourceforge.net/doxygen/html/plcore_8c.html>, and that immediately shows that you have to be concerned about propagation to just five other functions. That visual result is much more convenient than grepping through the source code to find out this information so I think these type of doxygen results are going to be an enormous help to us. One minor caveat is my favorite "konqueror" browser refuses to horizontally scroll the wide caller graphs so is largely unusable for browsing these results. Fortunately, I do have access to iceweasel (the Debian brand for the firefox browser) and that has no trouble with horizontal scrolling or other rendering issues for these results so you might want to try firefox there if you are having any rendering trouble with your favorite browser (if that is different than firefox). Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-24 20:34:48
|
On 2016-02-24 06:52-0500 Hazen Babcock wrote: [Alan] >> Anyhow, I am convinced by my above estimate that the propagation part >> of the work will require similar effort to the rest of the project, I >> do like the simplicity of the return code method that has proved to be >> so useful in the ephcom case, and I think the above testing method >> will insure the whole effort will give reliable return code results. >> Also, I think we are talking a few weeks of one-man effort rather than >> months to get this done completely. Therefore, I stand willing (early >> in the next release cycle) to make this happen. Of course, many hands >> make light work so I would welcome some help with this project. > [Hazen] > I agree with your time estimate. I also agree that the return codes > approach is the right way to do it, and even if it does end up taking > longer than expected it will be more than worth the effort. So I'd be > happy to help. I guess we'd start with a header file containing a list > of possible error codes, and then split up the editing of the various > files in the src/ directory between those who volunteer? Hi Hazen: I think that would be a good approach, and thanks very much for volunteering to help with this important project. Note, I have recently argued for an immediate exit approach for memory allocation errors which would vastly simplify what we had to do. But if you feel strongly that even those errors that only appear for emergency conditions should be included in the error reporting system, I would be willing to go along with that. Also, as noted before I won't have time to work on this project until the current release is out the door (still something like a month or so away). However, if you do have time to work on this now, please just go ahead on a private topic branch rather than waiting for me. Right after the current release is out the door you could push your matured topic branch to master or else if it is not matured by then, I would be willing and highly motivated to add a lot of energy to the project at that stage with the goal of getting this topic matured quickly so we can push it to master not too long after the current release is done. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-24 19:40:36
|
On 2016-02-24 10:33-0000 Phil Rosenberg wrote: [Alan] >> I believe there are some 300 functions in our public and private API, >> and not all of those will be affected by this (because there are no >> error conditions they generate or they need to propagate). So I would >> estimate "hundreds" of functions rather than "thousands" would be >> touched by this propagation effort. [Phil] > Note that some functions are called tens of times within plplot, and > with an error code return mechanism it is each call that needs a > check. That point is correct. So it is possible our C examples might never exercise a particular call chain that needs checking, or that call chain might be there to be checked, but it could be masked in tests because it is always called after a different call chain to the same function in everyone of our C examples. Nevertheless, our first line of defence is due diligence with grep which should easily find all instances where a particular routine is being called within our code base. So combining that with the test I described would likely catch the vast majority of errors in the error reporting. Therefore, I think this approach is already far better than "good enough". In any case this is a large change for users since before they could (largely) rely on PLplot to exit when there were errors and now suddenly the action to take concerning errors is on their shoulders instead. So we should write up this change extensively in the release notes including a section about how to report errors in the report system itself where there is an initial error message followed either by no non-zero return code or else some other PLplot crash before control is returned to the calling programme. But again, I think because of the due diligence noted above this would be quite a rare case for our users to contend with. > Also some functions return values; this would need > restructuring with the potential for introducing bugs - possibly very > hard to find bugs. I agree this is an issue that will need to be addressed _if_ there exists a PLplot function that currently returns a value and that function also has potential error conditions to report or propagate. However, this is likely a small problem (or may turn out to not be a problem at all) because the number of PLplot functions with return values now is quite small. >> Also, I think the editing task for propagating return codes would be a >> similar effort to the editing task for replacing the call to plexit by >> the generation of the equivalent message to stderr and an immediate >> return with non-zero return code. Note, the number of plexit calls is >> 220, > By contrast with an exception model we have less than 1 change to make > per plexit call. These can almost be done by find and replace. You may > wonder why this is less than. I imagine almost all of those are memory > allocation failures. If we create a plmalloc function which allocates > and tracks memory then this function can generate the exception if the > memory allocation fails. This removes all requirements to even check > whether a memory allocation has succeeded. Almost all the plexit calls > vanish by writing just one function, but even better all the cases > where we were lazy and didn't bother to check if our allocation > succeeded (I wonder how many of those exist in plplot right now?) get > automatically fixed. I agree that the vast majority of our current plexit calls are due to memory allocation errors and furthermore I agree with you we have been lazy about checking for such memory allocation errors so that fraction would become substantially larger if we started to check for such errors in all memory allocation cases. However, it should be noted that memory allocation errors (for debugged code that calls malloc et al. correctly) are essentially always a sign of an emergency condition where you have run out of memory due to some process leaking memory like crazy. And remember this is an extremely rare emergency condition in any case. For example, in my decades of experience with PLplot I have _never_ seen a plexit message concerning memory allocation failing. So my conclusion is an immediate exit is unlikely to irritate users in this one particular memory allocation error case. So let's simplify the problem by wrapping all our malloc calls with plmalloc (and calloc calls by plcalloc) as you suggest above. The difference with what you suggest above is I think those wrappers should rigourously check validity of arguments (i.e., sizes must be positive), automatically check for allocation errors, and if such an error is encountered print out a standard "likely out of memory" error message before doing an immediate exit. >From what you have said above, I think you will agree with me that single change would make the task of implementing a proper error report system for the remaining errors at least an order of magnitude easier. > I am a fairly recent convert to exceptions, but I now use them > throughout my C++ code, because using exceptions totally frees me from > having to worry about errors. I don't have to worry about checking > return values or anything like that, I am free to concentrate on the > interesting aspects of the code writing, rather than getting bogged > down in error checking and avoiding memory leaks. But more > importantly, if used in well styled code, exceptions are bomb proof, > they actually make it impossible to miss or fail to deal with an > error. Is this possibility only available in C++? If so, it does not help us with our core C library which does need an error reporting system now. We do not want to use the the possibility that we _might_ move to C++ in the future for our core library as an excuse for inaction on this important topic for our current core C library. [...] > Anyway, sorry if this sounds negative, it's not meant to be. As I said > I've found exceptions quite recently and they have revolutionised my > coding style, so I feel quite passionate about them. OK. By the way, I am not negative about C++, but I just don't have time right now to give it a high priority. Note that could change since even at this late date I am still learning and growing. :-) For example, I have many decades of Fortran 77 experience but only in the last couple of months have I finally learned modern Fortran as a result of my recent collaboration with Arjen on rewriting our Fortran binding. As a result I am now a strong advocate of modern Fortran (it has some high-level capabilities that are wonderfully useful in scientific computing) while before I was mainly indifferent to it. So as a result of this experience, I intend to convert both te_gen (a Fortran subproject of timeephem) and FreeEOS to modern Fortran. Live and learn.... Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-24 17:18:46
|
On 2016-02-24 10:33-0000 Phil Rosenberg wrote: > Hi Alan > I've cut and pasted some bits together of your last couple of emails > > > >> However, isn't being forced to use separate threads in order to >> (automatically) have separate PLplot contexts for each thread a >> relatively minor inconvenience compared to the very much larger user >> inconvenience required by adding a context address to most API calls? >> > Not necessarily. This may force someone to restructure their code as > they would have to have separate contexts in separate functions and > may actually generate race conditions that people will need to deal > with using mutexes. Without introducing backwards incompatibilities in our entire API we could certainly add a getter, plgcontext and setter, plscontext, for PLplotContextAddress to our API similarly to the way that plgstrm and plsstrm work now. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Hazen B. <hba...@ma...> - 2016-02-24 11:52:08
|
> >> I obviously don't have any idea of the size of ephcom, but Plplot > must have thousands of internal function calls that would need the > return value checking. > > I believe there are some 300 functions in our public and private API, > and not all of those will be affected by this (because there are no > error conditions they generate or they need to propagate). So I would > estimate "hundreds" of functions rather than "thousands" would be > touched by this propagation effort. > > Also, I think the editing task for propagating return codes would be a > similar effort to the editing task for replacing the call to plexit by > the generation of the equivalent message to stderr and an immediate > return with non-zero return code. Note, the number of plexit calls is > 220, from the results of > > find . -name "*.c" |xargs grep plexit | wc -l > > That number is roughly equivalent to the estimate I made above of the > number of functions that would be affected by the propagation effort. > That is the basis of my claim that the two efforts would be > comparable. There are also a few functions in the API that already return values that would have to be changed. So this would also mean updating all of the language bindings and probably some of the examples. > Anyhow, I am convinced by my above estimate that the propagation part > of the work will require similar effort to the rest of the project, I > do like the simplicity of the return code method that has proved to be > so useful in the ephcom case, and I think the above testing method > will insure the whole effort will give reliable return code results. > Also, I think we are talking a few weeks of one-man effort rather than > months to get this done completely. Therefore, I stand willing (early > in the next release cycle) to make this happen. Of course, many hands > make light work so I would welcome some help with this project. I agree with your time estimate. I also agree that the return codes approach is the right way to do it, and even if it does end up taking longer than expected it will be more than worth the effort. So I'd be happy to help. I guess we'd start with a header file containing a list of possible error codes, and then split up the editing of the various files in the src/ directory between those who volunteer? -Hazen |
From: Phil R. <p.d...@gm...> - 2016-02-24 10:33:26
|
Hi Alan I've cut and pasted some bits together of your last couple of emails > However, isn't being forced to use separate threads in order to > (automatically) have separate PLplot contexts for each thread a > relatively minor inconvenience compared to the very much larger user > inconvenience required by adding a context address to most API calls? > Not necessarily. This may force someone to restructure their code as they would have to have separate contexts in separate functions and may actually generate race conditions that people will need to deal with using mutexes. >I believe there are some 300 functions in our public and private API, >and not all of those will be affected by this (because there are no >error conditions they generate or they need to propagate). So I would >estimate "hundreds" of functions rather than "thousands" would be >touched by this propagation effort. Note that some functions are called tens of times within plplot, and with an error code return mechanism it is each call that needs a check. Also some functions return values; this would need restructuring with the potential for introducing bugs - possibly very hard to find bugs. In each function there would need to be an examination of all the memory allocations and code would need to be introduced at each check point to do freeing to avoid memory leaks. >Also, I think the editing task for propagating return codes would be a >similar effort to the editing task for replacing the call to plexit by >the generation of the equivalent message to stderr and an immediate >return with non-zero return code. Note, the number of plexit calls is >220, By contrast with an exception model we have less than 1 change to make per plexit call. These can almost be done by find and replace. You may wonder why this is less than. I imagine almost all of those are memory allocation failures. If we create a plmalloc function which allocates and tracks memory then this function can generate the exception if the memory allocation fails. This removes all requirements to even check whether a memory allocation has succeeded. Almost all the plexit calls vanish by writing just one function, but even better all the cases where we were lazy and didn't bother to check if our allocation succeeded (I wonder how many of those exist in plplot right now?) get automatically fixed. I am a fairly recent convert to exceptions, but I now use them throughout my C++ code, because using exceptions totally frees me from having to worry about errors. I don't have to worry about checking return values or anything like that, I am free to concentrate on the interesting aspects of the code writing, rather than getting bogged down in error checking and avoiding memory leaks. But more importantly, if used in well styled code, exceptions are bomb proof, they actually make it impossible to miss or fail to deal with an error. However I used to think exceptions were just a feature that I didn't really understand I probably didn't need to. The thing that changed my mind was some reading about coding style. Can I strongly recommend that you have a look at chapter 10 of C++ In Action: Industrial-Strength Programming Techniques. It has been published online for free and you can find this chapter at http://www.relisoft.com/book/tech/5resource.html. The important bit is from the beginning to the subsection Ownership of Resources. I know you are not big into C++, but all you have to know to understand this part of the chapter is that in C++ new is basically like malloc, delete is basically like free, when you create an object (either statically or using new) a function called its constructor is automatically called and when it goes out of scope (or if you delete it after creating it with new) then a function called its destructor is called. It may help to know that earlier in the book they create a calculator project as an example which gets briefly referred to. Reading this chapter was a revelation to me! >For each replacement of a "plexit" call you simply check >(by testing with a temporary fake error message and temporary fake >non-zero return code) that the fake return code propagates correctly. I'm not sure how you practically do this? Nor am I sure this is maintainable. Don't forget that a program may reach many potential plexit calls with a single API call and it may reach the same plexit call multiple times, but in different states. We also need to check for memory leaks. The testing required sounds huge to me. Anyway, sorry if this sounds negative, it's not meant to be. As I said I've found exceptions quite recently and they have revolutionised my coding style, so I feel quite passionate about them. Phil |
From: Alan W. I. <ir...@be...> - 2016-02-24 09:11:13
|
On 2016-02-24 14:18+1030 Jonathan Woithe wrote: > Hi Alan > > On Tue, Feb 23, 2016 at 11:00:24AM -0800, Alan W. Irwin wrote: [..] >> It appears everyone so far is happy with the context approach. >> Furthermore, it appears now with thread-local storage that the address of >> the context does not need to be an argument which makes life a lot >> simpler for our users. > > From my perspective as a user of plplot I think having the explicit context > argument in the C API has distinct advantages. [...] > I certainly understand the need to keep API changes to a minimum to avoid > inflicting undue pain on users. At the same time, while thread-local global > context storage may provide thread safety it means that people who require > multiple plot contexts could not achieve this unless they split their work > across multiple threads. True. However, isn't being forced to use separate threads in order to (automatically) have separate PLplot contexts for each thread a relatively minor inconvenience compared to the very much larger user inconvenience required by adding a context address to most API calls? > On a related matter, the placement of the context argument at the end of the > argument list in the original proposal was intriging. Most libraries I've > used which include such an argument include it as the first argument and > this makes more sense to my brain at least. What is the rationale for > having it at the end of the list? Does it make it easier to support earlier > API versions which lack the new argument? If we did end up having to include a context address as an argument to every PLplot call, then I agree my initial choice of making that argument last is likely wrong, and it should be first instead. However, I also think this question is now likely moot (see my response to your question above). You also asked a question about our C++ API, but I will leave that answer to our C++ experts. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Jonathan W. <jw...@ju...> - 2016-02-24 03:49:35
|
Hi Alan On Tue, Feb 23, 2016 at 11:00:24AM -0800, Alan W. Irwin wrote: > Hi Phil: > On 2016-02-23 11:21-0000 Phil Rosenberg wrote: > > > Hi Alan and Jim > > I entirely advocate this. This is the same model that libcurl uses > > too. In libcurl the "context" variable is a typecast void* so is > > entirely opaque to the user, but it gets cast to a structure > > internally. I presume the idea would be that the user would use one > > context per thread? > > > [...] > > So in summary: > > > > Pass in a context for helping thread safety - definitely > > It appears everyone so far is happy with the context approach. > Furthermore, it appears now with thread-local storage that the address of > the context does not need to be an argument which makes life a lot > simpler for our users. >From my perspective as a user of plplot I think having the explicit context argument in the C API has distinct advantages. I have written several programs where multiple plot streams were being maintained in parallel from a common data source. I got it working but the need to deal with a single global plot context made this more difficult than it might otherwise have been. Wrapping this context up into a context structure is a great first step. However, the full benefit won't be available if this is just stored in thread-local storage because a single thread will still be dealing with what is essentially a global context. I certainly understand the need to keep API changes to a minimum to avoid inflicting undue pain on users. At the same time, while thread-local global context storage may provide thread safety it means that people who require multiple plot contexts could not achieve this unless they split their work across multiple threads. The inclusion of an explicit context argument addresses both considerations. On a related matter, the placement of the context argument at the end of the argument list in the original proposal was intriging. Most libraries I've used which include such an argument include it as the first argument and this makes more sense to my brain at least. What is the rationale for having it at the end of the list? Does it make it easier to support earlier API versions which lack the new argument? While there's talk about APIs, I also encountered what seemed to be a slight oddity in the C++ API under plplot 5.10.0. In particular there doesn't appear to be any plstream method equivalent to plsetqtdev(). Instead one must call plsetqtdev() which relies on the global "current stream" variable plsc. The practical upshot is that one must ensure that no new plstream is created between the creation of a plstream and it's companion call to plsetqtdev(). It would be much neater and more consistent if one could just do plstream *pls = new plstream; pls->sdev("extqt"); pls->plsetqtdev(...); Regards jonathan |
From: Alan W. I. <ir...@be...> - 2016-02-24 03:17:46
|
Hi Phil: On 2016-02-23 23:36-0000 Phil Rosenberg wrote: > I obviously don't have any idea of the size of ephcom, but Plplot must have thousands of internal function calls that would need the return value checking. I believe there are some 300 functions in our public and private API, and not all of those will be affected by this (because there are no error conditions they generate or they need to propagate). So I would estimate "hundreds" of functions rather than "thousands" would be touched by this propagation effort. Also, I think the editing task for propagating return codes would be a similar effort to the editing task for replacing the call to plexit by the generation of the equivalent message to stderr and an immediate return with non-zero return code. Note, the number of plexit calls is 220, from the results of find . -name "*.c" |xargs grep plexit | wc -l That number is roughly equivalent to the estimate I made above of the number of functions that would be affected by the propagation effort. That is the basis of my claim that the two efforts would be comparable. > I'm not sure there is any way we can reliably catch them all I am glad you asked that really important question. You stumped me for a while, but I have now thought of one test method which I believe would be almost completely reliable. Here is how you would perform that test. For each replacement of a "plexit" call you simply check (by testing with a temporary fake error message and temporary fake non-zero return code) that the fake return code propagates correctly. That check would be done by running all our C examples with the appropriate device. Normally, that device would just be the svg or psc one, but you would need to be more specific about the device whenever you are testing propagation of return codes from a particular device to libplplot. The point is that if the return code propagation is working correctly, the fake error message (generated right in the routine where the error occurs) should always be accompanied by a non-zero return code from one of the C examples. So this method tests both the propagation of the non-zero return code throughout our devices and C library and also through the C examples (which is a good thing). Of course, there is one obvious caveat about the reliability of this test method which is that our C examples might never call (directly or indirectly) the function where the fake error message and fake non-zero return code is being used for testing propagation. So extra caution would have to be used for the case where no C example prints out the temporary fake error message. But I don't think that case will happen very often because our C examples test libplplot pretty thoroughly. Anyhow, I am convinced by my above estimate that the propagation part of the work will require similar effort to the rest of the project, I do like the simplicity of the return code method that has proved to be so useful in the ephcom case, and I think the above testing method will insure the whole effort will give reliable return code results. Also, I think we are talking a few weeks of one-man effort rather than months to get this done completely. Therefore, I stand willing (early in the next release cycle) to make this happen. Of course, many hands make light work so I would welcome some help with this project. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2016-02-23 23:36:51
|
Hi Alan Sorry for top posting, I'm replying from my phone. Just to be clear the strategy I am suggesting is only internal, so that at the API boundary plplot just returns an error code. I obviously don't have any idea of the size of ephcom, but Plplot must have thousands of internal function calls that would need the return value checking. I'm not sure there is any way we can reliably catch them all and the time investment would be huge. However, discussions about this shouldn't necessarily impact the api change. We can still move to a context variable and a error return code and the internals can be changed as we need. Phil -----Original Message----- From: "Alan W. Irwin" <ir...@be...> Sent: 23/02/2016 19:00 To: "Phil Rosenberg" <p.d...@gm...> Cc: "Jim Dishaw" <ji...@di...>; "PLplot development list" <Plp...@li...> Subject: Error report system plus thread safety Hi Phil: On 2016-02-23 11:21-0000 Phil Rosenberg wrote: > Hi Alan and Jim > I entirely advocate this. This is the same model that libcurl uses > too. In libcurl the "context" variable is a typecast void* so is > entirely opaque to the user, but it gets cast to a structure > internally. I presume the idea would be that the user would use one > context per thread? > [...] > So in summary: > > Pass in a context for helping thread safety - definitely It appears everyone so far is happy with the context approach. Furthermore, it appears now with thread-local storage that the address of the context does not need to be an argument which makes life a lot simpler for our users. > Decide how we wish to report an error, could be return val, callback > or a plgeterr( context ) call. In the ephcom case, David Howells was kind enough to implement both thread safety and an error reporting system. The implementation of the latter uses return values (simply but correctly propagated by each internal call of an ephcom routine checking for a non-zero error return and making an immediate return if one of those is detected). That implementation also uses some C tricks to make it convenient for a routine encountering an error to print out an error message in standard form and return with non-zero exit status with one statement. Furthermore, virtually all PLplot routines right now do not have a return value which allows us to use return values as part of our error reporting system. So I lean toward following the ephcom implementation of an error reporting system based on return values. Also note that if an application or a library that depends on PLplot wants to ignore the returned status code for all our C API he can do so with the above model. So it appears to me that we can implement an error reporting system and also thread safety without disrupting our users in any way which is a huge plus in my book. > If we want to stick with C: > Create a memory pool for allocating memory and ensuring we avoid memory leaks > Use longjmp for reporting errors back to the API entry point, but bear > in mind issues with jumping over C++ code I prefer the return value approach, see above. > My personal view is that the error reporting and removing exit calls > is actually more important than thread safety. I think we are in basic agreement on this, but I would phrase it slightly differently; I won't consider PLplot to be a first-class library until we have both a good error reporting system AND thread safety implemented. So I am greedy for both. :-) Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-23 19:00:34
|
Hi Phil: On 2016-02-23 11:21-0000 Phil Rosenberg wrote: > Hi Alan and Jim > I entirely advocate this. This is the same model that libcurl uses > too. In libcurl the "context" variable is a typecast void* so is > entirely opaque to the user, but it gets cast to a structure > internally. I presume the idea would be that the user would use one > context per thread? > [...] > So in summary: > > Pass in a context for helping thread safety - definitely It appears everyone so far is happy with the context approach. Furthermore, it appears now with thread-local storage that the address of the context does not need to be an argument which makes life a lot simpler for our users. > Decide how we wish to report an error, could be return val, callback > or a plgeterr( context ) call. In the ephcom case, David Howells was kind enough to implement both thread safety and an error reporting system. The implementation of the latter uses return values (simply but correctly propagated by each internal call of an ephcom routine checking for a non-zero error return and making an immediate return if one of those is detected). That implementation also uses some C tricks to make it convenient for a routine encountering an error to print out an error message in standard form and return with non-zero exit status with one statement. Furthermore, virtually all PLplot routines right now do not have a return value which allows us to use return values as part of our error reporting system. So I lean toward following the ephcom implementation of an error reporting system based on return values. Also note that if an application or a library that depends on PLplot wants to ignore the returned status code for all our C API he can do so with the above model. So it appears to me that we can implement an error reporting system and also thread safety without disrupting our users in any way which is a huge plus in my book. > If we want to stick with C: > Create a memory pool for allocating memory and ensuring we avoid memory leaks > Use longjmp for reporting errors back to the API entry point, but bear > in mind issues with jumping over C++ code I prefer the return value approach, see above. > My personal view is that the error reporting and removing exit calls > is actually more important than thread safety. I think we are in basic agreement on this, but I would phrase it slightly differently; I won't consider PLplot to be a first-class library until we have both a good error reporting system AND thread safety implemented. So I am greedy for both. :-) Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2016-02-23 18:10:51
|
On 2016-02-23 00:50-0500 Jim Dishaw wrote: > >> On Feb 22, 2016, at 5:48 PM, Alan W. Irwin <ir...@be...> wrote: >> >> @Everybody: now on to my C idea for thread safety. >> >> My idea for implementing that (closely following what was done for the >> C ephcom library case where David Howells implemented an ephcom >> context to help provide an API that did not depend on static >> variables) is to eliminate all static variables by using a >> PLplotContext struct that contains all data that is currently stored >> as static variables. >> >> Once that is implemented then the proper non-static way to use PLplot >> would be to do something like the following: >> >> PLplotContext * context; >> >> context = CreatePLplotContext(); >> plparseopts(..., context); >> plinit(context); >> // other ordinary PLplot calls as usual but with >> // context as last argument, e.g., >> plline(..., context); >> >> plend(context); >> >> where CreatePLplotContext would malloc a PLplotContext, and all the >> PLplot API calls would refer to that extra context argument whenever >> they needed access to any part of what were previously static variables >> (e.g., such as plsc). >> >> The context-sensitive plend would do everything that plend currently does plus >> destroy the context by freeing it. >> >> So far this follows pretty exactly what David Howells implemented for >> ephcom. Does everybody agree this general idea (a comprehensive API >> change where a context argument was added to every function call) would >> go a long way toward making PLplot thread safe? >> > > I think that is a good approach. I have done something similar in a mixed C/Fortran environment and it appeared to work, though I did not rigorously test the implementation. The hard part will be chasing down all the static allocations—they abound in the string handling. I had a patch that I put together that removed all the static char arrays used to format information/error messages, but it never made it into the code. I still have the patch and can update and push it to repository now that i can commit. > >> We found in the ephcom case that both the Python and Fortran bindings >> for ephcom could pass the C pointer to a context as arguments. So we >> could implement those bindings in a non-static way using the ephcom >> equivalent of CreatePLplotContext above. But we retained the static >> version of the API just in case some future binding of ephcom was for >> a language that could not pass C pointer arguments, and we would also >> want to retain the static API for similar reasons in the PLplot case. >> >> For static versions of plinit, plinit variants (e.g., plstar), and/or >> all PLplot routines that can legitimately be called before plinit >> would call CreatePLplotContext internally if the static variable >> PLplotContextAddress was non-NULL and store that pointer in >> PLplotContextAddress which would be referenced by every static PLplot >> routine (but PLplotContextAddress would be completely ignored by the >> non-static API). >> > > I seem to recall there is a way to get per-thread initialization of variables in C. Is that the direction you want to go or do you want the context address to be shared by all threads? Thanks for that idea which I frankly had never heard of before, but now you have drawn my attention to that possibility (e.g., <https://en.wikipedia.org/wiki/Thread-local_storage>) it would be a vast simplification since we would no longer need to carry the context argument to each routine because instead you could store context as a static thread-local variable. In other words, if I am interpreting what that article said properly, what I have described as the static case could be made thread safe by simply changing the static variable PLplotContextAddress to a static thread-local variable. One complication for this exciting prospect for PLplot thread safety is according to the article above this C capability was only introduced in C11. So it is apparently going to take a while for C compilers to support thread-local capability in a standard way. For example, according to <https://gcc.gnu.org/onlinedocs/gcc-3.3/gcc/Thread-Local.html> gcc does support thread local storage but currently in a non-standard way using the "__thread" attribute. On my system the following compiles without issues for gcc-4.9.2: irwin@raven> cat test_thread_local.c static __thread int foo = 0; irwin@raven> gcc -c test_thread_local.c -o test.o So that support for the __thread attribute looks promising. Anyhow, I think the next step forward is to follow up with the "static" implementation I described above with PLplotContextAddress initially declared for testing purposes as static __thread PLplotContext * PLplotContextAddress; for gcc (4.9.2 and above) and static PLplotContext * PLplotContextAddress; otherwise. I hope someone here is keen to give this idea a try. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2016-02-23 11:22:00
|
Hi Alan and Jim I entirely advocate this. This is the same model that libcurl uses too. In libcurl the "context" variable is a typecast void* so is entirely opaque to the user, but it gets cast to a structure internally. I presume the idea would be that the user would use one context per thread? There are two other items highly related items that go hand in hand with this and they are error reporting and removing the exit calls. Actually this is where I feel C++ has an advantage over C, but things can be done in C too. Basically when we hit an error that we cannot deal with such as a memory allocation fail, we need to return all the way back to the API entry point and somehow make an error code available to the user. At the same time we need to make sure we don't generate a segfault by accessing our failed allocation and we need to free any memory along the way to avoid memory leaks. The naïve, error prone and extremely labour intensive way to do this is to return error codes from all our internal functions, make sure we check them and do clean up at every stage. This will never ever work. The library is just too complex. C++ does this very well. We can use an array class for our memory allocations which has a destructor that frees the memory. This means that when an array goes out of scope the memory is automatically freed. C++ also has an error reporting mechanism called exception throwing. If we have code which can throw an exception then we put it inside a special block called a try-catch block. If at any point no matter how many layers down the function call stack, the code hits an error that it cannot deal with it throws an exception using the throw() function. At this point the code will return all the way back to the try catch block and execute the code in the catch section. Otherwise the catch code is ignored. The amazing thing about this is that along the return path all scopes are exited "cleanly" causing the destructors of all objects to be called and freeing all memory. This is an incredibly clean way to do things. However if we wish to stick with C, which is probably a valid thing to do, then there are some similar things we can do. Instead of exception throwing C has longjmp(), which does a similar thing. The main problem is that we still need to free our memory. We can do this by creating a plmalloc and plfree function which will keep track of what memory is allocated/deallocated. If long_jmp gets called part of the error handling that we will do is check if any memory is still allocated and deallocate it. From my understanding this longjmp and memory pool model is pretty standard in C error reporting. Here is an example of how it would work: void c_plline( PLINT n, const PLFLT *x, const PLFLT *y, PLplotContext context ) { plstream *plsc = (plstream *)context; int val; val = setjmp( plsc->env ); //this returns 0, however if we call longjmp //using the same value for env, we return //here (a bit like a goto with a nonzero return value. if( val == 0 ) //run our code if we haven't had a longjmp call { if ( plsc->level < 3 ) { longjmp( PLERR_INITLEVEL3, plsc->env ); } plP_drawor_poly( x, y, n, plsc ); //somewhere in the depths here we could call longjmp and //we would end up back at the setjmp call } //do cleanup, just in case we forgot to do it in the code and save our error code and perhaps report //an error message plfreeall(); reporterror( val, plsc ); //this could do any number of things } Things do get a little more complicated than this, for example if internally we call an API function we need to push the env variable onto a stack to avoid overwriting it and we must avoid longjmping over C++ code otherwise we get undefined behaviour. There are some libraries that implement all this already, e.g. https://github.com/guillermocalvo/exceptions4c So in summary: Pass in a context for helping thread safety - definitely Decide how we wish to report an error, could be return val, callback or a plgeterr( context ) call. If we want to stick with C: Create a memory pool for allocating memory and ensuring we avoid memory leaks Use longjmp for reporting errors back to the API entry point, but bear in mind issues with jumping over C++ code If we would be interested in using C++: use an array object for memory allocation and automatic freeing use exceptions to report errors back to the API entry point This would be much much much (I could add a lot of muches here) more robust than doing things with C, but it does require a philosophical change to how plplot is written. My personal view is that the error reporting and removing exit calls is actually more important than thread safety. Note that even if we decide to use C++ internally we must maintain a C interface. C++ at a library interface can be very bad. Phil On 23 February 2016 at 05:50, Jim Dishaw <ji...@di...> wrote: > >> On Feb 22, 2016, at 5:48 PM, Alan W. Irwin <ir...@be...> wrote: >> >> @Everybody: now on to my C idea for thread safety. >> >> My idea for implementing that (closely following what was done for the >> C ephcom library case where David Howells implemented an ephcom >> context to help provide an API that did not depend on static >> variables) is to eliminate all static variables by using a >> PLplotContext struct that contains all data that is currently stored >> as static variables. >> >> Once that is implemented then the proper non-static way to use PLplot >> would be to do something like the following: >> >> PLplotContext * context; >> >> context = CreatePLplotContext(); >> plparseopts(..., context); >> plinit(context); >> // other ordinary PLplot calls as usual but with >> // context as last argument, e.g., >> plline(..., context); >> >> plend(context); >> >> where CreatePLplotContext would malloc a PLplotContext, and all the >> PLplot API calls would refer to that extra context argument whenever >> they needed access to any part of what were previously static variables >> (e.g., such as plsc). >> >> The context-sensitive plend would do everything that plend currently does plus >> destroy the context by freeing it. >> >> So far this follows pretty exactly what David Howells implemented for >> ephcom. Does everybody agree this general idea (a comprehensive API >> change where a context argument was added to every function call) would >> go a long way toward making PLplot thread safe? >> > > I think that is a good approach. I have done something similar in a mixed C/Fortran environment and it appeared to work, though I did not rigorously test the implementation. The hard part will be chasing down all the static allocations—they abound in the string handling. I had a patch that I put together that removed all the static char arrays used to format information/error messages, but it never made it into the code. I still have the patch and can update and push it to repository now that i can commit. > >> We found in the ephcom case that both the Python and Fortran bindings >> for ephcom could pass the C pointer to a context as arguments. So we >> could implement those bindings in a non-static way using the ephcom >> equivalent of CreatePLplotContext above. But we retained the static >> version of the API just in case some future binding of ephcom was for >> a language that could not pass C pointer arguments, and we would also >> want to retain the static API for similar reasons in the PLplot case. >> >> For static versions of plinit, plinit variants (e.g., plstar), and/or >> all PLplot routines that can legitimately be called before plinit >> would call CreatePLplotContext internally if the static variable >> PLplotContextAddress was non-NULL and store that pointer in >> PLplotContextAddress which would be referenced by every static PLplot >> routine (but PLplotContextAddress would be completely ignored by the >> non-static API). >> > > I seem to recall there is a way to get per-thread initialization of variables in C. Is that the direction you want to go or do you want the context address to be shared by all threads? > >> For backwards compatibility (e.g., to support those who don't care >> about thread safety and who do not want to change their code) we would >> want to retain the same name for the static API that are used >> now. But I would like to use the same names for the non-static cases >> as well if that is possible (say by following the approach discussed >> at <http://stackoverflow.com/questions/1472138/c-default-arguments>.) >> >> Further discussion is encouraged and welcome! Also, I am well aware I >> have glossed over lots of details here. That is because I frankly >> don't completely understand all those details! :-) Nevertheless, >> assuming I have expressed the overview correctly of what would be >> required, I hope someone will be inspired by that overview to go ahead >> and implement the non-static C alternative as a very large step toward >> the ability to use our library in a thread-safe way. >> >> Alan >> __________________________ >> Alan W. Irwin >> >> Astronomical research affiliation with Department of Physics and Astronomy, >> University of Victoria (astrowww.phys.uvic.ca). >> >> Programming affiliations with the FreeEOS equation-of-state >> implementation for stellar interiors (freeeos.sf.net); the Time >> Ephemerides project (timeephem.sf.net); PLplot scientific plotting >> software package (plplot.sf.net); the libLASi project >> (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); >> and the Linux Brochure Project (lbproject.sf.net). >> __________________________ >> >> Linux-powered Science >> __________________________ >> >> ---------- Forwarded message ---------- >> Date: Tue, 10 Dec 2002 02:52:43 -0600 (CST) >> From: Maurice LeBrun <mj...@ga...> >> To: Alan W. Irwin <ir...@be...> >> Cc: PLplot development list <Plp...@li...> >> Subject: Re: [Plplot-devel] plinit, plend, plinit sequence now works, >> but I am having second thoughts >> >> I can't tell you the specific answer to your questions due to how maniacally >> busy I am these days (having just joined Lightspeed Semiconductor), but I can >> elucidate some of the plplot design ideas that have historically been >> un-documented. And, I can give it in an object-oriented context, which >> (because it is "canonical") is a lot nicer than the "this is the way it should >> work" approach I've used historically. :) >> >> This also includes proposals for change to how we do it now -- i.e. the >> behavior of plinit(). I've always been somewhat bothered by the way stream 0 >> vs stream N is handled (this bugged me back in '94 but I wasn't exactly >> swimming in free time then either). >> >> When plplot starts, you have the statically pre-allocated stream, stream 0. >> Yes the stream 0 that I hate b/c it's not allocated on the heap like a proper >> data-structure/object (in the plframe widget I automatically start from stream >> 1). >> >> So stream 0 is like a class definition and an instance rolled into one. What >> I think we need to do is get rid of the "instance" part of this and leave >> stream 0 as a "class definition" only. In this case all the command line >> arguments and initial pls...() calls (before plinit) serve to override the >> default initializion of the class variables, i.e. set stream 0 parameters only >> >> In other words, stream 0 becomes the template for all plplot streams. You can >> use it, but once you've called plinit() you have your own "instance" of the >> plplot "object" -- i.e. you have a new stream with all the relevant state info >> copied from stream 0. If you change it, it dies when your stream dies with >> plend1(). >> >> If you really want to change stream 0 (i.e. the "plplot object" "class data") >> you can always set your stream number to 0 and fire away. >> >> To summarize: >> >> plinit() creates a new stream, copied from stream 0 >> plend1() deletes that stream >> plend() deletes all streams, except of course stream 0 which is "class data" >> >> Let me know if any of this helps. >> >> -- >> Maurice LeBrun mj...@ga... >> Research Organization for Information Science and Technology of Japan (RIST) >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Plplot-devel mailing list >> Plp...@li... >> https://lists.sourceforge.net/lists/listinfo/plplot-devel > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Plplot-devel mailing list > Plp...@li... > https://lists.sourceforge.net/lists/listinfo/plplot-devel |
From: Jim D. <ji...@di...> - 2016-02-23 06:07:36
|
> On Feb 22, 2016, at 5:48 PM, Alan W. Irwin <ir...@be...> wrote: > > @Everybody: now on to my C idea for thread safety. > > My idea for implementing that (closely following what was done for the > C ephcom library case where David Howells implemented an ephcom > context to help provide an API that did not depend on static > variables) is to eliminate all static variables by using a > PLplotContext struct that contains all data that is currently stored > as static variables. > > Once that is implemented then the proper non-static way to use PLplot > would be to do something like the following: > > PLplotContext * context; > > context = CreatePLplotContext(); > plparseopts(..., context); > plinit(context); > // other ordinary PLplot calls as usual but with > // context as last argument, e.g., > plline(..., context); > > plend(context); > > where CreatePLplotContext would malloc a PLplotContext, and all the > PLplot API calls would refer to that extra context argument whenever > they needed access to any part of what were previously static variables > (e.g., such as plsc). > > The context-sensitive plend would do everything that plend currently does plus > destroy the context by freeing it. > > So far this follows pretty exactly what David Howells implemented for > ephcom. Does everybody agree this general idea (a comprehensive API > change where a context argument was added to every function call) would > go a long way toward making PLplot thread safe? > I think that is a good approach. I have done something similar in a mixed C/Fortran environment and it appeared to work, though I did not rigorously test the implementation. The hard part will be chasing down all the static allocations—they abound in the string handling. I had a patch that I put together that removed all the static char arrays used to format information/error messages, but it never made it into the code. I still have the patch and can update and push it to repository now that i can commit. > We found in the ephcom case that both the Python and Fortran bindings > for ephcom could pass the C pointer to a context as arguments. So we > could implement those bindings in a non-static way using the ephcom > equivalent of CreatePLplotContext above. But we retained the static > version of the API just in case some future binding of ephcom was for > a language that could not pass C pointer arguments, and we would also > want to retain the static API for similar reasons in the PLplot case. > > For static versions of plinit, plinit variants (e.g., plstar), and/or > all PLplot routines that can legitimately be called before plinit > would call CreatePLplotContext internally if the static variable > PLplotContextAddress was non-NULL and store that pointer in > PLplotContextAddress which would be referenced by every static PLplot > routine (but PLplotContextAddress would be completely ignored by the > non-static API). > I seem to recall there is a way to get per-thread initialization of variables in C. Is that the direction you want to go or do you want the context address to be shared by all threads? > For backwards compatibility (e.g., to support those who don't care > about thread safety and who do not want to change their code) we would > want to retain the same name for the static API that are used > now. But I would like to use the same names for the non-static cases > as well if that is possible (say by following the approach discussed > at <http://stackoverflow.com/questions/1472138/c-default-arguments>.) > > Further discussion is encouraged and welcome! Also, I am well aware I > have glossed over lots of details here. That is because I frankly > don't completely understand all those details! :-) Nevertheless, > assuming I have expressed the overview correctly of what would be > required, I hope someone will be inspired by that overview to go ahead > and implement the non-static C alternative as a very large step toward > the ability to use our library in a thread-safe way. > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ > > ---------- Forwarded message ---------- > Date: Tue, 10 Dec 2002 02:52:43 -0600 (CST) > From: Maurice LeBrun <mj...@ga...> > To: Alan W. Irwin <ir...@be...> > Cc: PLplot development list <Plp...@li...> > Subject: Re: [Plplot-devel] plinit, plend, plinit sequence now works, > but I am having second thoughts > > I can't tell you the specific answer to your questions due to how maniacally > busy I am these days (having just joined Lightspeed Semiconductor), but I can > elucidate some of the plplot design ideas that have historically been > un-documented. And, I can give it in an object-oriented context, which > (because it is "canonical") is a lot nicer than the "this is the way it should > work" approach I've used historically. :) > > This also includes proposals for change to how we do it now -- i.e. the > behavior of plinit(). I've always been somewhat bothered by the way stream 0 > vs stream N is handled (this bugged me back in '94 but I wasn't exactly > swimming in free time then either). > > When plplot starts, you have the statically pre-allocated stream, stream 0. > Yes the stream 0 that I hate b/c it's not allocated on the heap like a proper > data-structure/object (in the plframe widget I automatically start from stream > 1). > > So stream 0 is like a class definition and an instance rolled into one. What > I think we need to do is get rid of the "instance" part of this and leave > stream 0 as a "class definition" only. In this case all the command line > arguments and initial pls...() calls (before plinit) serve to override the > default initializion of the class variables, i.e. set stream 0 parameters only > > In other words, stream 0 becomes the template for all plplot streams. You can > use it, but once you've called plinit() you have your own "instance" of the > plplot "object" -- i.e. you have a new stream with all the relevant state info > copied from stream 0. If you change it, it dies when your stream dies with > plend1(). > > If you really want to change stream 0 (i.e. the "plplot object" "class data") > you can always set your stream number to 0 and fire away. > > To summarize: > > plinit() creates a new stream, copied from stream 0 > plend1() deletes that stream > plend() deletes all streams, except of course stream 0 which is "class data" > > Let me know if any of this helps. > > -- > Maurice LeBrun mj...@ga... > Research Organization for Information Science and Technology of Japan (RIST) > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Plplot-devel mailing list > Plp...@li... > https://lists.sourceforge.net/lists/listinfo/plplot-devel |
From: Alan W. I. <ir...@be...> - 2016-02-22 22:48:58
|
@Phil: I am particularly interested in your reaction here because you had the idea before that you could implement PLplot in a thread-safe way by using C++ as the core language, i.e., rewriting PLplot in C++. I don't rule out the possibility, but one intermediate step toward that goal might be to implement the idea below in C, and and then change our C++ binding so that it automatically uses a PLplotContext (which would presumably go a long way toward solving your current issues with PLplot thread safety and wxwidgets). @Everybody: now on to my C idea for thread safety. The PLplot core C library is currently not thread safe, and I think we are all agreed that it is important to address that issue since it is an important barrier to entry for some. For example, if I recall correctly it was Ruby on Rails developers who publicly expressed that they wanted no part of PLplot because of this thread safety issue, and presumably other library developers are silently avoiding PLplot for this same reason. In addition, once it becomes possible to use PLplot in a thread-safe way, it should make life much easier for Phil's development work on the wxwidgets device. We would go a long way toward thread safety if we got rid of all static variables so the rest of this post focuses on that issue from the C perspective. My idea for implementing that (closely following what was done for the C ephcom library case where David Howells implemented an ephcom context to help provide an API that did not depend on static variables) is to eliminate all static variables by using a PLplotContext struct that contains all data that is currently stored as static variables. Once that is implemented then the proper non-static way to use PLplot would be to do something like the following: PLplotContext * context; context = CreatePLplotContext(); plparseopts(..., context); plinit(context); // other ordinary PLplot calls as usual but with // context as last argument, e.g., plline(..., context); plend(context); where CreatePLplotContext would malloc a PLplotContext, and all the PLplot API calls would refer to that extra context argument whenever they needed access to any part of what were previously static variables (e.g., such as plsc). The context-sensitive plend would do everything that plend currently does plus destroy the context by freeing it. So far this follows pretty exactly what David Howells implemented for ephcom. Does everybody agree this general idea (a comprehensive API change where a context argument was added to every function call) would go a long way toward making PLplot thread safe? We found in the ephcom case that both the Python and Fortran bindings for ephcom could pass the C pointer to a context as arguments. So we could implement those bindings in a non-static way using the ephcom equivalent of CreatePLplotContext above. But we retained the static version of the API just in case some future binding of ephcom was for a language that could not pass C pointer arguments, and we would also want to retain the static API for similar reasons in the PLplot case. For static versions of plinit, plinit variants (e.g., plstar), and/or all PLplot routines that can legitimately be called before plinit would call CreatePLplotContext internally if the static variable PLplotContextAddress was non-NULL and store that pointer in PLplotContextAddress which would be referenced by every static PLplot routine (but PLplotContextAddress would be completely ignored by the non-static API). For backwards compatibility (e.g., to support those who don't care about thread safety and who do not want to change their code) we would want to retain the same name for the static API that are used now. But I would like to use the same names for the non-static cases as well if that is possible (say by following the approach discussed at <http://stackoverflow.com/questions/1472138/c-default-arguments>.) Further discussion is encouraged and welcome! Also, I am well aware I have glossed over lots of details here. That is because I frankly don't completely understand all those details! :-) Nevertheless, assuming I have expressed the overview correctly of what would be required, I hope someone will be inspired by that overview to go ahead and implement the non-static C alternative as a very large step toward the ability to use our library in a thread-safe way. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ ---------- Forwarded message ---------- Date: Tue, 10 Dec 2002 02:52:43 -0600 (CST) From: Maurice LeBrun <mj...@ga...> To: Alan W. Irwin <ir...@be...> Cc: PLplot development list <Plp...@li...> Subject: Re: [Plplot-devel] plinit, plend, plinit sequence now works, but I am having second thoughts I can't tell you the specific answer to your questions due to how maniacally busy I am these days (having just joined Lightspeed Semiconductor), but I can elucidate some of the plplot design ideas that have historically been un-documented. And, I can give it in an object-oriented context, which (because it is "canonical") is a lot nicer than the "this is the way it should work" approach I've used historically. :) This also includes proposals for change to how we do it now -- i.e. the behavior of plinit(). I've always been somewhat bothered by the way stream 0 vs stream N is handled (this bugged me back in '94 but I wasn't exactly swimming in free time then either). When plplot starts, you have the statically pre-allocated stream, stream 0. Yes the stream 0 that I hate b/c it's not allocated on the heap like a proper data-structure/object (in the plframe widget I automatically start from stream 1). So stream 0 is like a class definition and an instance rolled into one. What I think we need to do is get rid of the "instance" part of this and leave stream 0 as a "class definition" only. In this case all the command line arguments and initial pls...() calls (before plinit) serve to override the default initializion of the class variables, i.e. set stream 0 parameters only In other words, stream 0 becomes the template for all plplot streams. You can use it, but once you've called plinit() you have your own "instance" of the plplot "object" -- i.e. you have a new stream with all the relevant state info copied from stream 0. If you change it, it dies when your stream dies with plend1(). If you really want to change stream 0 (i.e. the "plplot object" "class data") you can always set your stream number to 0 and fire away. To summarize: plinit() creates a new stream, copied from stream 0 plend1() deletes that stream plend() deletes all streams, except of course stream 0 which is "class data" Let me know if any of this helps. -- Maurice LeBrun mj...@ga... Research Organization for Information Science and Technology of Japan (RIST) |