|
From: Ivo R. <iv...@iv...> - 2017-06-01 10:58:51
|
2017-05-29 13:20 GMT+02:00 FEVOTTE Francois <fra...@ed...>: > Dear Valgrind developers, > > first, please forgive us if this post is out of place in this list. > > We would like to introduce Verrou [1], a floating-point error diagnostics > tool based on Valgrind. The idea behind the tool is that it replaces all > floating-point operations by randomly rounded ones (which means that > instead of always rounding non-representable results to the nearest > floating-point number, one of the two nearest floating-point numbers is > chosen randomly). Instrumented program results thus become realizations of > a random variable, the dispersion of which gives an estimation of the > impact of the accumulation of floating-point round-off errors during > program execution. In the computer arithmetic community, this technique is > known as an asynchronous CESTAC method, which is a variant of Monte-Carlo > arithmetic. More details can be found in Verrou's user manual [2]. > > This work was pursued at EDF R&D [3], but we think such a tool might be of > broader interest, especially since Valgrind's "Project Suggestions" page > lists the detection of floating-point inaccuracies as a topic of interest. > We also would like to take the opportunity of this message to thank Josef > Weidendorfer, who kindly helped us getting started with the development of > a new Valgrind tool, back when this project began in 2014. We just released > (under the GPLv2) version 1.0.0 of Verrou, which we believe to be stable > enough for others to use. So please let us know of any comments you might > have about this tool. > Dear François and Bruno, Thank you for sharing information about new Valgrind tool with the Valgrind developers. I am Cc'ing also Valgrind users because it's actually users who will be using this tool. >From the Valgrind (tooling) perspective it looks quite neat. But I know nothing about floating point rounding modes to know how practical it is for finding real issues. So I asked some of my colleagues for their thoughts and comments. Your comments about these are welcome. ---------------------------------------------------------------- Comment #1: My first reaction is that just using random rounding might be considerably less interesting than also being able to do precision bounding. The latter might be able to help with questions like "do I need to switch between float and double?" and stuff like that. I also wonder if random rounding leads to tremendous understatement or overstatement of a rounding problem. I can imagine the former, since random choices might tend to cancel each other out. I could also imagine the latter, since interval arithmetic (most pessimistic rounding) was rather incapable of judging the numerical stability of conventional algorithms. I'm more inclined to bet on the former. Looking at top google hits on Monte Carlo arithmetic makes this stuff sound a little researchy and unproven (old hits, many from the same author), but I didn't look closely. Comment #2: I once tracked down a numerical problem with a SPEC benchmark by modifying the compiler to do arithmetic in both the precision specified by the program and a higher precision, and then to print a warning when they diverged. (Of course, that meant that the higher precision results had to be stored in a table hashed by the address of the lower precision results; also it had to reset the higher precision value when the variable was assigned from some source that didn’t have an associated higher precision value, for example via I/O.) That worked pretty well, although of course it slowed the program down a bit. Comment #3: While I've recently been reading up on design of elementary functions for speed and accuracy, I won't claim to be a master of numerical methods. With that disclaimer, I will say that the idea of "random rounding" makes me uncomfortable, in part because any method which does not give repeatable results creates difficulty for debugging. Also, some types of cumulative numerical instabilities will not be shown by random rounding. On the other hand, the general problem of identifying numerical instability in large applications is a tough problem. If "random rounding" has been shown to help identify some problems, then it could be considered one of several valid numerical stability tests and a useful tool in the numerical analyst's toolbox. Personally, I like the approach of doing test runs of an application at higher precision to see if the results change. That approach is often supported by use of a compiler switch and SW libraries for the higher precision, requiring modest programming effort and a one time investment of slow test runs. I'm sure this tool also requires slow test runs as it talks about repeated runs of different portions of the application to determine the source of maximum round-off variation. For elementary functions, such as those found in libm, there are more rigorous methods than either of the above for proving the worst case error does not exceed defined bounds. Whether this particular tool will become important to customers is unknown. Many more tools are developed than are widely used. It may be determined in part by the 'marketing' of it by its developers. Some approaches get a lot of buzz and then fade away. I class interval arithmetic in that category. Maybe not gone forever, but not driving any major purchase decisions. Others grow and eventually become part of everyone's base expectations (Perl, Java, ...). --------------------------------------------------------------------------- What will happen at this point? 1. I hope we can discuss more about this tool. 2. We can add a link at this page: http://valgrind.org/downloads/variants.html pointing to your tool. 3. If the community agrees that this tool will be worth adding into Valgrind source code repository then you can initiate talks about integrating it. But 1. needs to happen first. Kind regards, Ivosh Raisr |