|
From: FEVOTTE F. <fra...@ed...> - 2017-05-29 11:21:08
Attachments:
edf_sign_1.gif
|
Dear Valgrind developers,
first, please forgive us if this post is out of place in this list.
We would like to introduce Verrou [1], a floating-point error diagnostics tool based on Valgrind. The idea behind the tool is that it replaces all floating-point operations by randomly rounded ones (which means that instead of always rounding non-representable results to the nearest floating-point number, one of the two nearest floating-point numbers is chosen randomly). Instrumented program results thus become realizations of a random variable, the dispersion of which gives an estimation of the impact of the accumulation of floating-point round-off errors during program execution. In the computer arithmetic community, this technique is known as an asynchronous CESTAC method, which is a variant of Monte-Carlo arithmetic. More details can be found in Verrou's user manual [2].
This work was pursued at EDF R&D [3], but we think such a tool might be of broader interest, especially since Valgrind's "Project Suggestions" page lists the detection of floating-point inaccuracies as a topic of interest. We also would like to take the opportunity of this message to thank Josef Weidendorfer, who kindly helped us getting started with the development of a new Valgrind tool, back when this project began in 2014. We just released (under the GPLv2) version 1.0.0 of Verrou, which we believe to be stable enough for others to use. So please let us know of any comments you might have about this tool.
In any case, many thanks to the Valgrind development team: Verrou would not exist without your outstanding work. And the other codes we develop would be a lot buggier and less performant if it were not for Valgrind tools...
François Févotte and Bruno Lathuilière
[1] http://github.com/edf-hpc/verrou
[2] http://edf-hpc.github.io/verrou/vr-manual.html
[3] EDF is France's main electricity utility. Its R&D department develops a lot of numerical simulation softwares, which are used in crucial parts of our industrial process.
--
[cid:149...@ed...]
François FÉVOTTE
Research Engineer
EDF – R&D – PERICLES
I23 (Analysis and Numerical Modeling)
7 boulevard Gaspard Monge
91120 Palaiseau
fra...@ed...
Phone: +33 1 78 19 44 23
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________
This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.
E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
|
|
Re: [Valgrind-developers] Request For Comments: Verrou,
a Valgrind tool for floating-point debugging
From: Ivo R. <iv...@iv...> - 2017-06-01 10:58:51
|
2017-05-29 13:20 GMT+02:00 FEVOTTE Francois <fra...@ed...>: > Dear Valgrind developers, > > first, please forgive us if this post is out of place in this list. > > We would like to introduce Verrou [1], a floating-point error diagnostics > tool based on Valgrind. The idea behind the tool is that it replaces all > floating-point operations by randomly rounded ones (which means that > instead of always rounding non-representable results to the nearest > floating-point number, one of the two nearest floating-point numbers is > chosen randomly). Instrumented program results thus become realizations of > a random variable, the dispersion of which gives an estimation of the > impact of the accumulation of floating-point round-off errors during > program execution. In the computer arithmetic community, this technique is > known as an asynchronous CESTAC method, which is a variant of Monte-Carlo > arithmetic. More details can be found in Verrou's user manual [2]. > > This work was pursued at EDF R&D [3], but we think such a tool might be of > broader interest, especially since Valgrind's "Project Suggestions" page > lists the detection of floating-point inaccuracies as a topic of interest. > We also would like to take the opportunity of this message to thank Josef > Weidendorfer, who kindly helped us getting started with the development of > a new Valgrind tool, back when this project began in 2014. We just released > (under the GPLv2) version 1.0.0 of Verrou, which we believe to be stable > enough for others to use. So please let us know of any comments you might > have about this tool. > Dear François and Bruno, Thank you for sharing information about new Valgrind tool with the Valgrind developers. I am Cc'ing also Valgrind users because it's actually users who will be using this tool. >From the Valgrind (tooling) perspective it looks quite neat. But I know nothing about floating point rounding modes to know how practical it is for finding real issues. So I asked some of my colleagues for their thoughts and comments. Your comments about these are welcome. ---------------------------------------------------------------- Comment #1: My first reaction is that just using random rounding might be considerably less interesting than also being able to do precision bounding. The latter might be able to help with questions like "do I need to switch between float and double?" and stuff like that. I also wonder if random rounding leads to tremendous understatement or overstatement of a rounding problem. I can imagine the former, since random choices might tend to cancel each other out. I could also imagine the latter, since interval arithmetic (most pessimistic rounding) was rather incapable of judging the numerical stability of conventional algorithms. I'm more inclined to bet on the former. Looking at top google hits on Monte Carlo arithmetic makes this stuff sound a little researchy and unproven (old hits, many from the same author), but I didn't look closely. Comment #2: I once tracked down a numerical problem with a SPEC benchmark by modifying the compiler to do arithmetic in both the precision specified by the program and a higher precision, and then to print a warning when they diverged. (Of course, that meant that the higher precision results had to be stored in a table hashed by the address of the lower precision results; also it had to reset the higher precision value when the variable was assigned from some source that didn’t have an associated higher precision value, for example via I/O.) That worked pretty well, although of course it slowed the program down a bit. Comment #3: While I've recently been reading up on design of elementary functions for speed and accuracy, I won't claim to be a master of numerical methods. With that disclaimer, I will say that the idea of "random rounding" makes me uncomfortable, in part because any method which does not give repeatable results creates difficulty for debugging. Also, some types of cumulative numerical instabilities will not be shown by random rounding. On the other hand, the general problem of identifying numerical instability in large applications is a tough problem. If "random rounding" has been shown to help identify some problems, then it could be considered one of several valid numerical stability tests and a useful tool in the numerical analyst's toolbox. Personally, I like the approach of doing test runs of an application at higher precision to see if the results change. That approach is often supported by use of a compiler switch and SW libraries for the higher precision, requiring modest programming effort and a one time investment of slow test runs. I'm sure this tool also requires slow test runs as it talks about repeated runs of different portions of the application to determine the source of maximum round-off variation. For elementary functions, such as those found in libm, there are more rigorous methods than either of the above for proving the worst case error does not exceed defined bounds. Whether this particular tool will become important to customers is unknown. Many more tools are developed than are widely used. It may be determined in part by the 'marketing' of it by its developers. Some approaches get a lot of buzz and then fade away. I class interval arithmetic in that category. Maybe not gone forever, but not driving any major purchase decisions. Others grow and eventually become part of everyone's base expectations (Perl, Java, ...). --------------------------------------------------------------------------- What will happen at this point? 1. I hope we can discuss more about this tool. 2. We can add a link at this page: http://valgrind.org/downloads/variants.html pointing to your tool. 3. If the community agrees that this tool will be worth adding into Valgrind source code repository then you can initiate talks about integrating it. But 1. needs to happen first. Kind regards, Ivosh Raisr |
|
Re: [Valgrind-developers] Request For Comments: Verrou,
a Valgrind tool for floating-point debugging
From: Roland M. <rol...@nr...> - 2017-06-01 11:47:01
|
On Mon, May 29, 2017 at 1:20 PM, FEVOTTE Francois
<fra...@ed...> wrote:
> Dear Valgrind developers,
>
> first, please forgive us if this post is out of place in this list.
>
> We would like to introduce Verrou [1], a floating-point error diagnostics
> tool based on Valgrind. The idea behind the tool is that it replaces all
> floating-point operations by randomly rounded ones (which means that instead
> of always rounding non-representable results to the nearest floating-point
> number, one of the two nearest floating-point numbers is chosen randomly).
It would be nice to be able to "define" the randomness here, e.g. by
providing a pseudo-random-number generator and a command line option
to provide the "seed" value. Point is that you can actually debug
issues in a deterministic&&repeatable way *IF* they happen.
> Instrumented program results thus become realizations of a random variable,
> the dispersion of which gives an estimation of the impact of the
> accumulation of floating-point round-off errors during program execution. In
> the computer arithmetic community, this technique is known as an
> asynchronous CESTAC method, which is a variant of Monte-Carlo arithmetic.
> More details can be found in Verrou's user manual [2].
>
> This work was pursued at EDF R&D [3], but we think such a tool might be of
> broader interest, especially since Valgrind's "Project Suggestions" page
> lists the detection of floating-point inaccuracies as a topic of interest.
Another issue is to make sure things like +nan/-nan and NaNs with
payloads work correctly with your tool since there are lots of
applications which use this kind of stuff for error ("error" as in
"error message") propagation.
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) rol...@nr...
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
|
|
Re: [Valgrind-developers] Request For Comments: Verrou,
a Valgrind tool for floating-point debugging
From: FEVOTTE F. <fra...@ed...> - 2017-06-02 13:21:45
|
Le jeudi 01 juin 2017 à 13:46 +0200, Roland Mainz a écrit : > It would be nice to be able to "define" the randomness here, e.g. by > providing a pseudo-random-number generator and a command line option > to provide the "seed" value. We are planning to add a command-line switch to provide the seed value (https://github.com/edf-hpc/verrou/issues/3), but I don't think that we will go any further than that. In other words, we don't plan changing the pRNG nor letting the user define it. > Point is that you can actually debug > issues in a deterministic&&repeatable way *IF* they happen. In order to have a deterministic way to perturb results, we have a "farthest" rounding mode, which always rounds in an opposite way to the standard nearest rounding mode (it leaves representable values unchanged, though). This is documented here: http://edf-hpc.github.io/verrou/vr-manual.html#vr-manual.feat.rounding-mode > Another issue is to make sure things like +nan/-nan and NaNs with > payloads work correctly with your tool since there are lots of > applications which use this kind of stuff for error ("error" as in > "error message") propagation. This is a very good point, thanks! I just checked and it appears that Verrou currently preserves NaN values, but sometimes changes an infinite value into a (large but) finite one. Also, NaN payloads are sometimes changed. So there is some work to do here. I opened an issue to handle this: https://github.com/edf-hpc/verrou/issues/4 Thanks for your comment, François -- François FÉVOTTE Research Engineer EDF – R&D – PERICLES I23 (Analysis and Numerical Modeling) 7 boulevard Gaspard Monge 91120 Palaiseau - FRANCE fra...@ed... Phone: +33 1 78 19 44 23 Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. |
|
From: FEVOTTE F. <fra...@ed...> - 2017-06-02 12:31:32
|
Hello, Le jeudi 01 juin 2017 à 11:29 +0000, joh...@si... a écrit : > This is potentially interesting for what I do. Is the documentation at > http://edf-hpc.github.io/verrou/vr-manual.html up to date? Yes, it should be. > Something that would be very useful would be control of the seed of > the random number generator, so that we could repeat and debug cases > that gave strange results. I actually think we had this feature in earlier versions of Verrou, and I'm not sure when and how it disappeared. But I opened an issue on Github (https://github.com/edf-hpc/verrou/issues/3) and will (re)introduce this feature as soon as I can. > The code I work on is a large mathematical modeller, with perhaps > 70,000 functions. Rather than using an exclusion file, it would be > very useful to have an inclusion file as an alternative, where one > could specify only the functions that should be subject to > perturbation. We'd like this because it’s quite difficult to > adequately test various numerical algorithms outside the context of > the modeller. This one is also in our todo list. One way to avoid the problem is to let Verrou generate the whole list of functions it encounters during a test run, and then provide this list as an exclusion list. If you give the full list, nothing will be perturbed; if you comment some functions in the list, only these functions will be perturbed. The generation of exclusion lists is documented here: http://edf-hpc.github.io/verrou/vr-manual.html#idm6262 Thank you for your interest in Verrou. Best regards, François -- François FÉVOTTE Research Engineer EDF – R&D – PERICLES I23 (Analysis and Numerical Modeling) 7 boulevard Gaspard Monge 91120 Palaiseau fra...@ed... Phone: +33 1 78 19 44 23 Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. |