|
From: Tina H. <tin...@gm...> - 2014-01-21 21:49:28
|
I am new to this list. Can anyone guide me dissect a problem with valgrinds long double fp math on x86-64 cpus? We're getting major malfunctions in our applications because any long double operation (say y=sinl(x)) contains rubbish in the least significant bits. Tina -- Tina Harriott - Women in Mathematics Contact: tin...@gm... |
|
From: John R. <jr...@Bi...> - 2014-01-21 21:59:35
|
> Can anyone guide me dissect a problem with > valgrinds long double fp math on x86-64 cpus? We're getting major > malfunctions in our applications because any long double operation > (say y=sinl(x)) contains rubbish in the least significant bits. This is a well-known and long-standing property of valgrind. For instance: http://valgrind.10908.n7.nabble.com/valgrind-does-not-handle-long-double-td41633.html [Search the 'net for "valgrind long double".] There is no solution. If IEEE-754 'double' is not good enough for you, then valgrind is not good enough for you. |
|
From: Tom H. <to...@co...> - 2014-01-21 23:04:08
|
On 21/01/14 21:49, Tina Harriott wrote: > I am new to this list. Can anyone guide me dissect a problem with > valgrinds long double fp math on x86-64 cpus? We're getting major > malfunctions in our applications because any long double operation > (say y=sinl(x)) contains rubbish in the least significant bits. See the "Limitations" section of the manual: http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits Specifically the section about floating point limitations. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Lionel C. <lio...@gm...> - 2014-01-21 23:19:35
|
On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote: > On 21/01/14 21:49, Tina Harriott wrote: > >> I am new to this list. Can anyone guide me dissect a problem with >> valgrinds long double fp math on x86-64 cpus? We're getting major >> malfunctions in our applications because any long double operation >> (say y=sinl(x)) contains rubbish in the least significant bits. > > See the "Limitations" section of the manual: > > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits > > Specifically the section about floating point limitations. "...Whether or not this is critical remains to be seen..." Yeah, that comment is so 'funny' that it hurts again. The difference between 64bit math and 80bit math is whether a MBDA Meteor missile will hit its target or not (Michael Östergren holds a personal grudge against valgrind because of weeks lost due this particular embarrassing bug screwing up simulations btw), whether the beams in LHC will meet the (intended!) target or not, whether math in SAS software works or not (warranting a warning in the written documentation that running with valgrind to test 3rd party plugins is not supported). So the list of things which do *NOT* work with valgrind is *impressive* and hurt high value projects, IMHO warranting at least the removal of that mocking comment "...Whether or not this is critical remains to be seen...". Please. Lionel |
|
From: Philippe W. <phi...@sk...> - 2014-01-22 21:36:49
|
On Wed, 2014-01-22 at 00:19 +0100, Lionel Cons wrote: > On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote: > > On 21/01/14 21:49, Tina Harriott wrote: > > > >> I am new to this list. Can anyone guide me dissect a problem with > >> valgrinds long double fp math on x86-64 cpus? We're getting major > >> malfunctions in our applications because any long double operation > >> (say y=sinl(x)) contains rubbish in the least significant bits. > > > > See the "Limitations" section of the manual: > > > > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits > > > > Specifically the section about floating point limitations. > > "...Whether or not this is critical remains to be seen..." > > Yeah, that comment is so 'funny' that it hurts again. The difference > between 64bit math and 80bit math is whether a MBDA Meteor missile > will hit its target or not (Michael Östergren holds a personal grudge > against valgrind because of weeks lost due this particular > embarrassing bug screwing up simulations btw), whether the beams in > LHC will meet the (intended!) target or not, whether math in SAS > software works or not (warranting a warning in the written > documentation that running with valgrind to test 3rd party plugins is > not supported). So the list of things which do *NOT* work with > valgrind is *impressive* and hurt high value projects, IMHO warranting > at least the removal of that mocking comment "...Whether or not this > is critical remains to be seen...". Please. Effectively, it looks clear that many applications have problems with this aspect. Would be better to rephrase the doc :). Now, maybe these applications should better be compilable with 64 bits floats, and would/should then work properly natively and under valgrind. The gcc documentation says for -mfpmath=sse: The resulting code should be considerably faster in the majority of cases and avoid the numerical instability problems of 387 code, but may break some existing code that expects temporaries to be 80 bits. So, you might try to compile your app with the above flag (I guess you might need a #ifdef or so to have a typedef that is 80 bits without the above, and 64 bits with the above). But of course, we all agree it would be nice to have 80 bits floats properly supported by Valgrind. It is just nobody has time/money/effort to spend on that :(. Philippe |
|
From: Tina H. <tin...@gm...> - 2014-01-29 20:54:12
|
On 22 January 2014 22:36, Philippe Waroquiers <phi...@sk...> wrote: > On Wed, 2014-01-22 at 00:19 +0100, Lionel Cons wrote: >> On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote: >> > On 21/01/14 21:49, Tina Harriott wrote: >> > >> >> I am new to this list. Can anyone guide me dissect a problem with >> >> valgrinds long double fp math on x86-64 cpus? We're getting major >> >> malfunctions in our applications because any long double operation >> >> (say y=sinl(x)) contains rubbish in the least significant bits. >> > >> > See the "Limitations" section of the manual: >> > >> > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits >> > >> > Specifically the section about floating point limitations. >> >> "...Whether or not this is critical remains to be seen..." >> >> Yeah, that comment is so 'funny' that it hurts again. The difference >> between 64bit math and 80bit math is whether a MBDA Meteor missile >> will hit its target or not (Michael Östergren holds a personal grudge >> against valgrind because of weeks lost due this particular >> embarrassing bug screwing up simulations btw), whether the beams in >> LHC will meet the (intended!) target or not, whether math in SAS >> software works or not (warranting a warning in the written >> documentation that running with valgrind to test 3rd party plugins is >> not supported). So the list of things which do *NOT* work with >> valgrind is *impressive* and hurt high value projects, IMHO warranting >> at least the removal of that mocking comment "...Whether or not this >> is critical remains to be seen...". Please. > Effectively, it looks clear that many applications have problems > with this aspect. Would be better to rephrase the doc :). > > Now, maybe these applications should better be compilable > with 64 bits floats, and would/should then work properly natively > and under valgrind. > > The gcc documentation says for -mfpmath=sse: > > The resulting code should be considerably faster in the majority of > cases and avoid the numerical instability problems of 387 code, but > may break some existing code that expects temporaries to be 80 > bits. > > So, you might try to compile your app with the above flag > (I guess you might need a #ifdef or so to have a typedef that > is 80 bits without the above, and 64 bits with the above). > > But of course, we all agree it would be nice to have 80 bits floats > properly supported by Valgrind. It is just nobody has time/money/effort > to spend on that :(. Kickstarter project maybe? Tina -- Tina Harriott - Women in Mathematics Contact: tin...@gm... |
|
From: Lionel C. <lio...@gm...> - 2014-02-13 11:48:30
|
On 29 January 2014 21:54, Tina Harriott <tin...@gm...> wrote: > On 22 January 2014 22:36, Philippe Waroquiers > <phi...@sk...> wrote: >> On Wed, 2014-01-22 at 00:19 +0100, Lionel Cons wrote: >>> On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote: >>> > On 21/01/14 21:49, Tina Harriott wrote: >>> > >>> >> I am new to this list. Can anyone guide me dissect a problem with >>> >> valgrinds long double fp math on x86-64 cpus? We're getting major >>> >> malfunctions in our applications because any long double operation >>> >> (say y=sinl(x)) contains rubbish in the least significant bits. >>> > >>> > See the "Limitations" section of the manual: >>> > >>> > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits >>> > >>> > Specifically the section about floating point limitations. >>> >>> "...Whether or not this is critical remains to be seen..." >>> >>> Yeah, that comment is so 'funny' that it hurts again. The difference >>> between 64bit math and 80bit math is whether a MBDA Meteor missile >>> will hit its target or not (Michael Östergren holds a personal grudge >>> against valgrind because of weeks lost due this particular >>> embarrassing bug screwing up simulations btw), whether the beams in >>> LHC will meet the (intended!) target or not, whether math in SAS >>> software works or not (warranting a warning in the written >>> documentation that running with valgrind to test 3rd party plugins is >>> not supported). So the list of things which do *NOT* work with >>> valgrind is *impressive* and hurt high value projects, IMHO warranting >>> at least the removal of that mocking comment "...Whether or not this >>> is critical remains to be seen...". Please. >> Effectively, it looks clear that many applications have problems >> with this aspect. Would be better to rephrase the doc :). >> >> Now, maybe these applications should better be compilable >> with 64 bits floats, and would/should then work properly natively >> and under valgrind. >> >> The gcc documentation says for -mfpmath=sse: >> >> The resulting code should be considerably faster in the majority of >> cases and avoid the numerical instability problems of 387 code, but >> may break some existing code that expects temporaries to be 80 >> bits. >> >> So, you might try to compile your app with the above flag >> (I guess you might need a #ifdef or so to have a typedef that >> is 80 bits without the above, and 64 bits with the above). >> >> But of course, we all agree it would be nice to have 80 bits floats >> properly supported by Valgrind. It is just nobody has time/money/effort >> to spend on that :(. > > Kickstarter project maybe? Philippe? > > Tina > -- > Tina Harriott - Women in Mathematics > Contact: tin...@gm... Lionel |
|
From: Philippe W. <phi...@sk...> - 2014-02-22 02:19:20
|
On Thu, 2014-02-13 at 12:48 +0100, Lionel Cons wrote: > >> But of course, we all agree it would be nice to have 80 bits floats > >> properly supported by Valgrind. It is just nobody has > time/money/effort > >> to spend on that :(. > > > > Kickstarter project maybe? > > Philippe? > Looks interesting, but I do not think I am qualified to create (and realise) this project for various reasons (lack of time, lack of knowledge, and I believe some legal conditions). But as I understand, Julian is doing some (professional, not amateur like me) work on Valgrind. He might be more interested. Philippe |
|
From: Julian S. <js...@ac...> - 2014-02-24 10:03:08
|
>>> But of course, we all agree it would be nice to have 80 bits floats >>> properly supported by Valgrind. To support this for 64-bit processes would require, roughly: * add an F80 (80-bit floating point) type to IR * add relevant 80-bit equivalents of the relevant IROps (AddF80, SubF80, SinF80, etc) * change the front end (guest_amd64_toIR.c) to generate IR that uses those new IROps * change the back end (host_amd64_isel.c) to generate 80 bit FP instructions from that IR. Much of the back end stuff could be imported from the x86 (32-bit) compilation pipeline. That already has machinery to generate x87 code and in particular to deal with the x87 register-stack wierdness. That would get a baseline simulator (--tool=none) that works OK, Getting Memcheck to work requires extra steps: * add an I80 (80-bit integer) type to IR * add 80 bit versions of the few IROps that Memcheck requires (CmpwNEZ80, CmpNEZ80, Left80) * add instruction selection to deal with those. The tricky bit is that these will have to be generated into register-pairs, in the style of the existing iselInt128Expr, except that only the lowest 16 bits of the high-part register is used. And comprehensive test cases, of course. These are the most important single piece, since they can be used to drive the rest of the development. J |
|
From: Roland M. <rol...@nr...> - 2014-02-24 14:26:20
|
On Mon, Feb 24, 2014 at 11:02 AM, Julian Seward <js...@ac...> wrote: > >>>> But of course, we all agree it would be nice to have 80 bits floats >>>> properly supported by Valgrind. > > To support this for 64-bit processes would require, roughly: > > * add an F80 (80-bit floating point) type to IR > * add relevant 80-bit equivalents of the relevant IROps > (AddF80, SubF80, SinF80, etc) > * change the front end (guest_amd64_toIR.c) to generate > IR that uses those new IROps > * change the back end (host_amd64_isel.c) to generate 80 > bit FP instructions from that IR. > > Much of the back end stuff could be imported from the x86 (32-bit) > compilation pipeline. That already has machinery to generate > x87 code and in particular to deal with the x87 register-stack > wierdness. > > That would get a baseline simulator (--tool=none) that works OK, > Getting Memcheck to work requires extra steps: [snip] Just curious: Does valgrind have 128bit floating-point support somewhere? If "yes" ... could core parts of it be adopted/"dumbed down" to do some of the 80bit parts ? ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) rol...@nr... \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) |
|
From: Florian K. <fl...@ei...> - 2014-02-24 15:14:04
|
On 02/24/2014 03:26 PM, Roland Mainz wrote: > On Mon, Feb 24, 2014 at 11:02 AM, Julian Seward <js...@ac...> wrote: > > Just curious: > Does valgrind have 128bit floating-point support somewhere? If "yes" > ... could core parts of it be adopted/"dumbed down" to do some of the > 80bit parts ? The s390 port supports 128-bit floating point. You could look at that for how it is integrated. But it won't buy you much otherwise. The problem with this task is not that it is difficult. The issue is that it is mostly busy work and the testing part is as unsexy as it is important. Maybe somebody with ties into Intel or AMD can convince them to sponsor a person to do the work.... Florian |
|
From: Patrick J. L. <lop...@gm...> - 2014-01-23 02:44:32
|
On Wed, Jan 22, 2014 at 1:36 PM, Philippe Waroquiers <phi...@sk...> wrote: > > The gcc documentation says for -mfpmath=sse: > > The resulting code should be considerably faster in the majority of > cases and avoid the numerical instability problems of 387 code, but > may break some existing code that expects temporaries to be 80 > bits. "Considerably faster" is, if anything, an understatement. The performance advantage of "double" vs. "long double" on Intel CPUs is large, and it grows with each generation of hardware and compilers. (SSE registers can hold two doubles, AVX can hold four, and compilers keep getting smarter about vectorization.) Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"? A quick search turns up this bug report: https://bugs.kde.org/show_bug.cgi?id=164298 The commentary there plus the "CLOSED WONTFIX" resolution make it fairly clear how the Valgrind maintainers feel about this issue. I would not expect to see "long double" support in Valgrind until someone outside the core team offers patches and/or money. - Pat |
|
From: Dallman, J. <joh...@si...> - 2014-01-23 10:12:58
|
> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"? Not that they support as full speed hardware operations, AFAIK. SPARC has defined registers and instructions for 128-bit floating point, but implements them as sequences of operations on 64-bit floats, so they aren't terribly fast. IBM System/390 onwards supports 128-bit float in hardware, according to Wikipedia. -- John Dallman ----------------- Siemens Industry Software Limited is a limited company registered in England and Wales. Registered number: 3476850. Registered office: Faraday House, Sir William Siemens Square, Frimley, Surrey, GU16 8QD. |
|
From: Christian B. <bor...@de...> - 2014-01-23 11:16:17
|
On 23/01/14 11:12, Dallman, John wrote: >> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"? > > Not that they support as full speed hardware operations, AFAIK. SPARC has defined > registers and instructions for 128-bit floating point, but implements them as > sequences of operations on 64-bit floats, so they aren't terribly fast. > > IBM System/390 onwards supports 128-bit float in hardware, according to Wikipedia. Yes, s390 does use 128 bit float for long-double (if -mlong-double-128 is used, which is the default for a while). The valgrind port of s390x supports that, but I think that the valgrind test coverage for 128 bit float is not that good. Christian |
|
From: Irek S. <isz...@gm...> - 2014-01-23 23:26:17
|
On Thu, Jan 23, 2014 at 11:12 AM, Dallman, John <joh...@si...> wrote: >> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"? > > Not that they support as full speed hardware operations, AFAIK. SPARC has defined > registers and instructions for 128-bit floating point, but implements them as > sequences of operations on 64-bit floats, so they aren't terribly fast. This is totally wrong. 1. The SPARCV9 specification *requires* support for 128bit floating point, including load/store and all other operations supported for single and double floating point operations. It also mandates specific libc interfaces. Each 128bit SPARCV9 floating point register uses two 64bit floating point registers, so you have half the number of registers but still can have decent performance 2. Until very recently the SPARCV9 hardware *implementations* did not support the 128bit floating point instructions in hardware, but since the SPARCV9 specification mandates their support they are emulated by kernel traps and separately via the mandated libc wrappers. This changed with recent Fujitsu CPU versions which implement 128bit floating point in hardware, including support for register renaming, with performance almost as fast as the double instructions. Older Fujitsu SPARCV9 CPU implementations have varying degree of SPARCV9 128bit floating point support, but at least the load/store and add instructions were always supported. 3. Some SPARC emulators like Transitive implement full SPARCV9 128bit floating point instruction support, i.e. using the SPARCV9 instructions directly results in faster execution than going through the libc wrappers and have them executed by Transitive > IBM System/390 onwards supports 128-bit float in hardware, according to Wikipedia. Right Irek |
|
From: Alex B. <ker...@be...> - 2014-05-06 17:17:59
|
Irek Szczesniak <isz...@gm...> writes: > On Thu, Jan 23, 2014 at 11:12 AM, Dallman, John > <joh...@si...> wrote: >>> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"? >> >> Not that they support as full speed hardware operations, AFAIK. SPARC has defined >> registers and instructions for 128-bit floating point, but implements them as >> sequences of operations on 64-bit floats, so they aren't terribly fast. > <snip> > 3. Some SPARC emulators like Transitive implement full SPARCV9 128bit > floating point instruction support, i.e. using the SPARCV9 > instructions directly results in faster execution than going through > the libc wrappers and have them executed by Transitive <snip> FWIW the Transitive DBT also had to disable the generation of x87 code by the compiler as it could seriously screw up NaN propagation when interpreting SPARC floating point instructions. -- Alex Bennée |