Thread: [Valgrind-users] valgrind breaks any long double operation on x86-64

Brought to you by: njn, sewardj, wielaard

valgrind-users

[Valgrind-users] valgrind breaks any long double operation on x86-64

From: Tina H. <tin...@gm...> - 2014-01-21 21:49:28

I am new to this list. Can anyone guide me dissect a problem with
valgrinds long double fp math on x86-64 cpus? We're getting major
malfunctions in our applications because any long double operation
(say y=sinl(x)) contains rubbish in the least significant bits.

Tina
-- 
Tina Harriott  - Women in Mathematics
Contact: tin...@gm...

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: John R. <jr...@Bi...> - 2014-01-21 21:59:35

> Can anyone guide me dissect a problem with
> valgrinds long double fp math on x86-64 cpus? We're getting major
> malfunctions in our applications because any long double operation
> (say y=sinl(x)) contains rubbish in the least significant bits.

This is a well-known and long-standing property of valgrind.  For instance:
http://valgrind.10908.n7.nabble.com/valgrind-does-not-handle-long-double-td41633.html
[Search the 'net for "valgrind long double".]

There is no solution.  If IEEE-754 'double' is not good enough for you,
then valgrind is not good enough for you.

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Tom H. <to...@co...> - 2014-01-21 23:04:08

On 21/01/14 21:49, Tina Harriott wrote:

> I am new to this list. Can anyone guide me dissect a problem with
> valgrinds long double fp math on x86-64 cpus? We're getting major
> malfunctions in our applications because any long double operation
> (say y=sinl(x)) contains rubbish in the least significant bits.

See the "Limitations" section of the manual:

   http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits

Specifically the section about floating point limitations.

Tom

-- 
Tom Hughes (to...@co...)
http://compton.nu/

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Lionel C. <lio...@gm...> - 2014-01-21 23:19:35

On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote:
> On 21/01/14 21:49, Tina Harriott wrote:
>
>> I am new to this list. Can anyone guide me dissect a problem with
>> valgrinds long double fp math on x86-64 cpus? We're getting major
>> malfunctions in our applications because any long double operation
>> (say y=sinl(x)) contains rubbish in the least significant bits.
>
> See the "Limitations" section of the manual:
>
>    http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits
>
> Specifically the section about floating point limitations.

"...Whether or not this is critical remains to be seen..."

Yeah, that comment is so 'funny' that it hurts again. The difference
between 64bit math and 80bit math is whether a MBDA Meteor missile
will hit its target or not (Michael Östergren holds a personal grudge
against valgrind because of weeks lost due this particular
embarrassing bug screwing up simulations btw), whether the beams in
LHC will meet the (intended!) target or not, whether math in SAS
software works or not (warranting a warning in the written
documentation that running with valgrind to test 3rd party plugins is
not supported). So the list of things which do *NOT* work with
valgrind is *impressive* and hurt high value projects, IMHO warranting
at least the removal of that mocking comment "...Whether or not this
is critical remains to be seen...". Please.

Lionel

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Philippe W. <phi...@sk...> - 2014-01-22 21:36:49

On Wed, 2014-01-22 at 00:19 +0100, Lionel Cons wrote:
> On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote:
> > On 21/01/14 21:49, Tina Harriott wrote:
> >
> >> I am new to this list. Can anyone guide me dissect a problem with
> >> valgrinds long double fp math on x86-64 cpus? We're getting major
> >> malfunctions in our applications because any long double operation
> >> (say y=sinl(x)) contains rubbish in the least significant bits.
> >
> > See the "Limitations" section of the manual:
> >
> >    http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits
> >
> > Specifically the section about floating point limitations.
> 
> "...Whether or not this is critical remains to be seen..."
> 
> Yeah, that comment is so 'funny' that it hurts again. The difference
> between 64bit math and 80bit math is whether a MBDA Meteor missile
> will hit its target or not (Michael Östergren holds a personal grudge
> against valgrind because of weeks lost due this particular
> embarrassing bug screwing up simulations btw), whether the beams in
> LHC will meet the (intended!) target or not, whether math in SAS
> software works or not (warranting a warning in the written
> documentation that running with valgrind to test 3rd party plugins is
> not supported). So the list of things which do *NOT* work with
> valgrind is *impressive* and hurt high value projects, IMHO warranting
> at least the removal of that mocking comment "...Whether or not this
> is critical remains to be seen...". Please.
Effectively, it looks clear that many applications have problems
with this aspect. Would be better to rephrase the doc :).

Now, maybe these applications should better be compilable
with 64 bits floats, and would/should then work properly natively
and under valgrind.

The gcc documentation says for -mfpmath=sse: 

  The resulting code should be considerably faster in the majority of
  cases and avoid the numerical instability problems of 387 code, but
  may break some existing code that expects temporaries to be 80
  bits.

So, you might try to compile your app with the above flag
(I guess you might need a #ifdef or so to have a typedef that
is 80 bits without the above, and 64 bits with the above).

But of course, we all agree it would be nice to have 80 bits floats
properly supported by Valgrind. It is just nobody has time/money/effort
to spend on that :(.

Philippe

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Tina H. <tin...@gm...> - 2014-01-29 20:54:12

On 22 January 2014 22:36, Philippe Waroquiers
<phi...@sk...> wrote:
> On Wed, 2014-01-22 at 00:19 +0100, Lionel Cons wrote:
>> On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote:
>> > On 21/01/14 21:49, Tina Harriott wrote:
>> >
>> >> I am new to this list. Can anyone guide me dissect a problem with
>> >> valgrinds long double fp math on x86-64 cpus? We're getting major
>> >> malfunctions in our applications because any long double operation
>> >> (say y=sinl(x)) contains rubbish in the least significant bits.
>> >
>> > See the "Limitations" section of the manual:
>> >
>> >    http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits
>> >
>> > Specifically the section about floating point limitations.
>>
>> "...Whether or not this is critical remains to be seen..."
>>
>> Yeah, that comment is so 'funny' that it hurts again. The difference
>> between 64bit math and 80bit math is whether a MBDA Meteor missile
>> will hit its target or not (Michael Östergren holds a personal grudge
>> against valgrind because of weeks lost due this particular
>> embarrassing bug screwing up simulations btw), whether the beams in
>> LHC will meet the (intended!) target or not, whether math in SAS
>> software works or not (warranting a warning in the written
>> documentation that running with valgrind to test 3rd party plugins is
>> not supported). So the list of things which do *NOT* work with
>> valgrind is *impressive* and hurt high value projects, IMHO warranting
>> at least the removal of that mocking comment "...Whether or not this
>> is critical remains to be seen...". Please.
> Effectively, it looks clear that many applications have problems
> with this aspect. Would be better to rephrase the doc :).
>
> Now, maybe these applications should better be compilable
> with 64 bits floats, and would/should then work properly natively
> and under valgrind.
>
> The gcc documentation says for -mfpmath=sse:
>
>   The resulting code should be considerably faster in the majority of
>   cases and avoid the numerical instability problems of 387 code, but
>   may break some existing code that expects temporaries to be 80
>   bits.
>
> So, you might try to compile your app with the above flag
> (I guess you might need a #ifdef or so to have a typedef that
> is 80 bits without the above, and 64 bits with the above).
>
> But of course, we all agree it would be nice to have 80 bits floats
> properly supported by Valgrind. It is just nobody has time/money/effort
> to spend on that :(.

Kickstarter project maybe?

Tina
-- 
Tina Harriott  - Women in Mathematics
Contact: tin...@gm...

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Lionel C. <lio...@gm...> - 2014-02-13 11:48:30

On 29 January 2014 21:54, Tina Harriott <tin...@gm...> wrote:
> On 22 January 2014 22:36, Philippe Waroquiers
> <phi...@sk...> wrote:
>> On Wed, 2014-01-22 at 00:19 +0100, Lionel Cons wrote:
>>> On 22 January 2014 00:03, Tom Hughes <to...@co...> wrote:
>>> > On 21/01/14 21:49, Tina Harriott wrote:
>>> >
>>> >> I am new to this list. Can anyone guide me dissect a problem with
>>> >> valgrinds long double fp math on x86-64 cpus? We're getting major
>>> >> malfunctions in our applications because any long double operation
>>> >> (say y=sinl(x)) contains rubbish in the least significant bits.
>>> >
>>> > See the "Limitations" section of the manual:
>>> >
>>> >    http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits
>>> >
>>> > Specifically the section about floating point limitations.
>>>
>>> "...Whether or not this is critical remains to be seen..."
>>>
>>> Yeah, that comment is so 'funny' that it hurts again. The difference
>>> between 64bit math and 80bit math is whether a MBDA Meteor missile
>>> will hit its target or not (Michael Östergren holds a personal grudge
>>> against valgrind because of weeks lost due this particular
>>> embarrassing bug screwing up simulations btw), whether the beams in
>>> LHC will meet the (intended!) target or not, whether math in SAS
>>> software works or not (warranting a warning in the written
>>> documentation that running with valgrind to test 3rd party plugins is
>>> not supported). So the list of things which do *NOT* work with
>>> valgrind is *impressive* and hurt high value projects, IMHO warranting
>>> at least the removal of that mocking comment "...Whether or not this
>>> is critical remains to be seen...". Please.
>> Effectively, it looks clear that many applications have problems
>> with this aspect. Would be better to rephrase the doc :).
>>
>> Now, maybe these applications should better be compilable
>> with 64 bits floats, and would/should then work properly natively
>> and under valgrind.
>>
>> The gcc documentation says for -mfpmath=sse:
>>
>>   The resulting code should be considerably faster in the majority of
>>   cases and avoid the numerical instability problems of 387 code, but
>>   may break some existing code that expects temporaries to be 80
>>   bits.
>>
>> So, you might try to compile your app with the above flag
>> (I guess you might need a #ifdef or so to have a typedef that
>> is 80 bits without the above, and 64 bits with the above).
>>
>> But of course, we all agree it would be nice to have 80 bits floats
>> properly supported by Valgrind. It is just nobody has time/money/effort
>> to spend on that :(.
>
> Kickstarter project maybe?

Philippe?


>
> Tina
> --
> Tina Harriott  - Women in Mathematics
> Contact: tin...@gm...

Lionel

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Philippe W. <phi...@sk...> - 2014-02-22 02:19:20

On Thu, 2014-02-13 at 12:48 +0100, Lionel Cons wrote:
> >> But of course, we all agree it would be nice to have 80 bits floats
> >> properly supported by Valgrind. It is just nobody has
> time/money/effort
> >> to spend on that :(.
> >
> > Kickstarter project maybe?
> 
> Philippe?
>  
Looks interesting, but I do not think I am qualified to create
(and realise) this project for various reasons
(lack of time, lack of knowledge, and I believe some legal conditions).

But as I understand, Julian is doing some (professional, not amateur
like me) work on Valgrind.
He might be more interested.

Philippe

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Julian S. <js...@ac...> - 2014-02-24 10:03:08

>>> But of course, we all agree it would be nice to have 80 bits floats
>>> properly supported by Valgrind.

To support this for 64-bit processes would require, roughly:

* add an F80 (80-bit floating point) type to IR
* add relevant 80-bit equivalents of the relevant IROps
  (AddF80, SubF80, SinF80, etc)
* change the front end (guest_amd64_toIR.c) to generate
  IR that uses those new IROps
* change the back end (host_amd64_isel.c) to generate 80
  bit FP instructions from that IR.

Much of the back end stuff could be imported from the x86 (32-bit)
compilation pipeline.  That already has machinery to generate
x87 code and in particular to deal with the x87 register-stack
wierdness.

That would get a baseline simulator (--tool=none) that works OK,
Getting Memcheck to work requires extra steps:

* add an I80 (80-bit integer) type to IR
* add 80 bit versions of the few IROps that Memcheck requires
  (CmpwNEZ80, CmpNEZ80, Left80)
* add instruction selection to deal with those.  The tricky bit
  is that these will have to be generated into register-pairs,
  in the style of the existing iselInt128Expr, except that only
  the lowest 16 bits of the high-part register is used.

And comprehensive test cases, of course.  These are the most
important single piece, since they can be used to drive the rest
of the development.

J

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Roland M. <rol...@nr...> - 2014-02-24 14:26:20

On Mon, Feb 24, 2014 at 11:02 AM, Julian Seward <js...@ac...> wrote:
>
>>>> But of course, we all agree it would be nice to have 80 bits floats
>>>> properly supported by Valgrind.
>
> To support this for 64-bit processes would require, roughly:
>
> * add an F80 (80-bit floating point) type to IR
> * add relevant 80-bit equivalents of the relevant IROps
>   (AddF80, SubF80, SinF80, etc)
> * change the front end (guest_amd64_toIR.c) to generate
>   IR that uses those new IROps
> * change the back end (host_amd64_isel.c) to generate 80
>   bit FP instructions from that IR.
>
> Much of the back end stuff could be imported from the x86 (32-bit)
> compilation pipeline.  That already has machinery to generate
> x87 code and in particular to deal with the x87 register-stack
> wierdness.
>
> That would get a baseline simulator (--tool=none) that works OK,
> Getting Memcheck to work requires extra steps:
[snip]

Just curious:
Does valgrind have 128bit floating-point support somewhere? If "yes"
... could core parts of it be adopted/"dumbed down" to do some of the
80bit parts ?

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) rol...@nr...
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Florian K. <fl...@ei...> - 2014-02-24 15:14:04

On 02/24/2014 03:26 PM, Roland Mainz wrote:
> On Mon, Feb 24, 2014 at 11:02 AM, Julian Seward <js...@ac...> wrote:
> 
> Just curious:
> Does valgrind have 128bit floating-point support somewhere? If "yes"
> ... could core parts of it be adopted/"dumbed down" to do some of the
> 80bit parts ?

The s390 port supports 128-bit floating point. You could look at that
for how it is integrated. But it won't buy you much otherwise.

The problem with this task is not that it is difficult. The issue is
that it is mostly busy work and the testing part is as unsexy as it is
important.

Maybe somebody with ties into Intel or AMD can convince them to sponsor
a person to do the work....

   Florian

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Patrick J. L. <lop...@gm...> - 2014-01-23 02:44:32

On Wed, Jan 22, 2014 at 1:36 PM, Philippe Waroquiers
<phi...@sk...> wrote:
>
> The gcc documentation says for -mfpmath=sse:
>
>   The resulting code should be considerably faster in the majority of
>   cases and avoid the numerical instability problems of 387 code, but
>   may break some existing code that expects temporaries to be 80
>   bits.

"Considerably faster" is, if anything, an understatement. The
performance advantage of "double" vs. "long double" on Intel CPUs is
large, and it grows with each generation of hardware and compilers.
(SSE registers can hold two doubles, AVX can hold four, and compilers
keep getting smarter about vectorization.)

Do any common platforms, other than x86/x86_64, offer more-than-64-bit
"long double"?

A quick search turns up this bug report:

https://bugs.kde.org/show_bug.cgi?id=164298

The commentary there plus the "CLOSED WONTFIX" resolution make it
fairly clear how the Valgrind maintainers feel about this issue.

I would not expect to see "long double" support in Valgrind until
someone outside the core team offers patches and/or money.

 - Pat

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Dallman, J. <joh...@si...> - 2014-01-23 10:12:58

> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"?

Not that they support as full speed hardware operations, AFAIK. SPARC has defined
registers and instructions for 128-bit floating point, but implements them as
sequences of operations on 64-bit floats, so they aren't terribly fast.

IBM System/390 onwards supports 128-bit float in hardware, according to Wikipedia.

--
John Dallman
-----------------
Siemens Industry Software Limited is a limited company registered in England and Wales.
Registered number: 3476850.
Registered office: Faraday House, Sir William Siemens Square, Frimley, Surrey, GU16 8QD.

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Christian B. <bor...@de...> - 2014-01-23 11:16:17

On 23/01/14 11:12, Dallman, John wrote:
>> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"?
> 
> Not that they support as full speed hardware operations, AFAIK. SPARC has defined
> registers and instructions for 128-bit floating point, but implements them as
> sequences of operations on 64-bit floats, so they aren't terribly fast.
> 
> IBM System/390 onwards supports 128-bit float in hardware, according to Wikipedia.

Yes, s390 does use 128 bit float for long-double (if -mlong-double-128 is used, which is
the default for a while). The valgrind port of s390x supports that, but I think that
the valgrind test coverage for 128 bit float is not that good.

Christian

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Irek S. <isz...@gm...> - 2014-01-23 23:26:17

On Thu, Jan 23, 2014 at 11:12 AM, Dallman, John
<joh...@si...> wrote:
>> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"?
>
> Not that they support as full speed hardware operations, AFAIK. SPARC has defined
> registers and instructions for 128-bit floating point, but implements them as
> sequences of operations on 64-bit floats, so they aren't terribly fast.

This is totally wrong.
1. The SPARCV9 specification *requires* support for 128bit floating
point, including load/store and all other operations supported for
single and double floating point operations. It also mandates specific
libc interfaces. Each 128bit SPARCV9 floating point register uses two
64bit floating point registers, so you have half the number of
registers but still can have decent performance

2. Until very recently the SPARCV9 hardware *implementations* did not
support the 128bit floating point instructions in hardware, but since
the SPARCV9 specification mandates their support they are emulated by
kernel traps and separately via the mandated libc wrappers. This
changed with recent Fujitsu CPU versions which implement 128bit
floating point in hardware, including support for register renaming,
with performance almost as fast as the double instructions. Older
Fujitsu SPARCV9 CPU implementations have varying degree of SPARCV9
128bit floating point support, but at least the load/store and add
instructions were always supported.

3. Some SPARC emulators like Transitive implement full SPARCV9 128bit
floating point instruction support, i.e. using the SPARCV9
instructions directly results in faster execution than going through
the libc wrappers and have them executed by Transitive

> IBM System/390 onwards supports 128-bit float in hardware, according to Wikipedia.

Right

Irek

Re: [Valgrind-users] valgrind breaks any long double operation on x86-64

From: Alex B. <ker...@be...> - 2014-05-06 17:17:59

Irek Szczesniak <isz...@gm...> writes:

> On Thu, Jan 23, 2014 at 11:12 AM, Dallman, John
> <joh...@si...> wrote:
>>> Do any common platforms, other than x86/x86_64, offer more-than-64-bit "long double"?
>>
>> Not that they support as full speed hardware operations, AFAIK. SPARC has defined
>> registers and instructions for 128-bit floating point, but implements them as
>> sequences of operations on 64-bit floats, so they aren't terribly fast.
>
<snip>
> 3. Some SPARC emulators like Transitive implement full SPARCV9 128bit
> floating point instruction support, i.e. using the SPARCV9
> instructions directly results in faster execution than going through
> the libc wrappers and have them executed by Transitive
<snip>

FWIW the Transitive DBT also had to disable the generation of x87 code
by the compiler as it could seriously screw up NaN propagation when
interpreting SPARC floating point instructions. 

-- 
Alex Bennée