|
From: Vladislav Y. <vl...@so...> - 2021-10-20 15:12:49
|
Hi,
I am using Valgrind 3.17.0 and noticed a strange behavior while running a
code compiled with clang:
#include <cmath>
#include <cstdio>
int main() {
long double x = std::cbrtl(0.0L);
printf("%Lf", x);
return 0;
}
This example gives complete wrong output:
clang++ cbrtl.cpp -o cbrtl && valgrind ./cbrtl
==338929== Memcheck, a memory error detector
==338929== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==338929== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright
info
==338929== Command: ./cbrtl
==338929==
-0.178840
==338929==
==338929== HEAP SUMMARY:
==338929== in use at exit: 0 bytes in 0 blocks
==338929== total heap usage: 2 allocs, 2 frees, 73,728 bytes allocated
==338929==
==338929== All heap blocks were freed -- no leaks are possible
==338929==
==338929== For lists of detected and suppressed errors, rerun with: -s
==338929== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
However, using -m32 option for clang or gcc compiler (both 32 and 64-bit)
prints 0.0 as expected. I know that valgrind does not support 80-bit long
double and converts it to 64-bit.
I think it might cause this issue as the original code is compiled for
80-bit long doubles. It also might be an issue with clang.
Regards,
Vlad Yaglamunov
|
|
From: Paul F. <pj...@wa...> - 2021-10-20 16:16:41
|
> Message du 20/10/21 17:14
> De : "Vladislav Yaglamunov"
> A : val...@li...
> Copie à :
> Objet : [Valgrind-users] Cubic root of zero gives wrong result with clang 64-bit
>
>Hi,
>I am using Valgrind 3.17.0 and noticed a strange behavior while running a code compiled with clang:
>
> #include
>
> int main() {
> long double x = std::cbrtl(0.0L);
> printf("%Lf", x);
>
> return 0;
> }
I just did a test with g++. I had to modify the code a bit to prevent the compiler replacing the cbrtl() call with just a zero:
#include
#include
int main() {
volatile long double arg{0.0L};
long double x = std::cbrtl(arg);
printf("%Lf", x);
return 0;
}
This seems to work OK, both default and -m32.
The asm for that is
00000000004011a5 :
4011a5: 55 push %rbp
4011a6: 48 89 e5 mov %rsp,%rbp
4011a9: 48 83 ec 20 sub $0x20,%rsp
4011ad: d9 ee fldz
4011af: db 7d e0 fstpt -0x20(%rbp)
4011b2: db 6d e0 fldt -0x20(%rbp)
4011b5: 48 8d 64 24 f0 lea -0x10(%rsp),%rsp
4011ba: db 3c 24 fstpt (%rsp)
4011bd: e8 8e fe ff ff callq 401050
4011c2: 48 83 c4 10 add $0x10,%rsp
4011c6: db 7d f0 fstpt -0x10(%rbp)
4011c9: ff 75 f8 pushq -0x8(%rbp)
4011cc: ff 75 f0 pushq -0x10(%rbp)
4011cf: bf 04 20 40 00 mov $0x402004,%edi
4011d4: b8 00 00 00 00 mov $0x0,%eax
4011d9: e8 62 fe ff ff callq 401040
4011de: 48 83 c4 10 add $0x10,%rsp
4011e2: b8 00 00 00 00 mov $0x0,%eax
4011e7: c9 leaveq
4011e8: c3 retq
I'll have a go with clang++ at home tonight.
A+
Paul
|
|
From: Paul F. <pj...@wa...> - 2021-10-20 18:13:12
|
On 10/20/21 16:16, Paul FLOYD wrote:
>> Message du 20/10/21 17:14
>> De : "Vladislav Yaglamunov"
>> A : val...@li...
>> Copie à :
>> Objet : [Valgrind-users] Cubic root of zero gives wrong result with clang 64-bit
>>
>> I am using Valgrind 3.17.0 and noticed a strange behavior while running a code compiled with clang:
> #include
>> int main() {
>> long double x = std::cbrtl(0.0L);
>> printf("%Lf", x);
>>
>> return 0;
>> }
> I just did a test with g++. I had to modify the code a bit to prevent the compiler replacing the cbrtl() call with just a zero:
>
> #include <cmath>
> #include <
>
> int main() {
> volatile long double arg{0.0L};
> long double x = std::cbrtl(arg);
> printf("%Lf", x);
>
> return 0;
> }
>
> This seems to work OK, both default and -m32.
>
>
> The asm for that is
>
> 00000000004011a5 :
> 4011a5: 55 push %rbp
> 4011a6: 48 89 e5 mov %rsp,%rbp
> 4011a9: 48 83 ec 20 sub $0x20,%rsp
> 4011ad: d9 ee fldz
> 4011af: db 7d e0 fstpt -0x20(%rbp)
> 4011b2: db 6d e0 fldt -0x20(%rbp)
> 4011b5: 48 8d 64 24 f0 lea -0x10(%rsp),%rsp
> 4011ba: db 3c 24 fstpt (%rsp)
> 4011bd: e8 8e fe ff ff callq 401050
> 4011c2: 48 83 c4 10 add $0x10,%rsp
> 4011c6: db 7d f0 fstpt -0x10(%rbp)
> 4011c9: ff 75 f8 pushq -0x8(%rbp)
> 4011cc: ff 75 f0 pushq -0x10(%rbp)
> 4011cf: bf 04 20 40 00 mov $0x402004,%edi
> 4011d4: b8 00 00 00 00 mov $0x0,%eax
> 4011d9: e8 62 fe ff ff callq 401040
> 4011de: 48 83 c4 10 add $0x10,%rsp
> 4011e2: b8 00 00 00 00 mov $0x0,%eax
> 4011e7: c9 leaveq
> 4011e8: c3 retq
Well, it works with clang++:
FreeBSD clang version 11.0.1 (gi...@gi...:llvm/llvm-project.git
llvmorg-11.0.
1-0-g43ff75f2c3fe)
Target: x86_64-unknown-freebsd13.0
I also tried clang++ 8, 9 and 12.
The asm for clang++12 for my modified code is
0000000000201920 <main>:
201920: 55 push %rbp
201921: 48 89 e5 mov %rsp,%rbp
201924: 48 83 ec 40 sub $0x40,%rsp
201928: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
20192f: d9 ee fldz
201931: db 7d e0 fstpt -0x20(%rbp)
201934: db 6d e0 fldt -0x20(%rbp)
201937: 48 89 e0 mov %rsp,%rax
20193a: db 38 fstpt (%rax)
20193c: e8 bf 00 00 00 call 201a00 <cbrtl@plt>
201941: db 7d d0 fstpt -0x30(%rbp)
201944: db 6d d0 fldt -0x30(%rbp)
201947: 48 89 e0 mov %rsp,%rax
20194a: db 38 fstpt (%rax)
20194c: bf c9 05 20 00 mov $0x2005c9,%edi
201951: 31 c0 xor %eax,%eax
201953: e8 b8 00 00 00 call 201a10 <printf@plt>
201958: 31 c0 xor %eax,%eax
20195a: 48 83 c4 40 add $0x40,%rsp
20195e: 5d pop %rbp
20195f: c3 ret
Can you post the asm that you are getting ? I'll see if I can try clang
on Fedora 34.
A+
Paul
|
|
From: Paul F. <pj...@wa...> - 2021-10-20 19:41:07
|
On 20/10/2021 20:13, Paul Floyd wrote:
> On 10/20/21 16:16, Paul FLOYD wrote:
>
>>> Message du 20/10/21 17:14
>>>
>>> De : "Vladislav Yaglamunov"
>>>
>>> A : val...@li...
>>>
>>> Copie à :
>>>
>>> Objet : [Valgrind-users] Cubic root of zero gives wrong result with
>>> clang 64-bit
>>>
>>> I am using Valgrind 3.17.0 and noticed a strange behavior while
>>> running a code compiled with clang:
>>>
>> #include
>>
>>> int main() {
>>>
>>> long double x = std::cbrtl(0.0L);
>>>
>>> printf("%Lf", x);
>>>
>>> return 0;
>>>
>>> }
>>>
>> I just did a test with g++. I had to modify the code a bit to prevent
>> the compiler replacing the cbrtl() call with just a zero:
>>
>> #include <cmath>
>>
>> #include <
>>
>> int main() {
>>
>> volatile long double arg{0.0L};
>>
>> long double x = std::cbrtl(arg);
>>
>> printf("%Lf", x);
>>
>> return 0;
>>
>> }
>>
>> This seems to work OK, both default and -m32.
>>
>> The asm for that is
>>
>> 00000000004011a5 :
>>
>> 4011a5: 55 push %rbp
>>
>> 4011a6: 48 89 e5 mov %rsp,%rbp
>>
>> 4011a9: 48 83 ec 20 sub $0x20,%rsp
>>
>> 4011ad: d9 ee fldz
>>
>> 4011af: db 7d e0 fstpt -0x20(%rbp)
>>
>> 4011b2: db 6d e0 fldt -0x20(%rbp)
>>
>> 4011b5: 48 8d 64 24 f0 lea -0x10(%rsp),%rsp
>>
>> 4011ba: db 3c 24 fstpt (%rsp)
>>
>> 4011bd: e8 8e fe ff ff callq 401050
>>
>> 4011c2: 48 83 c4 10 add $0x10,%rsp
>>
>> 4011c6: db 7d f0 fstpt -0x10(%rbp)
>>
>> 4011c9: ff 75 f8 pushq -0x8(%rbp)
>>
>> 4011cc: ff 75 f0 pushq -0x10(%rbp)
>>
>> 4011cf: bf 04 20 40 00 mov $0x402004,%edi
>>
>> 4011d4: b8 00 00 00 00 mov $0x0,%eax
>>
>> 4011d9: e8 62 fe ff ff callq 401040
>>
>> 4011de: 48 83 c4 10 add $0x10,%rsp
>>
>> 4011e2: b8 00 00 00 00 mov $0x0,%eax
>>
>> 4011e7: c9 leaveq
>>
>> 4011e8: c3 retq
>>
> Well, it works with clang++:
>
> FreeBSD clang version 11.0.1 (gi...@gi...:llvm/llvm-project.git
> llvmorg-11.0.
>
> 1-0-g43ff75f2c3fe)
>
> Target: x86_64-unknown-freebsd13.0
>
> I also tried clang++ 8, 9 and 12.
>
> The asm for clang++12 for my modified code is
>
> 0000000000201920 <main>:
>
> 201920: 55 push %rbp
>
> 201921: 48 89 e5 mov %rsp,%rbp
>
> 201924: 48 83 ec 40 sub $0x40,%rsp
>
> 201928: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
>
> 20192f: d9 ee fldz
>
> 201931: db 7d e0 fstpt -0x20(%rbp)
>
> 201934: db 6d e0 fldt -0x20(%rbp)
> 201937: 48 89 e0 mov %rsp,%rax
> 20193a: db 38 fstpt (%rax)
> 20193c: e8 bf 00 00 00 call 201a00 <cbrtl@plt>
> 201941: db 7d d0 fstpt -0x30(%rbp)
> 201944: db 6d d0 fldt -0x30(%rbp)
> 201947: 48 89 e0 mov %rsp,%rax
> 20194a: db 38 fstpt (%rax)
> 20194c: bf c9 05 20 00 mov $0x2005c9,%edi
> 201951: 31 c0 xor %eax,%eax
> 201953: e8 b8 00 00 00 call 201a10 <printf@plt>
> 201958: 31 c0 xor %eax,%eax
> 20195a: 48 83 c4 40 add $0x40,%rsp
> 20195e: 5d pop %rbp
> 20195f: c3 ret
>
> Can you post the asm that you are getting ? I'll see if I can try
> clang on Fedora 34.
>
Third time lucky. Reproduced on Fedora 34 with clang++
clang version 12.0.1 (Fedora 12.0.1-1.fc34)
0000000000401140 <main>:
401140: 55 push %rbp
401141: 48 89 e5 mov %rsp,%rbp
401144: 48 83 ec 40 sub $0x40,%rsp
401148: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
40114f: d9 ee fldz
401151: db 7d e0 fstpt -0x20(%rbp)
401154: db 6d e0 fldt -0x20(%rbp)
401157: 48 89 e0 mov %rsp,%rax
40115a: db 38 fstpt (%rax)
40115c: e8 df fe ff ff callq 401040 <cbrtl@plt>
401161: db 7d d0 fstpt -0x30(%rbp)
401164: db 6d d0 fldt -0x30(%rbp)
401167: 48 89 e0 mov %rsp,%rax
40116a: db 38 fstpt (%rax)
40116c: bf 10 20 40 00 mov $0x402010,%edi
401171: 31 c0 xor %eax,%eax
401173: e8 b8 fe ff ff callq 401030 <printf@plt>
401178: 31 c0 xor %eax,%eax
40117a: 48 83 c4 40 add $0x40,%rsp
40117e: 5d pop %rbp
40117f: c3 retq
That looks very much like what I saw on FreeBSD.
Unless someone else has an Idea this is going to need some debugging
inside Valgrind.
A+
Paul
|
|
From: Florian W. <fw...@de...> - 2021-10-21 19:03:25
|
* Paul Floyd: > Unless someone else has an Idea this is going to need some debugging > inside Valgrind. It's probably the glibc implementation that is incorrectly executed by valgrind. “g++ -fno-builtin” reproduces the issue with the original sources. |
|
From: Paul F. <pj...@wa...> - 2021-10-21 21:02:50
|
On 21/10/2021 20:45, Florian Weimer wrote: > * Paul Floyd: > >> Unless someone else has an Idea this is going to need some debugging >> inside Valgrind. > It's probably the glibc implementation that is incorrectly executed by > valgrind. “g++ -fno-builtin” reproduces the issue with the original > sources. Yes, this is what is happening. Stepping through libc cbrtl with debuginfo, outside of Valgrind the return is where the arrow is, presumably fpclassify saying that it is FP_ZERO. xe is an int, can't imagine comparing it to zero is going to be a problem. 54 if (xe == 0 && fpclassify (x) <= FP_ZERO) │ >55 return x + x; The enum values returned by fpclassify are FP_NAN 0, FP_INFINITE 1, FP_ZERO 2. Inside Valgrind this test is false, and the (wrong) value that is calculated corresponds to the expressions and constants in the code. # define fpclassify(x) __builtin_fpclassify (FP_NAN, FP_INFINITE, \ FP_NORMAL, FP_SUBNORMAL, FP_ZERO, x) From what I see this builtin performs /* fpclassify(x) -> isnan(x) ? FP_NAN : (fabs(x) == Inf ? FP_INFINITE : (fabs(x) >= DBL_MIN ? FP_NORMAL : (x == 0 ? FP_ZERO : FP_SUBNORMAL))). */ For the first test, it looks like there is a 'fucomi'.That will set the parity flag if the input is NaN, but I only see ZF set. The next test is another 'fucomi' comparing to the largest long double followed by a 'ja', testing for Inf. The result is just CF set - less than. The third test is a 'fcompi' with ldbl_min. This sets ZF, and is followed by a 'jae'. That's wrong. That would mean 0.0 == ldbl_min. If internally we are working with double precision that it's to be expected that ldbl_min underflows to 0.0 and the comparison is true. It's a shame the libc test is done in this order. I don't see any easy fix for this. A+ Paul |