|
From: Carl E. L. <ce...@us...> - 2017-04-26 16:03:32
|
On Wed, 2017-04-26 at 16:11 +0200, Julian Seward wrote:
> Carl,
>
> There are various refinements in memcheck/mc_translate.c that are related
> to accurate definedness tracking in this kind of case, for
> Iop_Cmp{32,64}, Iop_{Add,Sub}{32,64}, and Iop_CmpORD{32,64}{S,U}, the
> last of which are PPC64 specials. But they don't seem to have had any
> effect here.
>
> To diagnose this we really need to see the front end translation of the
> assembly fragment you showed, plus the uninstrumented, optimised IR
> relating to it. You should be able to get those with --tool=none
> --trace-flags=10001000 --trace-notbelow=<whatever>. Please get those
> and then we can see what's going on. Maybe!
>
> J
>
Julian:
Here is the test program.
#include <string.h>
int main ()
{
char str1[15];
char str2[15];
strcpy(str1, "abcdef");
strcpy(str2, "ABCDEF");
return strcmp(str1, str2);
}
Here is a dump of the Power binary produced by gcc7 for the code we are
interested in.
0000000010000460 <main>:
#include <string.h>
int main ()
{
10000460: 02 10 40 3c lis r2,4098
10000464: 00 7f 42 38 addi r2,r2,32512
char str1[15];
char str2[15];
strcpy(str1, "abcdef");
strcpy(str2, "ABCDEF");
10000468: fe ff 22 3d addis r9,r2,-2
1000046c: 40 8a 29 81 lwz r9,-30144(r9)
10000470: fe ff 42 3d addis r10,r2,-2
10000474: 46 8a 4a 89 lbz r10,-30138(r10)
{
10000478: c1 ff 21 f8 stdu r1,-64(r1)
strcpy(str1, "abcdef");
1000047c: fe ff a2 3c addis r5,r2,-2
10000480: 38 8a a5 80 lwz r5,-30152(r5)
10000484: fe ff c2 3c addis r6,r2,-2
10000488: 3c 8a c6 a0 lhz r6,-30148(r6)
1000048c: fe ff e2 3c addis r7,r2,-2
10000490: 3e 8a e7 88 lbz r7,-30146(r7)
strcpy(str2, "ABCDEF");
10000494: fe ff 02 3d addis r8,r2,-2
10000498: 44 8a 08 a1 lhz r8,-30140(r8)
1000049c: 20 00 21 91 stw r9,32(r1)
100004a0: 26 00 41 99 stb r10,38(r1)
return strcmp(str1, str2);
100004a4: 30 00 21 39 addi r9,r1,48
100004a8: 20 00 41 39 addi r10,r1,32
strcpy(str1, "abcdef");
100004ac: 30 00 a1 90 stw r5,48(r1)
100004b0: 34 00 c1 b0 sth r6,52(r1)
100004b4: 36 00 e1 98 stb r7,54(r1)
strcpy(str2, "ABCDEF");
100004b8: 24 00 01 b1 sth r8,36(r1)
return strcmp(str1, str2);
100004bc: 28 4c 20 7d ldbrx r9,0,r9
100004c0: 28 54 40 7d ldbrx r10,0,r10
100004c4: 51 48 6a 7c subf. r3,r10,r9
100004c8: 1c 00 82 40 bne 100004e4 <main+0x84>
100004cc: f8 1b 2a 7d cmpb r10,r9,r3
100004d0: 00 00 aa 2f cmpdi cr7,r10,0
100004d4: 38 00 9e 41 beq cr7,1000050c <main+0xac>
}
100004d8: b4 07 63 7c extsw r3,r3
100004dc: 40 00 21 38 addi r1,r1,64
100004e0: 20 00 80 4e blr
return strcmp(str1, str2);
100004e4: 00 00 00 39 li r8,0
100004e8: f8 53 23 7d cmpb r3,r9,r10
100004ec: f8 43 28 7d cmpb r8,r9,r8
100004f0: 38 1b 03 7d orc r3,r8,r3
100004f4: 74 00 63 7c cntlzd r3,r3
100004f8: 08 00 63 38 addi r3,r3,8
100004fc: 30 1e 29 79 rldcl r9,r9,r3,56
10000500: 30 1e 4a 79 rldcl r10,r10,r3,56
10000504: 50 48 6a 7c subf r3,r10,r9
10000508: d0 ff ff 4b b 100004d8 <main+0x78>
The loads in question are at addresses 100004bc and 100004c0. The
optimization loads these partially ininitialized values. The compiler
uses the cmpb instruction to make sure it really only looks at the valid
bytes, but as we said Valgrind doesn't know all that.
I ran valgrind as:
valgrind --tool=none --trace-flags=10001000
--trace-notbelow=1408 ./bug80497-gcc7 > bug80497-debug 2>&1
Took a little playing but it looks like SB 1408 corresponds to the
beginning of main and the above assembly code runs thru SB1409. I
edited down the valgrind output to just SB1408 and SB1409. I have
attached it as a file. I have the complete output if I threw away too
much. Thanks for your help on this.
Carl Love
|