|
From: Christian B. <bor...@de...> - 2015-04-21 09:09:38
|
Am 21.04.2015 um 04:20 schrieb Christian Borntraeger:
> none/tests/bigcode (stdout)
> none/tests/bigcode (stderr)
[...]
> --tools=none,memcheck,callgrind,helgrind,cachegrind,drd,massif --reps=3 --vg=../valgrind-new --vg=../valgrind-old
> -- Running tests in perf ----------------------------------------------
> -- bigcode1 --
> bigcode1 valgrind-new:0.22s no:
> *** Command returned non-zero (2816)
> *** See perf.{cmd,stdout,stderr} to determine what went wrong.
I looked into these bugs and the problem is that the gcc on SLES11 compiles for z900 which
only has instructions with small immediates. So gcc thinks that f should load some values
from the literal pool with a relative instruction:
00000000800007b4 <f>:
800007b4: e3 d0 f0 68 00 24 stg %r13,104(%r15)
800007ba: c0 d0 00 00 02 3f larl %r13,80000c38 <------ relative load (2*0x23f bytes)
800007c0: b9 04 00 02 lgr %r0,%r2
800007c4: b9 04 00 23 lgr %r2,%r3
800007c8: 18 10 lr %r1,%r0
800007ca: 54 10 d0 04 n %r1,4(%r13)
800007ce: 12 11 ltr %r1,%r1
800007d0: a7 a4 00 08 jhe 800007e0 <f+0x2c>
800007d4: a7 1a ff ff ahi %r1,-1
[...]
This of course fails miserably as soon as the function is copied (wrong values
even without values, crashes with valgrind) and the literal is not.
Several ways of fixing this
(a) using -march=z9-109 or later avoids literal pool (this system introduced extended immidiate values). This will happen anyway on recent distros (z9 was introduced 10 years ago, so most distros
have a default compiler option for z9 or later)
(b) further increase func size.
I guess the tilegx fix r15095 recently fixes the same issue - the code can not be that big. ZhiGang,
can you double check if your fix was also actually a literal pool value and not code?
In our case we are beyond 1024 bytes, see above: 7ba ---> c38 is 1150 bytes
Now this patch fixes the issue:
Index: perf/bigcode.c
===================================================================
--- perf/bigcode.c (revision 15120)
+++ perf/bigcode.c (working copy)
@@ -20,7 +20,8 @@
#endif
#include "tests/sys_mman.h"
-#define FN_SIZE 1024 // Must be big enough to hold the compiled f()
+#define FN_SIZE 1280 // Must be big enough to hold the compiled f()
+ // and any literal pool that might be used
#define N_LOOPS 20000 // Should be divisible by four
#define RATIO 4 // Ratio of code sizes between the two modes
So I am tempted to check in aboves patch.
Now: This actually shows, that the mmap PROT_WRITE | PROT_EXEC is wrong. Strictly
speaking we __need__ PROT_READ as well. It just does not matter as PROT_EXEC implies
PROT_READ on a page table level.
Opinions?
Christian
|
|
From: Florian K. <fl...@ei...> - 2015-04-21 12:20:07
|
On 21.04.2015 11:09, Christian Borntraeger wrote: > Several ways of fixing this > (a) using -march=z9-109 or later avoids literal pool (this system introduced extended immidiate values). This will happen anyway on recent distros (z9 was introduced 10 years ago, so most distros > have a default compiler option for z9 or later) > > (b) further increase func size. Increasing FN_SIZE sounds better than fiddling with compiler options. > Now: This actually shows, that the mmap PROT_WRITE | PROT_EXEC is wrong. Strictly > speaking we __need__ PROT_READ as well. It just does not matter as PROT_EXEC implies > PROT_READ on a page table level. You cannot execute an instruction without reading it first from somewhere. So, to me, specifying PROT_READ in addition to PROT_EXEC is redundant. We should leave the prot bits as is because they gave rise to r15075. In other words, this is a testcase for that change. Florian |