|
From: Carl L. <ce...@us...> - 2020-02-13 21:54:12
|
Valgrind developers: The PPC architecture has a number of "new" regression errors. There appear to be two root causes for these issues. Not sure if the first of these issues has caused issues on other 64-bit architectures as well. Bug 416760 - ppc64le Assertion 'VG_IS_16_ALIGNED(sizeof(struct rt_sigframe))' failed https://bugs.kde.org/show_bug.cgi?id=416760 Bug 417427 - commit to fix vki_siginfo_t definition created numerous regression errors on PPC64 https://bugs.kde.org/show_bug.cgi?id=417427 Basically the issue comes from commit: commit 3bac39a10abf292d332bb20ab58c6dd5c28f9108 Author: Eugene Syromyatnikov <ev...@gm...> Date: Fri Mar 8 04:07:00 2019 +0100 include/vki: fix vki_siginfo_t definition on amd64, arm64, and ppc64 As it turned out, the size of vki_siginfo_t is incorrect on these 64-bit architectures: (gdb) p sizeof(vki_siginfo_t) $1 = 136 (gdb) ptype struct vki_siginfo type = struct vki_siginfo { int si_signo; int si_errno; int si_code; union { int _pad[29]; struct {...} _kill; struct {...} _timer; struct {...} _rt; struct {...} _sigchld; struct {...} _sigfault; struct {...} _sigpoll; } _sifields; } etc. The issue is the struct rt_sigframe is not properly aligned. The following patch fixed the issue on ppc64 reducing the number of regression errors from == 649 tests, 38 stderr failures, 13 stdout failures, 1 stderrB failure, 5 stdoutB failures, 2 post failures to == 649 tests, 6 stderr failures, 3 stdout failures, 0 stderrB failures, 2 stdoutB failures, 2 post failures == ----------------------------------------------- --- coregrind/m_sigframe/sigframe-ppc64-linux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/coregrind/m_sigframe/sigframe-ppc64-linux.c b/coregrind/m_sigframe/sigframe-ppc64-linux.c index 0406f3c..b54c4e0 100644 --- a/coregrind/m_sigframe/sigframe-ppc64-linux.c +++ b/coregrind/m_sigframe/sigframe-ppc64-linux.c @@ -112,7 +112,7 @@ struct rt_sigframe { vki_siginfo_t info; struct vg_sig_private priv; UChar abigap[288]; // unused -}; +} __attribute__ ((aligned (16))); #define SET_SIGNAL_LR(zztst, zzval) \ do { tst->arch.vex.guest_LR = (zzval); \ -- 2.7.4 --------------------------------------------------- I would like to get some testing done on other architectures to see if it fixes issues on other systems and if it causes any additional issues before committing this patch to Valgrind. Thanks for your help on this. Carl Love |
|
From: Carl L. <ce...@us...> - 2020-02-13 22:22:25
|
Valgrind developers:
Per my discussion with Mark on slack, I should clarify my comment on
this patch.
> On Thu, 2020-02-13 at 13:53 -0800, Carl Love wrote:
> Valgrind developers:
>
> The PPC architecture has a number of "new" regression errors. There
> appear to be two root causes for these issues. Not sure if the first
> of these issues has caused issues on other 64-bit architectures as
> well.
The fix is actually in a PPC64 specific file. The error occurs because
PPC64 checks that the structure is aligned properly. Not sure if other
architectures check the alignment or not. If they do, then they could
see the same issue. The structure is defined on a per architecture
basis which means other architectures would need to implement a similar
fix in their architecture specific file.
The patch talked about 64-bit architectures specifically amd64 arm64.
Hence I was thinking architecture independent as I was writing the
email and forgot that the actual fix was in a ppc64 specific file. I
guess I forgot that detail while working on the second patch.
Sorry for the confusion.
Carl Love
|
|
From: Mark W. <ma...@kl...> - 2020-02-14 13:25:21
|
Hi,
On Thu, 2020-02-13 at 14:22 -0800, Carl Love wrote:
> The fix is actually in a PPC64 specific file. The error occurs
> because
> PPC64 checks that the structure is aligned properly. Not sure if other
> architectures check the alignment or not. If they do, then they could
> see the same issue. The structure is defined on a per architecture
> basis which means other architectures would need to implement a similar
> fix in their architecture specific file.
>
> The patch talked about 64-bit architectures specifically amd64 arm64.
> Hence I was thinking architecture independent as I was writing the
> email and forgot that the actual fix was in a ppc64 specific file. I
> guess I forgot that detail while working on the second patch.
I don't believe any other architecture has a check that the structure
is aligned correctly. So it isn't causing immediate problems on other
architectures. I cannot tell whether they should or shouldn't have such
a check though.
For ppc64 and ppc64le I tested both patches and they seem to work
nicely.
Fedora release 30 (Thirty)
Linux 5.2.9-200.fc30.ppc64le #1 SMP Fri Aug 16 21:03:00 UTC 2019
GNU C Library (GNU libc) stable release version 2.29.
g++ 9.1.1 and binutils 2.31.1
Fedora release 28 (Twenty Eight)
Linux 5.0.16-100.fc28.ppc64 #1 SMP Tue May 14 17:55:15 UTC 2019
GNU C Library (GNU libc) stable release version 2.27.
g++ 8.3.1 and binutils 2.29.1
I have also backported the first patch to fedora (since that already
had the original patch by Eugene backported) and it seems to resolve
all issues (the second patch isn't needed because grail hasn't been
backported).
On ppc64[be] I do see an issue with the vbit-test:
*** Incorrect result for operator Iop_Or1
opnd 0: vbits = 00000001 value = 00
opnd 1: vbits = 00000000 value = 01
result: vbits = 00000001 value = 00
expect: vbits = 00000000
*** Incorrect result for operator Iop_Or1
opnd 0: vbits = 00000000 value = 01
opnd 1: vbits = 00000001 value = 01
result: vbits = 00000001 value = 00
expect: vbits = 00000000
*** Incorrect result for operator Iop_Or1
opnd 0: vbits = 00000001 value = 01
opnd 1: vbits = 00000000 value = 00
result: vbits = 00000001 value = 00
expect: vbits = 00000000
*** Incorrect result for operator Iop_Or1
opnd 0: vbits = 00000000 value = 00
opnd 1: vbits = 00000001 value = 00
result: vbits = 00000001 value = 00
expect: vbits = 00000000
|
|
From: Aleksandar R. <ale...@rt...> - 2020-02-14 14:23:49
|
Hi, The vbit-test has that issue on all BE platforms. There is a fix attached on KDE #417238, it works for mips64 BE. Aleksandar On 14.2.20. 14:25, Mark Wielaard wrote: > > On ppc64[be] I do see an issue with the vbit-test: > > *** Incorrect result for operator Iop_Or1 > opnd 0: vbits = 00000001 value = 00 > opnd 1: vbits = 00000000 value = 01 > result: vbits = 00000001 value = 00 > expect: vbits = 00000000 > *** Incorrect result for operator Iop_Or1 > opnd 0: vbits = 00000000 value = 01 > opnd 1: vbits = 00000001 value = 01 > result: vbits = 00000001 value = 00 > expect: vbits = 00000000 > *** Incorrect result for operator Iop_Or1 > opnd 0: vbits = 00000001 value = 01 > opnd 1: vbits = 00000000 value = 00 > result: vbits = 00000001 value = 00 > expect: vbits = 00000000 > *** Incorrect result for operator Iop_Or1 > opnd 0: vbits = 00000000 value = 00 > opnd 1: vbits = 00000001 value = 00 > result: vbits = 00000001 value = 00 > expect: vbits = 00000000 > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: John R. <jr...@bi...> - 2020-02-14 14:54:21
|
On 2/14/20 13:25 UTC, Mark Wielaard wrote:
> On ppc64[be] I do see an issue with the vbit-test:
The test itself is incorrect. First of all, the expected value is known but not shown.
Second, for a bit-wise OR operation, if either input has a valid '1' bit in some bit position,
then the result also has a valid '1' bit in that bit position. An input bit that is a valid '1',
makes the other input bit in that bit position into a "don't care" bit.
[A similar rule applies to AND operation with a valid '0' bit.]
Corrected lines are shown below the incorrect ones:
>
> *** Incorrect result for operator Iop_Or1
> opnd 0: vbits = 00000001 value = 00
> opnd 1: vbits = 00000000 value = 01
> result: vbits = 00000001 value = 00
> expect: vbits = 00000000
expect: vbits = 00000000 value = 01
> *** Incorrect result for operator Iop_Or1
> opnd 0: vbits = 00000000 value = 01
> opnd 1: vbits = 00000001 value = 01
> result: vbits = 00000001 value = 00
> expect: vbits = 00000000
expect: vbits = 00000001 value = 01
> *** Incorrect result for operator Iop_Or1
> opnd 0: vbits = 00000001 value = 01
> opnd 1: vbits = 00000000 value = 00
> result: vbits = 00000001 value = 00
> expect: vbits = 00000000 expect: vbits = 00000001 value = 01
> *** Incorrect result for operator Iop_Or1
> opnd 0: vbits = 00000000 value = 00
> opnd 1: vbits = 00000001 value = 00
> result: vbits = 00000001 value = 00
> expect: vbits = 00000000
expect: vbits = 00000000 value = 00
|
|
From: Carl L. <ce...@us...> - 2020-02-14 15:59:08
|
Mark:
On Fri, 2020-02-14 at 14:25 +0100, Mark Wielaard wrote:
> For ppc64 and ppc64le I tested both patches and they seem to work
> nicely.
>
> Fedora release 30 (Thirty)
> Linux 5.2.9-200.fc30.ppc64le #1 SMP Fri Aug 16 21:03:00 UTC 2019
> GNU C Library (GNU libc) stable release version 2.29.
> g++ 9.1.1 and binutils 2.31.1
>
> Fedora release 28 (Twenty Eight)
> Linux 5.0.16-100.fc28.ppc64 #1 SMP Tue May 14 17:55:15 UTC 2019
> GNU C Library (GNU libc) stable release version 2.27.
> g++ 8.3.1 and binutils 2.29.1
>
> I have also backported the first patch to fedora (since that already
> had the original patch by Eugene backported) and it seems to resolve
> all issues (the second patch isn't needed because grail hasn't been
> backported).
Thanks for the testing.
>
> On ppc64[be] I do see an issue with the vbit-test:
I looked at the fix mentioned by Aleksandar. It looks like it is
architecture independent. I think we have a BE PPC64 machine around.
I will see if I can reproduce the issue and then try the fix in KDE
#417238.
Carl Love
|
|
From: Mark W. <ma...@kl...> - 2020-02-14 16:04:31
|
On Fri, 2020-02-14 at 07:58 -0800, Carl Love wrote: > > On ppc64[be] I do see an issue with the vbit-test: > > I looked at the fix mentioned by Aleksandar. It looks like it is > architecture independent. I think we have a BE PPC64 machine > around. > I will see if I can reproduce the issue and then try the fix in KDE > #417238. I just tried that patch on the ppc64[be] machine and it does indeed resolve the vbit-test issues. Cheers, Mark |
|
From: Carl L. <ce...@us...> - 2020-02-14 16:59:44
|
Mark:
On Fri, 2020-02-14 at 17:04 +0100, Mark Wielaard wrote:
> On Fri, 2020-02-14 at 07:58 -0800, Carl Love wrote:
> > > On ppc64[be] I do see an issue with the vbit-test:
> >
> > I looked at the fix mentioned by Aleksandar. It looks like it is
> > architecture independent. I think we have a BE PPC64 machine
> > around.
> > I will see if I can reproduce the issue and then try the fix in KDE
> > #417238.
>
> I just tried that patch on the ppc64[be] machine and it does indeed
> resolve the vbit-test issues.
Ditto, I tried on a Power 8 machine running Red Hat 7.6 (Maipo) in BE
mode. The fix by Aleksandar fixes the Vbit test errors. I will add a
note to that bugzilla to that effect.
Carl Love
|