From: Dmitry V. L. <ld...@al...> - 2016-12-14 10:58:19
|
Starting with version 4.13, strace follows the schedule of linux kernel and new versions of strace are released along with new version of linux kernel. So strace 4.15 is tagged and uploaded. This is the first strace release that supports syscall fault injection, the implementation is based on the prototype developed by Nahim El Atmani as a part of his GSoC 2016 strace project. strace 4.15 would not be as good as it is without significant assistance by Eugene Syromyatnikov who authored around half of all commits since 4.14. I'd like to use this opportunity to thank all who contributed to this release. $ git tag -v v4.15 2> /dev/null | sed '0,/:$/d' Andreas Schwab Dmitry V. Levin Elvira Khabirova Eugene Syromyatnikov Gleb Fotengauer-Malinovskiy JingPiao Chen Mikulas Patocka Nahim El Atmani Sean Stangl Thomas De Schampheleire -- ldv |
From: Steve M. <st...@ei...> - 2016-12-19 18:52:20
|
On Wed, Dec 14, 2016 at 01:58:11PM +0300, Dmitry V. Levin wrote: >Starting with version 4.13, strace follows the schedule of linux kernel >and new versions of strace are released along with new version of linux >kernel. So strace 4.15 is tagged and uploaded. > >This is the first strace release that supports syscall fault injection, >the implementation is based on the prototype developed by Nahim El Atmani >as a part of his GSoC 2016 strace project. > >strace 4.15 would not be as good as it is without significant assistance >by Eugene Syromyatnikov who authored around half of all commits since 4.14. > >I'd like to use this opportunity to thank all who contributed to this release. > >$ git tag -v v4.15 2> /dev/null | sed '0,/:$/d' I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] on Debian machines. The 32-bit builds are both showing issues with fault injection. I can't follow what the code is meant to be doing here, so no ideas on what's wrong. :-( The mips64el failures are problems in pwritev and readahead, then a sefault in newfstatat. The backtrace from that isn't very helpful; maybe it's a toolchain problem here: (gdb) r Starting program: /home/93sam/build/strace/strace.git/tests/newfstatat warning: GDB can't find the start of the function at 0xfff7fcc2c8. GDB is unable to find the start of the function at 0xfff7fcc2c8 and thus can't determine the size of that function's stack frame. This means that GDB may be unable to access that stack frame, or the frames below it. This problem is most likely caused by an invalid program counter or stack pointer. However, if you think GDB should simply search farther back from 0xfff7fcc2c8 for code which looks like the beginning of a function, you can increase the range of the search using the `set heuristic-fence-post' command. warning: GDB can't find the start of the function at 0xfff7fcd04c. warning: GDB can't find the start of the function at 0xfff7f14038. Program received signal SIGSEGV, Segmentation fault. 0x000000fff7f14038 in ?? () from /lib/mips64el-linux-gnuabi64/libc.so.6 (gdb) bt #0 0x000000fff7f14038 in ?? () from /lib/mips64el-linux-gnuabi64/libc.so.6 [1] https://www.einval.com/debian/strace/build-logs/mips/2016-12-19-040520-log-minkus-TESTFAIL.txt [2] https://www.einval.com/debian/strace/build-logs/mipsel/2016-12-19-152924-log-eller-TESTFAIL.txt [3] https://www.einval.com/debian/strace/build-logs/mips64el/2016-12-19-155929-log-eller-TESTFAIL.txt -- Steve McIntyre, Cambridge, UK. st...@ei... There's no sensation to compare with this Suspended animation, A state of bliss |
From: Dmitry V. L. <ld...@al...> - 2017-01-13 02:33:15
|
On Wed, Jan 11, 2017 at 02:25:56PM +0000, James Cowgill wrote: > The newfstatat testcase on mips64 currently fails because: > - The BOGUS_STRUCT_STAT test segfaults inside glibc. > - The result of the fstatat call gives incorrect dates because the > kernel struct stat uses unsigned int timestamps. > > Fix by using avoiding the glibc wrapper and using the relevant syscall > directly. This obviously avoids the first problem, and avoids the second > problem because print_stat always sign extends dates (unlike glibc which > will zero extend them). Unfortunately, this makes fstatat64.test fail on x86 because of struct stat mismatch. If tests/fstatat.c is changed to define USE_ASM_STAT, then tests/fstatat64.c would need more stat64 related definitions (like in tests/fstat64.c), and I cannot tell off-hand whether fstatat64 takes struct stat64 on all platforms like it's declared in kernel's include/linux/syscalls.h file. btw, tests/fstatat.c is the last user of tests/xstatx.c that does not define USE_ASM_STAT yet. > --- > tests/fstatat.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/tests/fstatat.c b/tests/fstatat.c > index ec55ca04..3981b4ec 100644 > --- a/tests/fstatat.c > +++ b/tests/fstatat.c > @@ -28,7 +28,7 @@ > #ifdef HAVE_FSTATAT > > # define TEST_SYSCALL_INVOKE(sample, pst) \ > - fstatat(AT_FDCWD, sample, pst, AT_SYMLINK_NOFOLLOW) > + syscall(TEST_SYSCALL_NR, AT_FDCWD, sample, pst, AT_SYMLINK_NOFOLLOW) > # define PRINT_SYSCALL_HEADER(sample) \ > do { \ > int saved_errno = errno; \ > @@ -38,6 +38,8 @@ > printf(", AT_SYMLINK_NOFOLLOW) = %s\n", sprintrc(rc)); \ > } while (0) > > +# define USE_ASM_STAT > + > # include "xstatx.c" > > #else -- ldv |
From: Dmitry V. L. <ld...@al...> - 2017-01-13 20:38:14
|
On Fri, Jan 13, 2017 at 05:33:07AM +0300, Dmitry V. Levin wrote: > On Wed, Jan 11, 2017 at 02:25:56PM +0000, James Cowgill wrote: > > The newfstatat testcase on mips64 currently fails because: > > - The BOGUS_STRUCT_STAT test segfaults inside glibc. > > - The result of the fstatat call gives incorrect dates because the > > kernel struct stat uses unsigned int timestamps. > > > > Fix by using avoiding the glibc wrapper and using the relevant syscall > > directly. This obviously avoids the first problem, and avoids the second > > problem because print_stat always sign extends dates (unlike glibc which > > will zero extend them). > > Unfortunately, this makes fstatat64.test fail on x86 > because of struct stat mismatch. > > If tests/fstatat.c is changed to define USE_ASM_STAT, then > tests/fstatat64.c would need more stat64 related definitions > (like in tests/fstat64.c), and I cannot tell off-hand whether > fstatat64 takes struct stat64 on all platforms like it's declared > in kernel's include/linux/syscalls.h file. > > btw, tests/fstatat.c is the last user of tests/xstatx.c that > does not define USE_ASM_STAT yet. OK, I've changed both newfstatat and fstatat64 tests to invoke syscalls directly. -- ldv |
From: James C. <jco...@de...> - 2017-02-13 17:00:38
Attachments:
signature.asc
|
Hi, On 13/02/17 14:40, Dmitry V. Levin wrote: > On Mon, Jan 09, 2017 at 08:54:52PM +0300, Dmitry V. Levin wrote: >> On Mon, Jan 09, 2017 at 05:31:37PM +0000, James Cowgill wrote: >>> On 06/01/17 00:51, Dmitry V. Levin wrote: >>>> On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: >>>>> On 20/12/16 00:36, Steve McIntyre wrote: >>>>>> On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: >>>>>>> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: >>>>>>>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: >>>>>>>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] >>>>>>>>> on Debian machines. >>>>>>>>> >>>>>>>>> The 32-bit builds are both showing issues with fault injection. I >>>>>>>>> can't follow what the code is meant to be doing here, so no >>>>>>>>> ideas on what's wrong. :-( >>>>>>>> >>>>>>>> As Dmitry said earlier: >>>>>>>> >>>>>>>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: >>>>>>>>> Well, the mips kernel does not implement substitution of syscall numbers >>>>>>>> >>>>>>>> So it looks like the test has failed to SKIP on this target. >>>>>>> >>>>>>> As I'm not 100% sure there is no kernel support for mips, I decided >>>>>>> not to skip the test on mips until somebody investigates. >>>>>> >>>>>> Ah, OK. James Cowgill is my friendly local mips expert - let's see >>>>>> what he thinks... :-) >>>>> >>>>> I've had a look and I think there is a kernel bug here - specifically >>>>> affecting 32-bit programs run on 64-bit kernels (like all the Debian >>>>> buildds and the porterbox are). An extra PTRACE_SYSCALL stop is >>>>> happening which confuses everything. I'll look some more tomorrow. >>>> >>>> Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. >>>> >>>> According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall >>>> (the same applies to arch/mips/kernel/scall64-n32.S), >>>> >>>> if the syscall number after the first syscall_trace_enter call is out >>>> of range, there is a jump to not_o32_scall which in turn jumps to >>>> arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which >>>> then jumps on to handle_sys64). >>>> >>>> Handle_sys64, unsurprisingly, does all over again, starting with >>>> a syscall_trace_enter call, which appears to be the second one >>>> and causes that extra syscall stop you observe with 32-bit tracees >>>> running on 64-bit kernels. >>> >>> Just going through all the MIPS testsuite bugs again: >>> >>> fault_injection* >>> The kernel bug above (no patch yet). >> >> I couldn't find an easy fix for this kernel bug. Any ideas? > > OK, there is not going to be any fix for this kernel bug in v4.10, > so I'd rather disable scno tampering tests when MIPS ABI is o32 > but the kernel is n64. > > Is there any simple way for MIPS o32 userspace to find out whether > the kernel is not a native MIPS o32? Something less hackish > than manually invoking a MIPS n64 syscall? uname -m is a bit less hackish: 32-bit kernel: $(uname -m) = mips 64-bit kernel: $(uname -m) = mips64 Note there is no difference between big and little endian here. Thanks, James |
From: Josh S. <ji...@re...> - 2017-02-13 19:12:14
|
On 02/13/2017 08:47 AM, James Cowgill wrote: >> Is there any simple way for MIPS o32 userspace to find out whether >> the kernel is not a native MIPS o32? Something less hackish >> than manually invoking a MIPS n64 syscall? > > uname -m is a bit less hackish: > > 32-bit kernel: $(uname -m) = mips > 64-bit kernel: $(uname -m) = mips64 > > Note there is no difference between big and little endian here. This can still lie, e.g. under setarch. $ setarch i386 uname -rm 4.9.6-200.fc25.x86_64 i686 It doesn't even matter that my /usr/bin/uname is a 64-bit binary. And while Fedora puts the arch in the release string, you can't rely on that everywhere. But I don't know any truly foolproof way. |
From: Dmitry V. L. <ld...@al...> - 2017-02-14 09:28:07
|
On Mon, Feb 13, 2017 at 04:47:05PM +0000, James Cowgill wrote: > Hi, > > > On 13/02/17 14:40, Dmitry V. Levin wrote: > > On Mon, Jan 09, 2017 at 08:54:52PM +0300, Dmitry V. Levin wrote: > >> On Mon, Jan 09, 2017 at 05:31:37PM +0000, James Cowgill wrote: > >>> On 06/01/17 00:51, Dmitry V. Levin wrote: > >>>> On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: > >>>>> On 20/12/16 00:36, Steve McIntyre wrote: > >>>>>> On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: > >>>>>>> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: > >>>>>>>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: > >>>>>>>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] > >>>>>>>>> on Debian machines. > >>>>>>>>> > >>>>>>>>> The 32-bit builds are both showing issues with fault injection. I > >>>>>>>>> can't follow what the code is meant to be doing here, so no > >>>>>>>>> ideas on what's wrong. :-( > >>>>>>>> > >>>>>>>> As Dmitry said earlier: > >>>>>>>> > >>>>>>>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: > >>>>>>>>> Well, the mips kernel does not implement substitution of syscall numbers > >>>>>>>> > >>>>>>>> So it looks like the test has failed to SKIP on this target. > >>>>>>> > >>>>>>> As I'm not 100% sure there is no kernel support for mips, I decided > >>>>>>> not to skip the test on mips until somebody investigates. > >>>>>> > >>>>>> Ah, OK. James Cowgill is my friendly local mips expert - let's see > > >>>>>> what he thinks... :-) > >>>>> > >>>>> I've had a look and I think there is a kernel bug here - specifically > >>>>> affecting 32-bit programs run on 64-bit kernels (like all the Debian > >>>>> buildds and the porterbox are). An extra PTRACE_SYSCALL stop is > >>>>> happening which confuses everything. I'll look some more tomorrow. > >>>> > >>>> Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. > >>>> > >>>> According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall > >>>> (the same applies to arch/mips/kernel/scall64-n32.S), > >>>> > >>>> if the syscall number after the first syscall_trace_enter call is out > >>>> of range, there is a jump to not_o32_scall which in turn jumps to > >>>> arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which > >>>> then jumps on to handle_sys64). > >>>> > >>>> Handle_sys64, unsurprisingly, does all over again, starting with > >>>> a syscall_trace_enter call, which appears to be the second one > >>>> and causes that extra syscall stop you observe with 32-bit tracees > >>>> running on 64-bit kernels. > >>> > >>> Just going through all the MIPS testsuite bugs again: > >>> > >>> fault_injection* > >>> The kernel bug above (no patch yet). > >> > >> I couldn't find an easy fix for this kernel bug. Any ideas? > > > > OK, there is not going to be any fix for this kernel bug in v4.10, > > so I'd rather disable scno tampering tests when MIPS ABI is o32 > > but the kernel is n64. > > > > Is there any simple way for MIPS o32 userspace to find out whether > > the kernel is not a native MIPS o32? Something less hackish > > than manually invoking a MIPS n64 syscall? > > uname -m is a bit less hackish: > > 32-bit kernel: $(uname -m) = mips > 64-bit kernel: $(uname -m) = mips64 No, it didn't work out, 64-bit kernel pretends it's mips: http://www.einval.com/debian/strace/build-logs/mipsel/2017-02-14-040242-log-eller-TESTFAIL.txt -- ldv |
From: James C. <jco...@de...> - 2017-02-14 10:57:40
Attachments:
signature.asc
|
Hi, On 14/02/17 09:27, Dmitry V. Levin wrote: > On Mon, Feb 13, 2017 at 04:47:05PM +0000, James Cowgill wrote: >> On 13/02/17 14:40, Dmitry V. Levin wrote: >>> On Mon, Jan 09, 2017 at 08:54:52PM +0300, Dmitry V. Levin wrote: >>>> On Mon, Jan 09, 2017 at 05:31:37PM +0000, James Cowgill wrote: >>>>> On 06/01/17 00:51, Dmitry V. Levin wrote: >>>>>> On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: >>>>>>> On 20/12/16 00:36, Steve McIntyre wrote: >>>>>>>> On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: >>>>>>>>> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: >>>>>>>>>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: >>>>>>>>>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] >>>>>>>>>>> on Debian machines. >>>>>>>>>>> >>>>>>>>>>> The 32-bit builds are both showing issues with fault injection. I >>>>>>>>>>> can't follow what the code is meant to be doing here, so no >>>>>>>>>>> ideas on what's wrong. :-( >>>>>>>>>> >>>>>>>>>> As Dmitry said earlier: >>>>>>>>>> >>>>>>>>>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: >>>>>>>>>>> Well, the mips kernel does not implement substitution of syscall numbers >>>>>>>>>> >>>>>>>>>> So it looks like the test has failed to SKIP on this target. >>>>>>>>> >>>>>>>>> As I'm not 100% sure there is no kernel support for mips, I decided >>>>>>>>> not to skip the test on mips until somebody investigates. >>>>>>>> >>>>>>>> Ah, OK. James Cowgill is my friendly local mips expert - let's see >> >>>>>>>> what he thinks... :-) >>>>>>> >>>>>>> I've had a look and I think there is a kernel bug here - specifically >>>>>>> affecting 32-bit programs run on 64-bit kernels (like all the Debian >>>>>>> buildds and the porterbox are). An extra PTRACE_SYSCALL stop is >>>>>>> happening which confuses everything. I'll look some more tomorrow. >>>>>> >>>>>> Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. >>>>>> >>>>>> According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall >>>>>> (the same applies to arch/mips/kernel/scall64-n32.S), >>>>>> >>>>>> if the syscall number after the first syscall_trace_enter call is out >>>>>> of range, there is a jump to not_o32_scall which in turn jumps to >>>>>> arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which >>>>>> then jumps on to handle_sys64). >>>>>> >>>>>> Handle_sys64, unsurprisingly, does all over again, starting with >>>>>> a syscall_trace_enter call, which appears to be the second one >>>>>> and causes that extra syscall stop you observe with 32-bit tracees >>>>>> running on 64-bit kernels. >>>>> >>>>> Just going through all the MIPS testsuite bugs again: >>>>> >>>>> fault_injection* >>>>> The kernel bug above (no patch yet). >>>> >>>> I couldn't find an easy fix for this kernel bug. Any ideas? >>> >>> OK, there is not going to be any fix for this kernel bug in v4.10, >>> so I'd rather disable scno tampering tests when MIPS ABI is o32 >>> but the kernel is n64. >>> >>> Is there any simple way for MIPS o32 userspace to find out whether >>> the kernel is not a native MIPS o32? Something less hackish >>> than manually invoking a MIPS n64 syscall? >> >> uname -m is a bit less hackish: >> >> 32-bit kernel: $(uname -m) = mips >> 64-bit kernel: $(uname -m) = mips64 > > No, it didn't work out, 64-bit kernel pretends it's mips: > http://www.einval.com/debian/strace/build-logs/mipsel/2017-02-14-040242-log-eller-TESTFAIL.txt Hmm and from the kernel source it looks like you cannot unset PER_LINUX32 once it's been set: http://lxr.free-electrons.com/source/arch/mips/kernel/linux32.c#L121 So I don't think there is any nice way to tell. James |
From: Nahim El A. <nah...@na...> - 2016-12-20 00:08:49
|
Hi Steve, On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: > I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] > on Debian machines. > > The 32-bit builds are both showing issues with fault injection. I > can't follow what the code is meant to be doing here, so no > ideas on what's wrong. :-( As Dmitry said earlier: On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: > Well, the mips kernel does not implement substitution of syscall numbers So it looks like the test has failed to SKIP on this target. Best, -- Nahim El Atmani <nah...@na...> https://brokenpi.pe/ |
From: Dmitry V. L. <ld...@al...> - 2016-12-20 00:16:25
|
Hi, On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: > Hi Steve, > > On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: > > I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] > > on Debian machines. > > > > The 32-bit builds are both showing issues with fault injection. I > > can't follow what the code is meant to be doing here, so no > > ideas on what's wrong. :-( > > As Dmitry said earlier: > > On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: > > Well, the mips kernel does not implement substitution of syscall numbers > > So it looks like the test has failed to SKIP on this target. As I'm not 100% sure there is no kernel support for mips, I decided not to skip the test on mips until somebody investigates. -- ldv |
From: Steve M. <st...@ei...> - 2016-12-20 00:36:20
|
On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: >Hi, > >On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: >> Hi Steve, >> >> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: >> > I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] >> > on Debian machines. >> > >> > The 32-bit builds are both showing issues with fault injection. I >> > can't follow what the code is meant to be doing here, so no >> > ideas on what's wrong. :-( >> >> As Dmitry said earlier: >> >> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: >> > Well, the mips kernel does not implement substitution of syscall numbers >> >> So it looks like the test has failed to SKIP on this target. > >As I'm not 100% sure there is no kernel support for mips, I decided >not to skip the test on mips until somebody investigates. Ah, OK. James Cowgill is my friendly local mips expert - let's see what he thinks... :-) -- Steve McIntyre, Cambridge, UK. st...@ei... You lock the door And throw away the key There's someone in my head but it's not me |
From: James C. <jco...@de...> - 2017-01-03 18:16:36
Attachments:
signature.asc
|
Hi, On 20/12/16 00:36, Steve McIntyre wrote: > On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: >> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: >>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: >>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] >>>> on Debian machines. >>>> >>>> The 32-bit builds are both showing issues with fault injection. I >>>> can't follow what the code is meant to be doing here, so no >>>> ideas on what's wrong. :-( >>> >>> As Dmitry said earlier: >>> >>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: >>>> Well, the mips kernel does not implement substitution of syscall numbers >>> >>> So it looks like the test has failed to SKIP on this target. >> >> As I'm not 100% sure there is no kernel support for mips, I decided >> not to skip the test on mips until somebody investigates. > > Ah, OK. James Cowgill is my friendly local mips expert - let's see > what he thinks... :-) I've had a look and I think there is a kernel bug here - specifically affecting 32-bit programs run on 64-bit kernels (like all the Debian buildds and the porterbox are). An extra PTRACE_SYSCALL stop is happening which confuses everything. I'll look some more tomorrow. The test passes OK if it's run on a 32-bit kernel though so the kernel does somewhat support it. Thanks, James |
From: Steve M. <st...@ei...> - 2017-01-04 15:22:22
|
Happy New Year James! On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: >On 20/12/16 00:36, Steve McIntyre wrote: >> >> Ah, OK. James Cowgill is my friendly local mips expert - let's see >> what he thinks... :-) > >I've had a look and I think there is a kernel bug here - specifically >affecting 32-bit programs run on 64-bit kernels (like all the Debian >buildds and the porterbox are). An extra PTRACE_SYSCALL stop is >happening which confuses everything. I'll look some more tomorrow. Cool, thanks. >The test passes OK if it's run on a 32-bit kernel though so the kernel >does somewhat support it. Ah, OK. For *now*, I've simple disabled these tests for the Debian mips builds so that we can get strace into testing safely before the Stretch release. Happy to tweak that later if you find a better answer. -- Steve McIntyre, Cambridge, UK. st...@ei... Is there anybody out there? |
From: Dmitry V. L. <ld...@al...> - 2017-01-06 00:51:26
|
Hi, On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: > On 20/12/16 00:36, Steve McIntyre wrote: > > On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: > >> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: > >>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: > >>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] > >>>> on Debian machines. > >>>> > >>>> The 32-bit builds are both showing issues with fault injection. I > >>>> can't follow what the code is meant to be doing here, so no > >>>> ideas on what's wrong. :-( > >>> > >>> As Dmitry said earlier: > >>> > >>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: > >>>> Well, the mips kernel does not implement substitution of syscall numbers > >>> > >>> So it looks like the test has failed to SKIP on this target. > >> > >> As I'm not 100% sure there is no kernel support for mips, I decided > >> not to skip the test on mips until somebody investigates. > > > > Ah, OK. James Cowgill is my friendly local mips expert - let's see > > what he thinks... :-) > > I've had a look and I think there is a kernel bug here - specifically > affecting 32-bit programs run on 64-bit kernels (like all the Debian > buildds and the porterbox are). An extra PTRACE_SYSCALL stop is > happening which confuses everything. I'll look some more tomorrow. Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall (the same applies to arch/mips/kernel/scall64-n32.S), if the syscall number after the first syscall_trace_enter call is out of range, there is a jump to not_o32_scall which in turn jumps to arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which then jumps on to handle_sys64). Handle_sys64, unsurprisingly, does all over again, starting with a syscall_trace_enter call, which appears to be the second one and causes that extra syscall stop you observe with 32-bit tracees running on 64-bit kernels. -- ldv |
From: James C. <jco...@de...> - 2017-01-09 17:32:02
|
Hi, On 06/01/17 00:51, Dmitry V. Levin wrote: > On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: >> On 20/12/16 00:36, Steve McIntyre wrote: >>> On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: >>>> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: >>>>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: >>>>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] >>>>>> on Debian machines. >>>>>> >>>>>> The 32-bit builds are both showing issues with fault injection. I >>>>>> can't follow what the code is meant to be doing here, so no >>>>>> ideas on what's wrong. :-( >>>>> >>>>> As Dmitry said earlier: >>>>> >>>>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: >>>>>> Well, the mips kernel does not implement substitution of syscall numbers >>>>> >>>>> So it looks like the test has failed to SKIP on this target. >>>> >>>> As I'm not 100% sure there is no kernel support for mips, I decided >>>> not to skip the test on mips until somebody investigates. >>> >>> Ah, OK. James Cowgill is my friendly local mips expert - let's see >>> what he thinks... :-) >> >> I've had a look and I think there is a kernel bug here - specifically >> affecting 32-bit programs run on 64-bit kernels (like all the Debian >> buildds and the porterbox are). An extra PTRACE_SYSCALL stop is >> happening which confuses everything. I'll look some more tomorrow. > > Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. > > According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall > (the same applies to arch/mips/kernel/scall64-n32.S), > > if the syscall number after the first syscall_trace_enter call is out > of range, there is a jump to not_o32_scall which in turn jumps to > arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which > then jumps on to handle_sys64). > > Handle_sys64, unsurprisingly, does all over again, starting with > a syscall_trace_enter call, which appears to be the second one > and causes that extra syscall stop you observe with 32-bit tracees > running on 64-bit kernels. Just going through all the MIPS testsuite bugs again: fault_injection* The kernel bug above (no patch yet). readahead glibc bug fixed in 2.25 https://sourceware.org/bugzilla/show_bug.cgi?id=21026 pwritev I debugged this to an Octeon specific kernel bug in Octeon's optimized copy_from_memory implementation :S I've sent a patch to the linux-mips list to fix this. https://www.linux-mips.org/archives/linux-mips/2017-01/msg00094.html newfstatat Firstly we need to define TEST_BOGUS_STRUCT_STAT somewhere otherwise the test just segfaults in glibc. After that it suffers from the negative dates problem which happens when glibc copies struct stat on mips n64. I thought I fixed this - did something change? Thanks, James |
From: Dmitry V. L. <ld...@al...> - 2017-01-09 17:55:00
|
Hi, On Mon, Jan 09, 2017 at 05:31:37PM +0000, James Cowgill wrote: > On 06/01/17 00:51, Dmitry V. Levin wrote: > > On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: > >> On 20/12/16 00:36, Steve McIntyre wrote: > >>> On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: > >>>> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: > >>>>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: > >>>>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] > >>>>>> on Debian machines. > >>>>>> > >>>>>> The 32-bit builds are both showing issues with fault injection. I > >>>>>> can't follow what the code is meant to be doing here, so no > >>>>>> ideas on what's wrong. :-( > >>>>> > >>>>> As Dmitry said earlier: > >>>>> > >>>>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: > >>>>>> Well, the mips kernel does not implement substitution of syscall numbers > >>>>> > >>>>> So it looks like the test has failed to SKIP on this target. > >>>> > >>>> As I'm not 100% sure there is no kernel support for mips, I decided > >>>> not to skip the test on mips until somebody investigates. > >>> > >>> Ah, OK. James Cowgill is my friendly local mips expert - let's see > >>> what he thinks... :-) > >> > >> I've had a look and I think there is a kernel bug here - specifically > >> affecting 32-bit programs run on 64-bit kernels (like all the Debian > >> buildds and the porterbox are). An extra PTRACE_SYSCALL stop is > >> happening which confuses everything. I'll look some more tomorrow. > > > > Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. > > > > According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall > > (the same applies to arch/mips/kernel/scall64-n32.S), > > > > if the syscall number after the first syscall_trace_enter call is out > > of range, there is a jump to not_o32_scall which in turn jumps to > > arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which > > then jumps on to handle_sys64). > > > > Handle_sys64, unsurprisingly, does all over again, starting with > > a syscall_trace_enter call, which appears to be the second one > > and causes that extra syscall stop you observe with 32-bit tracees > > running on 64-bit kernels. > > Just going through all the MIPS testsuite bugs again: > > fault_injection* > The kernel bug above (no patch yet). I couldn't find an easy fix for this kernel bug. Any ideas? > readahead > glibc bug fixed in 2.25 > https://sourceware.org/bugzilla/show_bug.cgi?id=21026 I applied a workaround for this: v4.15-266-ge752ef6. > pwritev > I debugged this to an Octeon specific kernel bug in Octeon's optimized > copy_from_memory implementation :S I've sent a patch to the linux-mips > list to fix this. > https://www.linux-mips.org/archives/linux-mips/2017-01/msg00094.html Great. > newfstatat > Firstly we need to define TEST_BOGUS_STRUCT_STAT somewhere otherwise the > test just segfaults in glibc. Could you send a patch for this, please? > After that it suffers from the negative dates problem which happens when > glibc copies struct stat on mips n64. I thought I fixed this - did > something change? There was a big rework of struct stat{,64} decoding, see commit v4.13-80-ga7c4ee4. I reverted that workaround for mips n64 (commit v4.13-81-g6fc5338) as I believe it's no longer needed, otherwise other related tests would fail. -- ldv |
From: James C. <jam...@co...> - 2017-01-11 14:39:22
|
The newfstatat testcase on mips64 currently fails because: - The BOGUS_STRUCT_STAT test segfaults inside glibc. - The result of the fstatat call gives incorrect dates because the kernel struct stat uses unsigned int timestamps. Fix by using avoiding the glibc wrapper and using the relevant syscall directly. This obviously avoids the first problem, and avoids the second problem because print_stat always sign extends dates (unlike glibc which will zero extend them). --- tests/fstatat.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/fstatat.c b/tests/fstatat.c index ec55ca04..3981b4ec 100644 --- a/tests/fstatat.c +++ b/tests/fstatat.c @@ -28,7 +28,7 @@ #ifdef HAVE_FSTATAT # define TEST_SYSCALL_INVOKE(sample, pst) \ - fstatat(AT_FDCWD, sample, pst, AT_SYMLINK_NOFOLLOW) + syscall(TEST_SYSCALL_NR, AT_FDCWD, sample, pst, AT_SYMLINK_NOFOLLOW) # define PRINT_SYSCALL_HEADER(sample) \ do { \ int saved_errno = errno; \ @@ -38,6 +38,8 @@ printf(", AT_SYMLINK_NOFOLLOW) = %s\n", sprintrc(rc)); \ } while (0) +# define USE_ASM_STAT + # include "xstatx.c" #else -- 2.11.0 |
From: Dmitry V. L. <ld...@al...> - 2017-02-13 14:40:38
|
Hi, On Mon, Jan 09, 2017 at 08:54:52PM +0300, Dmitry V. Levin wrote: > On Mon, Jan 09, 2017 at 05:31:37PM +0000, James Cowgill wrote: > > On 06/01/17 00:51, Dmitry V. Levin wrote: > > > On Tue, Jan 03, 2017 at 06:02:44PM +0000, James Cowgill wrote: > > >> On 20/12/16 00:36, Steve McIntyre wrote: > > >>> On Tue, Dec 20, 2016 at 03:16:17AM +0300, Dmitry V. Levin wrote: > > >>>> On Tue, Dec 20, 2016 at 12:50:29AM +0100, Nahim El Atmani wrote: > > >>>>> On Mon, 19 Dec 2016 18:30:29 +0000, Steve McIntyre wrote: > > >>>>>> I'm seeing test suite failures on mips[1], mipsel[2] and mips64el[3] > > >>>>>> on Debian machines. > > >>>>>> > > >>>>>> The 32-bit builds are both showing issues with fault injection. I > > >>>>>> can't follow what the code is meant to be doing here, so no > > >>>>>> ideas on what's wrong. :-( > > >>>>> > > >>>>> As Dmitry said earlier: > > >>>>> > > >>>>> On Tue, 15 Nov 2016 15:43:59 +0300, Dmitry V. Levin wrote: > > >>>>>> Well, the mips kernel does not implement substitution of syscall numbers > > >>>>> > > >>>>> So it looks like the test has failed to SKIP on this target. > > >>>> > > >>>> As I'm not 100% sure there is no kernel support for mips, I decided > > >>>> not to skip the test on mips until somebody investigates. > > >>> > > >>> Ah, OK. James Cowgill is my friendly local mips expert - let's see > > >>> what he thinks... :-) > > >> > > >> I've had a look and I think there is a kernel bug here - specifically > > >> affecting 32-bit programs run on 64-bit kernels (like all the Debian > > >> buildds and the porterbox are). An extra PTRACE_SYSCALL stop is > > >> happening which confuses everything. I'll look some more tomorrow. > > > > > > Indeed, it seems to be a kernel bug in scall64-o32.S and scall64-n32.S. > > > > > > According to current arch/mips/kernel/scall64-o32.S:trace_a_syscall > > > (the same applies to arch/mips/kernel/scall64-n32.S), > > > > > > if the syscall number after the first syscall_trace_enter call is out > > > of range, there is a jump to not_o32_scall which in turn jumps to > > > arch/mips/kernel/scall64-64.S:handle_sys64 (or to handle_sysn32 which > > > then jumps on to handle_sys64). > > > > > > Handle_sys64, unsurprisingly, does all over again, starting with > > > a syscall_trace_enter call, which appears to be the second one > > > and causes that extra syscall stop you observe with 32-bit tracees > > > running on 64-bit kernels. > > > > Just going through all the MIPS testsuite bugs again: > > > > fault_injection* > > The kernel bug above (no patch yet). > > I couldn't find an easy fix for this kernel bug. Any ideas? OK, there is not going to be any fix for this kernel bug in v4.10, so I'd rather disable scno tampering tests when MIPS ABI is o32 but the kernel is n64. Is there any simple way for MIPS o32 userspace to find out whether the kernel is not a native MIPS o32? Something less hackish than manually invoking a MIPS n64 syscall? -- ldv |