|
From: Tom H. <to...@co...> - 2013-04-18 02:03:10
Attachments:
diffs.txt
|
valgrind revision: 13371 VEX revision: 2707 C compiler: gcc (GCC) 4.8.0 20130322 (Red Hat 4.8.0-1) GDB: GNU gdb (GDB) Fedora (7.5.91.20130323-14.fc19) Assembler: GNU assembler version 2.23.52.0.1-6.fc19 20130226 C library: GNU C Library (GNU libc) stable release version 2.17 uname -mrs: Linux 3.8.6-203.fc18.x86_64 x86_64 Vendor version: Fedora release 19 (Schrödingerâs Cat) Nightly build on bristol ( x86_64, Fedora 19 (Schrödingerâs Cat) ) Started at 2013-04-18 02:31:36 BST Ended at 2013-04-18 03:02:51 BST Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 652 tests, 6 stderr failures, 2 stdout failures, 3 stderrB failures, 2 stdoutB failures, 0 post failures == gdbserver_tests/mcbreak (stdout) gdbserver_tests/mcbreak (stdoutB) gdbserver_tests/mcbreak (stderrB) gdbserver_tests/mcinfcallRU (stderr) gdbserver_tests/mcinfcallWSRU (stderr) gdbserver_tests/mcinfcallWSRU (stderrB) gdbserver_tests/mcmain_pic (stdout) gdbserver_tests/mcmain_pic (stderr) gdbserver_tests/mcmain_pic (stdoutB) gdbserver_tests/mcmain_pic (stderrB) memcheck/tests/dw4 (stderr) memcheck/tests/origin5-bz2 (stderr) exp-sgcheck/tests/hackedbz2 (stderr) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 652 tests, 3 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/dw4 (stderr) memcheck/tests/origin5-bz2 (stderr) exp-sgcheck/tests/hackedbz2 (stderr) ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short 2013-04-18 02:47:05.443066400 +0100 --- new.short 2013-04-18 03:02:51.481306260 +0100 *************** *** 8,10 **** ! == 652 tests, 3 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/dw4 (stderr) --- 8,20 ---- ! == 652 tests, 6 stderr failures, 2 stdout failures, 3 stderrB failures, 2 stdoutB failures, 0 post failures == ! gdbserver_tests/mcbreak (stdout) ! gdbserver_tests/mcbreak (stdoutB) ! gdbserver_tests/mcbreak (stderrB) ! gdbserver_tests/mcinfcallRU (stderr) ! gdbserver_tests/mcinfcallWSRU (stderr) ! gdbserver_tests/mcinfcallWSRU (stderrB) ! gdbserver_tests/mcmain_pic (stdout) ! gdbserver_tests/mcmain_pic (stderr) ! gdbserver_tests/mcmain_pic (stdoutB) ! gdbserver_tests/mcmain_pic (stderrB) memcheck/tests/dw4 (stderr) |
|
From: Tom H. <to...@co...> - 2013-04-18 15:10:57
|
On 18/04/13 03:02, Tom Hughes wrote: > ! gdbserver_tests/mcbreak (stdout) > ! gdbserver_tests/mcbreak (stdoutB) > ! gdbserver_tests/mcbreak (stderrB) > ! gdbserver_tests/mcinfcallRU (stderr) > ! gdbserver_tests/mcinfcallWSRU (stderr) > ! gdbserver_tests/mcinfcallWSRU (stderrB) > ! gdbserver_tests/mcmain_pic (stdout) > ! gdbserver_tests/mcmain_pic (stderr) > ! gdbserver_tests/mcmain_pic (stdoutB) > ! gdbserver_tests/mcmain_pic (stderrB) These failiures are caused by the change I committed yesterday to respect the PT_GNU_STACK information and make stacks non-executable when possible. It seems that when GDB wants to call a function in the target it is poking an INT3 instruction onto the targets stack, then poking the address of that instruction on to the stack as the return address and resuming at the address of the function it wants to call. So when the target returns from that function it tries to execute on the stack and fails... Presumably when gdb is running the target directly it turns on execute permission for the stack? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Julian S. <js...@ac...> - 2013-04-30 17:05:30
|
On 04/18/2013 05:10 PM, Tom Hughes wrote: > On 18/04/13 03:02, Tom Hughes wrote: > These failiures are caused by the change I committed yesterday to > respect the PT_GNU_STACK information and make stacks non-executable > when possible. > > It seems that when GDB wants to call a function in the target it is > poking an INT3 instruction onto the targets stack, then poking the > address of that instruction on to the stack as the return address and > resuming at the address of the function it wants to call. Sounds plausible, but Philippe will know for sure. Did this get resolved yet? J |
|
From: Tom H. <to...@co...> - 2013-04-30 17:51:34
|
On 30/04/13 18:04, Julian Seward wrote: > On 04/18/2013 05:10 PM, Tom Hughes wrote: >> On 18/04/13 03:02, Tom Hughes wrote: > >> These failiures are caused by the change I committed yesterday to >> respect the PT_GNU_STACK information and make stacks non-executable >> when possible. >> >> It seems that when GDB wants to call a function in the target it is >> poking an INT3 instruction onto the targets stack, then poking the >> address of that instruction on to the stack as the return address and >> resuming at the address of the function it wants to call. > > Sounds plausible, but Philippe will know for sure. > > Did this get resolved yet? No. I was hoping Philippe would have some ideas. I did look a bit further at it, and it seems that when the target is run directly under gdb without valgrind it does things in a different way and doesn't use the stack as the return address. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Philippe W. <phi...@sk...> - 2013-05-01 16:31:30
|
On Tue, 2013-04-30 at 18:51 +0100, Tom Hughes wrote: > On 30/04/13 18:04, Julian Seward wrote: > > On 04/18/2013 05:10 PM, Tom Hughes wrote: > >> On 18/04/13 03:02, Tom Hughes wrote: > > > >> These failiures are caused by the change I committed yesterday to > >> respect the PT_GNU_STACK information and make stacks non-executable > >> when possible. > >> > >> It seems that when GDB wants to call a function in the target it is > >> poking an INT3 instruction onto the targets stack, then poking the > >> address of that instruction on to the stack as the return address and > >> resuming at the address of the function it wants to call. > > > > Sounds plausible, but Philippe will know for sure. > > > > Did this get resolved yet? > > No. I was hoping Philippe would have some ideas. Oops, I missed the original mail, so here is some late feedback. > > I did look a bit further at it, and it seems that when the target is run > directly under gdb without valgrind it does things in a different way > and doesn't use the stack as the return address. The i386/amd64 gdb target was changed in GDB 7.5 to use 'ON_STACK' inferior. GDB 7.4 and before are using AT_ENTRY_POINT. With ON_STACK, GDB always uses the stack to insert a breakpoint for the return address of an inferior function call. I do not think that GDB behaves differently when using a gdbserver or when running natively (i.e. GDB does not change the stack permissions). I think the difference of behaviour is due to Valgrind, which has to decode instructions before they are executed. When the inferior function returns, in an execution under GDB or an execution under a "normal" gdbserver, the breakpoint is encountered before the instruction is executed. GDB regains control, and restore the situation before the inferior function call. (not completely sure about the above. If the stack page is not executable, maybe some special code in the kernel will cause SIGTRAP to be generated anyway if there is a INT3 instruction). There is a testcase for "returning" to an nx page at: wget http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/ret-to-nxpage.c?cvsroot=systemtap If you compile and run ret-to-nxpage.c (amd64/Deb 6), it works properly natively. It fails under Valgrind. When running under Valgrind: when the called function returns, Valgrind has to decode and translate the instruction(s) on the stack. (note: after some discussions with GDB developers, the insertion of 0xcc/INT3 on the stack by GDB was added especially for Valgrind in GDB 7.5: without this 0xcc, the stack contains "random" bytes, which caused Valgrind instruction decoder to report invalid instructions). The resulting translated instructions will not be executed, as GDB will regain control due to the breakpoint being encountered (a breakpoint at an instruction is translated by Valgrind into a call to the Valgrind gdbserver, to let GDB regain control). With the PT_GNU_STACK making the stack not executable, Valgrind now reports an error and crashes. I checked that the stack permission is not changed by GDB 7.5, on Debian 6: (gdb) bt #0 whoami (msg=0x602010 "hello") at t.c:29 #1 <function called from gdb> #2 main (argc=1, argv=0x7fffffffe678) at t.c:105 (gdb) info frame 0 Stack frame at 0x7fffffffe460: rip = 0x400be8 in whoami (t.c:29); saved rip 0x7fffffffe46f called by frame at 0x7fffffffe468 source language c. Arglist at 0x7fffffffe450, args: msg=0x602010 "hello" Locals at 0x7fffffffe450, Previous frame's sp is 0x7fffffffe460 Saved registers: rbp at 0x7fffffffe450, rip at 0x7fffffffe458 (gdb) x/1xb 0x7fffffffe46f 0x7fffffffe46f: 0xcc (gdb) shell grep stack /proc/16264/maps 7ffffffea000-7ffffffff000 rw-p 00000000 00:00 0 [stack] (gdb) c Continuing. pid 16264 Thread 16264 hello (gdb) So, to have inferior function calls working with PT_GNU_STACK and GDB >= 7.5, it seems something more sophisticated will be needed in Valgrind. Maybe when an nx page is detected, have m_translate.c checking if there is a breakpoint at this address. If there is a breakpoint, rather generate a SIGTRAP than a SIGSEGV ? Or maybe call gdbserver directly ? I will experiment with the above and give feedback. Philippe |
|
From: Tom H. <to...@co...> - 2013-05-01 16:55:42
|
On 01/05/13 17:31, Philippe Waroquiers wrote: > When the inferior function returns, in an execution under GDB or an > execution under a "normal" gdbserver, the breakpoint is encountered > before the instruction is executed. GDB regains control, and restore > the situation before the inferior function call. So normally gdb will rely on a hardware breakpoint using the debug registers to trigger and the INT3 will never try and execute? > When running under Valgrind: when the called function returns, Valgrind > has to decode and translate the instruction(s) on the stack. > (note: after some discussions with GDB developers, the insertion of > 0xcc/INT3 on the stack by GDB was added especially for Valgrind in GDB > 7.5: without this 0xcc, the stack contains "random" bytes, which caused > Valgrind instruction decoder to report invalid instructions). Right, so it sounds like it was relying on a hardware breakpoint which is why it didn't need the actual instruction but valgrind does. > Maybe when an nx page is detected, have m_translate.c checking if > there is a breakpoint at this address. If there is a breakpoint, > rather generate a SIGTRAP than a SIGSEGV ? Or make the valgrind gdbserver support "hardware" breakpoints as I think it already does for watchpoints, and have execution stop with SIGTRAP when a breakpoint address is hit without valgrind trying to decode and execute the instruction? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Philippe W. <phi...@sk...> - 2013-05-01 20:39:31
|
On Wed, 2013-05-01 at 17:55 +0100, Tom Hughes wrote:
> On 01/05/13 17:31, Philippe Waroquiers wrote:
>
> > When the inferior function returns, in an execution under GDB or an
> > execution under a "normal" gdbserver, the breakpoint is encountered
> > before the instruction is executed. GDB regains control, and restore
> > the situation before the inferior function call.
>
> So normally gdb will rely on a hardware breakpoint using the debug
> registers to trigger and the INT3 will never try and execute?
I investigated the behaviour of GDB "native" some more in depth:
The INT3 in a non executable page is effectively never executed.
However, GDB does not use hw breapoints for that.
When the inferior function returns, a SIGSEGV is generated.
GDB however handles this SIGSEGV as a SIGTRAP when there is a
breakpoint at the address that caused the SEGV.
In such a case, GDB instructs the inferior process to ignore
the SEGV.
With the PT_GNU_STACK, Valgrind m_translate.c generates also
a SIGSEGV, properly received by GDB via gdbsrv.
However, in the case of Valgrind, there is no way for V to
ignore the SEGV : ignoring such a SEGV would imply to have
V properly continuing execution somewhere in m_translate.c.
Probably do-able but not straightforward => other solution
in the patch below.
> Or make the valgrind gdbserver support "hardware" breakpoints as I think
> it already does for watchpoints, and have execution stop with SIGTRAP
> when a breakpoint address is hit without valgrind trying to decode and
> execute the instruction?
Valgrind "emulates" hardware watchpoints by using the memcheck
A-bits machinery.
Breakpoints are translated in a call to gdbserver.
V accepts both Z0 and Z1 packets (hard or soft breakpoints)
but they are implemented the same way (and GDB uses Z0 by default).
The patch below fixes the problem of the nx stack by still accepting
to translate when there is a gdbsrv breakpoint at the addr to translate
in an nx segment.
Accepting to translate in such a case will have no effect as
GDB will cleanup before this translation is really executed
(only the gdbserver call at the beginning of the translation is
executed, not the "real" instructions after).
If no comment on the patch, I will commit in one day or two.
Philippe
Index: coregrind/m_translate.c
===================================================================
--- coregrind/m_translate.c (revision 13380)
+++ coregrind/m_translate.c (working copy)
@@ -745,9 +745,9 @@
/* --------- Various helper functions for translation --------- */
/* Look for reasons to disallow making translations from the given
- segment. */
+ segment/addr. */
-static Bool translations_allowable_from_seg ( NSegment const* seg )
+static Bool translations_allowable_from_seg ( NSegment const* seg, Addr addr )
{
# if defined(VGA_x86) || defined(VGA_s390x) || defined(VGA_mips32) \
|| defined(VGA_mips64)
@@ -757,7 +757,16 @@
# endif
return seg != NULL
&& (seg->kind == SkAnonC || seg->kind == SkFileC || seg->kind == SkShmC)
- && (seg->hasX || (seg->hasR && allowR));
+ && (seg->hasX
+ || (seg->hasR && allowR)
+ || VG_(has_gdbserver_breakpoint) (addr));
+ /* If GDB/gdbsrv has inserted a breakpoint at addr, we assume this
+ is a valid location to translate, even if seg not executable.
+ This is needed for inferior function calls from GDB: GDB inserts a
+ breakpoint on the stack, and expects to regain control before the
+ breakpoint instruction at the breakpoint address is really
+ executed. For this, the breakpoint instruction must be translated
+ so as to have the call to gdbserver executed. */
}
@@ -852,7 +861,7 @@
allow a chase. */
/* Destination not in a plausible segment? */
- if (!translations_allowable_from_seg(seg))
+ if (!translations_allowable_from_seg(seg, addr))
goto dontchase;
/* Destination is redirected? */
@@ -1418,7 +1427,7 @@
{ /* BEGIN new scope specially for 'seg' */
NSegment const* seg = VG_(am_find_nsegment)(addr);
- if ( (!translations_allowable_from_seg(seg))
+ if ( (!translations_allowable_from_seg(seg, addr))
|| addr == TRANSTAB_BOGUS_GUEST_ADDR ) {
if (VG_(clo_trace_signals))
VG_(message)(Vg_DebugMsg, "translations not allowed here (0x%llx)"
Index: coregrind/pub_core_gdbserver.h
===================================================================
--- coregrind/pub_core_gdbserver.h (revision 13380)
+++ coregrind/pub_core_gdbserver.h (working copy)
@@ -76,6 +76,9 @@
Bool VG_(gdbserver_point) (PointKind kind, Bool insert,
Addr addr, int len);
+/* True if there is a breakpoint at addr. */
+Bool VG_(has_gdbserver_breakpoint) (Addr addr);
+
/* Entry point invoked by vgdb when it uses ptrace to cause a gdbserver
invocation. A magic value is passed by vgdb in check as a verification
that the call has been properly pushed by vgdb. */
Index: coregrind/m_gdbserver/m_gdbserver.c
===================================================================
--- coregrind/m_gdbserver/m_gdbserver.c (revision 13380)
+++ coregrind/m_gdbserver/m_gdbserver.c (working copy)
@@ -394,6 +394,15 @@
return True;
}
+Bool VG_(has_gdbserver_breakpoint) (Addr addr)
+{
+ GS_Address *g;
+ if (!gdbserver_called)
+ return False;
+ g = VG_(HT_lookup) (gs_addresses, (UWord)HT_addr(addr));
+ return (g != NULL && g->kind == GS_break);
+}
+
Bool VG_(is_watched)(PointKind kind, Addr addr, Int szB)
{
Word n_elems;
|
|
From: Julian S. <js...@ac...> - 2013-05-02 09:54:03
|
On 05/01/2013 10:39 PM, Philippe Waroquiers wrote:
> If no comment on the patch, I will commit in one day or two.
Looks OK, except one comment:
> return seg != NULL
> && (seg->kind == SkAnonC || seg->kind == SkFileC || seg->kind == SkShmC)
> - && (seg->hasX || (seg->hasR && allowR));
> + && (seg->hasX
> + || (seg->hasR && allowR)
> + || VG_(has_gdbserver_breakpoint) (addr));
Should the last condition ..
|| VG_(has_gdbserver_breakpoint) (addr));
maybe be
|| (seg->hasR && VG_(has_gdbserver_breakpoint) (addr)));
? I am thinking of the case where the segment doesn't even have read
permissions, but nevertheless VG_(has_gdbserver_breakpoint) returns True.
In that case, I think VEX will segfault when it starts to make the
translation, no? We at least need read permissions on any page that we
make a translation from.
I am not sure about this though -- I can't remember if there is some
existing logic to deal with the situation where a page is mapped x-only,
since in that situation vex will need to have r-permission too. So
maybe this isn't a problem. I don't know.
J
|
|
From: Philippe W. <phi...@sk...> - 2013-05-02 13:18:36
|
On Thu, 2013-05-02 at 11:53 +0200, Julian Seward wrote:
> On 05/01/2013 10:39 PM, Philippe Waroquiers wrote:
> > If no comment on the patch, I will commit in one day or two.
>
> Looks OK, except one comment:
>
> > return seg != NULL
> > && (seg->kind == SkAnonC || seg->kind == SkFileC || seg->kind == SkShmC)
> > - && (seg->hasX || (seg->hasR && allowR));
> > + && (seg->hasX
> > + || (seg->hasR && allowR)
> > + || VG_(has_gdbserver_breakpoint) (addr));
>
> Should the last condition ..
>
> || VG_(has_gdbserver_breakpoint) (addr));
>
> maybe be
>
> || (seg->hasR && VG_(has_gdbserver_breakpoint) (addr)));
>
> ? I am thinking of the case where the segment doesn't even have read
> permissions, but nevertheless VG_(has_gdbserver_breakpoint) returns True.
> In that case, I think VEX will segfault when it starts to make the
> translation, no? We at least need read permissions on any page that we
> make a translation from.
>
> I am not sure about this though -- I can't remember if there is some
> existing logic to deal with the situation where a page is mapped x-only,
> since in that situation vex will need to have r-permission too. So
> maybe this isn't a problem. I don't know.
For the gdbsrv needed condition for the inferior function call
with PT_GNU_STACK not executable stack, the hasR will be True (breakpoint
on the stack which is for sure rw) and the hasX will be False.
=> I will change the condition to:.
&& (seg->hasX
|| (seg->hasR && (allowR
|| VG_(has_gdbserver_breakpoint) (addr))));
Note that the current code accepts to translate an hasX segment
without checking explicitely hasR. Maybe an hasX has automatically hasR ?
It looks better to not change this part of the condition.
Thanks for the review, will retest and commit
Philippe
|