|
From: Nicholas N. <nj...@cs...> - 2005-02-02 22:49:02
|
Hi,
Valgrind currently only supports the 'int' instruction when it is "int
0x80", which is used for a system call on x86/Linux. Some Java
implementations use the x86 'int' ("interrupt") instruction when certain
exceptions are thrown. Steve Blackburn of ANU was having problems with
using Cachegrind on some Java programs because of this. I made a quick
attempt at adding support for these instructions, but failed, so I'm
asking here about it. I tried adding a new kind of basic-block-ending
Jmp, and then tried adding a new UCode instruction, INT. I made some
progress but didn't really get anywhere.
AIUI, an interrupt basically causes a jump to a particular interrupt
handler within the kernel. This is tricky for Valgrind, because we can't
let jumps go just anywhere, otherwise we'll lose control and the client
will undoubtedly crash very quickly. So the pertinent question is this:
how does control return to the client in user-space? With system calls
(int 0x80), control returns -- once the kernel has done its thing -- to
the subsequent instruction. Is the same true of all interrupts?
Another question: how do we know what the kernel does while servicing the
interrupt? Would it require, as with syscalls, that Valgrind tells the
tool about certain events such as memory reads and writes? I guess it may
well depend on what the kernel does with each interrupt, and so support
would have to be added one interrupt at a time?
Basically, if anyone knows how these interrupts work, and have ideas about
how to support them, I'd appreciate knowing about it. Thanks.
Thanks.
N
|
|
From: Jeremy F. <je...@go...> - 2005-02-03 01:28:51
Attachments:
int3-emu.patch
|
On Wed, 2005-02-02 at 16:48 -0600, Nicholas Nethercote wrote:
> Valgrind currently only supports the 'int' instruction when it is "int
> 0x80", which is used for a system call on x86/Linux. Some Java
> implementations use the x86 'int' ("interrupt") instruction when certain
> exceptions are thrown. Steve Blackburn of ANU was having problems with
> using Cachegrind on some Java programs because of this. I made a quick
> attempt at adding support for these instructions, but failed, so I'm
> asking here about it. I tried adding a new kind of basic-block-ending
> Jmp, and then tried adding a new UCode instruction, INT. I made some
> progress but didn't really get anywhere.
Um, are you sure? I think the only interrupts which usable under Linux
are int3 and int $0x80. int3 is the breakpoint instruction, and is a
special case because it has a 1 byte opcode rather than 2 bytes.
If you try to run any other interrupt, you just get a GPF, which looks
like a SIGSEGV to user mode (currently we get this wrong by generating a
SIGILL, but nothing cares).
> Basically, if anyone knows how these interrupts work, and have ideas about
> how to support them, I'd appreciate knowing about it. Thanks.
I don't think we can in any meaningful way, except for int3. If there's
a real need, I would do it with a simple helper call.
If they are using int3, I implemented it the other day. (Attached, but
it is out of date with respect to the baseBlock removal.)
J
|
|
From: Steve B. <Ste...@an...> - 2005-02-03 03:33:36
|
Hi Jeremy,
I don't know much about valgrind, but...
> Um, are you sure? I think the only interrupts which usable under Linux
> are int3 and int $0x80. int3 is the breakpoint instruction, and is a
> special case because it has a 1 byte opcode rather than 2 bytes.
In Jikes RVM we generate a number of software traps using the INT instruction.
These are int 40 through to int 43. We use these to catch conditions such as
array bounds violations, throw control to a signal handler which then starts our
(Java) execption handling mechanism. So, yes, we do use these, and we catch
them in a regular signal hanler.
> If you try to run any other interrupt, you just get a GPF, which looks
> like a SIGSEGV to user mode (currently we get this wrong by generating a
> SIGILL, but nothing cares).
OK. Perhaps this is right. Perhaps we just look at the culprit instruction and
use the fact that it was an int rather than a load/store to determine what
condition we're handling. We do nonetheless generate an int instruction....
Perhaps all that needs to be done is for valgrind to implemetn the behavior you
describe above: make it look like a SIGSEGV.
Cheers,
--Steve
==21231== Cachegrind, an I1/D1/L2 cache profiler for x86-linux.
==21231== Copyright (C) 2002-2004, and GNU GPL'd, by Nicholas Nethercote et al.
==21231== Using valgrind-2.2.0, a program supervision framework for x86-linux.
==21231== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
==21231== For more details, rerun with: -v
==21231==
disInstr: unhandled instruction bytes: 0xCC 0xDF 0x32 0x57
at 0x5732DF9C: ???
Just to give some context, here's a couple of snippits from our baseline
compiler source code:
protected final void emit_ldiv() {
// (1) zero check
asm.emitMOV_Reg_RegDisp(T0, SP, 0);
asm.emitOR_Reg_RegDisp(T0, SP, 4);
VM_ForwardReference fr1 = asm.forwardJcc(asm.NE);
asm.emitINT_Imm(VM_Runtime.TRAP_DIVIDE_BY_ZERO + RVM_TRAP_BASE); // trap
if divisor is 0
....
public final void emitINT_Imm (int v) {
if (VM.VerifyAssertions) VM._assert(v <= 0xFF);
int miStart = mi;
if (v == 3) { // special case interrupt
setMachineCodes(mi++, (byte) 0xCC);
} else {
setMachineCodes(mi++, (byte) 0xCD);
setMachineCodes(mi++, (byte) v);
}
if (lister != null) lister.I(miStart, "INT", v);
}
|
|
From: Jeremy F. <je...@go...> - 2005-02-03 04:15:42
|
On Thu, 2005-02-03 at 14:33 +1100, Steve Blackburn wrote: > In Jikes RVM we generate a number of software traps using the INT instruction. > These are int 40 through to int 43. We use these to catch conditions such as > array bounds violations, throw control to a signal handler which then starts our > (Java) execption handling mechanism. So, yes, we do use these, and we catch > them in a regular signal hanler. Oh, OK, I see. > Perhaps all that needs to be done is for valgrind to implemetn the behavior you > describe above: make it look like a SIGSEGV. Right, that's easy. I don't think you can tell from the signal info alone which int instruction it was, so we can easily simulate the effect of them all by calling a helper with, say, INT $99 in it. Hm, need to make sure that all the VCPU state is up to date at that point, so you can see it from the signal handler. Do you look at other CPU state from the handler, or just EIP? Do you expect to be able to continue after the INT instruction, or does it always raise a Java exception? J |
|
From: Chris J. <ch...@at...> - 2005-02-03 21:17:56
|
> On Thu, 2005-02-03 at 14:33 +1100, Steve Blackburn wrote: > > In Jikes RVM we generate a number of software traps using the INT=20 > > instruction. > > These are int 40 through to int 43. We use these to catch=20 > conditions such as=20 > > array bounds violations, throw control to a signal handler=20 > which then starts our=20 > > (Java) execption handling mechanism. So, yes, we do use=20 > these, and we catch=20 > > them in a regular signal hanler. >=20 > Oh, OK, I see. >=20 > > Perhaps all that needs to be done is for valgrind to implemetn the=20 > > behavior you > > describe above: make it look like a SIGSEGV. >=20 > Right, that's easy. I don't think you can tell from the=20 > signal info alone which int instruction it was, so we can=20 > easily simulate the effect of them all by calling a helper=20 > with, say, INT $99 in it. Hm, need to make sure that all the=20 > VCPU state is up to date at that point, so you can see it=20 > from the signal handler. Do you look at other CPU state from=20 > the handler, or just EIP? Do you expect to be able to=20 > continue after the INT instruction, or does it always raise a=20 > Java exception? I think the breakpoints patch I've posted to this list could be = informative here since it handles INT $3. The same principles could be extended to handle other interrupts. Basically what the patch does is translate the = trap instruction into code to return from the innerloop with a particular TRC value. This value is then caught in VG_(scheduler) which raises a real signal (SIGTRAP in this case). Raising a real signal has the benefit it = can be seen by a debugger. Chris |
|
From: Jeremy F. <je...@go...> - 2005-02-04 00:26:54
|
On Thu, 2005-02-03 at 21:17 +0000, Chris January wrote: > I think the breakpoints patch I've posted to this list could be informative > here since it handles INT $3. The same principles could be extended to > handle other interrupts. Basically what the patch does is translate the trap > instruction into code to return from the innerloop with a particular TRC > value. This value is then caught in VG_(scheduler) which raises a real > signal (SIGTRAP in this case). Raising a real signal has the benefit it can > be seen by a debugger. Actually, that's what my patch does, only much more simply. It calls a helper which invokes a real int3 instruction; the generated SIGTRAP is then delivered to the thread using the normal signal machinery. J |
|
From: Chris J. <ch...@at...> - 2005-02-04 09:20:31
|
> On Thu, 2005-02-03 at 21:17 +0000, Chris January wrote: > > I think the breakpoints patch I've posted to this list could be > > informative here since it handles INT $3. The same > principles could be > > extended to handle other interrupts. Basically what the > patch does is > > translate the trap instruction into code to return from the > innerloop > > with a particular TRC value. This value is then caught in > > VG_(scheduler) which raises a real signal (SIGTRAP in this case). > > Raising a real signal has the benefit it can be seen by a debugger. > > Actually, that's what my patch does, only much more simply. > It calls a helper which invokes a real int3 instruction; the > generated SIGTRAP is then delivered to the thread using the > normal signal machinery. Doesn't that mean %eip isn't in the baseBlock/VG_(threads) at exception time? Chris |
|
From: Jeremy F. <je...@go...> - 2005-02-04 17:59:07
|
On Fri, 2005-02-04 at 09:16 +0000, Chris January wrote: > > Actually, that's what my patch does, only much more simply. > > It calls a helper which invokes a real int3 instruction; the > > generated SIGTRAP is then delivered to the thread using the > > normal signal machinery. > > Doesn't that mean %eip isn't in the baseBlock/VG_(threads) at exception > time? The INT/ud2 instructions are considered to be the end of the basic block, so all the VCPU state is flushed out. EIP will point to the client's INT instruction (so it doesn't matter what kind of INT we use to raise the signal, so long as it raises the right kind of signal). J |
|
From: Chris J. <ch...@at...> - 2005-02-04 20:38:53
|
> On Fri, 2005-02-04 at 09:16 +0000, Chris January wrote: > > > Actually, that's what my patch does, only much more simply. > > > It calls a helper which invokes a real int3 instruction; the=20 > > > generated SIGTRAP is then delivered to the thread using the=20 > > > normal signal machinery. > >=20 > > Doesn't that mean %eip isn't in the baseBlock/VG_(threads) at=20 > > exception time? >=20 > The INT/ud2 instructions are considered to be the end of the=20 > basic block, so all the VCPU state is flushed out. EIP will=20 > point to the client's INT instruction (so it doesn't matter=20 > what kind of INT we use to raise the signal, so long as it=20 > raises the right kind of signal). Sorry, maybe I'm not looking at your patch correctly but I can't see how = the VCPU state is flushed out before the exception is generated. Could you explain please? As I understand it the VCPU state is only flushed after = the exception has occurred, the kernel has queued the SIGTRAP signal, = Valgrind has received the signal and longjmp'ed out of the scheduler. Any = external program monitoring the program looking for traps, for example, will see = the wrong instruction pointer, even if they look in the = baseBlock/VG_(threads) structure instead of the real regs. Chris |
|
From: Jeremy F. <je...@go...> - 2005-02-04 22:56:49
|
On Fri, 2005-02-04 at 20:38 +0000, Chris January wrote: > Sorry, maybe I'm not looking at your patch correctly but I can't see how the > VCPU state is flushed out before the exception is generated. Could you > explain please? As I understand it the VCPU state is only flushed after the > exception has occurred, the kernel has queued the SIGTRAP signal, Valgrind > has received the signal and longjmp'ed out of the scheduler. Any external > program monitoring the program looking for traps, for example, will see the > wrong instruction pointer, even if they look in the baseBlock/VG_(threads) > structure instead of the real regs. EIP in the ThreadState (previously baseBlock) is always up to date, because its updated after every instruction. If the INT never completes (which it won't), EIP in the ThreadState will be left pointing to the client instruction which triggered the exception. The other register state may be still wrong; I think vg_from_ucode might defer flushing it until the last moment, which is after the call to the helper. That just means that you need to run with --single-step=yes, like you do with any other program which requires precise exceptions. J |
|
From: Julian S. <js...@ac...> - 2005-07-06 21:02:03
|
Reviving this long-dead thread ... I'm looking at reimplementing INT in Valgrind 3. V3 has a completely new JIT compared to Valgrind 2, and INT died in the transition (obviously int $0x80 works, but nothing else). So I'm trying to understand the requirements people have. Chris, you appear to require that execution of int3 causes SIGTRAP to be delivered. Steve, you appear to require that execution of int $0x40 .. $0x43 cause SIGTRAP to be delivered? So I'm inclined to arrange that SIGTRAP is delivered when executing int $anything-other-than-0x80. How does that sound? J |
|
From: Jeremy F. <je...@go...> - 2005-07-09 20:00:58
|
Julian Seward wrote:
>Reviving this long-dead thread ...
>
>I'm looking at reimplementing INT in Valgrind 3. V3 has a completely
>new JIT compared to Valgrind 2, and INT died in the transition
>(obviously int $0x80 works, but nothing else).
>
>So I'm trying to understand the requirements people have.
>
>Chris, you appear to require that execution of int3 causes SIGTRAP
>to be delivered.
>
>Steve, you appear to require that execution of int $0x40 .. $0x43
>cause SIGTRAP to be delivered?
>
>So I'm inclined to arrange that SIGTRAP is delivered when executing
>int $anything-other-than-0x80. How does that sound?
>
SIGTRAP is specifically for the breakpoint trap, which is int 3. The
others end up generating SIGSEGV because the CPU raises a GPF when a
user-mode process tries to raise a disallowed software interrupt.
Fortunately (in this case) GPF fault frames contain very little real
information, so synthesizing an accurate signal from them is easy.
Certainly the easiest way in 2.4 was to just run an "int $40" (or
something; doesn't matter which interrupt it is) as a helper callout,
but I don't know how that fits into the Vex way of doing things.
J
|
|
From: Steve B. <Ste...@an...> - 2005-07-09 00:40:27
Attachments:
Attached Message
Attached Message
|
Hi Julian, >I'm looking at reimplementing INT in Valgrind 3. V3 has a completely >new JIT compared to Valgrind 2, and INT died in the transition >(obviously int $0x80 works, but nothing else). > >So I'm trying to understand the requirements people have. > >Chris, you appear to require that execution of int3 causes SIGTRAP >to be delivered. > >Steve, you appear to require that execution of int $0x40 .. $0x43 >cause SIGTRAP to be delivered? > > My requirements are really just that I see the same behavior under valgrind as when running natively, given that we do use int $0x40...$0x43. As I understand it, we just get a signal which looks like a SIGSEGV and we figure things out from there by looking at the gregs. I've attached the correspondance I had with Jeremy which lead to his fix. Cheers, --Steve |
|
From: Chris J. <ch...@at...> - 2005-07-15 15:32:35
|
> Reviving this long-dead thread ... > > I'm looking at reimplementing INT in Valgrind 3. V3 has a completely > new JIT compared to Valgrind 2, and INT died in the transition > (obviously int $0x80 works, but nothing else). > > So I'm trying to understand the requirements people have. > > Chris, you appear to require that execution of int3 causes SIGTRAP > to be delivered. > > Steve, you appear to require that execution of int $0x40 .. $0x43 > cause SIGTRAP to be delivered? > > So I'm inclined to arrange that SIGTRAP is delivered when executing > int $anything-other-than-0x80. How does that sound? That sounds ok. Chris |
|
From: Julian S. <js...@ac...> - 2007-06-29 09:21:12
|
Both trunk and the 3.2 branch have supported int3 for a while now. Is that what you need? J On Friday 29 June 2007 03:53, Steve Blackburn wrote: > Hi Julian, > > Two years later, we've finally moved our valgrind work forward to V3, > and it seems that INT fell through the cracks. Any chance of it > going back in? > > Thanks, > > --Steve > > On 07/07/2005, at 7:01 AM, Julian Seward wrote: > > Reviving this long-dead thread ... > > > > I'm looking at reimplementing INT in Valgrind 3. V3 has a completely > > new JIT compared to Valgrind 2, and INT died in the transition > > (obviously int $0x80 works, but nothing else). > > > > So I'm trying to understand the requirements people have. > > > > Chris, you appear to require that execution of int3 causes SIGTRAP > > to be delivered. > > > > Steve, you appear to require that execution of int $0x40 .. $0x43 > > cause SIGTRAP to be delivered? > > > > So I'm inclined to arrange that SIGTRAP is delivered when executing > > int $anything-other-than-0x80. How does that sound? > > > > J |
|
From: Julian S. <js...@ac...> - 2007-06-29 10:09:39
|
On Friday 29 June 2007 10:41, Steve Blackburn wrote: > Hi Julian, > > We had a discussion in April 2005 and Jeremy added support for INT3 > prior to v3.0. That worked wonderfully for us. > > Recently we upgraded to the most recent release and tested the head > and our system no longer works :-/ We now get valgrind throwing an > error, ok .. send a small test case *and* the error that you're currently getting. J |
|
From: KJK::Hyperion <hac...@re...> - 2005-02-03 00:54:20
|
At 23.48 02/02/2005, Nicholas Nethercote wrote:
>So the pertinent question is this: how does control return to the client
>in user-space?
an interrupt almost invariably translates into a signal raised to the
calling thread. Interrupts were designed with that use in mind (or, most
probably, signals and exceptions were designed with interrupts in mind...).
A really small subset (in Linux, only # 0x80) is actually an abuse and is
used in place of call far <segment>:0 ("call gate" - IIRC, in AT&T syntax
it's lcall <segment>, 0), an abuse probably justified by raw performance
figures (I read somewhere Windows 95 used an invalid opcode, because at the
time the invalid opcode trap was the fastest way to get into kernel mode.
Nowadays we have sysenter)
>Another question: how do we know what the kernel does while servicing the
>interrupt?
you make an educated guess. In most cases, though, it only results in a
signal and the question really is which signal number for a given interrupt
|