|
From: Filipe C. <fi...@gm...> - 2008-04-17 14:58:32
|
Hi,
We have noticed that the assembly generated by a "boring goto" has an
indirect jump to $dispatcher_addr.
example:
-- goto {Boring} 0x80483A1:I32
movl $0x80483A1,%eax ; movl $dispatcher_addr,%edx ; jmp *%edx
Why the indirection? The dispatcher address (vta->dispatch) is known
since the start of the translation and (as far as we can tell) isn't
changed in runtime. Couldn't we just emit a jmp $dispatcher_addr?
Thanks in advance,
- Filipe Cabecinhas
|
|
From: Julian S. <js...@ac...> - 2008-04-17 15:24:48
|
On Thursday 17 April 2008 16:58, Filipe Cabecinhas wrote:
> Hi,
>
> We have noticed that the assembly generated by a "boring goto" has an
> indirect jump to $dispatcher_addr.
>
> example:
> -- goto {Boring} 0x80483A1:I32
> movl $0x80483A1,%eax ; movl $dispatcher_addr,%edx ; jmp *%edx
>
>
> Why the indirection? The dispatcher address (vta->dispatch) is known
> since the start of the translation and (as far as we can tell) isn't
> changed in runtime. Couldn't we just emit a jmp $dispatcher_addr?
Because .. x86 doesn't contain an instruction "jmp $32-bit-constant",
unfortunately. It does contain "jmp $32-bit-offset-relative-to-pc",
which is what is normally used (+ there is a short form with an 8-bit
signed offset).
Unfortunately vex is constrained to generate position independent code,
which means a relative jump can't be used. This makes other parts of
the engineering simpler. Specifically, it means vex can generate the
code somewhere, and valgrind's core (m_translate, m_transtab) can copy
the resulting translation to whereever it likes for long term storage.
You'll see the same problem for calls to helper functions, since there's
no call-absolute-address insn either. Except worse, since most SBs
instrumented by most tools have multiple helper calls in them.
There are two possible solutions, both kinda ugly and complex
(but standard and well-known):
1. Generate relative jmp/call insns. Also, generate a relocation
table for the translation, so we know how to adjust the offsets
when the translation is moved.
2. Valgrind decides beforehand where the translation will end up,
and vex generates it directly to that address.
(1) is generally more flexible, since it allows moving the translation
as many times as desired over its lifetime. Also (2) is difficult
because in general you might not know where you want the translation
to go until you know how big it is.
J
|
|
From: Filipe C. <fi...@gm...> - 2008-04-17 18:40:32
|
Hi On 17 Apr, 2008, at 16:19, Julian Seward wrote: > > Because .. x86 doesn't contain an instruction "jmp $32-bit-constant", > unfortunately. It does contain "jmp $32-bit-offset-relative-to-pc", > which is what is normally used (+ there is a short form with an 8-bit > signed offset). > I see. We forgot checking for that little detail, unfortunately. We'll take a closer look at the block chaining for valgrind 2, and maybe think about more optimizations. Thanks for the reply, - Filipe Cabecinhas |
|
From: John R.
|
> We have noticed that the assembly generated by a "boring goto" has an
> indirect jump to $dispatcher_addr.
>
> example:
> -- goto {Boring} 0x80483A1:I32
> movl $0x80483A1,%eax ; movl $dispatcher_addr,%edx ; jmp *%edx
>
>
> Why the indirection? The dispatcher address (vta->dispatch) is known
> since the start of the translation and (as far as we can tell) isn't
> changed in runtime. Couldn't we just emit a jmp $dispatcher_addr?
That would work on i686 but not on x86_64 or powerpc or powerpc64.
Those other architectures have limited displacement for direct branches
and calls.
--
|
|
From: Tom H. <to...@co...> - 2008-04-17 16:05:31
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> On Thursday 17 April 2008 16:58, Filipe Cabecinhas wrote:
>
>> We have noticed that the assembly generated by a "boring goto" has an
>> indirect jump to $dispatcher_addr.
>>
>> example:
>> -- goto {Boring} 0x80483A1:I32
>> movl $0x80483A1,%eax ; movl $dispatcher_addr,%edx ; jmp *%edx
>>
>>
>> Why the indirection? The dispatcher address (vta->dispatch) is known
>> since the start of the translation and (as far as we can tell) isn't
>> changed in runtime. Couldn't we just emit a jmp $dispatcher_addr?
>
> Because .. x86 doesn't contain an instruction "jmp $32-bit-constant",
> unfortunately. It does contain "jmp $32-bit-offset-relative-to-pc",
> which is what is normally used (+ there is a short form with an 8-bit
> signed offset).
You can do it in 32 bit mode can't you? Using FF/4 with a modrm
that encodes a 32 bit constant? In 64 bit mode that becomes the
encoding for a 32 bit RIP relative address though.
The same goes for call using the FF/2 opcode.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2008-04-17 18:48:21
|
> > Because .. x86 doesn't contain an instruction "jmp $32-bit-constant", > > unfortunately. It does contain "jmp $32-bit-offset-relative-to-pc", > > which is what is normally used (+ there is a short form with an 8-bit > > signed offset). > > You can do it in 32 bit mode can't you? Using FF/4 with a modrm > that encodes a 32 bit constant? In 64 bit mode that becomes the > encoding for a 32 bit RIP relative address though. > > The same goes for call using the FF/2 opcode. Nearly, but, alas, no. FF/4 $imm32 is "goto *(void*)$imm32" and not merely "goto $imm32", that is, it dereferences its argument. Ditto FF/2. J |