|
From: Nicholas N. <nj...@ca...> - 2004-07-25 21:40:44
|
Hi,
I'm attempting x86-64, I've got Valgrind compiling, which required
allowing for pointers and ints to be different sizes, and changing the
register set, and excising a whole lot of code that I didn't want to deal
with at the moment (eg. all the tools except Nulgrind, a lot of the signal
handling, etc).
Now I'm fighting through the startup stuff. I'm proceeding until
something breaks, then I try to fix it. Or ignore it, if it doesn't look
too dangerous. Here's what I've done so far, perhaps people can assist...
- I noticed that the s/start/_ume_entry in stage2.lds is unnecessary since
the -Wl,-e,_ume_entry option is passed to stage2. No matter.
- I changed kickstart_base to 0x70000000; having it at 0xb0000000 I get
lots of weird linker errors; it complains about truncations, I don't
understand.
- fix_auxv() isn't working, because there are apparently only 2 auxv
entries, neither of which are the ones we want. On x86 there are usually
17 AFAICT. Any ideas? Does anyone know a program that reads auxv entries
so I can check this is really happening? Currently, I'm just ignoring
this matter and proceeding anyway.
[Actually, the two entries are the added ones AT_UME_{PADFD,EXECFD}, which
makes me think that the original auxv is getting clobbered somehow. I
will investigate further.]
- When ume_go() is called to call stage2's main(), I get a seg fault, I
can't work out why. ume_go()'s x86 definition puzzles me:
/*
Jump to a particular EIP with a particular ESP. This is intended
to simulate the initial CPU state when the kernel starts an program
after exec; it therefore also clears all the other registers.
*/
void ume_go(addr_t eip, addr_t esp)
{
asm volatile ("movl %1, %%esp;" /* set esp */
"pushl %%eax;" /* push esp */
"xorl %%eax,%%eax;" /* clear registers */
"xorl %%ebx,%%ebx;"
"xorl %%ecx,%%ecx;"
"xorl %%edx,%%edx;"
"xorl %%esi,%%esi;"
"xorl %%edi,%%edi;"
"xorl %%ebp,%%ebp;"
"ret" /* return into entry */
: : "a" (eip), "r" (esp));
/* we should never get here */
for(;;)
asm volatile("ud2");
}
(I have of course changed this for x86-64.)
I can't see how %eip is set by this -- I don't understand inline asm
very well but there's no mention of %0, and what's the "pushl %eax" for?
Is eip being passed in register %eax? Surely not without
__attribute__((regparms(n)))?
- I find the whole startup (up until stage2's main()) sequence very
confusing. Several different files, several different headers, no use of
the VG_() macro so it's hard to see what's exported and what's local, very
few comments despite it doing some decidedly tricky stuff... hmm.
All ideas welcome. I could try tarballing up the mess I've got in case
anyone else wants a go. Thanks.
N
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-25 22:17:07
|
On Sun, 25 Jul 2004, Nicholas Nethercote wrote:
> - fix_auxv() isn't working, because there are apparently only 2 auxv entries,
> neither of which are the ones we want. On x86 there are usually 17 AFAICT.
> Any ideas? Does anyone know a program that reads auxv entries so I can check
> this is really happening? Currently, I'm just ignoring this matter and
> proceeding anyway.
>
> [Actually, the two entries are the added ones AT_UME_{PADFD,EXECFD}, which
> makes me think that the original auxv is getting clobbered somehow. I will
> investigate further.]
Ok, worked this out -- auxv wasn't being found properly because I hadn't
change the size of the auxv type for 64-bits, and also I wasn't accounting
for argc/argv all being 64-bits also. It now gets quite a bit further
before crashing :) My other questions still stand, though...
N
|
|
From: Jeremy F. <je...@go...> - 2004-07-26 02:51:24
|
On Sun, 2004-07-25 at 22:40 +0100, Nicholas Nethercote wrote:
> Now I'm fighting through the startup stuff. I'm proceeding until
> something breaks, then I try to fix it. Or ignore it, if it doesn't look
> too dangerous. Here's what I've done so far, perhaps people can assist...
>
> - I noticed that the s/start/_ume_entry in stage2.lds is unnecessary since
> the -Wl,-e,_ume_entry option is passed to stage2. No matter.
>
> - I changed kickstart_base to 0x70000000; having it at 0xb0000000 I get
> lots of weird linker errors; it complains about truncations, I don't
> understand.
Are you generating a x86-32 or -64 executable? If the former, surely
you'd be using the same binutils as on a plain x86-32 machine?
For 64-bit processes, we're going to have some interesting issues here.
Ideally we should be able to run x86-32 programs under Valgrind with the
process in a 64-bit configuration, so that Valgrind can exist entirely
outside the 4G address space, leaving the whole thing for client use.
Unfortunately the x86-64 tools don't support generating code with 64-bit
relocations, so all code must be below 4G (I think including .so's).
But at least we can put all the heap/shadow/etc above 4G.
But at the very least, since we have a full 4G address space to use, we
don't need to make any consideration for the kernel portion (ie, we can
put Valgrind up at e8000000 or something.
I think ultimately, for a full 64-bit process version, we're going to
have to use another address space configuration. I'm thinking of
putting the code down very low (below the client executable, say from
4M-16M), and sticking all the data up very high (~512Gbyte, or
something). That way the client gets a large address space in between
to play with.
> - fix_auxv() isn't working, because there are apparently only 2 auxv
> entries, neither of which are the ones we want. On x86 there are usually
> 17 AFAICT. Any ideas? Does anyone know a program that reads auxv entries
> so I can check this is really happening? Currently, I'm just ignoring
> this matter and proceeding anyway.
Strange. I suspect there's some dodgy code which assumes sizeof(int) ==
sizeof(void *) in there.
> [Actually, the two entries are the added ones AT_UME_{PADFD,EXECFD}, which
> makes me think that the original auxv is getting clobbered somehow. I
> will investigate further.]
>
> - When ume_go() is called to call stage2's main(), I get a seg fault, I
> can't work out why. ume_go()'s x86 definition puzzles me:
>
> /*
> Jump to a particular EIP with a particular ESP. This is intended
> to simulate the initial CPU state when the kernel starts an program
> after exec; it therefore also clears all the other registers.
> */
> void ume_go(addr_t eip, addr_t esp)
> {
> asm volatile ("movl %1, %%esp;" /* set esp */
> "pushl %%eax;" /* push esp */
> "xorl %%eax,%%eax;" /* clear registers */
> "xorl %%ebx,%%ebx;"
> "xorl %%ecx,%%ecx;"
> "xorl %%edx,%%edx;"
> "xorl %%esi,%%esi;"
> "xorl %%edi,%%edi;"
> "xorl %%ebp,%%ebp;"
>
> "ret" /* return into entry */
> : : "a" (eip), "r" (esp));
> /* we should never get here */
> for(;;)
> asm volatile("ud2");
> }
>
> (I have of course changed this for x86-64.)
>
> I can't see how %eip is set by this -- I don't understand inline asm
> very well but there's no mention of %0, and what's the "pushl %eax" for?
> Is eip being passed in register %eax? Surely not without
> __attribute__((regparms(n)))?
Well, "a" (eip) sets %eax to equal eip. The push puts it on the stack,
and the ret jumps to *(esp) - ie, eip. The comment is wrong: it should
be
"pushl %%eax;" /* push eip */
> - I find the whole startup (up until stage2's main()) sequence very
> confusing. Several different files, several different headers, no use of
> the VG_() macro so it's hard to see what's exported and what's local, very
> few comments despite it doing some decidedly tricky stuff... hmm.
Sorry, my bad. The ume stuff is somewhat standalone, though there isn't
much point. Though ume is used in two distinct ways (to load stage2 and
to load the client).
The VG_() stuff is somewhat redundant now, since nothing is "exported"
in the same way. That is, we don't need to do name mangling to prevent
namespace clashes with the client libraries. There's a bit of a policy
decision to be made here - how much should Valgrind keep implementing
for itself, and how much should we use libc stuff. A lot of the VG_()
functions are simple duplicates of libc.
J
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-26 09:35:12
|
On Sun, 25 Jul 2004, Jeremy Fitzhardinge wrote: > Are you generating a x86-32 or -64 executable? If the former, surely > you'd be using the same binutils as on a plain x86-32 machine? x86-64. I'm aiming just at running 64-bit code first, then I'll worry about 32-bit code. I'm not sure how much mixing of 32-bit and 64-bit code is possible, that could become interesting. > For 64-bit processes, we're going to have some interesting issues here. > Ideally we should be able to run x86-32 programs under Valgrind with the > process in a 64-bit configuration, so that Valgrind can exist entirely > outside the 4G address space, leaving the whole thing for client use. > Unfortunately the x86-64 tools don't support generating code with 64-bit > relocations, so all code must be below 4G (I think including .so's). > But at least we can put all the heap/shadow/etc above 4G. > > But at the very least, since we have a full 4G address space to use, we > don't need to make any consideration for the kernel portion (ie, we can > put Valgrind up at e8000000 or something. > > I think ultimately, for a full 64-bit process version, we're going to > have to use another address space configuration. I'm thinking of > putting the code down very low (below the client executable, say from > 4M-16M), and sticking all the data up very high (~512Gbyte, or > something). That way the client gets a large address space in between > to play with. On x86-64, 64-bit executable get mapped to 0x400000 -- that's only 4MB above zero. 32-bit executables get mapped to 0x8048000 as on x86 but libraries get loaded very low, eg. 0x552000. So going below the executable doesn't look feasible. >> - fix_auxv() isn't working, because there are apparently only 2 auxv > > Strange. I suspect there's some dodgy code which assumes sizeof(int) == > sizeof(void *) in there. Yes, I changed that and it's working now. sizeof(argc) is actually 8-bytes; I'm seeing that in a few places, eg. syscalls, where things that are ints in C are actually 8-bytes... it's a bit confusing. > The VG_() stuff is somewhat redundant now, since nothing is "exported" > in the same way. That is, we don't need to do name mangling to prevent > namespace clashes with the client libraries. Why not? I like the VG_() anyway as it marks the exported functions clearly. N |
|
From: Jeremy F. <je...@go...> - 2004-07-26 19:49:34
|
On Mon, 2004-07-26 at 10:35 +0100, Nicholas Nethercote wrote: > On Sun, 25 Jul 2004, Jeremy Fitzhardinge wrote: > > > Are you generating a x86-32 or -64 executable? If the former, surely > > you'd be using the same binutils as on a plain x86-32 machine? > > x86-64. I'm aiming just at running 64-bit code first, then I'll worry > about 32-bit code. I'm not sure how much mixing of 32-bit and 64-bit code > is possible, that could become interesting. Well, I think there's definite benefits in running 32-bit code in 64-bit mode if available, because of all the extra address space and registers. > On x86-64, 64-bit executable get mapped to 0x400000 -- that's only 4MB > above zero. 32-bit executables get mapped to 0x8048000 as on x86 but > libraries get loaded very low, eg. 0x552000. So going below the > executable doesn't look feasible. I hadn't realized it was that low. But 4M is a lot of code - we can easily fit stage2, a tool and libc into 4M (especially if we change all the large static tables in core, memcheck and addrcheck to allocated memory). > > The VG_() stuff is somewhat redundant now, since nothing is "exported" > > in the same way. That is, we don't need to do name mangling to prevent > > namespace clashes with the client libraries. > > Why not? I like the VG_() anyway as it marks the exported functions > clearly. Well, with FV the client and Valgrind don't share a dynamic linker, so their symbols are in entirely different namespaces. The VG_() is from the LD_PRELOAD days to stop Valgrind's symbol names from overlapping with the clients. By "export", what do you mean? Exported to whom? Mostly they're internal; some are available for tool use, but there's no distinction made in the naming - it just depends on which header file they're declared in. J |
|
From: Bryan O'S. <bo...@se...> - 2004-07-26 20:51:26
|
On Sun, 2004-07-25 at 19:51 -0700, Jeremy Fitzhardinge wrote: > On Sun, 2004-07-25 at 22:40 +0100, Nicholas Nethercote wrote: > > Now I'm fighting through the startup stuff. I'm proceeding until > > something breaks, then I try to fix it. Or ignore it, if it doesn't look > > too dangerous. Here's what I've done so far, perhaps people can assist... > > > > - I noticed that the s/start/_ume_entry in stage2.lds is unnecessary since > > the -Wl,-e,_ume_entry option is passed to stage2. No matter. > > > > - I changed kickstart_base to 0x70000000; having it at 0xb0000000 I get > > lots of weird linker errors; it complains about truncations, I don't > > understand. > > Are you generating a x86-32 or -64 executable? If the former, surely > you'd be using the same binutils as on a plain x86-32 machine? You're getting relocation errors because under the small x86_64 code model, relocations are signed 32-bit quantities. > Unfortunately the x86-64 tools don't support generating code with 64-bit > relocations, so all code must be below 4G (I think including .so's). They do support 64-bit relocations; it's just not the default. The issue is that all the system shobjs are compiled with the small code model, so you can't mix and match code that has 32-bit relocations with code that doesn't. <b |
|
From: Nicholas N. <nj...@ca...> - 2004-07-26 21:10:22
|
On Mon, 26 Jul 2004, Bryan O'Sullivan wrote: >>> - I changed kickstart_base to 0x70000000; having it at 0xb0000000 I get >>> lots of weird linker errors; it complains about truncations, I don't >>> understand. > > You're getting relocation errors because under the small x86_64 code > model, relocations are signed 32-bit quantities. Ah, that makes sense; that's why I couldn't go above 0x80000000. Is there any way around it? N |
|
From: Jeremy F. <je...@go...> - 2004-07-26 21:55:45
|
On Mon, 2004-07-26 at 13:51 -0700, Bryan O'Sullivan wrote: > They do support 64-bit relocations; it's just not the default. The > issue is that all the system shobjs are compiled with the small code > model, so you can't mix and match code that has 32-bit relocations with > code that doesn't. I thought the tools only support small and medium, but not large? If we can stash all the Valgrind code below the main executable, small should be OK, right? Pointers are still 64-bit, so we can malloc stuff high? J |