|
From: Nicholas N. <nj...@ca...> - 2004-07-29 09:02:40
|
Hi, I've got Opteron working up to the point where the first BB is being translated. This means the entire startup procedure works, or at least is un-broken enough to get that far; also the dispatch loop must be working at least partially. If anyone wants a look, grab www.cl.cam.ac.uk/~njn25/vg-opteron.tar.bz2 It's just a tarball of my workspace. It unpacks into a directory called 'head6'. Do the usual autogen.sh/configure/make sequence. I did a test, and it worked for me, but it's possible something will go wrong on other machines. I don't want to make a branch yet, it is too premature. The code isn't very pretty. I'm taking the most direct route, just commenting out all the x86/32-bit-specific code and replacing it with x86-64/64-bit code. Each such place is marked with "OOO" (three capital Os, not three zeroes). It prints a whole lot of debugging output at startup. If anyone wants to look at this seriously, I recommend reading at least parts of the AMD64 ABI available at http://www.x86-64.org/documentation. I'd love to hear feedback, and particularly bug fixes :) Hmm, if a diff against the current CVS HEAD would be an easier way of obtaining this, let me know and I'll put that up (although I'll need to sync with the HEAD first). N |
|
From: Tom H. <th...@cy...> - 2004-07-29 09:23:03
|
In message <Pin...@he...>
Nicholas Nethercote <nj...@ca...> wrote:
> I've got Opteron working up to the point where the first BB is being
> translated. This means the entire startup procedure works, or at
> least is un-broken enough to get that far; also the dispatch loop
> must be working at least partially.
I just tried it on our box and it seems to get to about the same
point, then dies with a fatal signal delivered on wrong stack error.
> The code isn't very pretty. I'm taking the most direct route, just
> commenting out all the x86/32-bit-specific code and replacing it with
> x86-64/64-bit code. Each such place is marked with "OOO" (three
> capital Os, not three zeroes). It prints a whole lot of debugging
> output at startup.
There's one in vg_include.h that is OOö actually ;-)
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-29 22:43:28
|
On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote: >> - Are the types all meant to match exactly the "real" types? >> vki_ksigaction doesn't match sigaction, for one. (I've only looked at >> a few so far.) > > They're only supposed to match the syscall interface types. Ignore > anything in /usr/include (except /usr/include/(linux,asm)/, since that > will just mislead you. Ok... this is exactly the sort of thing I would really like to see explained in comments. I'm not a kernel hacker and I find a lot of the kernel-interface stuff, especially signals and threads, quite bewildering. Do you get your type definitions from /usr/include/linux and /usr/include/asm? Actually, I guess you get them from include/asm-whatever in the Linux source, right? > The kernel interface is fixed for all time (well, it extends), so > there's no serious problem with having our own copy of all this - like > glibc does - but it does cause duplicated effort. Ok, that makes me feel slightly better. But it's still not great. > I think we should probably strictly use the stdint types in > vg_kerneliface, so that we're explicit about the sizes of everything. What are "stdint types"? You mean like 'int' and 'short' rather than our 'Int' and 'Short'? Yes, I agree keeping the definitions as close as possible to the originals is a good idea. N |
|
From: Tom H. <th...@cy...> - 2004-07-29 22:53:37
|
In message <Pin...@he...>
Nicholas Nethercote <nj...@ca...> wrote:
> On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote:
>
> > I think we should probably strictly use the stdint types in
> > vg_kerneliface, so that we're explicit about the sizes of everything.
>
> What are "stdint types"? You mean like 'int' and 'short' rather than our
> 'Int' and 'Short'? Yes, I agree keeping the definitions as close as
> possible to the originals is a good idea.
I think he meant the types in the stdint.h header.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Paul M. <pa...@sa...> - 2004-08-01 01:27:18
|
Tom Hughes writes: > Well the kernel using magic numbers is pretty screwy, yes ;-) Why? It's not like the kernel could ever change to using different numbers. I would consider 0, 1 and 2 to be well-known constants. :) Paul. |
|
From: Tom H. <th...@cy...> - 2004-08-01 07:34:52
|
In message <166...@ca...>
Paul Mackerras <pa...@sa...> wrote:
> Tom Hughes writes:
>
> > Well the kernel using magic numbers is pretty screwy, yes ;-)
>
> Why? It's not like the kernel could ever change to using different
> numbers. I would consider 0, 1 and 2 to be well-known constants. :)
It's still bad software engineering practice if only because it makes
the code harder to read - if I see a condition that tests for origin
being SEEK_CUR then I can immediately understand it. If I see a condition
testing for the number 1 then I have no idea what that means without
going and looking up details elsewhere.
Plus if somebody is writing the code to handle seeks in a new filesystem
what is the risk of them making a mistake with the numbers relative to
that of them making a mistake with named constants?
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-29 09:34:59
|
On Thu, 29 Jul 2004, Tom Hughes wrote:
>> I've got Opteron working up to the point where the first BB is being
>> translated. This means the entire startup procedure works, or at
>> least is un-broken enough to get that far; also the dispatch loop
>> must be working at least partially.
>
> I just tried it on our box and it seems to get to about the same
> point, then dies with a fatal signal delivered on wrong stack error.
That's what I get too, so at least that's consistent; thanks, Tom.
I think the generated code is doing something bad that causes the signal.
I haven't made any changes to the code generated other than accounting for
8-byte words in the baseBlock, so it's probably seg faulting somehow.
Doing vg_{to,from}_ucode.c is going to be tricky.
Unfortunately, the FATAL SIGNAL error is not the correct response -- when
I make x86-Valgrind throw a signl from generated code (by embedding a
'ud2' in there) it dies with a much nicer "Process terminating with
default action of signal 4 (SIGILL): dumping core". So something's screwy
with signals.
N
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-29 15:08:30
|
On Thu, 29 Jul 2004, Nicholas Nethercote wrote:
> Unfortunately, the FATAL SIGNAL error is not the correct response -- when I
> make x86-Valgrind throw a signl from generated code (by embedding a 'ud2' in
> there) it dies with a much nicer "Process terminating with default action of
> signal 4 (SIGILL): dumping core". So something's screwy with signals.
I fixed this; problem was that one of the signal structs (can't remember
which one, now) in vg_kerneliface.h was the wrong size -- one of the
fields is 32-bits on x86, but 64-bits on x86-64.
Seeing this, I've been looking through vg_kerneliface.h for similar
examples. There are a few. This raises a few questions about this file
in general:
- I'm unclear about its exact use. It's there because we can't #include
the real headers, right? Can someone remind me why that is?
- Are the types all meant to match exactly the "real" types?
vki_ksigaction doesn't match sigaction, for one. (I've only looked at
a few so far.)
- Our definitions use a mix of built-in types (eg. unsigned long) and our
types (eg. ULong). This is ugly.
This whole file is a dirty hack. So is vg_unistd.h. Is there a better
way to do the same thing?
N
|
|
From: Paul M. <pa...@sa...> - 2004-07-29 16:21:54
|
Nicholas Nethercote writes: > This whole file is a dirty hack. So is vg_unistd.h. Is there a better > way to do the same thing? I have been wondering the same thing, in the context of the PPC port (yes I am still working on abstracting all the x86-specific stuff). My best idea at the moment is a set of arch-specific directories under include/, like we have under coregrind/. But we end up with an awful lot of -I flags on the compile commands though. Anybody got a better idea? Paul. |
|
From: Jeremy F. <je...@go...> - 2004-07-29 17:30:50
|
On Thu, 2004-07-29 at 11:20 -0500, Paul Mackerras wrote: > I have been wondering the same thing, in the context of the PPC port > (yes I am still working on abstracting all the x86-specific stuff). > > My best idea at the moment is a set of arch-specific directories under > include/, like we have under coregrind/. But we end up with an awful > lot of -I flags on the compile commands though. Anybody got a better > idea? We could copy the kernel build process and use a symlink. J |
|
From: Jeremy F. <je...@go...> - 2004-07-29 17:30:20
|
On Thu, 2004-07-29 at 16:08 +0100, Nicholas Nethercote wrote: > I fixed this; problem was that one of the signal structs (can't remember > which one, now) in vg_kerneliface.h was the wrong size -- one of the > fields is 32-bits on x86, but 64-bits on x86-64. > > Seeing this, I've been looking through vg_kerneliface.h for similar > examples. There are a few. This raises a few questions about this file > in general: > > - I'm unclear about its exact use. It's there because we can't #include > the real headers, right? Can someone remind me why that is? The libc headers are only vaguely related to the kernel interface headers. We could include linux/* headers, but that could into a pretty messy tarpit. My plan was to have separate vg_kerneliface headers for each arch/OS combination, since they are all separately defined. > - Are the types all meant to match exactly the "real" types? > vki_ksigaction doesn't match sigaction, for one. (I've only looked at > a few so far.) They're only supposed to match the syscall interface types. Ignore anything in /usr/include (except /usr/include/(linux,asm)/, since that will just mislead you. > - Our definitions use a mix of built-in types (eg. unsigned long) and our > types (eg. ULong). This is ugly. > > This whole file is a dirty hack. So is vg_unistd.h. Is there a better > way to do the same thing? The kernel interface is fixed for all time (well, it extends), so there's no serious problem with having our own copy of all this - like glibc does - but it does cause duplicated effort. I think we should probably strictly use the stdint types in vg_kerneliface, so that we're explicit about the sizes of everything. J |
|
From: Nicholas N. <nj...@ca...> - 2004-07-29 22:33:58
|
On Thu, 29 Jul 2004, Paul Mackerras wrote: >> This whole file is a dirty hack. So is vg_unistd.h. Is there a better >> way to do the same thing? > > I have been wondering the same thing, in the context of the PPC port > (yes I am still working on abstracting all the x86-specific stuff). > > My best idea at the moment is a set of arch-specific directories under > include/, like we have under coregrind/. But we end up with an awful > lot of -I flags on the compile commands though. Anybody got a better > idea? There are two different issues here -- how to know about kernel structures, and how to abstract out the arch/OS details. For the latter, I think arch- and OS-specific directories is the right way to go. That's how the Linux kernel does it, AFAICT. You'd need at most three -I flags, ie: -Iarch/ -IOS/ -Iarch-OS/ eg: -Ix86/ -Ilinux/ -Ix86-linux/ for the arch-specific code, OS-specific code, and arch+OS specific code. N |
|
From: Nicholas N. <nj...@ca...> - 2004-07-29 22:43:44
|
On Thu, 29 Jul 2004, Nicholas Nethercote wrote: > eg: > > -Ix86/ -Ilinux/ -Ix86-linux/ or symlinks, as Jeremy says. N |
|
From: Nicholas N. <nj...@ca...> - 2004-07-29 22:44:36
|
On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote: > The libc headers are only vaguely related to the kernel interface > headers. We could include linux/* headers, but that could into a pretty > messy tarpit. Why is that a messy tarpit? N |
|
From: Jeremy F. <je...@go...> - 2004-07-30 19:12:30
|
On Thu, 2004-07-29 at 23:44 +0100, Nicholas Nethercote wrote: > On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote: > > > The libc headers are only vaguely related to the kernel interface > > headers. We could include linux/* headers, but that could into a pretty > > messy tarpit. > > Why is that a messy tarpit? Mostly because you can't mix kernel and libc headers. They define lots of duplicates. That's why glibc has its own set of kernel headers. J |
|
From: Nicholas N. <nj...@ca...> - 2004-07-30 10:45:07
|
On Thu, 29 Jul 2004, Bob Friesenhahn wrote: >>> The libc headers are only vaguely related to the kernel interface >>> headers. We could include linux/* headers, but that could into a pretty >>> messy tarpit. >> >> Why is that a messy tarpit? > > Probably because the Linux kernel API is necessarily pretty stable but the > kernel header files change with each release. But if we're only using the API parts of the headers, as we are -- only certain types and certain constants -- is that still a problem? N |
|
From: Nicholas N. <nj...@ca...> - 2004-07-30 10:47:00
|
On Thu, 29 Jul 2004, Bob Friesenhahn wrote: > Does it seem to you that valgrind would benefit if part of it was included in > the Linux kernel and glibc environment? Maybe once Linux stablizes a bit > more it can offer a "valgrind" mode. Programs would run faster and valgrind > wouldn't have to replicate so much functionality. It's nice that you say "once Linux stabilizes"... I'd be more concerned about Valgrind stabilising :) Solaris 10 has dtrace in it, which has a lot of similar features to Valgrind. But I can't see Valgrind and Linux merging any time soon. If you ran the idea past a kernel developer, I'd be interested to hear their reaction :) N |
|
From: Nicholas N. <nj...@ca...> - 2004-07-30 13:28:40
|
On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote:
> They're only supposed to match the syscall interface types. Ignore
> anything in /usr/include (except /usr/include/(linux,asm)/, since that
> will just mislead you.
The constants VKI_SEEK_{SET,CUR,END} are in vg_kerneliface.h. I can't
find SEEK_{SET,CUR,END} in the linux-2.6.7 sources. I can find them in
/usr/include/fcntl.h.
Am I missing something obvious, or is this screwy?
N
|
|
From: Tom H. <th...@cy...> - 2004-07-30 13:40:22
|
In message <Pin...@he...>
Nicholas Nethercote <nj...@ca...> wrote:
> On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote:
>
>> They're only supposed to match the syscall interface types. Ignore
>> anything in /usr/include (except /usr/include/(linux,asm)/, since that
>> will just mislead you.
>
> The constants VKI_SEEK_{SET,CUR,END} are in vg_kerneliface.h. I can't
> find SEEK_{SET,CUR,END} in the linux-2.6.7 sources. I can find them
> in /usr/include/fcntl.h.
It looks like the kernel is using magic numbers instead of named
constants... Look at the llseek routines in fs/read_write.c
for the sheer horrible truth.
> Am I missing something obvious, or is this screwy?
Well the kernel using magic numbers is pretty screwy, yes ;-)
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Nicholas N. <nj...@ca...> - 2004-07-30 16:11:23
|
On Thu, 29 Jul 2004, Jeremy Fitzhardinge wrote: > They're only supposed to match the syscall interface types. Ignore > anything in /usr/include (except /usr/include/(linux,asm)/, since that > will just mislead you. Looks like 'struct vki_epoll_event' is really the glibc definition, right? N |
|
From: Nicholas N. <nj...@ca...> - 2004-07-30 16:54:45
|
On Fri, 30 Jul 2004, Nicholas Nethercote wrote: >> They're only supposed to match the syscall interface types. Ignore >> anything in /usr/include (except /usr/include/(linux,asm)/, since that >> will just mislead you. > > Looks like 'struct vki_epoll_event' is really the glibc definition, right? And what about vg_unsafe.h? Its presence means that a whole lot of the types used in vg_syscalls.c are really the glibc versions, right? This stuff seems like a giant mess... N |
|
From: Jeremy F. <je...@go...> - 2004-07-30 19:14:51
|
On Fri, 2004-07-30 at 17:54 +0100, Nicholas Nethercote wrote: > On Fri, 30 Jul 2004, Nicholas Nethercote wrote: > > >> They're only supposed to match the syscall interface types. Ignore > >> anything in /usr/include (except /usr/include/(linux,asm)/, since that > >> will just mislead you. > > > > Looks like 'struct vki_epoll_event' is really the glibc definition, right? > > And what about vg_unsafe.h? Its presence means that a whole lot of the > types used in vg_syscalls.c are really the glibc versions, right? This > stuff seems like a giant mess... Yup. vg_unsafe really is unsafe - we shouldn't use it at all (I've been actively incrementally removing things from it and putting copies into vg_kerneliface, and certainly not adding anything new to it. Of course, some things we're dealing with *are* glibc types (stuff around malloc, and other intercepted glibc routines, etc), so its appropriate to use the glibc definitions there. That too requires care, because we're not paying attention to symbol versioning, and glibc uses it extensively to handle backwards-compatibility. J |
|
From: Nicholas N. <nj...@ca...> - 2004-07-30 21:36:54
|
On Fri, 30 Jul 2004, Jeremy Fitzhardinge wrote: >> And what about vg_unsafe.h? Its presence means that a whole lot of the >> types used in vg_syscalls.c are really the glibc versions, right? This >> stuff seems like a giant mess... > > Yup. vg_unsafe really is unsafe - we shouldn't use it at all (I've been > actively incrementally removing things from it and putting copies into > vg_kerneliface, and certainly not adding anything new to it. > > Of course, some things we're dealing with *are* glibc types (stuff > around malloc, and other intercepted glibc routines, etc), so its > appropriate to use the glibc definitions there. Sure. But vg_unsafe.h is only used by vg_syscalls.c for providing types for talking to syscalls. So it should die a miserable death. > That too requires care, because we're not paying attention to symbol > versioning, and glibc uses it extensively to handle > backwards-compatibility. Oh goody, one more thing to get wrong... is this likely to be a problem in practice for us? N |