|
From: Nicholas N. <nj...@ca...> - 2004-08-19 14:20:05
|
Hi,
I'm trying to get my head around system calls and their wrappers, kernel
headers, and the difference between kernel types and glibc types. This is
in relation to the Opteron port, and indeed all future $ARCH/Linux ports.
Here's what I understand to be correct:
1. Many glibc and kernel types differ. For the syscall wrappers in
vg_syscalls.c, and for the syscalls called directly from various functions
in vg_mylibc.c, we need to use the kernel types. When using glibc
functions (eg. in ume.c) we need to use the glibc ones.
The fact that we use a lot of glibc types by importing in vg_unsafe.h is
quite bogus, and we're getting away with it only because the glibc and
kernel types are often the same. (And it's likely that there are some
bugs hidden in there because of this, eg. getting sizes wrong in
pre_mem_read, things like that.)
[Nb: why do glibc and the kernel have different types? Is this so glibc
can have more control, and change the types if it wants?]
2. Therefore we should get rid of vg_unsafe.h, and instead put our own
versions of the kernel types (prefixed with "vki_") in vg_kerneliface.h.
In fact, at least some of the #includes in vg_unsafe.h aren't necessary,
at least on my system -- eg. <sys/mman.h>.
3. A lot of the types in vg_kerneliface.h don't have the "vki_" prefix,
but they almost certainly should, because otherwise it's inconsistent and
becomes unclear what types are in use.
4. Recent versions of the Linux source (eg. 2.6.3 doesn't have it, 2.6.7
does have it) have a file include/linux/syscalls.h with a big list of
syscall prototypes, eg:
asmlinkage long sys_open(const char __user *filename,
int flags, int mode);
Note that this is different to the glibc open(), which returns an 'int'.
Does anyone know about this? It seems pretty definitive. Most syscalls
are in there (268 of about 284), and for those that are, I think the
prototypes are correct for all platforms. (Although note that the meaning
of the types within the prototypes vary across all platforms.)
However, despite the seeming genericity, some of those pass arguments in
strange ways -- eg. select() on i386 uses the strange args-in-memory-block
method.
Then include/asm/unistd.h seems to have all the arch-specific ones, eg:
asm-i386/unistd.h:
asmlinkage int sys_clone(struct pt_regs regs);
asm-x86_64/unistd.h:
asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp,
void *parent_tid, void *child_tid,
struct pt_regs regs);
I think these prototypes are what should be in the comments for each PRE()
wrapper, as opposed to the ones from the man pages that are mostly
currently in there.
Also, I think we should:
- have generic wrappers for the generic syscalls (eg. open())
- have per-arch wrappers for the non-generic syscalls (eg. clone())
- not sure about the generic-but-weird ones (eg. select())
I'm imagining changing things so that each syscall is specified with a
macro like this:
SYS(long, open, const char*, filename, int, flags, int, mode)
and that as much as possible is derived from that -- eg. the printing for
--trace-syscalls=yes, the checking of args (which I've postponed for the
moment because there are complications), everything except the sys_flags
and the PRE()/POST() wrapper actions.
---
I'm happy to do (2) and (3) above now, if people agree they're the right
thing to do. I will do (4) when working on the Opteron port if that is
the right thing. Please let me know if I've got anything wrong, or am
missing anything important.
Thanks.
N
|
|
From: Paul M. <pa...@sa...> - 2004-08-20 01:10:10
|
Nicholas Nethercote writes: > Here's what I understand to be correct: > > 1. Many glibc and kernel types differ. For the syscall wrappers in > vg_syscalls.c, and for the syscalls called directly from various functions > in vg_mylibc.c, we need to use the kernel types. When using glibc > functions (eg. in ume.c) we need to use the glibc ones. Yes. > [Nb: why do glibc and the kernel have different types? Is this so glibc > can have more control, and change the types if it wants?] It's partly for backwards compatibility, I think, and maybe partly for POSIX compliance. Once a difference arises, it tends to persist. Also, the glibc people decided that they needed 64-bit dev_t but Linus vetoed that for the kernel as being just to enormously greater than what was actually needed. > 2. Therefore we should get rid of vg_unsafe.h, and instead put our own > versions of the kernel types (prefixed with "vki_") in vg_kerneliface.h. We will still be doing some glibc calls, won't we, though? > 4. Recent versions of the Linux source (eg. 2.6.3 doesn't have it, 2.6.7 > does have it) have a file include/linux/syscalls.h with a big list of > syscall prototypes, eg: > > asmlinkage long sys_open(const char __user *filename, > int flags, int mode); > > Note that this is different to the glibc open(), which returns an 'int'. > > Does anyone know about this? It seems pretty definitive. Most syscalls That is a list of prototypes of the internal functions that implement the various system calls. It looks like all system calls get vectored directly to the corresponding sys_blah() routine on x86 (except for a few that call old_blah() routines), but that is not true on other platforms; for example on ppc we intercept a few by vectoring them to a ppc_blah() function which then generally calls sys_blah(). The prototype would generally be the same though. > However, despite the seeming genericity, some of those pass arguments in > strange ways -- eg. select() on i386 uses the strange args-in-memory-block > method. On x86, the select system call invokes old_select(), which copies in the arguments and calls sys_select(). > Then include/asm/unistd.h seems to have all the arch-specific ones, eg: > > asm-i386/unistd.h: > asmlinkage int sys_clone(struct pt_regs regs); Note that the regs are passed by value! The regs parameter here is just the exception frame established by the system call entry, not anything the user program has to provide. > Also, I think we should: > - have generic wrappers for the generic syscalls (eg. open()) > - have per-arch wrappers for the non-generic syscalls (eg. clone()) > - not sure about the generic-but-weird ones (eg. select()) Select isn't entirely generic, x86 passes the args in a block in memory but ppc passes them in the normal fashion. > I'm happy to do (2) and (3) above now, if people agree they're the right > thing to do. I will do (4) when working on the Opteron port if that is > the right thing. Please let me know if I've got anything wrong, or am > missing anything important. I have a bunch of changes that I would like to see go in that start abstracting things so that PPC will fit in. The first change is to move the machine-dependent fields of ThreadState into a single `arch' member whose type is defined in an arch-dependent header. I'll send a patch out shortly. Paul. |
|
From: Nicholas N. <nj...@ca...> - 2004-08-20 13:22:04
|
On Fri, 20 Aug 2004, Paul Mackerras wrote: >> 2. Therefore we should get rid of vg_unsafe.h, and instead put our own >> versions of the kernel types (prefixed with "vki_") in vg_kerneliface.h. > > We will still be doing some glibc calls, won't we, though? Yes. Eg. stage1.c and ume.c use glibc functions, as does vg_main.c. But for the syscall wrappers, and when we call syscalls directly in vg_mylibc.c, we're dealing directly with the kernel. >> 4. Recent versions of the Linux source (eg. 2.6.3 doesn't have it, 2.6.7 >> does have it) have a file include/linux/syscalls.h with a big list of >> syscall prototypes, eg: >> >> asmlinkage long sys_open(const char __user *filename, >> int flags, int mode); > > That is a list of prototypes of the internal functions that implement > the various system calls. It looks like all system calls get vectored > directly to the corresponding sys_blah() routine on x86 (except for a > few that call old_blah() routines), but that is not true on other > platforms; for example on ppc we intercept a few by vectoring them to > a ppc_blah() function which then generally calls sys_blah(). Ok. I'm interested in the general case for all architectures. Basically I want to know which Linux syscall wrappers can be handled with the same code on all architectures, ie. go in common-linux/ (or whatever it gets called), and which syscall wrappers must go in x86-linux/ or ppc-linux/ or wherever. I'm hoping that most can go in common-linux/ to reduce the amount of code. > The prototype would generally be the same though. Only generally the same? Since syscalls are the main way that user programs interact with the kernel, I would expect that there is, somewhere, a way of determining exactly what the interface between them is. I was hoping include/linux/syscalls.h might have served as that official interface, but it seems like this is not quite the case? If not, how/where is this interface officially defined? Is it just implicitly scattered throughout the kernel? > Select isn't entirely generic, x86 passes the args in a block in > memory but ppc passes them in the normal fashion. Is there a way of knowing which syscalls are entirely generic and which ones have arch-specific aspects? > I have a bunch of changes that I would like to see go in that start > abstracting things so that PPC will fit in. The first change is to > move the machine-dependent fields of ThreadState into a single `arch' > member whose type is defined in an arch-dependent header. I'll send a > patch out shortly. Yes, it makes sense to get the directory structure right first, as discussed in my reply to your patch message. N |