|
From: Tom H. <to...@co...> - 2005-03-29 13:53:09
|
A number of routines in vg_mylibc.c are simple wrappers around system calls which means that they are OS and/or platform specific and shouldn't be in the core code. In fact a number of them are currently disabled on amd64 because the system call they need to make is different there to x86-linux. One obvious answer would be to move them into the linux or x86-linux/amd64-linux directories as appropriate. They would to keep the VG_() designation though as it would be OS dependent whether they needed to be in the OS or platform directory and therefore have a VGO_() or VGP() designation. The only one that have been moved so far seems to be mmap which was done by creating a VGP_DO_MMAP macro although that is also used in a number of other places. Does anybody have any opinions on how best to deal with with this? and is anybody already working on resolving this or is it clear for me to take a look at it? Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: Julian S. <js...@ac...> - 2005-03-29 16:39:21
|
On Tuesday 29 March 2005 14:53, Tom Hughes wrote:
> A number of routines in vg_mylibc.c are simple wrappers around
> system calls which means that they are OS and/or platform specific
> and shouldn't be in the core code. In fact a number of them are
> currently disabled on amd64 because the system call they need to
> make is different there to x86-linux.
>
> One obvious answer would be to move them into the linux or
> x86-linux/amd64-linux directories as appropriate. They would
> to keep the VG_() designation though as it would be OS dependent
> whether they needed to be in the OS or platform directory and
> therefore have a VGO_() or VGP() designation.
>
> The only one that have been moved so far seems to be mmap which
> was done by creating a VGP_DO_MMAP macro although that is also
> used in a number of other places.
>
> Does anybody have any opinions on how best to deal with with
> this? and is anybody already working on resolving this or is
> it clear for me to take a look at it?
It certainly needs cleaning up. I mused on this a while back.
What I want to happen is for a Kernel-services Abstraction Layer
(Kal) to appear. In the same way that Moz has NPSR, OOo has
SAL, etc. Kal will provide services like mmap etc whilst hiding
the details of how the call is done on different platforms. Of
course Kal will be far simpler than NPSR, SAL, etc, but the idea
is the same.
So that's something worth looking into. I plan to do this myself
anyway at some stage, but if you want to look into it, please do.
A related problem is that in various places there are
direct use of syscalls (viz, do_syscallN(args)). This is
at odds with portability, and those should be replaced with
calls to the Kal interface. My intention is that, apart from
the syscall wrappers, doing syscalls directly is banned: all
requests for kernel services must go through Kal.
Another thing is that I don't think the current division of the
source base into generic stuff, x86 stuff, amd64 stuff, etc, is
necessarily the best top-level structure. Although it does
separate out the plat-dependent stuff, it obscures module
boundaries -- which imo are the most important thing to have
a clear picture of, from a maintenance/understanding point of view.
This is because, for example, to build such a Kal module, you
would have some generic stuff in one set of dirs/files, but also
bits of Kal scattered in the x86-linux, amd64-linux, etc, trees.
Instead, it would be better to put the Kal module/subsystem
in its own directory, kal, and put all the arch/os specific bits
just for Kal in there:
kal/pub_core_kal.h -- services exported from kal; core but not
tools may use these
kal/pub_tool_kal.h -- services exported from kal; both core and
tools may use these
kal/any_other_name.h -- private header files for use only within
kal/
kal/kal-x86-linux.c -- x86-linux specific implementation
kal/kal-amd64-linux.c -- amd64-linux specific implementation
kal/any_other_name.c -- generic implementation
I hope to diffuse the code base towards this kind of module structure
on an ongoing basis. An auxiliary idea is then to have a perl script
which runs at build time, which examines #include statements and
thereby mechanically enforces the requirement that only the publically
visible interfaces of each module are used outside of that module.
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-03-30 00:29:18
|
On Tue, 29 Mar 2005, Julian Seward wrote: > What I want to happen is for a Kernel-services Abstraction Layer > (Kal) to appear. In the same way that Moz has NPSR, OOo has > SAL, etc. Kal will provide services like mmap etc whilst hiding > the details of how the call is done on different platforms. Of > course Kal will be far simpler than NPSR, SAL, etc, but the idea > is the same. > > So that's something worth looking into. I plan to do this myself > anyway at some stage, but if you want to look into it, please do. Tom, perhaps you could generate some patches and post them to the list to see how they turn out. It will be interesting to see how much libc stuff is platform-specific. N |
|
From: Tom H. <to...@co...> - 2005-03-30 08:48:41
Attachments:
kal.patch
|
In message <Pin...@ch...>
Nicholas Nethercote <nj...@cs...> wrote:
> On Tue, 29 Mar 2005, Julian Seward wrote:
>
>> What I want to happen is for a Kernel-services Abstraction Layer
>> (Kal) to appear. In the same way that Moz has NPSR, OOo has
>> SAL, etc. Kal will provide services like mmap etc whilst hiding
>> the details of how the call is done on different platforms. Of
>> course Kal will be far simpler than NPSR, SAL, etc, but the idea
>> is the same.
>>
>> So that's something worth looking into. I plan to do this myself
>> anyway at some stage, but if you want to look into it, please do.
>
> Tom, perhaps you could generate some patches and post them to the list
> to see how they turn out. It will be interesting to see how much libc
> stuff is platform-specific.
Attached is a first cut at a kernel abstraction layer. It removes
most of the direct system calls from the coregrind directory and
moves them into KAL routines. As a side effect it implements the
missing mylibc functionality for amd64.
I've made a start on removing VKI types from the KAL interface as
any abstraction layer will need to be independent of the types used
to communicate with the kernel. Many routines still use VKI types
in their interface however.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Jeremy F. <je...@go...> - 2005-03-30 09:58:04
|
Tom Hughes wrote: >Attached is a first cut at a kernel abstraction layer. It removes >most of the direct system calls from the coregrind directory and >moves them into KAL routines. As a side effect it implements the >missing mylibc functionality for amd64. > >I've made a start on removing VKI types from the KAL interface as >any abstraction layer will need to be independent of the types used >to communicate with the kernel. Many routines still use VKI types >in their interface however. > > I think we should just use the libc API for this, if not the system glibc. Would uclibc be a good basis (http://www.uclibc.org/)? It would get us out of the business of maintaining a C library as well. If nothing else, the kal_* functions may as well be called VG_(getpid)(), etc, rather than either changing all the callers or having a file full of stub functions. > Int VG_(getppid) ( void ) > { >- Int res; >- res = VG_(do_syscall0)(__NR_getppid); >- return res; >+ return VG_(getppid)(); > } > > This might take a while to terminate... >+Addr VG_(kal_mmap)(Addr start, SizeT length, UInt prot, UInt flags, >+ UInt fd, OffT offset) >+{ >+ UWord args[6]; >+ >+ args[0] = start; >+ args[1] = length; >+ args[2] = prot; >+ args[3] = flags; >+ args[4] = fd; >+ args[5] = offset; >+ >+ return VG_(do_syscall1)(__NR_mmap, (UWord)args); >+} > > I think mmap2 would be preferable here. J |
|
From: Tom H. <to...@co...> - 2005-03-30 13:17:58
|
In message <424...@go...>
Jeremy Fitzhardinge <je...@go...> wrote:
> If nothing else, the kal_* functions may as well be called
> VG_(getpid)(), etc, rather than either changing all the callers or
> having a file full of stub functions.
I was a bit unsure whether or not to get rid of the libc layer
altogether - in many cases it makes sense as the routines are
just stubs to forward to the KAL routines but some like mmap
and friends have extra logic in the libc layer to update the
segment mappings and similar.
>> Int VG_(getppid) ( void )
>> {
>>- Int res;
>>- res = VG_(do_syscall0)(__NR_getppid);
>>- return res;
>>+ return VG_(getppid)();
>> }
>>
> This might take a while to terminate...
Fixed in my local tree.
>>+Addr VG_(kal_mmap)(Addr start, SizeT length, UInt prot, UInt flags,
>>+ UInt fd, OffT offset)
>>+{
>>+ UWord args[6];
>>+
>>+ args[0] = start;
>>+ args[1] = length;
>>+ args[2] = prot;
>>+ args[3] = flags;
>>+ args[4] = fd;
>>+ args[5] = offset;
>>+
>>+ return VG_(do_syscall1)(__NR_mmap, (UWord)args);
>>+}
>>
> I think mmap2 would be preferable here.
The code was all copied from what the libc code was already doing
but you're probably right. It might make sense to use rt_sigpending
on x86 as well, that way we can share with amd64. The other signal
routines already use the rt version anyway.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2005-03-31 10:32:47
|
This strikes me as a bit like the kernel-headers issue. Although 1 and
2 are attractive from the amount-of-work point of view, the only sane way
to decouple ourselves from the assumptions/difficulties of pre-supplied
kernel headers on random distros Mutant-Linux (etc) was to ship our own.
And so it will be here. Using glibc brings two problems:
* We then have to figure out all the bad ways glibc and valgrind
can interact, be sure we've discovered them all, implement workarounds
for them all, ensure those workarounds don't give rise to more
such problems, and ensure those workarounds don't mess up
maintainability too much.
* What happens on platforms where there is no glibc? MacOSX
and the BSDs? Solaris?
> Currently we use glibc in a number of modules: vg_main.c, ume.c, stage1.c,
> vg_stabs.c, vg_symtab2.c, $PLATFORM/core_platform.h, $ARCH/state.c,
> vg_pthreadmodel.c, vg_skiplist.c.
It's worth distinguishing between "uses an external header" and
"uses glibc-supplied functions". The first is not such a big
deal (eg, #include <elf.h>). The second is more of a problem.
I had a go at enumerating what we are using in the files Nick
mentioned. Nick, are you sure you found all files using libc
stuff? Anyway:
vg_skiplist: random -- trivially replaced
vg_pthreadmodel: pthread.h, for pthread types
-- seems unavoidable if we want to do pthread modelling
x86/state.c: ptrace, in ptrace_setregs_from_tst
-- not sure what this is for; it does not look to me like
"primary functionality" though
x86-linux/core_platform.h: setjmp.h (setjmp, longjmp, jmp_buf)
-- we need a setjmp/longjmp facility. 1.0.X used GCC's
__builtin_setjmp/__builtin_longjmp to avoid this
dependency; am surprised it is back.
vg_symtab2.c: elf.h, for ELF types/structs/consts, no fns
-- seems fairly harmless, conceivably we could copy the headers
if need be
vg_stabs.c: a.out.h, for stabs types/structs/consts, no fns
-- ditto
stage1.c: not sure this is a problem
-- stage1 can use what it likes since stage2 completely
gets rid of it at startup.
vg_ume.c:
fprintf exit open perror read close sscanf strchr malloc
assert pread strerror memcmp memset free mmap strdup printf
geteuid getegid getgroups stderr errno
These all look like stuff we either do supply ourselves or
could easily do
vg_main.c:
fork ptrace kill fprintf mmap munmap snprintf fstat malloc
read close getenv printf strlen memcmp free access strdup strchr
opendir strerror readdir strncmp closedir dlopen dlerror dlsym
dlclose abort
Apart from ptrace (what's that for? is it essential) the
dl* functions are the only worrying ones.
Most of the glibc functions we use, we already have or can easily
supply. The only stuff which is worrying is:
setjmp / longjmp
dlopen/dlsym/dlclose
ptrace, possibly
Assuming ptrace is fairly harmless (wrapper round the syscall) then
the real worriers are dlopen/dlsym/dlclose and possibly setjmp/longjmp.
J
|
|
From: Greg P. <gp...@us...> - 2005-03-31 19:21:01
|
Julian Seward writes: > * What happens on platforms where there is no glibc? MacOSX > and the BSDs? Solaris? Using the native libc on Mac OS X is hard. Valgrind can't share the client's libc because of data collisions and deadlocks. It's hard for Valgrind to load its own copy of libc because libc is a specially built library with difficult contraints on its position in the address space. I'd recommend that Valgrind duplicate any necessary libc functionality, especially if the requirements remain small. > Most of the glibc functions we use, we already have or can easily > supply. The only stuff which is worrying is: > > setjmp / longjmp These can be duplicated without too much trouble. They're processor-specific, but small and well-understood. > dlopen/dlsym/dlclose These are in the dynamic linker on Mac OS X, not libc, and Valgrind does load its own copy of the dynamic linker. Not a problem for Mac OS X. > ptrace, possibly I'm not sure how much of ptrace() actually works on Mac OS X. -- Greg Parker gp...@us... |
|
From: Jeremy F. <je...@go...> - 2005-03-31 21:25:41
|
Julian Seward wrote:
>This strikes me as a bit like the kernel-headers issue. Although 1 and
>2 are attractive from the amount-of-work point of view, the only sane way
>to decouple ourselves from the assumptions/difficulties of pre-supplied
>kernel headers on random distros Mutant-Linux (etc) was to ship our own.
>
>
Yep, which is why option 2 is to ship someone else's (small, simple)
libc with Valgrind. We get the option to patch it if it has really
unwanted behaviour, but otherwise leave it untouched.
> x86/state.c: ptrace, in ptrace_setregs_from_tst
> -- not sure what this is for; it does not look to me like
> "primary functionality" though
>
>
That's part of the attach-gdb stuff. That should all be hidden behind a
single arch/OS-specific attach_db() interface. And it should be a vki
header rather than a libc one.
> x86-linux/core_platform.h: setjmp.h (setjmp, longjmp, jmp_buf)
> -- we need a setjmp/longjmp facility. 1.0.X used GCC's
> __builtin_setjmp/__builtin_longjmp to avoid this
> dependency; am surprised it is back.
>
>
setjmp is pretty harmless; it does what you'd expect (ie, copies values
between register and memory). sigsetjmp plays with signal state, which
is unwanted. We could switch back to using __builtin_set/longjmp; I
changed it for a bit because I was using longjmp's ability to pass a
value back to setjmp (__builtin_longjmp can only pass back '1'), but
decided to do it another way in the end.
> vg_symtab2.c: elf.h, for ELF types/structs/consts, no fns
> -- seems fairly harmless, conceivably we could copy the headers
> if need be
>
> vg_stabs.c: a.out.h, for stabs types/structs/consts, no fns
> -- ditto
>
> stage1.c: not sure this is a problem
> -- stage1 can use what it likes since stage2 completely
> gets rid of it at startup.
>
> vg_ume.c:
> fprintf exit open perror read close sscanf strchr malloc
> assert pread strerror memcmp memset free mmap strdup printf
> geteuid getegid getgroups stderr errno
>
> These all look like stuff we either do supply ourselves or
> could easily do
>
>
The only complexity is that ume.c is used from both stage1 and stage2,
so replacing these calls would mean linking our home-rolled libc into
stage1 as well. Also, a lot of them are redundant; the scanf is used as
part of the /proc/self/maps parser, but it should probably just use
parseprocselfmaps.
> vg_main.c:
> fork ptrace kill fprintf mmap munmap snprintf fstat malloc
> read close getenv printf strlen memcmp free access strdup strchr
> opendir strerror readdir strncmp closedir dlopen dlerror dlsym
> dlclose abort
>
> Apart from ptrace (what's that for? is it essential) the
> dl* functions are the only worrying ones.
>
>
Most of these are used before Valgrind's internal infrastructure is up.
If the internal allocator can be initialized much eariler, then we can
use our own versions of a lot of these.
>Most of the glibc functions we use, we already have or can easily
>supply. The only stuff which is worrying is:
>
> setjmp / longjmp
>
>
We can go back to using the builtin versions, no problem.
> dlopen/dlsym/dlclose
>
>
Pretty unavoidable, and these have been surprisingly non-troublesome (I
can't think of an instance of a distro-specific bug being in unable to
load a tool .so).
> ptrace, possibly
>
>
Should be deeply hidden in a platform-specific corner.
J
|
|
From: Nicholas N. <nj...@cs...> - 2005-04-01 04:47:16
|
On Thu, 31 Mar 2005, Julian Seward wrote: > I had a go at enumerating what we are using in the files Nick > mentioned. Nick, are you sure you found all files using libc > stuff? Anyway: I just grep'd for "#include <"... N |
|
From: Tom H. <to...@co...> - 2005-03-30 13:12:50
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> Instead, it would be better to put the Kal module/subsystem
> in its own directory, kal, and put all the arch/os specific bits
> just for Kal in there:
>
> kal/pub_core_kal.h -- services exported from kal; core but not
> tools may use these
> kal/pub_tool_kal.h -- services exported from kal; both core and
> tools may use these
How are we going to decide what is and isn't exported? Do we make
everything core only to start with and move it to the tool header
if a tool needs it? or make everything available to tools unless
we have some reason to believe it can only work in the core?
> kal/kal-x86-linux.c -- x86-linux specific implementation
> kal/kal-amd64-linux.c -- amd64-linux specific implementation
> kal/any_other_name.c -- generic implementation
I'm not sure you can have a generic implementation of anything in the
kernel abstraction layer ;-)
I also added a kal-linux.c for things common to all linux builds
regardless of architecture.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Nicholas N. <nj...@cs...> - 2005-03-30 15:15:43
|
On Wed, 30 Mar 2005, Tom Hughes wrote: >> Instead, it would be better to put the Kal module/subsystem >> in its own directory, kal, and put all the arch/os specific bits >> just for Kal in there: >> >> kal/pub_core_kal.h -- services exported from kal; core but not >> tools may use these >> kal/pub_tool_kal.h -- services exported from kal; both core and >> tools may use these > > How are we going to decide what is and isn't exported? Do we make > everything core only to start with and move it to the tool header > if a tool needs it? or make everything available to tools unless > we have some reason to believe it can only work in the core? The former; the tool should see as little as possible. That's how it has been done in core.h and tool.h. >> kal/kal-x86-linux.c -- x86-linux specific implementation >> kal/kal-amd64-linux.c -- amd64-linux specific implementation >> kal/any_other_name.c -- generic implementation > > I'm not sure you can have a generic implementation of anything in the > kernel abstraction layer ;-) I guess it depends on whether you keep as a separate layer, as you did in the patch, or fold that into KAL. N |
|
From: Julian S. <js...@ac...> - 2005-03-30 16:15:23
|
On Wednesday 30 March 2005 16:15, Nicholas Nethercote wrote: > On Wed, 30 Mar 2005, Tom Hughes wrote: > >> Instead, it would be better to put the Kal module/subsystem > >> in its own directory, kal, and put all the arch/os specific bits > >> just for Kal in there: > >> > >> kal/pub_core_kal.h -- services exported from kal; core but not > >> tools may use these > >> kal/pub_tool_kal.h -- services exported from kal; both core and > >> tools may use these > > > > How are we going to decide what is and isn't exported? Do we make > > everything core only to start with and move it to the tool header > > if a tool needs it? or make everything available to tools unless > > we have some reason to believe it can only work in the core? > > The former; the tool should see as little as possible. That's how it has > been done in core.h and tool.h. Yup. Tools may see only a subset of what the core may see. So (as is the case with Tom's patch) tool.h exports as little as possible, and as much as possible (here, everything) is in core.h. The only change then needed is that core.h #includes tool.h so as to guarantee the subset/superset relation. > >> kal/kal-x86-linux.c -- x86-linux specific implementation > >> kal/kal-amd64-linux.c -- amd64-linux specific implementation > >> kal/any_other_name.c -- generic implementation > > > > I'm not sure you can have a generic implementation of anything in the > > kernel abstraction layer ;-) > > I guess it depends on whether you keep as a separate layer, as you did in > the patch, or fold that into KAL. I presume you're referring to what to do about VG_(strlen) et al? I think those could go in a different module as they are all pretty harmless. > If nothing else, the kal_* functions may as well be called > VG_(getpid)(), etc, rather than either changing all the callers or > having a file full of stub functions. No ... I prefer to keep the kal_ prefix. It isn't intrusive and it avoids the burden of having to remember whether VG_(obscure_foo) is a Kal function or not. In the same way that all the POSIX pthreads functions and types are trivially identifiable from their names. J |
|
From: Tom H. <to...@co...> - 2005-03-30 16:29:19
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> On Wednesday 30 March 2005 16:15, Nicholas Nethercote wrote:
>
> > If nothing else, the kal_* functions may as well be called
> > VG_(getpid)(), etc, rather than either changing all the callers or
> > having a file full of stub functions.
>
> No ... I prefer to keep the kal_ prefix. It isn't intrusive and
> it avoids the burden of having to remember whether VG_(obscure_foo)
> is a Kal function or not. In the same way that all the POSIX
> pthreads functions and types are trivially identifiable from their
> names.
The other option of course is a new VGK_() or VGKAL_() prefix.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Nicholas N. <nj...@cs...> - 2005-03-31 04:36:10
|
On Wed, 30 Mar 2005, Tom Hughes wrote: >> Instead, it would be better to put the Kal module/subsystem >> in its own directory, kal, and put all the arch/os specific bits >> just for Kal in there: >> >> kal/pub_core_kal.h -- services exported from kal; core but not >> tools may use these >> kal/pub_tool_kal.h -- services exported from kal; both core and >> tools may use these >> >> kal/kal-x86-linux.c -- x86-linux specific implementation >> kal/kal-amd64-linux.c -- amd64-linux specific implementation >> kal/any_other_name.c -- generic implementation Hmm... we were just discussing how having all the arch-specific code in one module/directory wasn't really the right thing to do -- that it's better to follow modules boundaries, and possibly have some arch-specific code in each module. Isn't there a similar situation here? Ie. really the module (or modules) here is our libc replacement or 'utility' module(s) or whatever you want to call it. In which case we shouldn't have a separate kal/ directory just for the arch-specific things. Does that make sense? N |
|
From: Julian S. <js...@ac...> - 2005-03-30 15:19:13
|
> > Tom, perhaps you could generate some patches and post them to the list > > to see how they turn out. It will be interesting to see how much libc > > stuff is platform-specific. > > Attached is a first cut at a kernel abstraction layer. It removes > most of the direct system calls from the coregrind directory and > moves them into KAL routines. As a side effect it implements the > missing mylibc functionality for amd64. Tom That looks like an excellent start. Does it compile/build on both x86 and amd64? > I've made a start on removing VKI types from the KAL interface as > any abstraction layer will need to be independent of the types used > to communicate with the kernel. Many routines still use VKI types > in their interface however. I guess kal basically implements a tiny subset of libc functionality. We need to consider a way to deal with returning error values. Currently those functions return -1 if there's a problem, but: * that precludes them returning -1 as a legitimate return value * having -1 as the only return value may not give enough detail for the caller to diagnose an error return and take appropriate action Not sure how to fix this. We could copy the libc madness and have an errno which gets set with a value indicating the problem. Or we could give all kal functions an Int* first parameter into which an error value is written if there is an error. In either case we'd need an kal-specific enum holding error codes (KAL_EINVAL, KAL_EBADF, etc), and, for each target, a function which converts kernel return values for the target into KAL_E* values. Comments? > I think we should just use the libc API for this, if not the system > glibc. Yes. The above suggestion effectively enshrines that, modulo perhaps sidestepping the errno uglyness for returning error codes. > Would uclibc be a good basis (http://www.uclibc.org/)? It would > get us out of the business of maintaining a C library as well. Currently we only use a tiny subset of the libc functionality. Its behaviour is well defined and understood and we haven't had a problem maintaining it. If we were going to use most/all of libc then uclibc might be a good choice. But considering that we don't have a maintenance problem, using someone else's libc just imports another load of assumptions and constraints to work around, and I don't want that. Besides, it would then constrain us to platforms on which uclibc works. J |
|
From: Tom H. <to...@co...> - 2005-03-30 16:08:47
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
>> > Tom, perhaps you could generate some patches and post them to the list
>> > to see how they turn out. It will be interesting to see how much libc
>> > stuff is platform-specific.
>>
>> Attached is a first cut at a kernel abstraction layer. It removes
>> most of the direct system calls from the coregrind directory and
>> moves them into KAL routines. As a side effect it implements the
>> missing mylibc functionality for amd64.
>
> That looks like an excellent start. Does it compile/build on both
> x86 and amd64?
It does now - the version I sent before had a few minor issues
on x86 but I've just fixed them.
>> I've made a start on removing VKI types from the KAL interface as
>> any abstraction layer will need to be independent of the types used
>> to communicate with the kernel. Many routines still use VKI types
>> in their interface however.
>
> I guess kal basically implements a tiny subset of libc functionality.
> We need to consider a way to deal with returning error values.
> Currently those functions return -1 if there's a problem, but:
>
> * that precludes them returning -1 as a legitimate return value
>
> * having -1 as the only return value may not give enough detail
> for the caller to diagnose an error return and take appropriate
> action
>
> Not sure how to fix this. We could copy the libc madness and have
> an errno which gets set with a value indicating the problem.
> Or we could give all kal functions an Int* first parameter into
> which an error value is written if there is an error.
Yep, the whole question of error codes was another thing that was
a bit thorny when I was doing it. I mostly tried to push the is_kerror
tests down into KAL but some of them are still outside which is wrong.
I agree that we really want out of band error reporting, which means
either reserving the return value for the error and returning any
other values through reference arguments, or adding an error argument
to each routine, or using a global.
> In either case we'd need an kal-specific enum holding error codes
> (KAL_EINVAL, KAL_EBADF, etc), and, for each target, a function
> which converts kernel return values for the target into KAL_E*
> values.
Sounds about right. Error codes presumably have to be in the tool
visible header, along with any structures that tool visible functions
want to use in their interface.
>> I think we should just use the libc API for this, if not the system
>> glibc.
>
> Yes. The above suggestion effectively enshrines that, modulo
> perhaps sidestepping the errno uglyness for returning error codes.
Well the other thing that I did for several routines was to use
a simpler interface that didn't rely on kernel structures, so for
example gettimeofday just returns a ULong and nanosleep takes a
ULong rather than faffing about with a structure.
Likewise for getrlimit/setrlimit which were the other ones where I
made an attempt to remove VKI types from the interface.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Tom H. <to...@co...> - 2005-03-31 09:08:39
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> I presume you'll have an updated patch after a bit more iteration?
I've got an updated patch now - it's getting a bit large so I've
gzipped it and put it on the web at:
http://www.compton.nu/kal.patch.gz
It builds on x86 and amd64. There is still work to be done removing
the vki structures from the kal interface but the error handling is
there and the stub routines in mylibc have been removed.
I had to move the tool header file to the main includes directory
so the tools could see it.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Tom H. <to...@co...> - 2005-03-31 14:50:09
|
In message <Pin...@ch...>
Nicholas Nethercote <nj...@cs...> wrote:
> On Thu, 31 Mar 2005, Tom Hughes wrote:
>
>> I've got an updated patch now - it's getting a bit large so I've
>> gzipped it and put it on the web at:
>>
>> http://www.compton.nu/kal.patch.gz
>>
>> It builds on x86 and amd64. There is still work to be done removing
>> the vki structures from the kal interface but the error handling is
>> there and the stub routines in mylibc have been removed.
>>
>> I had to move the tool header file to the main includes directory
>> so the tools could see it.
>
> Which header file is that? tool.h is already in include/ ...
The pub_tool_kal.h one that Julian described in the original design.
There are already pub_tool_execontext.h and pub_tool_stacktrace.h
files in there which I think came from you didn't they?
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Nicholas N. <nj...@cs...> - 2005-03-31 14:57:21
|
On Thu, 31 Mar 2005, Tom Hughes wrote: >>> I had to move the tool header file to the main includes directory >>> so the tools could see it. >> >> Which header file is that? tool.h is already in include/ ... > > The pub_tool_kal.h one that Julian described in the original design. Ah, that's fine. > There are already pub_tool_execontext.h and pub_tool_stacktrace.h > files in there which I think came from you didn't they? Yep. N |
|
From: Julian S. <js...@ac...> - 2005-03-30 16:36:57
|
> > That looks like an excellent start. Does it compile/build on both > > x86 and amd64? > > It does now - the version I sent before had a few minor issues > on x86 but I've just fixed them. Cool. > Yep, the whole question of error codes was another thing that was > a bit thorny when I was doing it. I mostly tried to push the is_kerror > tests down into KAL but some of them are still outside which is wrong. I entirely agree. > I agree that we really want out of band error reporting, which means > either reserving the return value for the error and returning any > other values through reference arguments, or adding an error argument > to each routine, or using a global. A global (errno) is ugly and gives problems should we ever get into a genuinely multithreaded environment. Returning the normal return as a ref param is a bit cumbersome, because often we want to see the return value but ignore the error value (eg, kal_getpid; that cannot fail). Hence it seems to me the cleanest solution is for the error value to be a var parameter which we mandate always to be the first param. We allow it to be NULL. Then we can conveniently do pid = VG_(kal_getpid)(NULL); which is not excessively intrusive. > > In either case we'd need an kal-specific enum holding error codes > > (KAL_EINVAL, KAL_EBADF, etc), and, for each target, a function > > which converts kernel return values for the target into KAL_E* > > values. > > Sounds about right. Error codes presumably have to be in the tool > visible header, along with any structures that tool visible functions > want to use in their interface. Sounds exactly right. > >> I think we should just use the libc API for this, if not the system > >> glibc. > > > > Yes. The above suggestion effectively enshrines that, modulo > > perhaps sidestepping the errno uglyness for returning error codes. > > Well the other thing that I did for several routines was to use > a simpler interface that didn't rely on kernel structures, so for > example gettimeofday just returns a ULong and nanosleep takes a > ULong rather than faffing about with a structure. That I like -- it sounds like a good thing. In fact, we need to not have any kernel types, structs or consts in the kal interface. > Likewise for getrlimit/setrlimit which were the other ones where I > made an attempt to remove VKI types from the interface. Yup. I presume you'll have an updated patch after a bit more iteration? J |
|
From: Tom H. <to...@co...> - 2005-03-30 18:06:51
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> > >> I think we should just use the libc API for this, if not the system
> > >> glibc.
> > >
> > > Yes. The above suggestion effectively enshrines that, modulo
> > > perhaps sidestepping the errno uglyness for returning error codes.
> >
> > Well the other thing that I did for several routines was to use
> > a simpler interface that didn't rely on kernel structures, so for
> > example gettimeofday just returns a ULong and nanosleep takes a
> > ULong rather than faffing about with a structure.
>
> That I like -- it sounds like a good thing. In fact, we need to not
> have any kernel types, structs or consts in the kal interface.
The real point I was trying to make was that I wasn't sticking
religiously to the libc interface and creating a kal_timeval struct
but was doing whatever was easiest for the caller to use.
> > Likewise for getrlimit/setrlimit which were the other ones where I
> > made an attempt to remove VKI types from the interface.
>
> Yup.
>
> I presume you'll have an updated patch after a bit more iteration?
Yup. I'm working on it now. Did we ever reach a decision on whether
to drop routines from vg_mylibc that become simple veneers over a
single KAL routine?
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2005-03-30 18:14:53
|
> Yup. I'm working on it now. Did we ever reach a decision on whether > to drop routines from vg_mylibc that become simple veneers over a > single KAL routine? I think we should -- that is, there's no point in calling into mylibc if that then calls onwards to KAL and does nothing else. In other words mylibc becomes more or less split into two parts: KAL for dealing with the kernel, and whatever is left over in mylibc, which is presumably vanilla stuff like strlen etc. Does that sound sensible? J |
|
From: Nicholas N. <nj...@cs...> - 2005-03-31 03:38:52
|
On Wed, 30 Mar 2005, Jeremy Fitzhardinge wrote: >> Attached is a first cut at a kernel abstraction layer. > > I think we should just use the libc API for this, if not the system glibc. > Would uclibc be a good basis (http://www.uclibc.org/)? It would get us out > of the business of maintaining a C library as well. Perhaps we should stop, think about this, and make a decision before proceeding further, since there it's an important issue and there is some disagreement. Currently we use glibc in a number of modules: vg_main.c, ume.c, stage1.c, vg_stabs.c, vg_symtab2.c, $PLATFORM/core_platform.h, $ARCH/state.c, vg_pthreadmodel.c, vg_skiplist.c. The first three in that list use libc a lot, the others are more minor uses (some only for types and constants, I think). And we also have our own libc implementations of lots of functions. And we've got some external library code in Valgrind (the demangler). So we're using an awkward mix. I see three possibilities here. 1. Use glibc externally, as we do a little bit now. Pros: saves us implementation work. Especialy helpful with complex functions like dlopen() and dlsym() Cons: unclear impact on correctness. Eg. what happens if a library function calls malloc()? 2. Copy a libc into Valgrind, eg. uClibc. Pros: as for (1) Cons: lots of extra code, much of it unused 3. Implement our own libc. Pros: eliminates dependency on outside world, great for correctness guarantees Cons: implementation effort, esp. on different platforms. AIUI, Julian favours (3), Jeremy favours either (1) or (2). I'm leaning towards (3). I've undoubtedly oversimplified the issues involved and missed out some important details. Anyone care to add their two cents? I figure it's a good idea to discuss this and come to some sort of conclusion/agreement before too much implementation effort occurs. N |
|
From: Robert W. <rj...@du...> - 2005-03-31 03:52:19
|
> AIUI, Julian favours (3), Jeremy favours either (1) or (2). I'm leaning=20 > towards (3). I was leaning towards (1) or (2), but after our last discussion on this I'm now in the (3) camp. External code =3D=3D figuring out what side effects it has, which is error-prone. Maintaining our own libc isn't easy, but it does give us a lot more control, and who said this had to be easy? Regards, Robert. --=20 Robert Walsh Amalgamated Durables, Inc. - "We don't make the things you buy." Email: rj...@du... |