Re: [Valgrind-developers] tool-specific CLOs

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Fri, 2004-01-09 at 09:56, Nicholas Nethercote wrote:
> Behaviour should be the same as before;  where VALGRIND_OPTS was read
> before, I know read VALGRIND_OPTS plus the two .rc files.

OK,  I think env vars are more "local" though (since they're per-shell),
so VALGRIND_OPTS should take precidence over ~/.valgrindrc (and maybe
./.valgrindrc), but obviously not the actual command line.

> > > I think this also fixes bug 71126, thanks to Tom for his patch.
> >
> > I still think this could do with a more radical rearrangement.  I don't
> > think main() should be parsing the command line at all, or loading the
> > tool.  If it simply generates a unified vg_argv and passes it to
> > VG_(main), VG_(main) can do the rest.  This means that the address space
> > should remain padded until VG_(main) is ready to unpad it (ie, after
> > doing dlopen on the tool).
> 
> Can you clarify what you see as the conceptual difference between main()
> and VG_(main)()?

There is none - it's a historical artifact.  At one point stage2 and
valgrind.so were separate entities, with stage2 dlopening valgrind.so. 
At that point it seemed like a good idea to do all the dlopening in
main(), and set VG_(main) off with everything set up.  Now that they're
linked together, it doesn't seem so important.

Part of the complexity of the current initialization is that a lot of
low-level Valgrind mechanisms are set up in VG_(main), but the stuff in
main() should really be using them.

Using VG_(mmap) is mandatory once the padding has been removed, and
VG_(mmap) depends on the allocator having been set up. And because we'd
like to allow the use of any library, we should probably set up
intercept functions for plain old mmap/__libc_mmap/etc. 
VG_(mmap/munmap/mprotect/ec) also depend on the Segment list being set
up.

Also, all the error reporting in main() should probably use Valgrind
standard error reporting mechanisms, rather than just fprintf()ing
things.

In other words, it's a mess which just happens to mostly work.

> One possibility is that main() is a bit like __libc_start_main(), ie sets
> things like argv/argc up ready for the program.  Then VG_(main)() is a bit
> like main() for a normal program -- start doing real stuff.  In which
> case, moving the command-line parsing to main() makes perfect sense.
> 
> However, to me, the unpadding feels like it should be done as part of the
> setup stuff in main().
> 
>   1. pad
>   2. setup argv
>   3. partially parse argv to find tool
>   4. load tool.so + valgrind.so
>   5. do_exec the client binary
>   6. unpad
> 
> Because it's all a bit mixed up at the start -- necessary due to tool
> selection from the command-line -- there doesn't seem to be an obvious cut
> point between main() and VG_(main)().

The pad() really does have to happen very early, because it must get
done before anyone else calls mmap(), but we can unpad pretty late
without consequence (at any time up to running the first client
instruction).  Hm.  The most important pad, the client address space,
has already been done by stage1, so it is already in place by the time
stage2's main() is called.

Setting up argv currently uses malloc(), which is a bit troublesome,
because it may want to use brk().  brk() happens to work at the moment,
because stage1 sets up the brk segment, but this is non-portable and
unreliable.  This means we also have to intercept brk/sbrk and implement
them in terms of VG_(mmap) so that libraries can use them.

In general, for the moment, I'd like to keep using VG_() versions of
functions as much as possible, and move over to the libc versions in a
systematic way.  That's because a lot of the VG_() versions are not
simple clones of the libc functions, and/or may have other important
side-effects which we rely on.  I broke this rule a lot in stage2.c and
ume.c, mostly because of the historical unavailability of the VG_()
functions in those contexts (and ume.c still needs to link into stage1,
which has no VG_() functions available).

So, I think the init ordering should be:

     1. stage1
             1. pad the client address space
             2. do_exec stage2
     2. stage2:main()
             1. (do something to cope with .init sections of any
                libraries we're using)
             2. initialize Segment list to reflect all current mappings
                (VG_(init_memory)
             3. init Valgrind heap, make mmap() call VG_(mmap), deal
                with brk/sbrk.
             4. init early error reporting (ie default output sink)
             5. set up any other padding
             6. collate arguments from all the possible sources,
                generating a unified vg_argv[] array
     3. stage2:VG_(main)()
             1. parse vg_argv
             2. print errors/usage/early failures
             3. dlopen() the tool
             4. init everything else (may need to be after
                do_exec(client) if it depends on the shape/placement of
                the client address space)
             5. remove all padding/clear all mappings out of client
                address space
             6. do_exec client/set up client stack
             7. start VCPU

As you say, the break between main() and VG_(main) is pretty arbitrary. 
Perhaps we should just drop stage2.c, move everything into vg_main.c,
rename VG_(main)() to main(), get rid of KickStartParams (another
historical name).

	J