|
From: Nicholas N. <nj...@ca...> - 2004-01-05 16:34:53
|
Hi, A current problem with VALGRIND_OPTS is that you can't put tool-specific options in it, otherwise you'll get problems if you run a different tool. Eg. a Memcheck-specific option will cause Cachegrind to fall over. I have a patch that addresses this, whereby you can prefix an option with "--toolname:" and it will only be run where "toolname" matches the chosen tool. Eg, with this: --memcheck:--leak-check the --leak-check will only be applied for Memcheck, and ignored for other tools. The way I have coded it, this "--toolname:" prefix works in VALGRIND_OPTS or on the command line... the latter wasn't really necessary, but it doesn't seem to be a problem. I was originally thinking of using the syntax "toolname:--option", but it's easier to have "--" at the start, because stage2 considers any command line word that doesn't begin with '-' to be the start of the executable. It wouldn't be hard to allow ./.valgrindrc and ~/.valgrindrc files, too, which can contain (prefixed and non-prefixed) options, just like the VALGRIND_OPTS var. What do people think? N |
|
From: Josef W. <Jos...@gm...> - 2004-01-07 10:45:27
|
Am Montag, 5. Januar 2004 17:34 schrieb Nicholas Nethercote: > I was originally thinking of using the syntax "toolname:--option", but What about --toolname:option ? BTW, this doesn't allow column chars in any option any more, right? > It wouldn't be hard to allow ./.valgrindrc and ~/.valgrindrc files, too, > which can contain (prefixed and non-prefixed) options, just like the > VALGRIND_OPTS var. I vote for this. Cheers, Josef |
|
From: Tom H. <th...@cy...> - 2004-01-07 11:02:53
|
In message <200...@gm...>
Josef Weidendorfer <Jos...@gm...> wrote:
> Am Montag, 5. Januar 2004 17:34 schrieb Nicholas Nethercote:
>> I was originally thinking of using the syntax "toolname:--option", but
>
> What about --toolname:option ?
I was thinking much the same thing.
> BTW, this doesn't allow column chars in any option any more, right?
column chars?
>> It wouldn't be hard to allow ./.valgrindrc and ~/.valgrindrc files, too,
>> which can contain (prefixed and non-prefixed) options, just like the
>> VALGRIND_OPTS var.
>
> I vote for this.
Me to. It would be nice if it could handle spaces in option values
somehow as well - they are now handled in command line arguments but
not currently in argument taken from the environment.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Nicholas N. <nj...@ca...> - 2004-01-07 12:04:27
|
On Wed, 7 Jan 2004, Tom Hughes wrote:
> > What about --toolname:option ?
>
> I was thinking much the same thing.
Good idea, I'll do that.
> > BTW, this doesn't allow column chars in any option any more, right?
>
> column chars?
Colon (':') chars? Yes, they wouldn't be allowed, but that doesn't seem
like a great loss? If that's a problem, maybe --toolname::option, or
something even more obscure?
> >> It wouldn't be hard to allow ./.valgrindrc and ~/.valgrindrc files, too,
> >> which can contain (prefixed and non-prefixed) options, just like the
> >> VALGRIND_OPTS var.
> >
> > I vote for this.
>
> Me to. It would be nice if it could handle spaces in option values
> somehow as well - they are now handled in command line arguments
Really? Do any of the existing options handle them? Does it require
quoting? Seems like knowing when an option ends and a command name begins
is problematic if spaces are allowed, eg. in "--foo=bar baz" is "baz" part
of the --foo option or the name of the executable?
> but not currently in argument taken from the environment.
N
|
|
From: Josef W. <Jos...@gm...> - 2004-01-07 22:40:22
|
Am Mittwoch, 7. Januar 2004 12:02 schrieb Tom Hughes: > In message <200...@gm...> > > BTW, this doesn't allow column chars in any option any more, right? > column chars? Oops, I meant ":" (colon). Josef |
|
From: Tom H. <th...@cy...> - 2004-01-07 12:50:34
|
In message <Pin...@re...>
Nicholas Nethercote <nj...@ca...> wrote:
> On Wed, 7 Jan 2004, Tom Hughes wrote:
>
>> >> It wouldn't be hard to allow ./.valgrindrc and ~/.valgrindrc files, too,
>> >> which can contain (prefixed and non-prefixed) options, just like the
>> >> VALGRIND_OPTS var.
>> >
>> > I vote for this.
>>
>> Me to. It would be nice if it could handle spaces in option values
>> somehow as well - they are now handled in command line arguments
>
> Really? Do any of the existing options handle them? Does it require
> quoting? Seems like knowing when an option ends and a command name begins
> is problematic if spaces are allowed, eg. in "--foo=bar baz" is "baz" part
> of the --foo option or the name of the executable?
No existing options require spaces, but the --db-command option that
is in my patch for bug 69489 does as the command to start the debugger
will normally include spaces.
For the .valgrindrc files it should be difficult if they expect one
option per line, for the env var some form of quoting would have to
be supported.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Nicholas N. <nj...@ca...> - 2004-01-07 13:45:08
|
On Wed, 7 Jan 2004, Tom Hughes wrote: > No existing options require spaces, but the --db-command option that > is in my patch for bug 69489 does as the command to start the debugger > will normally include spaces. > > For the .valgrindrc files it should be difficult if they expect one > option per line, for the env var some form of quoting would have to > be supported. If quoting is supported in the env var (as seems necessary), it shouldn't be hard to support it in the .valgrindrc file too -- I was planning to mmap the .valgrindrc file instead of reading it line by line, so that it can be handled by the same code that handles the env var. N |
|
From: Nicholas N. <nj...@ca...> - 2004-01-08 18:44:06
|
Hi,
I'm working on this, and I want to read ~/.valgrindrc. However, I can't
use '~' in C because it's a shell thing, I think.
How can I find the home directory -- is getenv("HOME") the right way?
(Is $HOME always defined?)
Thanks
N
|
|
From: Robert W. <rj...@du...> - 2004-01-08 18:52:03
|
> I'm working on this, and I want to read ~/.valgrindrc. However, I can't
> use '~' in C because it's a shell thing, I think.
>
> How can I find the home directory -- is getenv("HOME") the right way?
> (Is $HOME always defined?)
It's not guaranteed to be, but it nearly always is. Another way is to
use getpwuid(getuid()) and pull out the pw_dir field from the result.
That should always work, unless the UID isn't in your password
database. Happily, with full virtualisation, this should now be usable
(getpwuid() is in libc, and actually more complicated than you might
imagine since it has to handle NIS, /etc/passwd, LDAP, etc.)
Regards,
Robert.
|
|
From: Jeremy F. <je...@go...> - 2004-01-08 21:04:40
|
On Thu, 2004-01-08 at 10:44, Nicholas Nethercote wrote:
> How can I find the home directory -- is getenv("HOME") the right way?
> (Is $HOME always defined?)
No, but if it isn't, then just don't look for the file. There's no
point in getting too heroic over it.
It might also be useful to look in "." as well, since you might want a
program-specific config file. Or perhaps have an env var for looking
for multiple config files? Is this getting all too complex?
J
|
|
From: Nicholas N. <nj...@ca...> - 2004-01-08 21:20:24
|
On Thu, 8 Jan 2004, Jeremy Fitzhardinge wrote: > No, but if it isn't, then just don't look for the file. There's no > point in getting too heroic over it. Yeah, I'll do that. > It might also be useful to look in "." as well, since you might want a > program-specific config file. Or perhaps have an env var for looking > for multiple config files? Is this getting all too complex? I'm currently looking in three places: 1. ~/.valgrindrc 2. VALGRIND_OPTS 3. ./.valgrindrc Args are added in that order, the idea being that more 'local' args are added later, to override less local ones. (1) and (2) could be swapped, but (3) should definitely be last, as it's clearly the most local. N |
|
From: Nicholas N. <nj...@ca...> - 2004-01-09 16:20:06
Attachments:
diff
|
On Thu, 8 Jan 2004, Nicholas Nethercote wrote: > I'm currently looking in three places: > > 1. ~/.valgrindrc > 2. VALGRIND_OPTS > 3. ./.valgrindrc > > Args are added in that order, the idea being that more 'local' args are > added later, to override less local ones. (1) and (2) could be swapped, > but (3) should definitely be last, as it's clearly the most local. I've swapped, giving: 1. VALGRIND_OPTS 2. ~/.valgrindrc 3. ./.valgrindrc Diff is attached, feedback welcome. I hope the way I mmap'd in the .rc files and didn't munmap them is reasonable. I think this also fixes bug 71126, thanks to Tom for his patch. N |
|
From: Nicholas N. <nj...@ca...> - 2004-01-09 16:27:44
|
On Fri, 9 Jan 2004, Nicholas Nethercote wrote: > Diff is attached, feedback welcome. Oh, I haven't added support for quoting to allow spaces within args... N |
|
From: Doug R. <df...@nl...> - 2004-01-09 16:49:02
|
On Fri, 2004-01-09 at 16:20, Nicholas Nethercote wrote: > On Thu, 8 Jan 2004, Nicholas Nethercote wrote: > > > I'm currently looking in three places: > > > > 1. ~/.valgrindrc > > 2. VALGRIND_OPTS > > 3. ./.valgrindrc > > > > Args are added in that order, the idea being that more 'local' args are > > added later, to override less local ones. (1) and (2) could be swapped, > > but (3) should definitely be last, as it's clearly the most local. > > I've swapped, giving: > > 1. VALGRIND_OPTS > 2. ~/.valgrindrc > 3. ./.valgrindrc > > Diff is attached, feedback welcome. I hope the way I mmap'd in the .rc > files and didn't munmap them is reasonable. > > I think this also fixes bug 71126, thanks to Tom for his patch. This will have problems if the file is exactly getpagesize() bytes long since the mmap'ed image won't be null terminated. |
|
From: Jeremy F. <je...@go...> - 2004-01-09 17:38:18
|
On Fri, 2004-01-09 at 08:20, Nicholas Nethercote wrote: > I've swapped, giving: > > 1. VALGRIND_OPTS > 2. ~/.valgrindrc > 3. ./.valgrindrc Where does the command-line come into this? What happens with _VALGRIND_CLO? I think if the latter is set, then everything else should be ignored, since it should contain the complete union of all command line options anyway. > Diff is attached, feedback welcome. I hope the way I mmap'd in the .rc > files and didn't munmap them is reasonable. Why mmap? Why not just allocate a chunk of memory and read? > I think this also fixes bug 71126, thanks to Tom for his patch. I still think this could do with a more radical rearrangement. I don't think main() should be parsing the command line at all, or loading the tool. If it simply generates a unified vg_argv and passes it to VG_(main), VG_(main) can do the rest. This means that the address space should remain padded until VG_(main) is ready to unpad it (ie, after doing dlopen on the tool). J |
|
From: Nicholas N. <nj...@ca...> - 2004-01-09 17:40:11
|
On Fri, 9 Jan 2004, Doug Rabson wrote:
> > Diff is attached, feedback welcome. I hope the way I mmap'd in the .rc
> > files and didn't munmap them is reasonable.
>
> This will have problems if the file is exactly getpagesize() bytes long
> since the mmap'ed image won't be null terminated.
Would changing this:
f_clo = mmap(0, s1.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
to this
f_clo = mmap(0, s1.st_size+1, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
fix it?
N
|
|
From: Doug R. <df...@nl...> - 2004-01-09 17:47:03
|
On Fri, 2004-01-09 at 17:40, Nicholas Nethercote wrote: > On Fri, 9 Jan 2004, Doug Rabson wrote: > > > > Diff is attached, feedback welcome. I hope the way I mmap'd in the .rc > > > files and didn't munmap them is reasonable. > > > > This will have problems if the file is exactly getpagesize() bytes long > > since the mmap'ed image won't be null terminated. > > Would changing this: > > f_clo = mmap(0, s1.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); > > to this > > f_clo = mmap(0, s1.st_size+1, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); > > fix it? Probably. It does seem easier to malloc+read the contents though. |
|
From: Nicholas N. <nj...@ca...> - 2004-01-09 17:56:49
|
On Fri, 9 Jan 2004, Jeremy Fitzhardinge wrote: > > I've swapped, giving: > > > > 1. VALGRIND_OPTS > > 2. ~/.valgrindrc > > 3. ./.valgrindrc > > Where does the command-line come into this? What happens with > _VALGRIND_CLO? I think if the latter is set, then everything else > should be ignored, since it should contain the complete union of all > command line options anyway. Behaviour should be the same as before; where VALGRIND_OPTS was read before, I know read VALGRIND_OPTS plus the two .rc files. > > Diff is attached, feedback welcome. I hope the way I mmap'd in the .rc > > files and didn't munmap them is reasonable. > > Why mmap? Why not just allocate a chunk of memory and read? Didn't think of that. I'll change it. > > I think this also fixes bug 71126, thanks to Tom for his patch. > > I still think this could do with a more radical rearrangement. I don't > think main() should be parsing the command line at all, or loading the > tool. If it simply generates a unified vg_argv and passes it to > VG_(main), VG_(main) can do the rest. This means that the address space > should remain padded until VG_(main) is ready to unpad it (ie, after > doing dlopen on the tool). Can you clarify what you see as the conceptual difference between main() and VG_(main)()? One possibility is that main() is a bit like __libc_start_main(), ie sets things like argv/argc up ready for the program. Then VG_(main)() is a bit like main() for a normal program -- start doing real stuff. In which case, moving the command-line parsing to main() makes perfect sense. However, to me, the unpadding feels like it should be done as part of the setup stuff in main(). 1. pad 2. setup argv 3. partially parse argv to find tool 4. load tool.so + valgrind.so 5. do_exec the client binary 6. unpad Because it's all a bit mixed up at the start -- necessary due to tool selection from the command-line -- there doesn't seem to be an obvious cut point between main() and VG_(main)(). N |
|
From: Josef W. <Jos...@gm...> - 2004-01-09 18:53:19
|
Am Mittwoch, 7. Januar 2004 13:04 schrieb Nicholas Nethercote:
> Colon (':') chars? Yes, they wouldn't be allowed, but that doesn't seem
> like a great loss? If that's a problem, maybe --toolname::option, or
> something even more obscure?
In calltree, in some options there can symbol names be specified, e.g. where
to start collecting, and so on. Unfortunately, for C++, colons can be part of
the symbol name.
As my options are like --option=... it would work with your scheme if we only
allow alpha(numeric?) characters as part of a tool/skin name, e.g. not '='.
Is this acceptable?
Josef
|
|
From: Nicholas N. <nj...@ca...> - 2004-01-10 12:56:39
|
On Fri, 9 Jan 2004, Josef Weidendorfer wrote:
> > Colon (':') chars? Yes, they wouldn't be allowed, but that doesn't seem
> > like a great loss? If that's a problem, maybe --toolname::option, or
> > something even more obscure?
>
> In calltree, in some options there can symbol names be specified, e.g. where
> to start collecting, and so on. Unfortunately, for C++, colons can be part of
> the symbol name.
> As my options are like --option=... it would work with your scheme if we only
> allow alpha(numeric?) characters as part of a tool/skin name, e.g. not '='.
> Is this acceptable?
Sounds fine, I'll fix.
N
|
|
From: Jeremy F. <je...@go...> - 2004-01-10 08:23:11
|
On Fri, 2004-01-09 at 09:56, Nicholas Nethercote wrote:
> Behaviour should be the same as before; where VALGRIND_OPTS was read
> before, I know read VALGRIND_OPTS plus the two .rc files.
OK, I think env vars are more "local" though (since they're per-shell),
so VALGRIND_OPTS should take precidence over ~/.valgrindrc (and maybe
./.valgrindrc), but obviously not the actual command line.
> > > I think this also fixes bug 71126, thanks to Tom for his patch.
> >
> > I still think this could do with a more radical rearrangement. I don't
> > think main() should be parsing the command line at all, or loading the
> > tool. If it simply generates a unified vg_argv and passes it to
> > VG_(main), VG_(main) can do the rest. This means that the address space
> > should remain padded until VG_(main) is ready to unpad it (ie, after
> > doing dlopen on the tool).
>
> Can you clarify what you see as the conceptual difference between main()
> and VG_(main)()?
There is none - it's a historical artifact. At one point stage2 and
valgrind.so were separate entities, with stage2 dlopening valgrind.so.
At that point it seemed like a good idea to do all the dlopening in
main(), and set VG_(main) off with everything set up. Now that they're
linked together, it doesn't seem so important.
Part of the complexity of the current initialization is that a lot of
low-level Valgrind mechanisms are set up in VG_(main), but the stuff in
main() should really be using them.
Using VG_(mmap) is mandatory once the padding has been removed, and
VG_(mmap) depends on the allocator having been set up. And because we'd
like to allow the use of any library, we should probably set up
intercept functions for plain old mmap/__libc_mmap/etc.
VG_(mmap/munmap/mprotect/ec) also depend on the Segment list being set
up.
Also, all the error reporting in main() should probably use Valgrind
standard error reporting mechanisms, rather than just fprintf()ing
things.
In other words, it's a mess which just happens to mostly work.
> One possibility is that main() is a bit like __libc_start_main(), ie sets
> things like argv/argc up ready for the program. Then VG_(main)() is a bit
> like main() for a normal program -- start doing real stuff. In which
> case, moving the command-line parsing to main() makes perfect sense.
>
> However, to me, the unpadding feels like it should be done as part of the
> setup stuff in main().
>
> 1. pad
> 2. setup argv
> 3. partially parse argv to find tool
> 4. load tool.so + valgrind.so
> 5. do_exec the client binary
> 6. unpad
>
> Because it's all a bit mixed up at the start -- necessary due to tool
> selection from the command-line -- there doesn't seem to be an obvious cut
> point between main() and VG_(main)().
The pad() really does have to happen very early, because it must get
done before anyone else calls mmap(), but we can unpad pretty late
without consequence (at any time up to running the first client
instruction). Hm. The most important pad, the client address space,
has already been done by stage1, so it is already in place by the time
stage2's main() is called.
Setting up argv currently uses malloc(), which is a bit troublesome,
because it may want to use brk(). brk() happens to work at the moment,
because stage1 sets up the brk segment, but this is non-portable and
unreliable. This means we also have to intercept brk/sbrk and implement
them in terms of VG_(mmap) so that libraries can use them.
In general, for the moment, I'd like to keep using VG_() versions of
functions as much as possible, and move over to the libc versions in a
systematic way. That's because a lot of the VG_() versions are not
simple clones of the libc functions, and/or may have other important
side-effects which we rely on. I broke this rule a lot in stage2.c and
ume.c, mostly because of the historical unavailability of the VG_()
functions in those contexts (and ume.c still needs to link into stage1,
which has no VG_() functions available).
So, I think the init ordering should be:
1. stage1
1. pad the client address space
2. do_exec stage2
2. stage2:main()
1. (do something to cope with .init sections of any
libraries we're using)
2. initialize Segment list to reflect all current mappings
(VG_(init_memory)
3. init Valgrind heap, make mmap() call VG_(mmap), deal
with brk/sbrk.
4. init early error reporting (ie default output sink)
5. set up any other padding
6. collate arguments from all the possible sources,
generating a unified vg_argv[] array
3. stage2:VG_(main)()
1. parse vg_argv
2. print errors/usage/early failures
3. dlopen() the tool
4. init everything else (may need to be after
do_exec(client) if it depends on the shape/placement of
the client address space)
5. remove all padding/clear all mappings out of client
address space
6. do_exec client/set up client stack
7. start VCPU
As you say, the break between main() and VG_(main) is pretty arbitrary.
Perhaps we should just drop stage2.c, move everything into vg_main.c,
rename VG_(main)() to main(), get rid of KickStartParams (another
historical name).
J
|
|
From: Nicholas N. <nj...@ca...> - 2004-01-21 12:02:57
Attachments:
startup-merge.patch
order
|
On Sat, 10 Jan 2004, Jeremy Fitzhardinge wrote: > So, I think the init ordering should be: > > 1. stage1 > 1. pad the client address space > 2. do_exec stage2 > 2. stage2:main() > 1. (do something to cope with .init sections of any > libraries we're using) > 2. initialize Segment list to reflect all current mappings > (VG_(init_memory) > 3. init Valgrind heap, make mmap() call VG_(mmap), deal > with brk/sbrk. > 4. init early error reporting (ie default output sink) > 5. set up any other padding > 6. collate arguments from all the possible sources, > generating a unified vg_argv[] array > 3. stage2:VG_(main)() > 1. parse vg_argv > 2. print errors/usage/early failures > 3. dlopen() the tool > 4. init everything else (may need to be after > do_exec(client) if it depends on the shape/placement of > the client address space) > 5. remove all padding/clear all mappings out of client > address space > 6. do_exec client/set up client stack > 7. start VCPU > > As you say, the break between main() and VG_(main) is pretty arbitrary. > Perhaps we should just drop stage2.c, move everything into vg_main.c, > rename VG_(main)() to main(), get rid of KickStartParams (another > historical name). Attached is a draft patch that does this -- removes stage2.c, moving it into vg_main.c. This allowed me to remove quite a lot of cruft, because the two were pretty closely intertwined. It also includes my .valgrindrc stuff, and a fix for bug 71126, as previously mentioned. There are a couple of shortcomings still: no support for quoting in the .valgrindrc files, which Tom wants, and tool-specific CLOs can't have colons in them, which Josef wants. It seems to work ok (modulo the TLS patch that just went in and breaks pthreaded programs for me, but that happens without my startup-merge patch too). I tried my best to just refactor things, and not change any actual behaviour. I've attached another file that lists all the tasks now done in main(), and their dependencies. It doesn't look very much like your list above, Jeremy; mostly I kept the existing order of things. Valgrind has certainly become a tricky sucker to start up. I think these dependencies should eventually go into comments in main(); there are already some dependencies listed for the code that was VG_(main)(), and they're very useful. There is certainly potential for the order of startup events to change, and for neatening up, and fixing the mentioned shortcomings. However, if nobody objects, I'd like to put this patch in sooner rather than later since any intermediate commits could cause some pretty heinous clashes. N |