|
From: Robert W. <rj...@du...> - 2004-04-12 00:29:36
|
Hi all, I think I finally have a working patch for bug 73655. I've tested it on RedHat 9 and everything appears to be working just fine. I'd really like it if someone could test on something else (Fedora, say) just so I'm not borking any of the other builds. If someone could volunteer for this, I'd really appreciate it. (My Fedora machine is now an Athlon64, which means I can't test this without actually porting Valgrind. Of course, now that my Fedora machine is an Athlon64, I have a lot of motivation to do this. I'll start on this as soon as the contractors put my house back together again and I get a free minute from work... :-) The patch is attached to the bug, or you can grab it from http://www.durables.org/software/valgrind. Regards, Robert. -- Robert Walsh Amalgamated Durables, Inc. "We bring dead things to life" |
|
From: Nicholas N. <nj...@ca...> - 2004-04-13 10:25:51
|
On Sun, 11 Apr 2004, Robert Walsh wrote: > I think I finally have a working patch for bug 73655. I've tested it on > RedHat 9 and everything appears to be working just fine. I'd really like > it if someone could test on something else (Fedora, say) just so I'm not > borking any of the other builds. If someone could volunteer for this, I'd > really appreciate it. I tried it; that test case works on my, er, RH 9 box too. What about the memcheck/tests/new_override.cpp test? The expected output in the .exp file is that Valgrind's 'new' takes precedence, which is actually wrong here. With your patch, I'm still getting V's new taking precedence. Also, you're now generating vg_intercept.c and vg_replace_malloc.c, from files with .vgi extensions... we already have similar files with .base extensions (vg_skin.h.base)... should be consistent here. Finally, how does your patch actually work? What did you change? thanks N |
|
From: Robert W. <rj...@du...> - 2004-04-13 18:57:47
|
> I tried it; that test case works on my, er, RH 9 box too. Yea! Now we know it's safe :-) > What about the memcheck/tests/new_override.cpp test? The expected output > in the .exp file is that Valgrind's 'new' takes precedence, which is > actually wrong here. With your patch, I'm still getting V's new taking > precedence. The test is borked. new Test[2] doesn't ever call the operator new defined in that file because it's looking for operator new[] instead. I'll fix the test in the patch and make a new release of the patch later tonight. > Also, you're now generating vg_intercept.c and vg_replace_malloc.c, from > files with .vgi extensions... we already have similar files with .base > extensions (vg_skin.h.base)... should be consistent here. Yes. I'll do that, too. > Finally, how does your patch actually work? What did you change? First, the problem: the problem was that Valgrind was always intercepting operator new, even when your program defined it's own version of it. This is why new_override.cpp would never have worked even if it had been overriding operator new[]. This was happening because the preloaded vg_inject.so had an exported operator new symbol. This was to work around the following issue: vg_inject.so used an init segment to set up the intercept table. However, there was no guarantee that vg_inject.so would have it's init segment run before any other shared object. This meant that a shared object that had an init section that called operator new might actually have ended up using the glibc version instead of the Valgrind one if the intercept table hadn't been set up yet, causing all sorts of confusion. However, exporting the operator new function in vg_inject.so worked around this issue by ensuring that the Valgrind one was always used. Of course, this meant that the Valgrind one was always used, even if your program defined it's own version. The solution was to get rid of any symbol called "operator new" (or, actually, _Znwj) from the vg_inject.so file. This meant that a program-defined operator new got resolved properly. However, we needed some way of setting up the intercept table before hand. Jeremy and I talked about this for a few hours and decided to go with the solution I adopted. We now mangle the symbols in vg_inject.so so that they can be demangled to indicate exactly what library and function they are overriding. When a .so file is loaded, we slurp in the symbol table, pick out those symbols that match the mangled syntax we defined, demangle the symbol into a library name/function name pair and set up the table that way. In order to make the whole thing a bit more palatable, we use a pre-processor to turn a human-readable version of the file (the .vgi files) into the mangled version. The above also goes for all other overridden symbols, not just operator new. Does that make sense? Thanks for the feedback. Regards, Robert. |
|
From: Nicholas N. <nj...@ca...> - 2004-04-13 20:33:33
|
On Tue, 13 Apr 2004, Robert Walsh wrote: > The test is borked. new Test[2] doesn't ever call the operator new > defined in that file because it's looking for operator new[] instead. > I'll fix the test in the patch and make a new release of the patch later > tonight. Ah, of course. > First, the problem: the problem was that Valgrind was always > intercepting operator new, even when your program defined it's own > version of it. This is why new_override.cpp would never have worked > even if it had been overriding operator new[]. [snip] > Does that make sense? Sounds plausible. Is it documented? I'm not sure if the comments in vg_intercept.vgi now match with the reality. A precis of your description could be what's needed... Also, I see in the generated vg_intercept.c lines like this: int VG_INTERCEPT(soname$3Alibc$2Eso$2E6$3A__raise)(int) How does that work? What's with the dollar signs? N |
|
From: Tom H. <th...@cy...> - 2004-04-13 20:37:55
|
In message <Pin...@ye...>
Nicholas Nethercote <nj...@ca...> wrote:
> Also, I see in the generated vg_intercept.c lines like this:
>
> int VG_INTERCEPT(soname$3Alibc$2Eso$2E6$3A__raise)(int)
>
> How does that work? What's with the dollar signs?
Well looking at it I'd guess $xx is a hex escape, as $2E would be
a '.' character which makes sense for libc.so in that line.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Robert W. <rj...@du...> - 2004-04-13 20:43:46
|
> Sounds plausible. Is it documented? Phngh. > I'm not sure if the comments in > vg_intercept.vgi now match with the reality. A precis of your description > could be what's needed... Okey dokey. I'll put a better description extracted from my message into vg_intercept.c.base. > Also, I see in the generated vg_intercept.c lines like this: > > int VG_INTERCEPT(soname$3Alibc$2Eso$2E6$3A__raise)(int) > > How does that work? What's with the dollar signs? VG_INTERCEPT is a C macro that just sticks a magic preamble onto the symbol so that we can pick it up automatically when we're scanning the symbol table. The dollar symbols are legal in symbols in gcc, believe it or not. We use them to delimit bits of the symbol. The two numbers following it are hex digits of a character that isn't legal in a symbol, so $3A == ':' and $2E == '.'. So this function is: soname:libc.so.6:__raise This is plucked apart by the scanning code to set up the intercept table. This basically says "the function called _vgi__soname$3Alibc$2Eso$2E6$3A__raise is an intercept for __raise in the shared object called libc.so.6." Ugly, but it works. :-) If you can think of a cleaner mechanism, I'd like to hear about it. Regards, Robert. |
|
From: Nicholas N. <nj...@ca...> - 2004-04-13 20:50:40
|
On Tue, 13 Apr 2004, Robert Walsh wrote: > The dollar symbols are legal in symbols in gcc, believe > it or not. We use them to delimit bits of the symbol. The two numbers > following it are hex digits of a character that isn't legal in a symbol, > so $3A == ':' and $2E == '.'. So this function is: > > soname:libc.so.6:__raise > > This is plucked apart by the scanning code to set up the intercept > table. This basically says "the function called > _vgi__soname$3Alibc$2Eso$2E6$3A__raise is an intercept for __raise in > the shared object called libc.so.6." > > Ugly, but it works. :-) If you can think of a cleaner mechanism, I'd > like to hear about it. I probably would have just replaced them with underscores, but hey, if it works... it's not like any compiler other than GCC can compile Valgrind. N |
|
From: Nicholas N. <nj...@ca...> - 2004-04-13 21:02:17
|
On Tue, 13 Apr 2004, Nicholas Nethercote wrote: > I probably would have just replaced them with underscores, but hey, if it > works... it's not like any compiler other than GCC can compile Valgrind. Rereading that... last bit might sound sarcastic, but it wasn't meant to be; certain features (esp. nested functions) mean GCC is the only compiler that can compile Valgrind, AFAIK. N |
|
From: Robert W. <rj...@du...> - 2004-04-13 21:09:27
|
> > I probably would have just replaced them with underscores, but hey, if it > > works... it's not like any compiler other than GCC can compile Valgrind. > > Rereading that... last bit might sound sarcastic, but it wasn't meant to > be; certain features (esp. nested functions) mean GCC is the only > compiler that can compile Valgrind, AFAIK. Right. I spent about 2 whole minutes agonising over what character to use as the escape and choose a dollar symbol. We can replace it with something else if it every becomes an issue. Regards, Robert. |
|
From: Nicholas N. <nj...@ca...> - 2004-04-13 21:23:11
|
On Tue, 13 Apr 2004, Robert Walsh wrote: > The test is borked. new Test[2] doesn't ever call the operator new > defined in that file because it's looking for operator new[] instead. > I'll fix the test in the patch and make a new release of the patch later > tonight. So if a program has its own version of new or new[], Valgrind doesn't know anything about them, and any heap blocks allocated with them won't be checked as normal? Is it possible to have a custom 'new' but a non-custom 'delete'? That could cause false free errors? N |
|
From: Tom H. <th...@cy...> - 2004-04-13 21:37:03
|
In message <Pin...@ye...>
Nicholas Nethercote <nj...@ca...> wrote:
> So if a program has its own version of new or new[], Valgrind doesn't know
> anything about them, and any heap blocks allocated with them won't be
> checked as normal? Is it possible to have a custom 'new' but a non-custom
> 'delete'? That could cause false free errors?
Well it's possible in as much as you could write code to do that, but
it would usually be a bad plan as the custom new would presumably have
allocated memory in some special way that an ordinary delete wouldn't
know how to free.
I guess you might have a custom new that does an ordinary new and then
initialises the memory or something, so that the ordinary delete would
be fine.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: Robert W. <rj...@du...> - 2004-04-13 21:40:09
|
> So if a program has its own version of new or new[], Valgrind doesn't know
> anything about them, and any heap blocks allocated with them won't be
> checked as normal?
Well, most of these new and deletes are going to be doing some sort of
allocation and freeing at the end of the day. I can see several
scenarios:
* custom allocator doesn't actually allocate anything.
* custom allocator allocates memory by doing a call to the glibc
version of the allocator after looking up it's address using
dlsym(RTLD_NEXT, ...).
* custom allocator allocates memory by doing a call to malloc or
some other existing allocator.
* custom allocator allocates memory using some custom mechanism,
such as a memory pool.
In the first scenario, nothing happens. That's just weird anyway, and I
can't imagine it happening very often, but you never know. In any case,
nothing is allocated and life goes on. Valgrind won't mind at all.
In the second scenario, the glibc allocator is eventually called, but
that call gets intercepted by Valgrind and treated as normal. This is
probably most often used for error-checking malloc wrappers and the
like.
In the third scenario, operator new or whatever calls malloc or
something like that. This too will get intercepted by Valgrind as a
regular malloc. This is what the new_override.cpp test does.
In the fourth scenario, everything is up in the air, but I don't see why
Valgrind should care, as long as the underlying memory allocation
mechanism is intercepted.
If I'm forgetting anything, let me know.
> Is it possible to have a custom 'new' but a non-custom
> 'delete'? That could cause false free errors?
Yes, but I don't think they'd be false errors. If your custom operator
new() uses malloc, then your custom operator delete() should use
free... If there is not custom operator delete(), then you're doing
something wrong and Valgrind would be correct in pointing that out.
Regards,
Robert.
|
|
From: Nicholas N. <nj...@ca...> - 2004-04-14 09:07:29
|
On Tue, 13 Apr 2004, Robert Walsh wrote: > Well, most of these new and deletes are going to be doing some sort of > allocation and freeing at the end of the day. I can see several > scenarios: > > * custom allocator doesn't actually allocate anything. > * custom allocator allocates memory by doing a call to the glibc > version of the allocator after looking up it's address using > dlsym(RTLD_NEXT, ...). > * custom allocator allocates memory by doing a call to malloc or > some other existing allocator. > * custom allocator allocates memory using some custom mechanism, > such as a memory pool. > > In the first scenario, nothing happens. That's just weird anyway, and I > can't imagine it happening very often, but you never know. In any case, > nothing is allocated and life goes on. Valgrind won't mind at all. > > In the second scenario, the glibc allocator is eventually called, but > that call gets intercepted by Valgrind and treated as normal. This is > probably most often used for error-checking malloc wrappers and the > like. > > In the third scenario, operator new or whatever calls malloc or > something like that. This too will get intercepted by Valgrind as a > regular malloc. This is what the new_override.cpp test does. > > In the fourth scenario, everything is up in the air, but I don't see why > Valgrind should care, as long as the underlying memory allocation > mechanism is intercepted. > > If I'm forgetting anything, let me know. All sounds good to me. With the fourth scenario, they would have to use client requests as with any other custom allocator. Thanks for clearing this up for me. N |
|
From: Robert W. <rj...@du...> - 2004-04-13 21:47:37
|
Hi all, I've updated the patch on the KDE Bugzilla site with feedback from Nick (comments, filename changes, etc): http://bugs.kde.org/show_bug.cgi?id=73655 Have at it! Regards, Robert. |