|
From: Julian S. <js...@ac...> - 2011-08-11 19:30:43
|
On Monday, July 25, 2011, Jakub Jelinek wrote:
> On Mon, Jul 25, 2011 at 03:05:56PM +0200, Julian Seward wrote:
> > One other thing: when you say "change somewhat the behavior of the
> > program" what do you mean? The redirects are all intended to be
> > functionally identical to the functions they replace.
>
> By that I mean that the .got.plt section with your approach will actually
> contain a pointer to valgrind's replacement. No normal code will probably
> look at that, but perhaps something like ltrace might.
Sorry to go on about this even more .. I've been thinking about the
redirection machinery some more, with a view to fixing 275284
(memmove vs memcpy redirection swampage) and I ended up reading the
long discussion on 206013, the ifunc-redir report, as background.
I'm concerned that the ifunc-redir fix makes the redirection machinery
complex, and fixing 275284 is going to make it even more complex.
So my question: is it actually necessary to redirect ifunc functions?
Couldn't we just add normal redirections for all possible target addresses
that an ifunc could point at? Would that work? There surely can't
be more than a handful of (eg) strlen implementations in libc, can there?
What I'm contemplating is a way to separate the actions of writing a
replacement function from that of specifying what functions it is a
replacement for. At the moment these concepts are glued together, and
the fact that some function X is a replacement for Y is encoded in
X's name. Hence if I want to have replacements for both "strlen" and
"__specialised_implementation_of_strlen", I have to use the STRLEN
macro in mc_replace_strmem.c, and I end up with two copies of the
replacement function in vgpreload_memcheck-amd64-linux.so. Which is
stupid.
If such a separation was possible, then I could
(1) write a single replacement for (eg) strlen then
I could add specifications saying that all the known variants
of strlen in libc should be redirected to it, hence making
the ifunc stuff more or less redundant, and
and
(2) write a single memmove replacement, and specify that all
known libc variants of both memmove and memcopy should be redirected
to it. This would fix 275284. (Of course it would still be necessary
to generalise the symbol table stuff to allow multiple symbols at the
same address, as per https://bugs.kde.org/show_bug.cgi?id=275284#c12
This would also fix the problem described in c13 of that bug.
J
|
|
From: Tom H. <to...@co...> - 2011-08-11 23:31:51
|
On 11/08/11 20:27, Julian Seward wrote: > So my question: is it actually necessary to redirect ifunc functions? > Couldn't we just add normal redirections for all possible target addresses > that an ifunc could point at? Would that work? There surely can't > be more than a handful of (eg) strlen implementations in libc, can there? Obviously that would work, at least as long as the alternate functions that the ifunc selects from remain in the symbol table. The only issues are really those which Jakub pointed out to do with potentially increased maintenance in having to add new symbols, and the risk that the functions might disappear from the symbol table altogether some day. I wonder if maybe the answer is to go with your plan, but also to include you previous scheme of intercepting the ifunc call and directing it to a routine which returns the address of our replacement routine. Once you separate the defining of the replacement routines from the declaration of the mapping it might be easy to have a version of the macro that handles the declaration which also generates a simple wrapper routine which the ifunc can be redirected to? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Julian S. <js...@ac...> - 2011-08-12 07:44:00
|
> Obviously that would work, at least as long as the alternate functions > that the ifunc selects from remain in the symbol table. > > The only issues are really those which Jakub pointed out to do with > potentially increased maintenance in having to add new symbols, and the > risk that the functions might disappear from the symbol table altogether > some day. Well, yes .. we can't really rely on them always being in dynsym. Perhaps better to forget that plan. > I wonder if maybe the answer is to go with your plan, but also to > include you previous scheme of intercepting the ifunc call and directing > it to a routine which returns the address of our replacement routine. Yes, maybe. One thing though: IIUC, the ifunc will get called as part of the dynamic linker fixing up the first call to (eg) strlen. Suppose we redirect the ifunc to a routine which returns the address of our replacement routine, but for whatever reason, to find that address also requires resolution by the dynamic linker. So we'd effectively have a dynamic linker query resulting in a nested query. Is that OK? (Just trying to think of every way this could screw up before trying it out ..) > Once you separate the defining of the replacement routines from the > declaration of the mapping it might be easy to have a version of the > macro that handles the declaration which also generates a simple wrapper > routine which the ifunc can be redirected to? Yes, that's conceivable. Although I haven't yet thought of a good way to declare the mapping. J |
|
From: Tom H. <to...@co...> - 2011-08-12 07:51:12
|
On 12/08/11 08:41, Julian Seward wrote: >> Obviously that would work, at least as long as the alternate functions >> that the ifunc selects from remain in the symbol table. >> >> The only issues are really those which Jakub pointed out to do with >> potentially increased maintenance in having to add new symbols, and the >> risk that the functions might disappear from the symbol table altogether >> some day. > > Well, yes .. we can't really rely on them always being in dynsym. > Perhaps better to forget that plan. Well as Jakub said, they're not in dynsym now. They are in symtab though, at least at the moment, and we scan both. >> I wonder if maybe the answer is to go with your plan, but also to >> include you previous scheme of intercepting the ifunc call and directing >> it to a routine which returns the address of our replacement routine. > > Yes, maybe. One thing though: IIUC, the ifunc will get called as part of > the dynamic linker fixing up the first call to (eg) strlen. Suppose we > redirect the ifunc to a routine which returns the address of our > replacement routine, but for whatever reason, to find that address also > requires resolution by the dynamic linker. So we'd effectively have a > dynamic linker query resulting in a nested query. Is that OK? > (Just trying to think of every way this could screw up before trying > it out ..) I don't think that would happen, as the address we would be returning would be in the same shared object. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Julian S. <js...@ac...> - 2011-08-12 10:14:42
|
On Friday, August 12, 2011, Tom Hughes wrote: > > Well, yes .. we can't really rely on them always being in dynsym. > > Perhaps better to forget that plan. > > Well as Jakub said, they're not in dynsym now. They are in symtab > though, at least at the moment, and we scan both. I tried it, just to see what would happen. It's already not-feasible on Ubuntu 10.04.2. I added a bunch of intercepts for __strcmp_sse42 and similar, and that makes it possible to run /bin/date, but that's about all. X applications soon wind up in apparently unlabelled bits of libc: ==20218== Invalid read of size 8 ==20218== at 0x5239052: ??? (strcpy.S:94) ==20218== by 0x54E68FB: _XlcResolveLocaleName+635 (string3.h:107) ==20218== by 0x54E9FE0: initialize+352 (lcPublic.c:234) ==20218== by 0x54E9512: initialize+34 (lcGeneric.c:1017) So we can knock that one on the head. J |
|
From: Julian S. <js...@ac...> - 2011-08-12 14:00:30
|
Jakub, > >> I wonder if maybe the answer is to go with your plan, but also to > >> include you previous scheme of intercepting the ifunc call and directing > >> it to a routine which returns the address of our replacement routine. Where in glibc should I look to see an example of an ifunc? Is (eg) the function below, from ./sysdeps/x86_64/multiarch/strcmp.S, an ifunc? I ask because I am trying to make sense of Dodji's comment at https://bugs.kde.org/show_bug.cgi?id=206013#c5 which implies there is some problem with intercepting the ifunc routine. I don't really understand the problem in this #c5. Seems like it implies that the ifunc itself should update the GOT entry, but that can't be right. Thanks. J /* Define multiple versions only for the definition in libc. Don't define multiple versions for strncmp in static library since we need strncmp before the initialization happened. */ #if (defined SHARED || !defined USE_AS_STRNCMP) && !defined NOT_IN_libc .text ENTRY(STRCMP) .type STRCMP, @gnu_indirect_function cmpl $0, __cpu_features+KIND_OFFSET(%rip) jne 1f call __init_cpu_features 1: leaq STRCMP_SSE42(%rip), %rax testl $bit_SSE4_2, __cpu_features+CPUID_OFFSET+index_SSE4_2(%rip) jnz 2f leaq STRCMP_SSSE3(%rip), %rax testl $bit_SSSE3, __cpu_features+CPUID_OFFSET+index_SSSE3(%rip) jnz 2f leaq STRCMP_SSE2(%rip), %rax 2: ret END(STRCMP) |
|
From: Tom H. <to...@co...> - 2011-08-12 14:07:03
|
On 12/08/11 14:57, Julian Seward wrote: > Where in glibc should I look to see an example of an ifunc? Is (eg) the > function below, from ./sysdeps/x86_64/multiarch/strcmp.S, an ifunc? That looks like one. > which implies there is some problem with intercepting the ifunc > routine. I don't really understand the problem in this #c5. Seems > like it implies that the ifunc itself should update the GOT entry, > but that can't be right. I think the dynamic linker will do the update - if the symbol is of type STT_GNU_IFUNC then it will call the associated address (which would be the routine you quoted) then store the result (just as it would normally store the result of symbol resolution for a function on the first call) and then do the initial call. Future calls would go directly to the resolved address. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Julian S. <js...@ac...> - 2011-08-12 14:16:34
|
On Friday, August 12, 2011, Tom Hughes wrote: > I think the dynamic linker will do the update - if the symbol is of type > STT_GNU_IFUNC then it will call the associated address (which would be > the routine you quoted) then store the result (just as it would normally > store the result of symbol resolution for a function on the first call) > and then do the initial call. > > Future calls would go directly to the resolved address. Right -- that's also my understanding. And it matches the ifunc in my previous message, which clearly does not mess with the GOT (also, how would it know where it was?) So I don't understand what problem Dodji is referring to in 206013#c5. J |