From: <sv...@va...> - 2007-03-11 13:00:42
|
Author: sewardj Date: 2007-03-11 13:00:34 +0000 (Sun, 11 Mar 2007) New Revision: 6640 Log: It appears glibc-2.5's getenv() function steps along environment strings in 16-bit chunks, which can cause false errors in some cases (sigh). So do the usual thing and replace it. Modified: trunk/memcheck/mc_replace_strmem.c Modified: trunk/memcheck/mc_replace_strmem.c =================================================================== --- trunk/memcheck/mc_replace_strmem.c 2007-03-10 02:27:44 UTC (rev 6639) +++ trunk/memcheck/mc_replace_strmem.c 2007-03-11 13:00:34 UTC (rev 6640) @@ -670,6 +670,40 @@ GLIBC25_MEMPCPY(m_ld_so_1, mempcpy) /* ld.so.1 */ +/* getenv. glibc-2.5 steps along the env strings in 2 byte chunks + which means it sometimes overreads. sigh. */ +#define GLIBC25_GETENV(soname, fnname) \ + char* VG_REPLACE_FUNCTION_ZU(soname,fnname)( const char* name0 ); \ + char* VG_REPLACE_FUNCTION_ZU(soname,fnname)( const char* name0 ) \ + { \ + char** ep; \ + char* cand; \ + char* name; \ + extern char** __environ; \ + if (__environ == NULL || name0 == NULL || name0[0] == '\0') \ + return NULL; \ + for (ep = __environ; *ep; ep++) { \ + cand = *ep; \ + name = (char*)name0; \ + /* advance cand and name until either points at zero or \ + until what they both point at differs. */ \ + while (1) { \ + if (*cand == 0 || *name == 0) \ + break; \ + if (*cand != *name) \ + break; \ + cand++; \ + name++; \ + } \ + if (*name == 0 && *cand == '=') \ + return cand+1; \ + } \ + return NULL; \ + } + +GLIBC25_GETENV(m_libc_soname, getenv) + + /*------------------------------------------------------------*/ /*--- AIX stuff only after this point ---*/ /*------------------------------------------------------------*/ |
From: Dirk M. <dm...@gm...> - 2007-03-11 13:58:40
|
On Sunday, 11. March 2007, sv...@va... wrote: > It appears glibc-2.5's getenv() function steps along environment > strings in 16-bit chunks, which can cause false errors Are you sure that they're false? Dirk |
From: Julian S. <js...@ac...> - 2007-03-11 14:22:42
|
On Sunday 11 March 2007 13:58, Dirk Mueller wrote: > On Sunday, 11. March 2007, sv...@va... wrote: > > It appears glibc-2.5's getenv() function steps along environment > > strings in 16-bit chunks, which can cause false errors > > Are you sure that they're false? Well, um, err, not sure. Good question. I found this when running a complete KDE session on V (openSUSE 10.2, amd64) and got loads of errors from getenv(). I added the replacement function and the noise went away, so it's not KDE that was causing it, which was my immediate concern. Whether glibc's getenv() is broken, or just optimised too much for memcheck, I don't know. Now you mention it, maybe there is something bad going on. At the start of glibc-2.5/stdlib/getenv.c is this: /* Return the value of the environment variable NAME. This implementation is tuned a bit in that it assumes no environment variable has an empty name which of course should always be true. We have a special case for one character names so that for the general case we can assume at least two characters which we can access. By doing this we can avoid using the `strncmp' most of the time. */ but I don't have the energy to dig through the complicated logic and figure out what's going on. The errors were of the form "invalid read of size 2 at the start of a block of size 1". From the glibc getenv() code I don't think that is the name to be looked up, but instead an entry in __environ. So a 1-char entry only has space for "=", so perhaps something in KDE added some bogus entries to the environment and it is then hitting the assumption in the glibc code that no variable has an empty name. So perhaps it is a bug in KDE. Who knows. J |
From: Julian S. <js...@ac...> - 2007-03-11 18:31:57
|
On Sunday 11 March 2007 14:20, Julian Seward wrote: > On Sunday 11 March 2007 13:58, Dirk Mueller wrote: > > On Sunday, 11. March 2007, sv...@va... wrote: > > > It appears glibc-2.5's getenv() function steps along environment > > > strings in 16-bit chunks, which can cause false errors > > > > Are you sure that they're false? > > Well, um, err, not sure. Good question. Maybe glibc's getenv is fine and the problem is there are invalid strings in the environment. The only way I can reproduce this is to add (putenv) a non-terminated string containing just '=' and then do a getenv: #include <stdlib.h> #include <assert.h> #include <stdio.h> int main ( void ) { int r; char* p; p = malloc(1); assert(p); p[0] = '='; r = putenv(p); assert(r == 0); p = getenv("XYZZY"); printf("p = %p\n", p); return 0; } gives ==27447== Invalid read of size 2 ==27447== at 0x4E593FD: getenv (in /lib64/libc-2.5.so) ==27447== by 0x4006A5: main (nullenv.c:14) ==27447== Address 0x4049030 is 0 bytes inside a block of size 1 alloc'd ==27447== at 0x4C22A76: malloc (vg_replace_malloc.c:207) ==27447== by 0x400655: main (nullenv.c:10) Passing any kind of properly terminated string to putenv makes the error go away. So perhaps the environment is constructed wrongly. The first sign of trouble in a KDE run is this: (4) ==27107== Invalid read of size 2 (4) ==27107== at 0x6DE43FD: getenv (in /lib64/libc-2.5.so) (4) ==27107== by 0x4082D7: (within /opt/kde3/bin/kdeinit) (4) ==27107== by 0x408CCE: (within /opt/kde3/bin/kdeinit) (4) ==27107== by 0x6DCFAE3: (below main) (in /lib64/libc-2.5.so) (4) ==27107== Address 0x40708e0 is 0 bytes inside a block of size 1 alloc'd (4) ==27107== at 0x4C22A76: malloc (vg_replace_malloc.c:207) (4) ==27107== by 0x6E25AB1: strdup (in /lib64/libc-2.5.so) (4) ==27107== by 0x404E20: (within /opt/kde3/bin/kdeinit) (4) ==27107== by 0x408CBD: (within /opt/kde3/bin/kdeinit) (4) ==27107== by 0x6DCFAE3: (below main) (in /lib64/libc-2.5.so) Are there debuginfo packages available for KDE on SuSE 10.2 so we can find out who is calling strdup? J |
From: Dirk M. <dm...@gm...> - 2007-03-21 16:04:41
|
On Sunday, 11. March 2007, Julian Seward wrote: > Maybe glibc's getenv is fine and the problem is there are invalid > strings in the environment. The only way I can reproduce this is to > add (putenv) a non-terminated string containing just '=' and then do > a getenv: Yep, I can confirm. I couldn't find another issue either. So there seems to be some issue with the code. are you reverting your getenv wrapper? Dirk |
From: Julian S. <js...@ac...> - 2007-03-21 16:13:06
|
On Wednesday 21 March 2007 16:04, Dirk Mueller wrote: > On Sunday, 11. March 2007, Julian Seward wrote: > > Maybe glibc's getenv is fine and the problem is there are invalid > > strings in the environment. The only way I can reproduce this is to > > add (putenv) a non-terminated string containing just '=' and then do > > a getenv: > > Yep, I can confirm. I couldn't find another issue either. So there seems to > be some issue with the code. are you reverting your getenv wrapper? Dirk, thanks for the confirmation. Yes, the getenv wrapper is already reverted. J |
From: Dirk M. <dm...@gm...> - 2007-03-29 21:19:35
Attachments:
putenv-wrapper.diff
|
On Wednesday, 21. March 2007, Julian Seward wrote: > Dirk, thanks for the confirmation. Yes, the getenv wrapper is already > reverted. Ah, sorry, didn't see it. I've tried to implement a putenv function wrapper in order to be able to pinpoint the place where undefined values are put into the environment (setenv is also to be fixed similarly) instead of having to trace miracle errors during getenv time. the patch is below, and its not working. probably I'm doing something stupid, but I can't find an example for a wrapper function anywhere. Any insights much appreciated. Thanks, Dirk |
From: Julian S. <js...@ac...> - 2007-03-29 21:39:44
|
Dirk > the patch is below, and its not working. probably I'm doing something > stupid, No - this is subtle. > but I can't find an example for a wrapper function anywhere. auxprogs/libmpiwrap.c has a huge collection of them. You've hit the distinction between replacement functions and wrapper functions. A replacement (eg malloc) simply replaces some function; there is no way to call onwards to the original. A wrapper function is more powerful - it gives you a way to find out the original and call onwards to it. I think your wrapper might produce an infinite loop, since the call to putenv just goes back to the wrapper. Try the code below. All this strange stuff is documented at http://www.valgrind.org/docs/manual/manual-core.html#manual-core.wrapping J /* putenv */ int VG_WRAP_FUNCTION_ZU(m_libc_soname, putenv) (char* string); int VG_WRAP_FUNCTION_ZU(m_libc_soname, putenv) (char* string) { OrigFn fn; char* p; VALGRIND_GET_ORIG_FN(fn); // remember the original p = string; if (string) { int len = 0; while (*p++) len++; VALGRIND_CHECK_MEM_IS_DEFINED(string, len+1); } // call original, ultra-magic hacks to avoid recursion unsigned long tmp; CALL_FN_W_W(tmp, fn, (unsigned long)string); return (int)tmp; } |
From: Dirk M. <dm...@gm...> - 2007-03-29 23:11:43
Attachments:
putenv-wrapper.diff
|
On Thursday, 29. March 2007, Julian Seward wrote: > > the patch is below, and its not working. probably I'm doing something > > stupid, > No - this is subtle. Hmm, no :) I was actually looking for something like the call wrapper, but I must have been blind. Thanks a lot for your help! Patch is below. The reason glibc doesn't trigger the memcheck traces itself is that it never actually touches the memory, it just stores the pointer to it in the environment table and is done with it. Ok? Thanks, Dirk |
From: Julian S. <js...@ac...> - 2007-03-29 23:32:26
|
> Patch is below. Nice! Did it allow you to track down the original problem with env vars (now lost in the mists of time) ? My only concerns are (mostly the 2nd point): - could you change 'int result' to 'Word result' (64-bit paranoia w.r.t the horrible CALL_FN_W.. macros) - how do we stop some future gcc-5.7.2 from noticing that "if (p) while (*p++) ;" does not compute anything useful and and therefore optimising it away? Only solution I can think of (is horrible) is: if (p) { while (*p++) ; __asm__ __volatile__( "" : /*out*/ : /*in*/ "r"(p) ); } Since gcc has no way to know what is going in inside the asm, it has to believe our claim that it reads p. Nicer suggestions welcomed. J |
From: Nicholas N. <nj...@cs...> - 2007-03-29 23:40:11
|
On Fri, 30 Mar 2007, Julian Seward wrote: >> Patch is below. > > Nice! Did it allow you to track down the original problem with > env vars (now lost in the mists of time) ? So you've essentially added eager definedness checking for environment variables -- is that right? Nick |
From: Julian S. <js...@ac...> - 2007-03-29 23:59:42
|
On Friday 30 March 2007 00:40, Nicholas Nethercote wrote: > On Fri, 30 Mar 2007, Julian Seward wrote: > >> Patch is below. > > > > Nice! Did it allow you to track down the original problem with > > env vars (now lost in the mists of time) ? > > So you've essentially added eager definedness checking for environment > variables -- is that right? Oh, yes, you're right. Hang on. That's not good. The original reason I wrote a getenv wrapper is I thought that the glibc one over-read memory (like memcpy etc). But I was wrong, it was the application putting bogus strings into the environment. So now Dirk is trying to detect this when at the putenv/addenv point rather than later in getenv, when it is really too late. Sounds reasonable; on the other hand it's a bit specialised and would cause a false error in the case where you added an undefined string to the environment and then never looked it up. Should be be in the business of checking arguments to arbitrary libc functions? I had not realised this before, but the function wrapping stuff makes it possible to do that if we want. J |
From: Ashley P. <as...@qu...> - 2007-03-30 09:51:12
|
Julian Seward wrote: > Should be be in the business of checking arguments to arbitrary libc > functions? I had not realised this before, but the function wrapping > stuff makes it possible to do that if we want. It would be a nice thing to have although a whole heap of work to do it properly. Question is though should memcheck be doing this or would it require it's own tool? How much checking could you perform by taking the output of "valgrind --tool=callgrind --ct-verbose=1 ..." and piping it into another application? There are a number of functions that are done already (malloc(-1) anyone?) but they could be better because currently they only print an error, not the stack and don't play well with the xml output, changing this is another thing that's on my TODO list but I'm not sure what the correct thing to do is... Ashley, |
From: Nicholas N. <nj...@cs...> - 2007-03-30 00:41:05
|
On Fri, 30 Mar 2007, Julian Seward wrote: >> So you've essentially added eager definedness checking for environment >> variables -- is that right? > > Oh, yes, you're right. Hang on. That's not good. > [...] > Should be be in the business of checking arguments to arbitrary libc > functions? I had not realised this before, but the function wrapping > stuff makes it possible to do that if we want. Dunno, it's a bit of a slippery slope. It's a shame the origin-tracking doesn't work very well. Nick |
From: Josef W. <Jos...@gm...> - 2007-03-30 09:24:50
|
On Friday 30 March 2007, Nicholas Nethercote wrote: > On Fri, 30 Mar 2007, Julian Seward wrote: > > >> So you've essentially added eager definedness checking for environment > >> variables -- is that right? > > > > Oh, yes, you're right. Hang on. That's not good. > > [...] > > Should be be in the business of checking arguments to arbitrary libc > > functions? I had not realised this before, but the function wrapping > > stuff makes it possible to do that if we want. > > Dunno, it's a bit of a slippery slope. It's a shame the origin-tracking > doesn't work very well. What are the big problems there? Josef |
From: Nicholas N. <nj...@cs...> - 2007-04-02 03:15:18
|
On Fri, 30 Mar 2007, Josef Weidendorfer wrote: >> Dunno, it's a bit of a slippery slope. It's a shame the origin-tracking >> doesn't work very well. > > What are the big problems there? See this paper: http://www.cs.mu.oz.au/~njn/pubs/origin-tracking-TR2007.pdf You only need to read sections 1 and 6 to understand the Memcheck origin tracking. Nick |
From: Dirk M. <dm...@gm...> - 2007-03-30 09:47:01
|
On Friday, 30. March 2007, Julian Seward wrote: > Oh, yes, you're right. Hang on. That's not good. I've thought about this some days ago. the reason why environment is special compared to other glibc functions is that it is something that interacts with process-external (other subprocesses for example). So it falls into the same class like the defined-ness checking for write(2) or similar syscallls, with the difference, that environment manipulation doesn't involve any significant syscalls (brk() at most). Thats why I want to add these special wrappers. No, I haven't used the code myself yet, but I've added it to the suse package and requested the bug reporter to test it ;) > Sounds reasonable; on the other hand it's a bit specialised and > would cause a false error in the case where you added an undefined > string to the environment and then never looked it up. Well, its the same like you write uninitialized memory to disk and never use it again. Plus you can still write a suppression if you really don't like it :) > Should be be in the business of checking arguments to arbitrary libc > functions? I had not realised this before, but the function wrapping > stuff makes it possible to do that if we want. A more generalized approach would be preferred, yes. So still ok to go ahead with the patch? Dirk |
From: Dirk M. <dm...@gm...> - 2007-03-30 09:56:27
|
On Friday, 30. March 2007, Julian Seward wrote: > - could you change 'int result' to 'Word result' (64-bit paranoia > w.r.t the horrible CALL_FN_W.. macros) done. > - how do we stop some future gcc-5.7.2 from noticing that > "if (p) while (*p++) ;" does not compute anything useful and > and therefore optimising it away? it is a documented feature of gcc to not optimize away empty loops. > Since gcc has no way to know what is going in inside the asm, it > has to believe our claim that it reads p. another solution would be to calculate len and store it in a volatile variable. Dirk |
From: Ashley P. <as...@qu...> - 2007-03-30 10:00:57
|
Dirk Mueller wrote: > Plus you can still write a suppression if you really don't like > it :) Can you? As it's a function wrapper isn't it run in the client address space and and therefore currently unsuppressable? This may have changed without me noticing though. Ashley, |
From: Duncan S. <bal...@fr...> - 2007-03-30 10:13:45
|
> it is a documented feature of gcc to not optimize away empty loops. Not anymore, this feature was removed in the gcc 4.x series. Duncan. |
From: Dirk M. <dm...@gm...> - 2007-03-30 10:37:02
|
On Friday, 30. March 2007, Duncan Sands wrote: > > it is a documented feature of gcc to not optimize away empty loops. > Not anymore, this feature was removed in the gcc 4.x series. Not entirely true: the documentation says: However, the rationale here is that optimization of a nonempty loop cannot produce an empty one. This held for carefully written C compiled with less powerful optimizers but is not always the case for carefully written C++ or with more powerful optimizers. Thus GCC will remove operations from loops whenever it can determine those operations are not externally visible (apart from the time taken to execute them, of course). In case the loop can be proved to be finite, GCC will also remove the loop itself. In this case, there is a memory dereference involved (externally visible) and the compiler cannot prove that the loop is finite (since it doesn't know exit value of p). the next two levels of making sure the compiler doesn't optimize is volatile'ing p and calling an extern non-inline function with p after the loop. (which is a bit difficult because the vgpreload consists of only one compilation unit, I'd have to add a second one for that). Dirk |
From: Dirk M. <dm...@gm...> - 2007-03-30 10:39:17
|
On Friday, 30. March 2007, Ashley Pittman wrote: > Can you? As it's a function wrapper isn't it run in the client address > space and and therefore currently unsuppressable? This may have changed > without me noticing though. Just because it is run in the client it is suppressable (?). In any case created a suppression to test this behaviour and it worked fine for me. Dirk |
From: Ashley P. <as...@qu...> - 2007-03-30 10:45:33
|
Dirk Mueller wrote: > On Friday, 30. March 2007, Ashley Pittman wrote: > >> Can you? As it's a function wrapper isn't it run in the client address >> space and and therefore currently unsuppressable? This may have changed >> without me noticing though. > > Just because it is run in the client it is suppressable (?). In any case > created a suppression to test this behaviour and it worked fine for me. Errors that originate from client requests are reported as "Uninitialised byte(s) found during client check request" and I don't think these can be suppressed currently, it should be fairly easy to add however. The other slightly annoying thing about them is that they don't report the base or the length of the client request. Ashley, |
From: Dirk M. <dm...@gm...> - 2007-03-30 11:01:54
|
On Friday, 30. March 2007, Ashley Pittman wrote: > Errors that originate from client requests are reported as > "Uninitialised byte(s) found during client check request" and I don't > think these can be suppressed currently, it should be fairly easy to add > however. The other slightly annoying thing about them is that they > don't report the base or the length of the client request. I've removed the client request in the 2nd incarnation of the patch, since the string length determination did already issue a memcheck trace, the valgrind client request was just duplicating the same error. Greetings, Dirk |
From: Nicholas N. <nj...@cs...> - 2007-04-01 08:40:18
|
On Fri, 30 Mar 2007, Dirk Mueller wrote: > I've thought about this some days ago. the reason why environment is special > compared to other glibc functions is that it is something that interacts with > process-external (other subprocesses for example). So it falls into the same > class like the defined-ness checking for write(2) or similar syscallls, with > the difference, that environment manipulation doesn't involve any significant > syscalls (brk() at most). Thats why I want to add these special wrappers. Hmm, that does sound reasonable. Nick |