|
From: Alan J. <aj...@st...> - 2008-06-17 19:29:53
|
I found a memory error in depmod(8). I vaguely remembered an LKML
discussion on a similar topic but I couldn't find it. Now I've tried to
describe the problem I find I've answered most of my questions. I'd
still appreciate some comment, even if it's just "Yes, we know".
The invalid read is caused by strlen() is reading past the end of a
malloc()ed string. For efficiency it is reading the string 4 bytes at a
time. valgrind reports this as an error.
I believe this is acceptable *if the reads are guaranteed to be
aligned*. In this case they are, because the string is a variable
length array at the end of a C structure, and the previous field is a
pointer. The string will start on an aligned address, so the read which
includes the terminator will also be aligned. There's no chance of it
crossing a page boundary and provoking a segfault.
This is GCC at it's scariest. If I add a field "char pad;" just before
the variable length array, then the memory errors go away (it calls the
out-of-line strlen() which makes no alignment assumptions).
The problem is that valgrind obviously doesn't have all the information
GCC has. Should valgrind ignore an aligned read which straddles the end
of a malloc()ed block? Or should it be reported as a "Possibly invalid
read", by analogy with the line "possibly lost" in the leak summary?
The read can't cause any harm, but in other cases it might be an
indication of a bug.
OK, I've just found "--partial-loads-ok=yes" which is supposed to let
you ignore these errors. Unfortunately it doesn't seem to work :-(. I
still get the same errors. Also the manpage says this is an option of
last resort for code which violates the C standards. I don't think the
standards didn't consider what instructions GCC might generate as
optimizations. Possibly the documentation should add that GCC may
generate such code of it's own accord.
Sigh. Another piece of hard-won obscure knowledge to add to my
collection. I hope the act of posting it makes it slightly less obscure.
$ valgrind ./depmod -n > /dev/null
==24016== Memcheck, a memory error detector.
==24016== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==24016== Using LibVEX rev 1854, a library for dynamic binary translation.
==24016== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==24016== Using valgrind-3.3.1, a dynamic binary instrumentation framework.
==24016== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==24016== For more details, rerun with: -v
==24016==
==24016== Invalid read of size 4
==24016== at 0x401612: output_aliases (depmod.c:760)
==24016== by 0x40346D: main (depmod.c:1204)
==24016== Address 0x52bb494 is 252 bytes inside a block of size 255 alloc'd
==24016== at 0x4C22FEB: malloc (vg_replace_malloc.c:207)
==24016== by 0x4028D5: grab_module (depmod.c:335)
==24016== by 0x402A85: do_module (depmod.c:597)
==24016== by 0x401B54: grab_dir (depmod.c:641)
==24016== by 0x401B2C: grab_dir (depmod.c:654)
==24016== by 0x401B2C: grab_dir (depmod.c:654)
==24016== by 0x4034EF: main (depmod.c:676)
I stripped down output_aliases() for the purposes of reproduction:
static void output_aliases(struct module *modules, FILE *out)
{
struct module *i;
const char *p;
unsigned long size;
fprintf(out, "# Aliases extracted from modules themselves.\n");
for (i = modules; i; i = i->next) {
/* In place of more complex code using strlen() */
fprintf(out, "%d\n", (int) strlen(i->pathname));
}
}
And here's the disassembly. Note the "add $0x4" and the magic that follows:
.../depmod.c:760
401608: 48 8d b3 d0 00 00 00 lea 0xd0(%rbx),%rsi
40160f: 48 89 f2 mov %rsi,%rdx
401612: 8b 02 mov (%rdx),%eax ;<-- 4 byte
read here <--
401614: 48 83 c2 04 add $0x4,%rdx
401618: 8d 88 ff fe fe fe lea -0x1010101(%rax),%ecx
40161e: f7 d0 not %eax
401620: 21 c1 and %eax,%ecx
401622: 81 e1 80 80 80 80 and $0x80808080,%ecx
401628: 74 e8 je 401612 <output_aliases+0x32>
40162a: 89 c8 mov %ecx,%eax
40162c: 48 89 ef mov %rbp,%rdi
40162f: c1 e8 10 shr $0x10,%eax
401632: f7 c1 80 80 00 00 test $0x8080,%ecx
401638: 0f 44 c8 cmove %eax,%ecx
40163b: 48 8d 42 02 lea 0x2(%rdx),%rax
40163f: 48 0f 44 d0 cmove %rax,%rdx
401643: 00 c9 add %cl,%cl
401645: 48 83 da 03 sbb $0x3,%rdx
401649: 31 c0 xor %eax,%eax
40164b: 48 29 f2 sub %rsi,%rdx
40164e: be 8b 7f 40 00 mov $0x407f8b,%esi
401653: e8 f8 fd ff ff callq 401450 <fprintf@plt>
|
|
From: Nicholas N. <nj...@cs...> - 2008-06-18 01:19:57
|
On Tue, 17 Jun 2008, Alan Jenkins wrote: > The invalid read is caused by strlen() is reading past the end of a > malloc()ed string. For efficiency it is reading the string 4 bytes at a > time. valgrind reports this as an error. > > I believe this is acceptable *if the reads are guaranteed to be > aligned*. In this case they are, because the string is a variable > length array at the end of a C structure, and the previous field is a > pointer. The string will start on an aligned address, so the read which > includes the terminator will also be aligned. There's no chance of it > crossing a page boundary and provoking a segfault. Valgrind tries to replace strlen with its own, less-optimised version, for exactly this reason. Sometimes it doesn't work, eg. if your libc.so is stripped (IIRC). As for "acceptable"... there's "acceptable according to the C standard" and "acceptable because it works on pretty much all modern machines", and those two aren't quite the same. Valgrind/Memcheck tries to find a happy medium between those two, and --partial-loads-ok=yes is one example where you get to choose what behaviour you want. It's surprising that --partial-loads-ok=yes isn't working, though. Nick |
|
From: Alan J. <aj...@st...> - 2008-06-18 09:59:50
|
Nicholas Nethercote wrote: > On Tue, 17 Jun 2008, Alan Jenkins wrote: > >> The invalid read is caused by strlen() is reading past the end of a >> malloc()ed string. For efficiency it is reading the string 4 bytes at a >> time. valgrind reports this as an error. >> >> I believe this is acceptable *if the reads are guaranteed to be >> aligned*. In this case they are, because the string is a variable >> length array at the end of a C structure, and the previous field is a >> pointer. The string will start on an aligned address, so the read which >> includes the terminator will also be aligned. There's no chance of it >> crossing a page boundary and provoking a segfault. > > Valgrind tries to replace strlen with its own, less-optimised version, > for exactly this reason. Sometimes it doesn't work, eg. if your > libc.so is stripped (IIRC). I think it can't work in my case because GCC has inlined it all (I think it's a "builtin function"). If you look at the assembly, I can't see what valgrind could replace here. I went back and checked - there's no error if I compile without optimisations (-O0). Also, this means I can't ignore this error with a single suppression for strlen(); I need a suppression for every function with this inlined strlen(). > As for "acceptable"... there's "acceptable according to the C > standard" and "acceptable because it works on pretty much all modern > machines", and those two aren't quite the same. Valgrind/Memcheck > tries to find a happy medium between those two, and > --partial-loads-ok=yes is one example where you get to choose what > behaviour you want. I think this is a grey area. Not because it happens to work on my machine, but because AFAIK the code responsible is not actually a C function. It's a GCC builtin that's been inlined. Alternatively, you can look it at as a scarily good compiler optimisation. My understanding is that the C standards don't say anything about the exact memory accesses a compiler can generate, other than specifying aliasing rules and volatile objects. I'm happy that valgrind warns about it though, especially since there's an option to control the warnings. > It's surprising that --partial-loads-ok=yes isn't working, though. I've figured that out now too. The option works on my 32 bit system. It does not work on my 64 bit system. On both systems, the invalid read is of size 4. "--partial-loads-ok=yes" is working as documented; it only applies to "word-sized, word-aligned loads". This specific problem wouldn't happen if GCC had a 64-bit version of the aggresively optimised strlen() builtin. Thanks for letting me share my problem. Alan |
|
From: Nicholas N. <nj...@cs...> - 2008-06-18 12:11:42
|
On Wed, 18 Jun 2008, Alan Jenkins wrote: >> As for "acceptable"... there's "acceptable according to the C standard" and >> "acceptable because it works on pretty much all modern machines", and those >> two aren't quite the same. Valgrind/Memcheck tries to find a happy medium >> between those two, and --partial-loads-ok=yes is one example where you get >> to choose what behaviour you want. > > I think this is a grey area. Not because it happens to work on my machine, > but because AFAIK the code responsible is not actually a C function. It's a > GCC builtin that's been inlined. Alternatively, you can look it at as a > scarily good compiler optimisation. My understanding is that the C standards > don't say anything about the exact memory accesses a compiler can generate, > other than specifying aliasing rules and volatile objects. I believe there's something saying that you can move an array pointer one element past the end of an array, but nowhere else outside its array. But Valgrind works at the binary level, and it may not even be C code that its analysing, so that makes things even more complicated. In short, you hit one of the tricky cases. Nick |
|
From: Alan J. <aj...@st...> - 2008-06-18 12:25:03
|
Nicholas Nethercote wrote: > On Wed, 18 Jun 2008, Alan Jenkins wrote: > >>> As for "acceptable"... there's "acceptable according to the C >>> standard" and "acceptable because it works on pretty much all modern >>> machines", and those two aren't quite the same. Valgrind/Memcheck >>> tries to find a happy medium between those two, and >>> --partial-loads-ok=yes is one example where you get to choose what >>> behaviour you want. >> >> I think this is a grey area. Not because it happens to work on my >> machine, but because AFAIK the code responsible is not actually a C >> function. It's a GCC builtin that's been inlined. Alternatively, >> you can look it at as a scarily good compiler optimisation. My >> understanding is that the C standards don't say anything about the >> exact memory accesses a compiler can generate, other than specifying >> aliasing rules and volatile objects. > > I believe there's something saying that you can move an array pointer > one element past the end of an array, but nowhere else outside its > array. But Valgrind works at the binary level, and it may not even be > C code that its analysing, so that makes things even more complicated. > > In short, you hit one of the tricky cases. I certainly did! As a nitpick: what you're thinking of with arrays is a convenience for pointer arithmetic only. You're allowed to generate a pointer one element past the end of an array, but you're not supposed to actually dereference it and try to read the nonexistent element :-). Anyway, I'm happy now I understand what's going on. Valgrind has been extremely useful to me; I can live with a few strange errors that have to be ignored. Alan |
|
From: Bart V. A. <bar...@gm...> - 2008-06-18 12:25:57
|
On Wed, Jun 18, 2008 at 2:11 PM, Nicholas Nethercote <nj...@cs...> wrote: > I believe there's something saying that you can move an array pointer one > element past the end of an array, but nowhere else outside its array. A pointer may point to the element just past an array in C and C++, but I don't think it's allowed to dereference such a pointer. Bart. |
|
From: Nicholas N. <nj...@cs...> - 2008-06-18 22:15:28
|
On Wed, 18 Jun 2008, Bart Van Assche wrote: >> I believe there's something saying that you can move an array pointer one >> element past the end of an array, but nowhere else outside its array. > > A pointer may point to the element just past an array in C and C++, > but I don't think it's allowed to dereference such a pointer. Yes. That's what I meant, I obviously didn't make myself clear! :) Nick |
|
From: Julian S. <js...@ac...> - 2008-06-18 12:40:08
|
On Wednesday 18 June 2008 11:56, Alan Jenkins wrote: > > Valgrind tries to replace strlen with its own, less-optimised version, > > for exactly this reason. Sometimes it doesn't work, eg. if your > > libc.so is stripped (IIRC). > > I think it can't work in my case because GCC has inlined it all (I think > it's a "builtin function"). That's unfortunately correct. > If you look at the assembly, I can't see > what valgrind could replace here. I went back and checked - there's no > error if I compile without optimisations (-O0). Also, this means I > can't ignore this error with a single suppression for strlen(); I need a > suppression for every function with this inlined strlen(). Yes .. known problem. I think you can persuade gcc not to do the inlining if you give it -mno-inline-all-stringops. That would allow you to still use -O and above. I'd be interested to know if this helps. > can look it at as a scarily good compiler optimisation. My > understanding is that the C standards don't say anything about the exact > memory accesses a compiler can generate, other than specifying aliasing > rules and volatile objects. There's basically a tradeoff between performance of generated code vs generating code which is sufficiently uncomplicated to be accurately understandable by Memcheck. Not surprisingly the gcc people are mostly interested in maxing out performance, no matter what .. which occasionally means observability takes second place. Unfortunate but true. Having said that, 99.9% of the time Memcheck has no problems making sense of the generated code. > > It's surprising that --partial-loads-ok=yes isn't working, though. > > I've figured that out now too. The option works on my 32 bit system. > It does not work on my 64 bit system. On both systems, the invalid read > is of size 4. "--partial-loads-ok=yes" is working as documented; Good. (was a bit concerned to hear it wasn't working as documented ..) J |
|
From: Alan J. <aj...@st...> - 2008-06-18 13:05:43
|
Julian Seward wrote: > On Wednesday 18 June 2008 11:56, Alan Jenkins wrote: > > >>> Valgrind tries to replace strlen with its own, less-optimised version, >>> for exactly this reason. Sometimes it doesn't work, eg. if your >>> libc.so is stripped (IIRC). >>> >> I think it can't work in my case because GCC has inlined it all (I think >> it's a "builtin function"). >> > > That's unfortunately correct. > > >> If you look at the assembly, I can't see >> what valgrind could replace here. I went back and checked - there's no >> error if I compile without optimisations (-O0). Also, this means I >> can't ignore this error with a single suppression for strlen(); I need a >> suppression for every function with this inlined strlen(). >> > > Yes .. known problem. I think you can persuade gcc not to do the > inlining if you give it -mno-inline-all-stringops. That would allow > you to still use -O and above. I'd be interested to know if this > helps. > No, it still generates the same invalid read error. I think -mno-inline-all-stringops is the default. Looking at the manpage, GCC will always inline in my case, because it knows my string is aligned. My string is a variable length array at the end of a C structure, and the previous field is a pointer. So I don't think I can disable this specific optimisation. Thanks Alan |
|
From: Doug E. <dj...@go...> - 2008-07-11 05:16:10
|
On Wed, Jun 18, 2008 at 6:00 AM, Alan Jenkins <aj...@st...> wrote: >> Yes .. known problem. I think you can persuade gcc not to do the >> inlining if you give it -mno-inline-all-stringops. That would allow >> you to still use -O and above. I'd be interested to know if this >> helps. >> > No, it still generates the same invalid read error. > > I think -mno-inline-all-stringops is the default. Looking at the > manpage, GCC will always inline in my case, because it knows my string > is aligned. My string is a variable length array at the end of a C > structure, and the previous field is a pointer. So I don't think I can > disable this specific optimisation. -fno-builtin but it is a sledgehammer |