[Valgrind-users] Question: bug in user code, valgrind or glibc?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I have been analyzing thunderbird mail client under valgrind for sometime.
memcheck has been so useful for me to find memory-related errors.
Thank  you for releasing this great tool.

Recently, I noticed an invalid read of 8 bytes warning, which should be 
familiar to all of us.

Interestingly, the initial part of the stack trace is found in a report 
in Qt bug database.
It comes from dynamic loading library support.
https://bugreports.qt.io/browse/QTBUG-90374
It was filed last year.

My system is Debian GNU/Linux and I used gcc to compile thunderbird.
The report was done by someone who uses clang.

I believe the issue lies in a certain version of dl-library, glibc OR 
valgrind? The reason I say valgrind might be to blame, too, is as follows.
(Debian is known to release toolchains very conservatively. I think that 
is why I did not see this issue last year.)

Actually, mine has line numbers slight off due to version differences I 
suspect.

143:39.43 GECKO(115765) ==115769== Invalid read of size 8
143:39.64 GECKO(115765) ==115769==    at 0x4021BF4: strncmp (strcmp.S:175)
143:39.64 GECKO(115765) ==115769==    by 0x400655D: is_dst (dl-load.c:214)
143:39.64 GECKO(115765) ==115769==    by 0x4007666: _dl_dst_count 
(dl-load.c:251)
143:39.64 GECKO(115765) ==115769==    by 0x4007857: 
expand_dynamic_string_token (dl-load.c:393)
143:39.64 GECKO(115765) ==115769==    by 0x40079C7: fillin_rpath.isra.0 
(dl-load.c:465)
143:39.68 GECKO(115765) ==115769==    by 0x4007CC2: decompose_rpath 
(dl-load.c:636)
143:39.68 GECKO(115765) ==115769==    by 0x4009E9D: cache_rpath 
(dl-load.c:678)
143:39.68 GECKO(115765) ==115769==    by 0x4009E9D: cache_rpath 
(dl-load.c:659)
       ... [omitted] ...

My local valgrind dump tells me where the address was allocated.

143:40.60 GECKO(115765) ==115769==  Address 0x27ba3819 is 9 bytes inside 
a block of size 15 alloc'd
143:40.65 GECKO(115765) ==115769==    at 0x483CF9B: malloc 
(vg_replace_malloc.c:380)
143:40.65 GECKO(115765) ==115769==    by 0x402074B: malloc 
(rtld-malloc.h:56)
143:40.65 GECKO(115765) ==115769==    by 0x402074B: strdup (strdup.c:42)
143:40.65 GECKO(115765) ==115769==    by 0x4007C54: decompose_rpath 
(dl-load.c:611)
143:40.65 GECKO(115765) ==115769==    by 0x4009E9D: cache_rpath 
(dl-load.c:678)
143:40.65 GECKO(115765) ==115769==    by 0x4009E9D: cache_rpath 
(dl-load.c:659)
143:40.65 GECKO(115765) ==115769==    by 0x4009E9D: _dl_map_object 
(dl-load.c:2174)
143:40.65 GECKO(115765) ==115769==    by 0x400E4B0: openaux (dl-deps.c:64)
           ... [omission] ...

I *think* this is a valid error case of large-sized READ used in strncmp 
reading beyond the
allocated memory boundary. (strcmp.S shows 8 octets read instead of one 
octet at a time.)

I think such a usage of strdup/str{n}cmp combination is abound in C 
source codes.
So I thought maybe valgrind was reporting something different.
Otherwise, many application programs have to create suppression for this 
type of issue.
That is what I thought initially.

A different type of error I thought initially was, say, for example, 9 bytes
inside a block of size 15 might mean somehow the data contains
uninitialized data in the string area in that position.  However, come
to think of it, if so, strdup would have triggered a valgrind warning
before this.  There is no warning from valgrind for strdup.

Also, I created a test program and realized that in that case, valgrind 
prints

==120076== Conditional jump or move depends on uninitialised value(s)
==120076==    at 0x4843172: strncmp (vg_replace_strmem.c:663)
==120076==    by 0x108778: main (in /home/ishikawa/Dropbox/TB-DIR/a.out)

So the original problem must be the read beyond malloc'ed area boundary.

Now, is dl-library to blame?
I think dl-library has been used literally hundreds of million times or 
more daily and
is hard to think that there is a bug there. (Famous last word).

Dl-library does not have control how long each path strings are (I
think it is trying to record the path components of a loading path),
and thus cannot control valgrind messages generated due to 8-char read
going beyond the malloced memory end. (So probably people have to
create suppression after all. If the particular version has this
issue.)

As for valgrind, can valgrind be somehow more intelligent in this
case?  Maybe creating a substitute strcmp? (I know single char
comparison at a time would be slower than comparing 8 characters at a
time when appropriate).  But at least, this type of surprise warning
would be reduced.

However, we may have a problem here for glibc..  If this read beyond
the malloced region is for real, we have a problem.  I have no idea how
this behavior is constrained or sanctioned by C standard, C library
standard or POSIX standard, but the use of 8 octets strcmp.S can lead
to a real issue possibly unless malloc() does allocate memory chunks
in 8 or larger unit uniformly. Unless glibc makes sure that there is a 
guard area between malloc area and the end of user virtual space.

I have an experience where a bitblt-like CPU instruction expected us to
create a bitmap with a horizontal bit length of multiple of 16 (or 8?).
even if the really used screen size is less than that. So we had to
round it up to the multiple of 16 (or 8?).
I got a bit stingy on memory use and once created a bitmap data with the 
raster line
not appended with this extra octets to make its length  a multiple of 16 
(or 8).
Kaboom. I created this memory area using the C runtime library of the 
CPU/computer maker's OS.
When the CPU bitblt-like instruction accessed the last raster line data, 
it fetches data 16 (or 8?) octets at a time and at the end, it accessed 
beyond the malloced area.
And it was BEYOND the allocated user memory space by the OS.
(The access of 8-byte read for intermediate ratlines ended up reading 
the next allocated rasterline area, and so it was OK.)
So the program crashed due to memory violation.
It took me a couple of weeks to figure this out since bitblt-like 
instruction did not
offer any clue regarding where the address violation occurred.
Also, only one of the screen bitmaps created thusly was at the end of 
user virtual space and it was difficult to realize why the instruction 
crashed seemingly in random manner when it handled other bitmaps without 
a problem.
The CPU vendor intended to use the instruction only for the main display 
screen of its work station and in that case, the memory is preallocated 
in neat 16-multiple horizontal. I tried to use the bitblt-like 
instruction for arbitrary use-defined virtual screen.

So my message here is that there *can* be a grave consequence of this 
malloc and reading larger than originally assumed  chunk behavior, but I 
am not sure where to report this and alert the developers.
Yeah, if malloc() allocates 8 or 16 byte chunks always, it should be OK  
[and we are better off it is built this way due to some standard, glibc 
manifest, or whatever published document which won't change overnight.]

Even in this age of PC users having GBs of memory, I hate to think of 
programs which allocates memory using 3 or 4 octet length...

Chiaki