|
From: Johannes S. <Joh...@gm...> - 2009-01-27 23:33:16
|
Hi,
unfortunately unsubscribed users cannot send to your list, so I am
forwarding this after subscribing.
Short story: we have problems with zlib. No, not those problems.
Apparently, there are uninitialized bytes _in the middle_ of the buffer
filled by deflate().
Maybe someone can shed some light into the issue?
Thanks,
Dscho
---------- Forwarded message ----------
Date: Tue, 27 Jan 2009 22:52:39 +0100 (CET)
From: Johannes Schindelin <Joh...@gm...>
To: Linus Torvalds <tor...@li...>
Cc: zl...@gz..., val...@li...,
Mark Brown <br...@si...>, Jeff King <pe...@pe...>,
Junio C Hamano <gi...@po...>, Git Mailing List <gi...@vg...>
Subject: Re: Valgrind updates
Hi,
[Cc'ed the valgrind-users list, maybe the valgrind Gods can see that our
case is pretty strange, and tell us what we do wrong.]
Note to valgrind experts: this is _not_ about the Conditional thing in
zlib, but about an uninitialized byte _in the middle_ of the zlib output
buffer.
On Tue, 27 Jan 2009, Linus Torvalds wrote:
> Hmm. The zlib faq has a note about zlib doing a conditional on
> uninitialized memory that doesn't matter, and that is what the
> suppression should be about (to avoid a warning about "Conditional jump
> or move depends on uninitialised value").
>
> But that one is documented to not matter for the actual output (zlib
> FAQ#36).
>
> It's possible that zlib really does leave padding bytes around that
> literally don't matter, and that don't get initialized. That really
> would be bad, because it means that the output of git wouldn't be
> repeatable. But I doubt this is the case - original git used to actually
> do the SHA1 over the _compressed_ data, which was admittedly a totally
> and utterly broken design (and we fixed it), but it did work. Maybe it
> worked by luck, but I somehow doubt it.
>
> Some googling did find this:
>
> http://mailman.few.vu.nl/pipermail/sysprog/2008-October/000298.html
>
> which looks very similar: an uninitialized byte in the middle of a
> deflate() packet.
>
> Anyway, I'm just going to Cc 'zl...@gz...', since this definitely is
> _not_ the same issue as in the FAQ, and we're not the only ones seeing it.
>
> [...]
>
> Dscho wrote:
>
> > Yet, the buffer in question is 195 bytes, stream.total_count (which
> > totally agrees with size - stream.avail_out) says it is 58 bytes, and
> > valgrind says that the byte with offset 51 is uninitialized.
>
> The thing to note here is that what we are passing in to "write_buffer()"
> is _exactly_ what zlib deflated for us:
>
> - 'compressed' is the allocation, and is what we used to initialize
> 'stream.next_out' with (at the top of the code sequence above)
>
> - 'size' is gotten from 'stream.total_out' at the end of the compression.
>
> Oh Gods of zlib, please hear our plea for clarification..
To help ye Gods, I put together this almost minimal C program:
-- snip --
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <zlib.h>
int main(int argc, char **argv)
{
const char hdr[] = {
0x74, 0x72, 0x65, 0x65, 0x20, 0x31, 0x36, 0x35,
0x00,
};
int hdrlen = sizeof(hdr);
const char buf[] = {
0x31, 0x30, 0x30, 0x36, 0x34, 0x34, 0x20, 0x66,
0x69, 0x6c, 0x65, 0x31, 0x00, 0x10, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x31, 0x30, 0x30, 0x36, 0x34, 0x34, 0x20,
0x66, 0x69, 0x6c, 0x65, 0x32, 0x00, 0x20, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x31, 0x30, 0x30, 0x36, 0x34, 0x34,
0x20, 0x66, 0x69, 0x6c, 0x65, 0x33, 0x00, 0x30,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x31, 0x30, 0x30, 0x36, 0x34,
0x34, 0x20, 0x66, 0x69, 0x6c, 0x65, 0x34, 0x00,
0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x31, 0x30, 0x30, 0x36,
0x34, 0x34, 0x20, 0x66, 0x69, 0x6c, 0x65, 0x35,
0x00, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00,
};
int len = sizeof(buf);
z_stream stream;
unsigned char *compressed;
int size, ret, i;
FILE *out;
memset(&stream, 0, sizeof(stream));
deflateInit(&stream, Z_BEST_SPEED);
size = 8 + deflateBound(&stream, len+hdrlen);
compressed = malloc(size);
if (!compressed)
return 1;
stream.next_out = compressed;
stream.avail_out = size;
stream.next_in = (unsigned char *)hdr;
stream.avail_in = hdrlen;
while ((ret = deflate(&stream, 0)) == Z_OK)
/* nothing */;
/* deflate() returns Z_BUF_ERROR at this point */
stream.next_in = (unsigned char *)buf;
stream.avail_in = len;
ret = deflate(&stream, Z_FINISH);
if (ret != Z_STREAM_END)
return 1;
if (deflateEnd(&stream) != Z_OK)
return 1;
out = fopen("/dev/null", "w");
fwrite(compressed + 51, 51, 1, out);
fwrite(compressed + 51, 1, 1, stderr);
fflush(out);
fclose(out);
free(compressed);
return 0;
}
-- snap --
... which produces this output...
-- snip --
==6348== Memcheck, a memory error detector.
==6348== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==6348== Using LibVEX rev exported, a library for dynamic binary translation.
==6348== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==6348== Using valgrind-3.5.0.SVN, a dynamic binary instrumentation framework.
==6348== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==6348== For more details, rerun with: -v
==6348==
==6348== Use of uninitialised value of size 8
==6348== at 0x4E2FC5B: (within /usr/lib/libz.so.1.2.3.3)
==6348== by 0x4E317B6: (within /usr/lib/libz.so.1.2.3.3)
==6348== by 0x4E2DF9C: (within /usr/lib/libz.so.1.2.3.3)
==6348== by 0x4E2E654: deflate (in /usr/lib/libz.so.1.2.3.3)
==6348== by 0x400957: main (valgrind-testcase.c:60)
==6348==
==6348== Syscall param write(buf) points to uninitialised byte(s)
==6348== at 0x5103D50: write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9AE2: _IO_file_write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9748: (within /lib/libc-2.6.1.so)
==6348== by 0x50A9A4B: _IO_file_xsputn (in /lib/libc-2.6.1.so)
==6348== by 0x509FDBA: fwrite (in /lib/libc-2.6.1.so)
==6348== by 0x4009D7: main (valgrind-testcase.c:69)
==6348== Address 0x53da87b is 51 bytes inside a block of size 195 alloc'd
==6348== at 0x4C222CB: malloc (in /usr/local/lib/valgrind/amd64-linux/vgpreload_memcheck.so)
==6348== by 0x4008D7: main (valgrind-testcase.c:45)
,==6348==
==6348== Syscall param write(buf) points to uninitialised byte(s)
==6348== at 0x5103D50: write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9AE2: _IO_file_write (in /lib/libc-2.6.1.so)
==6348== by 0x50A9748: (within /lib/libc-2.6.1.so)
==6348== by 0x50A9A83: _IO_do_write (in /lib/libc-2.6.1.so)
==6348== by 0x50AA048: _IO_file_sync (in /lib/libc-2.6.1.so)
==6348== by 0x509EDB9: fflush (in /lib/libc-2.6.1.so)
==6348== by 0x4009E0: main (valgrind-testcase.c:70)
==6348== Address 0x4020000 is not stack'd, malloc'd or (recently) free'd
==6348==
==6348== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 15 from 4)
==6348== malloc/free: in use at exit: 0 bytes in 0 blocks.
==6348== malloc/free: 7 allocs, 7 frees, 268,835 bytes allocated.
==6348== For counts of detected errors, rerun with: -v
==6348== Use --track-origins=yes to see where uninitialised values come from
==6348== All heap blocks were freed -- no leaks are possible.
-- snap --
Note that the error only occurs when fwrite()ing to stderr, not
any other file.
This is with valgrind compiled from a git-svn mirror updated today, i.e.
valgrind-3.5.0.SVN.
Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to maj...@vg...
More majordomo info at http://vger.kernel.org/majordomo-info.html
|
|
From: Julian S. <js...@ac...> - 2009-01-28 00:53:04
|
> Note to valgrind experts: this is _not_ about the Conditional thing in
> zlib, but about an uninitialized byte _in the middle_ of the zlib output
> buffer.
I understand. But -- whilst we're on the subject of the Conditional thing:
I thought I'd suppressed all those zlib FAQ#36 complaints, but apparently
not. Could you try the following and see if it helps?
Index: xfree-4.supp
===================================================================
--- xfree-4.supp (revision 9082)
+++ xfree-4.supp (working copy)
@@ -315,6 +315,7 @@
zlib-1.2.x trickyness (1a): See http://www.zlib.net/zlib_faq.html#faq36
Memcheck:Cond
obj:/*lib*/libz.so.1.2.*
+ ...
obj:/*lib*/libz.so.1.2.*
fun:deflate
}
@@ -329,6 +330,7 @@
zlib-1.2.x trickyness (2a): See http://www.zlib.net/zlib_faq.html#faq36
Memcheck:Value8
obj:/*lib*/libz.so.1.2.*
+ ...
obj:/*lib*/libz.so.1.2.*
fun:deflate
}
@@ -343,6 +345,7 @@
zlib-1.2.x trickyness (3a): See http://www.zlib.net/zlib_faq.html#faq36
Memcheck:Value4
obj:/*lib*/libz.so.1.2.*
+ ...
obj:/*lib*/libz.so.1.2.*
fun:deflate
}
Now to your real question. I believe this is either a bug in Valgrind
or some strangeness with libc. I don't think it has anything to do
with zlib.
One way to make sense of it is to force Memcheck to check the definedness
of the output buffer as soon as you're done with zlib. This is easier to
make sense of than trying to infer what's going on by interpreting complaints
about badness deep in the innards of libc. Thusly:
--- zlib_bad.c-ORIG 2009-01-28 01:25:46.000000000 +0100
+++ zlib_bad.c 2009-01-28 01:41:13.000000000 +0100
@@ -2,6 +2,7 @@
#include <stdlib.h>
#include <string.h>
#include <zlib.h>
+#include <memcheck.h>
int main(int argc, char **argv)
{
@@ -64,12 +65,17 @@
if (deflateEnd(&stream) != Z_OK)
return 1;
+ int nAvail = size - stream.avail_out;
+ printf("nAvail = %d\n", nAvail);
+ VALGRIND_CHECK_MEM_IS_DEFINED(compressed, nAvail);
+#if 0
out = fopen("/dev/null", "w");
fwrite(compressed + 51, 51, 1, out);
fwrite(compressed + 51, 1, 1, stderr);
fflush(out);
fclose(out);
+#endif
free(compressed);
return 0;
}
Program as modified w/ patch runs Memcheck-clean for me. Whereas if
you change the check to
VALGRIND_CHECK_MEM_IS_DEFINED(compressed, nAvail+1);
then I get
nAvail = 58
==3891== Uninitialised byte(s) found during client check request
==3891== at 0x40092D: main (zlib_bad.c:70)
==3891== Address 0x53da882 is 58 bytes inside a block of size 193 alloc'd
==3891== at 0x4C2594E: malloc (vg_replace_malloc.c:207)
==3891== by 0x400859: main (zlib_bad.c:46)
as expected.
That said, I don't have any concrete offerings as to what the problem
really is, or even whether there is a problem. I do know that zlib's
handwritten assembly inner loops tend to cause Memcheck's definedness
tracking to have problems. If you want to chase this further, I suggest you
build zlib from source, with -g -O0, and make sure you use the generic
C versions of deflate et al. Then you might be in with a fighting chance
of getting to the bottom of this.
Please lmk what the outcome is, so I know if it's a bug in Memcheck.
A final comment - you should make friends with --track-origins=yes.
It doesn't help much here, but in general it makes finding the sources
of uninitialised values a whole lot easier.
J
|
|
From: Johannes S. <Joh...@gm...> - 2009-01-29 15:30:42
|
Hi, so it really bothered me. So I actually tried to recreate the Ubuntu package as it displays the weird error (remember: after deflate(), offset 51 in a buffer of 58 was reported uninitialized, yet it appears inflate() wants exactly that byte, so it is definitely not uninitialized). After a bit of researching (why does Debian have to make it so awfully complicated to make a package?) I _could_ reproduce. The issue is that Debian sets UNALIGNED_OK, while zlib's configure does not. Does that ring a bell? Ciao, Dscho |
|
From: Julian S. <js...@ac...> - 2009-01-29 15:39:26
|
> After a bit of researching (why does Debian have to make it so awfully > complicated to make a package?) I _could_ reproduce. The issue is that > Debian sets UNALIGNED_OK, while zlib's configure does not. > > Does that ring a bell? Not here. We have (in Memcheck) some subtleness to do with (mis)alignment of word-sized accesses, but I don't see that they would have to do with this case. Did you configure zlib to use only generic C routines, or is it still using handwritten assembly? At least from where I'm sitting, that's a fairly crucial question. If you can chop this down into a more self contained test case, I'm happy to look at it a bit more. J |
|
From: Johannes S. <Joh...@gm...> - 2009-01-29 19:10:47
|
Hi, On Thu, 29 Jan 2009, Julian Seward wrote: > > After a bit of researching (why does Debian have to make it so awfully > > complicated to make a package?) I _could_ reproduce. The issue is > > that Debian sets UNALIGNED_OK, while zlib's configure does not. > > > > Does that ring a bell? > > Not here. We have (in Memcheck) some subtleness to do with > (mis)alignment of word-sized accesses, but I don't see that they would > have to do with this case. > > Did you configure zlib to use only generic C routines, or is it still > using handwritten assembly? At least from where I'm sitting, that's a > fairly crucial question. I tried to make sure that no direct assembler code was used, and it seems that I was successful. > If you can chop this down into a more self contained test case, I'm happy > to look at it a bit more. So far I did not succeed in chopping it down. (It is still the .c file I sent you, linked to zlib.) The thing that really gets me is that I littered a while() loop with instructions like this: VALGRIND_CHECK_MEM_IS_DEFINED(s->strm->state->pending_out + 49, 1); fprintf(stderr, "Z%02x %p\n", s->strm->state->pending_out[49], s->strm->state->pending_out + 49); I also added such an instruction just before the end of the while loop, and just after it. Unless my brain really turned into tatties, the check after the while loop should not be able to trigger an error unless the one in the while() loop does (yes, I checked that the loop body is actually executed, I substituted the "Z" in the fprintf() with an "X" there, and it prints it). Yet the latter check triggers an error, and the former does not. FWIW the while() loop is in in trees.c, function compress_block() in http://archive.ubuntu.com/ubuntu/pool/main/z/zlib/zlib_1.2.3.3.dfsg.orig.tar.gz Oh, and it had to be configured in this way: CFLAGS="-g -O3 -DUNALIGNED_OK" ./configure If you leave out either the -O3 or the define, the bug does not trigger. And like I said, this is x86_64; I could imagine that this bug is platform dependent. Now I am truly wasted, after a couple of hours of bug-chasing, so if you want to play with the test case, go ahead, I cannot stare at it until I had a good night's sleep. Ciao, Dscho |
|
From: Johannes S. <Joh...@gm...> - 2009-01-30 16:49:13
|
Hi, On Thu, 29 Jan 2009, Johannes Schindelin wrote: > So far I did not succeed in chopping it down. (It is still the .c file I > sent you, linked to zlib.) > > The thing that really gets me is that I littered a while() loop with > instructions like this: > > VALGRIND_CHECK_MEM_IS_DEFINED(s->strm->state->pending_out + 49, 1); > fprintf(stderr, "Z%02x %p\n", s->strm->state->pending_out[49], > s->strm->state->pending_out + 49); > > I also added such an instruction just before the end of the while loop, > and just after it. > > Unless my brain really turned into tatties, the check after the while loop > should not be able to trigger an error unless the one in the while() loop > does (yes, I checked that the loop body is actually executed, I > substituted the "Z" in the fprintf() with an "X" there, and it prints it). > > Yet the latter check triggers an error, and the former does not. > > FWIW the while() loop is in in trees.c, function compress_block() in > http://archive.ubuntu.com/ubuntu/pool/main/z/zlib/zlib_1.2.3.3.dfsg.orig.tar.gz > > Oh, and it had to be configured in this way: > > CFLAGS="-g -O3 -DUNALIGNED_OK" ./configure > > If you leave out either the -O3 or the define, the bug does not trigger. > > And like I said, this is x86_64; I could imagine that this bug is platform > dependent. FWIW the issue is also there with i386 (tested on a Pentium, though). I'll not have time to work on this until next week. Just a heads-up, Dscho |
|
From: Johannes S. <Joh...@gm...> - 2009-01-29 15:04:40
|
Hi, On Wed, 28 Jan 2009, Julian Seward wrote: > > > Note to valgrind experts: this is _not_ about the Conditional thing in > > zlib, but about an uninitialized byte _in the middle_ of the zlib output > > buffer. > > I understand. But -- whilst we're on the subject of the Conditional thing: > I thought I'd suppressed all those zlib FAQ#36 complaints, but apparently > not. Could you try the following and see if it helps? > > Index: xfree-4.supp > =================================================================== > --- xfree-4.supp (revision 9082) > +++ xfree-4.supp (working copy) > @@ -315,6 +315,7 @@ > zlib-1.2.x trickyness (1a): See http://www.zlib.net/zlib_faq.html#faq36 > Memcheck:Cond > obj:/*lib*/libz.so.1.2.* > + ... > obj:/*lib*/libz.so.1.2.* > fun:deflate > } > @@ -329,6 +330,7 @@ > zlib-1.2.x trickyness (2a): See http://www.zlib.net/zlib_faq.html#faq36 > Memcheck:Value8 > obj:/*lib*/libz.so.1.2.* > + ... > obj:/*lib*/libz.so.1.2.* > fun:deflate > } > @@ -343,6 +345,7 @@ > zlib-1.2.x trickyness (3a): See http://www.zlib.net/zlib_faq.html#faq36 > Memcheck:Value4 > obj:/*lib*/libz.so.1.2.* > + ... > obj:/*lib*/libz.so.1.2.* > fun:deflate > } I had something similar in the suppressions file already. > Now to your real question. I believe this is either a bug in Valgrind > or some strangeness with libc. I don't think it has anything to do with > zlib. Actually, after much more painful research, it turns out that I cannot investigate the issue further. Your hint with VALGRIND_CHECK_MEM_IS_DEFINED() showed indeed that valgrind thinks that the single byte at offset 51 (in a buffer of 58) is undefined. So I wanted to use the same macro/function in the source of zlib to find out where the responsible code is. But the issue disappeared then. No matter if I link to zlib-1.2.3, 1.2.3.3.dfsg-5ubuntu1 (the version that is installed on the system I had the issue with) or the dynamic library thereof, the error was no longer reproducible. This is very frustrating, but at least I have a system now where the valgrind tests pass. > Program as modified w/ patch runs Memcheck-clean for me. For me, it did not. > That said, I don't have any concrete offerings as to what the problem > really is, or even whether there is a problem. I do know that zlib's > handwritten assembly inner loops tend to cause Memcheck's definedness > tracking to have problems. If you want to chase this further, I suggest > you build zlib from source, with -g -O0, and make sure you use the > generic C versions of deflate et al. Then you might be in with a > fighting chance of getting to the bottom of this. > > Please lmk what the outcome is, so I know if it's a bug in Memcheck. I could still imagine that there is some obscure insn that is not caught properly, but without a reproducible testcase that I can compile myself... > A final comment - you should make friends with --track-origins=yes. It > doesn't help much here, but in general it makes finding the sources of > uninitialised values a whole lot easier. Thank you! That is indeed very helpful. Sorry to pass my frustration to you... Will keep an eye open, maybe I'll find the time to do more fine-grained research into the assembler-level code. Thanks, Dscho |
|
From: Julian S. <js...@ac...> - 2009-01-29 15:33:26
|
> Sorry to pass my frustration to you... Will keep an eye open, maybe I'll > find the time to do more fine-grained research into the assembler-level > code. Thanks for the update. Over the years there has been a lot of improvements to the accuracy of Memcheck's undefinedness tracking, and this is the first possible-false-positive case I've seen in a while. So if you do manage to produce a smaller test case, or have any further insight, pls let us know. J |
|
From: Julian S. <js...@ac...> - 2009-02-01 13:07:17
|
Going round in circles here. > Your hint with VALGRIND_CHECK_MEM_IS_DEFINED() showed indeed that valgrind > thinks that the single byte at offset 51 (in a buffer of 58) is undefined. I just spent some time chasing this. I cannot reproduce the byte-51-is- undefined problem. It only shows that 58 and later are undefined, which is as expected. This is with various combinations of 32-bit and 64-bit builds, all optimisation levels, and with the system installed zlib and a from-source zlib. This is on x86_64, openSUSE 11.0, gcc (SUSE Linux) 4.3.1 20080507. Can you tell me what exactly I need to do to get the byte-51 undef error, including details of what distro you're using? J |
|
From: Johannes S. <Joh...@gm...> - 2009-02-01 22:30:51
|
Hi, On Sun, 1 Feb 2009, Julian Seward wrote: > Going round in circles here. > > > Your hint with VALGRIND_CHECK_MEM_IS_DEFINED() showed indeed that valgrind > > thinks that the single byte at offset 51 (in a buffer of 58) is undefined. > > I just spent some time chasing this. I cannot reproduce the byte-51-is- > undefined problem. It only shows that 58 and later are undefined, which > is as expected. This is with various combinations of 32-bit and 64-bit > builds, all optimisation levels, and with the system installed zlib and a > from-source zlib. This is on x86_64, openSUSE 11.0, gcc (SUSE Linux) 4.3.1 > 20080507. Sorry, I get it only when configuring with '-O3 -DUNALIGNED_OK' (I thought I mentioned that already...) > Can you tell me what exactly I need to do to get the byte-51 undef > error, including details of what distro you're using? Ubuntu Gutsy and Intrepid, both x86_64, zlib-1.2.3.3-ubuntu5, although I can reproduce even with zlib-1.2.3. The key is to use both -O3 and -DUNALIGNED_OK. Ciao, Dscho |
|
From: Julian S. <js...@ac...> - 2009-02-02 09:54:39
|
> > Can you tell me what exactly I need to do to get the byte-51 undef
> > error, including details of what distro you're using?
>
> Ubuntu Gutsy and Intrepid, both x86_64, zlib-1.2.3.3-ubuntu5, although I
> can reproduce even with zlib-1.2.3. The key is to use both -O3 and
> -DUNALIGNED_OK.
Right. So on Ubuntu 8.04.2 I can easily reproduce it. Situation is now
much clearer.
This is (unfortunately) another version of the FAQ#36 issue.
Exactly why the undefinedness continues to propagate when built on
Ubuntu but not on openSUSE, I don't know, despite a couple hours
of investigation.
The error message in this case is a bit misleading:
Uninitialised byte(s) found during client check request
at 0x400787: main (minimal.c:73)
Address 0x51de87b is 51 bytes inside a block of size 193 alloc'd
at 0x4C2694E: malloc (vg_replace_malloc.c:207)
by 0x4006A6: main (minimal.c:49)
It's true that at the point this VALGRIND_CHECK_MEM_IS_DEFINED was done,
an uninitialised value was observed 51 bytes inside a block you malloced
yourself (so to speak). However, it is not correct to assume that
that byte is uninitialised as a result of that specific malloc. Another
possibility is that it was copied to that place from somewhere else, in an
uninitialised state. And if you add --track-origins=yes the original source
is stated:
Uninitialised value was created by a heap allocation
at 0x4C2694E: malloc (vg_replace_malloc.c:207)
by 0x40491C: zcalloc (zutil.c:306)
by 0x402BEB: deflateInit2_ (deflate.c:289)
by 0x402D40: deflateInit_ (deflate.c:212)
by 0x40068E: main (minimal.c:47)
which confirms it as the same source as all the other (false) uninit-value
errors that Memcheck reports in zlib-1.2.x.
So you can safely ignore it.
See http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.value
for details of Memcheck's definedness-tracking machinery.
I studied the relevant loop in zlib, at deflate.c:1106, and ended up
wondering if it is possible to rewrite it, without loss of performance,
so that it does not need to visit uninitialised memory.
J
|
|
From: Johannes S. <Joh...@gm...> - 2009-02-02 13:38:34
|
Hi, On Mon, 2 Feb 2009, Julian Seward wrote: > > > Can you tell me what exactly I need to do to get the byte-51 undef > > > error, including details of what distro you're using? > > > > Ubuntu Gutsy and Intrepid, both x86_64, zlib-1.2.3.3-ubuntu5, although I > > can reproduce even with zlib-1.2.3. The key is to use both -O3 and > > -DUNALIGNED_OK. > > Right. So on Ubuntu 8.04.2 I can easily reproduce it. Situation is now > much clearer. Thanks! > This is (unfortunately) another version of the FAQ#36 issue. > Exactly why the undefinedness continues to propagate when built on > Ubuntu but not on openSUSE, I don't know, despite a couple hours > of investigation. > > The error message in this case is a bit misleading: > > Uninitialised byte(s) found during client check request > at 0x400787: main (minimal.c:73) > Address 0x51de87b is 51 bytes inside a block of size 193 alloc'd > at 0x4C2694E: malloc (vg_replace_malloc.c:207) > by 0x4006A6: main (minimal.c:49) > > It's true that at the point this VALGRIND_CHECK_MEM_IS_DEFINED was done, > an uninitialised value was observed 51 bytes inside a block you malloced > yourself (so to speak). However, it is not correct to assume that > that byte is uninitialised as a result of that specific malloc. Another > possibility is that it was copied to that place from somewhere else, in an > uninitialised state. And if you add --track-origins=yes the original source > is stated: > > Uninitialised value was created by a heap allocation > at 0x4C2694E: malloc (vg_replace_malloc.c:207) > by 0x40491C: zcalloc (zutil.c:306) > by 0x402BEB: deflateInit2_ (deflate.c:289) > by 0x402D40: deflateInit_ (deflate.c:212) > by 0x40068E: main (minimal.c:47) > > which confirms it as the same source as all the other (false) uninit-value > errors that Memcheck reports in zlib-1.2.x. > > So you can safely ignore it. Can I make valgrind ignore this particular error, without having to mark _my_ code with a suppression? > See http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.value > for details of Memcheck's definedness-tracking machinery. > > I studied the relevant loop in zlib, at deflate.c:1106, and ended up > wondering if it is possible to rewrite it, without loss of performance, > so that it does not need to visit uninitialised memory. Right. Note, however, that the issue only arises with UNLIGNED_OK, which is handled at deflate.c:1138. My suspicion is that it would actually make sense to have code in zlib that _initializes_ the buffer at the end, instead of changing the loop. You could make this initialization an opt-out, so that appliances needing zlib do not need these extra-cycles. Something like this: -- snipsnap -- [PATCH] Make valgrind happy by default Valgrind does not like zlib to use uninitialized memory, but for performance reasons, zlib has to avoid the end-of-buffer check until it quite possibly overran by 8 bytes. As zlib fixes the overrun right after the loop, this is not really a mistake. To make valgrind happy, just memset() the whole buffer, so that all such accesses are no longer marked as accesses to undefined memory. As a typical window size is just 32kB, and the initialization is a one-time cost in deflateInit2_(), it does not hurt performance too much. In case that this code does hurt performance-wise, it can be switched off by defining MAKE_VALGRIND_UNHAPPY. Note: the initialization is only performed if sizeof(uInt) > 2, because zalloc() calls calloc() otherwise anyway. Note alse: the same effect could be achieved in a cleaner manner by using valgrind's VALGRIND_MAKE_MEM_DEFINED() macro, but that would make zlib's source code depend on valgrind's headers. Signed-off-by: Johannes Schindelin <joh...@gm...> --- deflate.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/deflate.c b/deflate.c index 29ce1f6..3032d8e 100644 --- a/deflate.c +++ b/deflate.c @@ -288,6 +288,11 @@ int ZEXPORT deflateInit2_(strm, level, method, windowBits, memLevel, strategy, s->prev = (Posf *) ZALLOC(strm, s->w_size, sizeof(Pos)); s->head = (Posf *) ZALLOC(strm, s->hash_size, sizeof(Pos)); +#ifndef MAKE_VALGRIND_UNHAPPY + if (sizeof(uInt) > 2) + memset(s->window, 0, s->w_size); +#endif + s->lit_bufsize = 1 << (memLevel + 6); /* 16K elements by default */ overlay = (ushf *) ZALLOC(strm, s->lit_bufsize, sizeof(ush)+2); -- 1.6.1.2.556.g3bd7 |
|
From: Johannes S. <Joh...@gm...> - 2009-02-02 17:58:40
|
Hi, On Mon, 2 Feb 2009, Mark Adler wrote: > On Feb 2, 2009, at 1:54 AM, Julian Seward wrote: > >Another possibility is that it was copied to that place from somewhere else, > >in an uninitialised state. > > Except that that byte happens to be exactly the right value it needs to > be in order to part of the valid zlib stream (a 0x2c in this case). > Any other value results in an inflate error on the data. Either an odd > coincidence with a 1/256 probability, or far more likely, that the byte > is in fact not uninitialized. I'm sure that the latter is the case, > since the zlib code sets that byte in the output -- it doesn't have a > mechanism to skip a byte when writing. I would conjecture that > valgrind's propagation of uninitialized values is in error. The problem is indeed the "uninitialized" thing, as far as I read it. At some stage, the "uninitializedness" of the bytes that are looked at during longest_match() transpires to the values deduced from the length o the longest match. > >I studied the relevant loop in zlib, at deflate.c:1106, and ended up > >wondering if it is possible to rewrite it, without loss of performance, > >so that it does not need to visit uninitialised memory. > > I'm sure it is possible, but I am loathe to make any changes to code > that has been *extensively* tested over a decade and is not broken, just > to make a code checker happy. Indeed, but it _is_ sloppy code, from the viewpoint of a correctness checker. I could imagine that it _might_ possible to use a tell-tale detector just for zlib, but I think that the patch I sent should be some lower-impact solution to the problem. Because a problem it is: if I add suppressions for the code that depends on deflated data, I will no longer be able to catch true errors that do depend on uninitialized memory. Ciao, Dscho |
|
From: Julian S. <js...@ac...> - 2009-02-02 18:53:16
|
> On Mon, 2 Feb 2009, Mark Adler wrote: > > On Feb 2, 2009, at 1:54 AM, Julian Seward wrote: > > >Another possibility is that it was copied to that place from somewhere > > > else, in an uninitialised state. > > > > Except that that byte happens to be exactly the right value it needs to > > be in order to part of the valid zlib stream (a 0x2c in this case). > > Any other value results in an inflate error on the data. Either an odd > > coincidence with a 1/256 probability, or far more likely, that the byte > > is in fact not uninitialized. Sloppy wording on my part. I did not mean to imply the byte was really uninitialised. I am sure that what happened is, as Johannes says, > [...] At > some stage, the "uninitializedness" of the bytes that are looked at during > longest_match() transpires to the values deduced from the length o the > longest match. Correct. And whether that happens or not depends on the details of the code that gcc generates, specifically how it interacts with Memcheck's definedness tracking machinery. That's why it appears to be distro-specific -- in fact I think it is more likely to be gcc-version-specific. > > >I studied the relevant loop in zlib, at deflate.c:1106, and ended up > > >wondering if it is possible to rewrite it, without loss of performance, > > >so that it does not need to visit uninitialised memory. > > > > I'm sure it is possible, but I am loathe to make any changes to code > > that has been *extensively* tested over a decade and is not broken, just > > to make a code checker happy. > > Indeed, but it _is_ sloppy code, from the viewpoint of a correctness > checker. Being heavily involved with both Valgrind and bzip2, I sympathise with both points of view. I see two ways out of this. 1. In the short term we can hunt around in the Ubuntu-gcc assembly code to find out what is confusing Memcheck, possibly develop a fix, and move on. 2. In the longer term, rewrite the zlib loop so it doesn't enter uninitialised memory. Given that the (maximum) trip count is known before the loop, I'm fairly sure it is possible to use standard loop transformations: unrolling, vectorization and peeling, to achieve this. I lack enthusiasm for (1). Doing bit-exact definedness tracking for the x86 condition code registers (O S Z A C and P) is expensive, and Memcheck currently has some speed-accuracy tradeoffs that work well in practice for code that doesn't use uninitialised values in this rather subtle way. I am reluctant to special-case it for zlib, as that will likely impose a performance burden on all code handled by Memcheck, and Memcheck is already undesirably slow. Also, (1) is much like playing whack-a-mole. Sure we can track down and fix this hole, and that works until some devilishly clever code generation trick in a future gcc-5.4.3 does something Memcheck doesn't expect, and we are back at square 1 again. J |
|
From: John R.
|
[Julian] >>>> I studied the relevant loop in zlib, at deflate.c:1106, and ended up >>>> wondering if it is possible to rewrite it, without loss of performance, >>>> so that it does not need to visit uninitialised memory. [Mark] >>> I'm sure it is possible, but I am loathe to make any changes to code >>> that has been *extensively* tested over a decade and is not broken, just >>> to make a code checker happy. [Johannes] >> Indeed, but it _is_ sloppy code, from the viewpoint of a correctness >> checker. [Julian] > Being heavily involved with both Valgrind and bzip2, I sympathise with > both points of view. In the broader view, software has evolved so that developers expect that libraries will interact nicely with programmatic checkers of memory accesses. The particular code in zlib meets the old expectations of functional correctness, but not the new expectations of a longer value chain which includes smooth interaction with memcheck. Today's customers of zlib expect more from zlib than yesterday's customers did. In other similar cases, I have modified library code to initialize possible "overrun" bytes before the loop starts, even though this is not strictly necessary for functional correctness. The cost of a few instructions of static code, plus a few machine cycles per dynamic buffer, is repaid by a shorter development cycle and higher reliability in end products which use the library. The library users who highly value minimization of CPU cycles have become outnumbered by the users who highly value short time-to-market for reliable new products that use the library. For examples see http://bitwagon.com/glibc-audit/glibc-audit.html . -- |