|
From: Konstantin S. <kon...@gm...> - 2013-03-11 11:53:24
|
Hi, There was a gcc bug that caused Valgrind/Memcheck to complain about memcpy(x, x, n). http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480 Since gcc 4.3 it does not generate such code and the problem is gone. However, Clang/LLVM generates memcpy(x, x, n) and they have good reasons not to change this: http://llvm.org/bugs/show_bug.cgi?id=11763 As the result we have incompatibility between Clang and Valgrind, and Clang does not want to fix it. Does Valgrind want to? Any suggestions? % cat memcpy_bug.cc struct LargeStruct { int foo[100]; }; void copy(LargeStruct *a, LargeStruct *b) { *a = *b; } int main() { LargeStruct x; copy(&x, &x); } % gcc -g memcpy_bug.cc ; ~/valgrind/trunk/inst/bin/valgrind -q ./a.out % clang -g memcpy_bug.cc ; ~/valgrind/trunk/inst/bin/valgrind -q ./a.out ==28173== Source and destination overlap in memcpy(0xfff0000d0, 0xfff0000d0, 400) ==28173== at 0x402E280: memcpy@@GLIBC_2.14 (mc_replace_strmem.c:882) ==28173== by 0x400581: copy(LargeStruct*, LargeStruct*) (memcpy_bug.cc:6) ==28173== by 0x4005AC: main (memcpy_bug.cc:11) ==28173== % Thanks, --kcc |
|
From: Tom H. <to...@co...> - 2013-03-11 12:33:49
|
On 11/03/13 11:52, Konstantin Serebryany wrote: > There was a gcc bug that caused Valgrind/Memcheck to complain about > memcpy(x, x, n). > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480 > Since gcc 4.3 it does not generate such code and the problem is gone. > > However, Clang/LLVM generates memcpy(x, x, n) and they have good reasons not > to change this: http://llvm.org/bugs/show_bug.cgi?id=11763 > > As the result we have incompatibility between Clang and Valgrind, and > Clang does not want to fix it. > Does Valgrind want to? Well the critical question is, does the C standard say this is safe or not? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Konstantin S. <kon...@gm...> - 2013-03-11 13:16:59
|
On Mon, Mar 11, 2013 at 4:33 PM, Tom Hughes <to...@co...> wrote: > On 11/03/13 11:52, Konstantin Serebryany wrote: > >> There was a gcc bug that caused Valgrind/Memcheck to complain about >> memcpy(x, x, n). >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480 >> Since gcc 4.3 it does not generate such code and the problem is gone. >> >> However, Clang/LLVM generates memcpy(x, x, n) and they have good reasons >> not >> to change this: http://llvm.org/bugs/show_bug.cgi?id=11763 >> >> As the result we have incompatibility between Clang and Valgrind, and >> Clang does not want to fix it. >> Does Valgrind want to? > > > Well the critical question is, does the C standard say this is safe or not? My understanding is that the C/C++ standards do not allow users to call memcpy(x, x, n), but the compiler may choose to generate memcpy(x, y, n) for *x=*y w/o checking x!=y, if it knows that the memcpy implementation is safe. --kcc > > Tom > > -- > Tom Hughes (to...@co...) > http://compton.nu/ |
|
From: Tom H. <to...@co...> - 2013-03-11 13:18:23
|
On 11/03/13 13:16, Konstantin Serebryany wrote: > On Mon, Mar 11, 2013 at 4:33 PM, Tom Hughes <to...@co...> wrote: >> On 11/03/13 11:52, Konstantin Serebryany wrote: >> >>> There was a gcc bug that caused Valgrind/Memcheck to complain about >>> memcpy(x, x, n). >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480 >>> Since gcc 4.3 it does not generate such code and the problem is gone. >>> >>> However, Clang/LLVM generates memcpy(x, x, n) and they have good reasons >>> not >>> to change this: http://llvm.org/bugs/show_bug.cgi?id=11763 >>> >>> As the result we have incompatibility between Clang and Valgrind, and >>> Clang does not want to fix it. >>> Does Valgrind want to? >> >> >> Well the critical question is, does the C standard say this is safe or not? > > My understanding is that the C/C++ standards do not allow users to > call memcpy(x, x, n), > but the compiler may choose to generate memcpy(x, y, n) for *x=*y w/o > checking x!=y, > if it knows that the memcpy implementation is safe. So clang are saying they have verified the glibc implementation is safe and have a guarantee from the glibc maintainers not to change it? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Konstantin S. <kon...@gm...> - 2013-03-11 13:22:43
|
On Mon, Mar 11, 2013 at 5:18 PM, Tom Hughes <to...@co...> wrote: > On 11/03/13 13:16, Konstantin Serebryany wrote: >> >> On Mon, Mar 11, 2013 at 4:33 PM, Tom Hughes <to...@co...> wrote: >>> >>> On 11/03/13 11:52, Konstantin Serebryany wrote: >>> >>>> There was a gcc bug that caused Valgrind/Memcheck to complain about >>>> memcpy(x, x, n). >>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39480 >>>> Since gcc 4.3 it does not generate such code and the problem is gone. >>>> >>>> However, Clang/LLVM generates memcpy(x, x, n) and they have good reasons >>>> not >>>> to change this: http://llvm.org/bugs/show_bug.cgi?id=11763 >>>> >>>> As the result we have incompatibility between Clang and Valgrind, and >>>> Clang does not want to fix it. >>>> Does Valgrind want to? >>> >>> >>> >>> Well the critical question is, does the C standard say this is safe or >>> not? >> >> >> My understanding is that the C/C++ standards do not allow users to >> call memcpy(x, x, n), >> but the compiler may choose to generate memcpy(x, y, n) for *x=*y w/o >> checking x!=y, >> if it knows that the memcpy implementation is safe. > > > So clang are saying they have verified the glibc implementation is safe and > have a guarantee from the glibc maintainers not to change it? >> Eli Friedman 2012-01-13 19:47:28 CST >> This was done intentionally in r65699; in practice, memcpy implementations handle exact overlap correctly, and other implementations of struct assignment cause issues. (We're allowed to generate "illegal" calls to memcpy because LLVM is generating platform-specific assembly, not C code.) I don't think this has any stamp from glibc. :) --kcc > > > Tom > > -- > Tom Hughes (to...@co...) > http://compton.nu/ |
|
From: John R. <jr...@bi...> - 2013-03-11 15:03:03
|
On 03/11/2013 06:22 AM, Konstantin Serebryany wrote: > On Mon, Mar 11, 2013 at 5:18 PM, Tom Hughes <to...@co...> wrote: >> So clang are saying they have verified the glibc implementation is safe and >> have a guarantee from the glibc maintainers not to change it? > I don't think this has any stamp from glibc. :) As the clang/llvm threads indicate ( http://llvm.org/bugs/show_bug.cgi?id=11763#c8 , also https://bugs.kde.org/show_bug.cgi?id=148543 ) in practice the only known possibility for trouble is something like 'dcbz' on PowerPC, where the destination cache line might be zeroed before being read as identical overlapping source. Of course this applies only when n >= sizeof(line), actual alignment is known, etc. In practice it would be poor engineering for the C library to implement such an optimization without checking for exact overlap (dst==src). This check costs two instructions of space and usually zero time (all overlapped with previous execution on a 2-way superscalar processor.) Users will pay this cost gladly in order to get robust practical computing. Thus in practice the complaint "memcpy(x,x,n) is not OK" is very similar to noise. Because a suppression has no access to actual arguments and cannot check for src==dst, then there should be an option to turn off the complaint. -- |
|
From: Patrick J. L. <lop...@gm...> - 2013-03-11 16:38:00
|
On Mon, Mar 11, 2013 at 8:03 AM, John Reiser <jr...@bi...> wrote: > > As the clang/llvm threads indicate ( http://llvm.org/bugs/show_bug.cgi?id=11763#c8 , > also https://bugs.kde.org/show_bug.cgi?id=148543 ) > in practice the only known possibility for trouble is something like 'dcbz' on > PowerPC "I cannot imagine why you would want to do that, therefore nobody will ever want to do that"? I think I read about this one in my logic class. Proof By Lack Of Imagination, aka. the "Modus Bogus". It's UNDEFINED BEHAVIOR, for crying out loud. I could have sworn there was this thing called programming by contract, where "undefined behavior" means "don't do that". It's a rule. The funny part is, there is no development group on the planet more aggressively eager to take advantage of this rule than the Clang developers. They seem to think the same rule they apply so zealously to everybody else does not apply to themselves. Hilarious. > In practice it would be poor engineering for the C library to implement such an > optimization without checking for exact overlap (dst==src). This check costs > two instructions of space and usually zero time (all overlapped with previous > execution on a 2-way superscalar processor.) Users will pay this cost gladly > in order to get robust practical computing. By that logic, Clang could emit the self-assignment test before calling memcpy, and it would cost zero time. Maybe you should suggest it to the Clang developers. How far do you think you would get? > Thus in practice the complaint "memcpy(x,x,n) is not OK" is very similar to noise. The whole discussion is ludicrous. If the object is small, then "*x = *y" should be inlined. If it is large enough to bother with a call to memcpy(), then it is large enough that the self-assignment test is negligible. memcpy() has well-defined preconditions; Clang violates them; therefore Clang is buggy. Period, ipso facto, Q.E.D. > Because a suppression has no access to actual arguments and cannot check for src==dst, > then there should be an option to turn off the complaint. Although I disagree with every step of your reasoning, I do agree with your conclusion. Valgrind's purpose is to tell us when our code does something "funny", and overlapping memcpy() certainly qualifies. But sometimes we do something "funny" for a reason, so we need a way to shut up Valgrind. So one way or another, there needs to be a way to silence this warning. Clang is still buggy though. - Pat |
|
From: John R. <jr...@bi...> - 2013-03-12 16:21:18
|
On 03/11/2013 09:37 AM, Patrick J. LoPresti wrote: > On Mon, Mar 11, 2013 at 8:03 AM, John Reiser <jr...@bi...> wrote: >> >> As the clang/llvm threads indicate ( http://llvm.org/bugs/show_bug.cgi?id=11763#c8 , >> also https://bugs.kde.org/show_bug.cgi?id=148543 ) >> in practice the only known possibility for trouble is something like 'dcbz' on >> PowerPC > > "I cannot imagine why you would want to do that, therefore nobody will > ever want to do that"? If you know of other cases then please share details or citations. > > I think I read about this one in my logic class. Proof By Lack Of > Imagination, aka. the "Modus Bogus". > > It's UNDEFINED BEHAVIOR, for crying out loud. I could have sworn there > was this thing called programming by contract, where "undefined > behavior" means "don't do that". It's a rule. Programming standards are tools. I use them where they provide more benefits than costs. The benefit of "memcpy(x,x,n) is undefined" is small, smaller than zero. The supposed benefit actually is a cost. The cost is complexity, inhibition of unification, and development time. The political, public relations benefit of proclaiming "100% conformance" is dwarfed by the practical realities of software development. > > The funny part is, there is no development group on the planet more > aggressively eager to take advantage of this rule than the Clang > developers. They seem to think the same rule they apply so zealously > to everybody else does not apply to themselves. Hilarious. There is a development cost to implementing the code which enforces conformity to the rule. The cost varies over time. "Not now" might be a well-considered approach which justifies the current push-back. > >> In practice it would be poor engineering for the C library to implement such an >> optimization without checking for exact overlap (dst==src). This check costs >> two instructions of space and usually zero time (all overlapped with previous >> execution on a 2-way superscalar processor.) Users will pay this cost gladly >> in order to get robust practical computing. > > By that logic, Clang could emit the self-assignment test before > calling memcpy, and it would cost zero time. Maybe you should suggest > it to the Clang developers. How far do you think you would get? Putting the test inside memcpy does cost at most 1 cycle, but not every client has enough slack; client code can be tighter than the memcpy setup. Distributing the test into clients multiplies the space cost. It also increases the safety risk; a client might forget, or might not know the requirement. > >> Thus in practice the complaint "memcpy(x,x,n) is not OK" is very similar to noise. > > The whole discussion is ludicrous. If the object is small, then "*x = > *y" should be inlined. If it is large enough to bother with a call to > memcpy(), then it is large enough that the self-assignment test is > negligible. No, it is not ludicrous; it's engineering. If the source and destination are not known to to be aligned sufficiently, or if the length is variable, then often the call to memcpy is better than inlining, in space or time or both. > > memcpy() has well-defined preconditions; Clang violates them; > therefore Clang is buggy. Period, ipso facto, Q.E.D. *ALL* code is buggy in some way, yet we are obligated to make progress. This precondition interferes with current progress, and offers only small hope of future benefits. Also, any actual damage caused by violation is likely to be highly visible, and readily worked around: something similar to the intent of "#define memcpy memmove" at compile time, an intercepting overriding shared library at runtime, inserting the test in a few specific places, etc. > >> Because a suppression has no access to actual arguments and cannot check for src==dst, >> then there should be an option to turn off the complaint. > > Although I disagree with every step of your reasoning, I do agree with > your conclusion. Valgrind's purpose is to tell us when our code does > something "funny", and overlapping memcpy() certainly qualifies. But > sometimes we do something "funny" for a reason, so we need a way to > shut up Valgrind. So one way or another, there needs to be a way to > silence this warning. Yes, memcheck must provide a way to silence the warning. > > Clang is still buggy though. > > - Pat -- |
|
From: Patrick J. L. <lop...@gm...> - 2013-03-12 16:58:27
|
Well, much of this is philosophy, where we obviously disagree. So I will skip most of that and mainly respond to the technical error: On Tue, Mar 12, 2013 at 9:22 AM, John Reiser <jr...@bi...> wrote: > > No, it is not ludicrous; it's engineering. If the source and destination > are not known to to be aligned sufficiently, or if the length is variable, > then often the call to memcpy is better than inlining, in space or time > or both. We are talking about assignment (like "*x = *y"). So the length is a compile-time constant, and the source and destination are guaranteed to be aligned correctly. (If not, it is undefined behavior, which the compiler already assumes never happens.) So the decision to call memcpy() versus inlining depends on size and size alone. > *ALL* code is buggy in some way, yet we are obligated to make progress. > This precondition interferes with current progress, and offers only > small hope of future benefits. ...in your opinion. So go convince the glibc maintainers -- or POSIX/C/C++ committees -- to share your opinion and relax the preconditions. If you are so obviously right, it should be easy; it would not even harm backwards compatibility. That would be reasonable. I would even support it; memcpy() should have had memmove() semantics all along, in my view. What is not reasonable is to ignore preconditions on code you do not own on the grounds that they impose an undue cost... in your opinion. Because you never know when your code will run in an environment that assumes them. - Pat |