|
From: Tom H. <to...@co...> - 2005-07-05 16:32:31
|
I've just spent all afternoon chasing a seemingly bogus warning coming from optimised code on an amd64 box and I've finally worked out what seems to be going on. The routine that is causing the problem has a routine whose second argument is a short containing some flags, so it comes in as %rsi and the compiler moves it to %r15 with: movswl %si, %r15d There is then a second short on the stack which has gets initialised to one of two values and then later the function adds 16 to that value if the bit 7 of the flags word that was passed in is set. For that operation the compiler generates this: mov 0x2bac(%rsp), %edx mov 0x2bac(%rsp), %eax add $0x10, %edx inc %r15b cmovle %edx, %eax mov %ax, 0x2bac(%rsp) So it loads 4 bytes from the stack into both %edx and %eax (knowing that the variable is only two, but that it is safe to load 4) then does the add of 16 to the value in %edx so the top half is still uninitialised. It then increments the low byte of r15 which still contains our flags because it knows that incrementing 128 will produce either zero or a negative value so that the LE condition in the next line will hold, but any value less than 128 will produce a positive result so the LE will not hold. Then it conditionally moves %edx to %eax so that the original value is replace by the updated value if and only if the top bit of %r15b was set and finally it saves the low word of %eax back to the stack. The problem is that memcheck considers the add to make some of the condition flags undefined because the upper half of one of the arguments was undefined. That then causes some of the condition flags coming out of the inc to be undefined which makes the cmov result undefined and so on... Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: Julian S. <js...@ac...> - 2005-07-05 17:00:49
|
Yeh, compiler writers are really a rotten bunch, huh :-? Thanks for chasing this around. > The problem is that memcheck considers the add to make some of the > condition flags undefined because the upper half of one of the > arguments was undefined. That then causes some of the condition flags > coming out of the inc to be undefined which makes the cmov result > undefined and so on... That's certainly true for the 2.X line, but vex and the 3.X memcheck are .. well, at least marginally more clever about this kind of thing. There is some possibility that, going downhill with a following wind, etc, it could be fixed. For starters, can you get me the post- optimisation pre-instrumentation IR resulting from the code fragment containing the cmovle? That is: run with --tool=none (really) --trace-flags=10001000 and probably play some games with --trace-notbelow=<large number> unless you enjoy wading through gigabytes of log files. I might be able to tell if the situation can be improved once I see the IR (the "after tree building" stuff). J |
|
From: Tom H. <to...@co...> - 2005-07-05 17:31:11
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
> > The problem is that memcheck considers the add to make some of the
> > condition flags undefined because the upper half of one of the
> > arguments was undefined. That then causes some of the condition flags
> > coming out of the inc to be undefined which makes the cmov result
> > undefined and so on...
>
> That's certainly true for the 2.X line, but vex and the 3.X memcheck are
> .. well, at least marginally more clever about this kind of thing.
The full problem is I think that INC preserves the carry flag, so
that is propagate through the INC unchanged. The ADD left the carry
flag undefined so the flags after the INC are undefined and that
makes the CMOV appear to produce an undefined result even though
it only looked at the Z, O and S flags.
> There is some possibility that, going downhill with a following wind,
> etc, it could be fixed. For starters, can you get me the post-
> optimisation pre-instrumentation IR resulting from the code fragment
> containing the cmovle? That is: run with --tool=none (really)
> --trace-flags=10001000 and probably play some games with
> --trace-notbelow=<large number> unless you enjoy wading through
> gigabytes of log files. I might be able to tell if the situation can
I'll send you my test program and the trace output.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2005-07-05 17:39:26
|
> > That's certainly true for the 2.X line, but vex and the 3.X memcheck are > > .. well, at least marginally more clever about this kind of thing. > > The full problem is I think that INC preserves the carry flag, so > that is propagate through the INC unchanged. The ADD left the carry > flag undefined so the flags after the INC are undefined and that > makes the CMOV appear to produce an undefined result even though > it only looked at the Z, O and S flags. Yeh, but the ir optimiser may be able to knock out some of those bogus dependencies. That's why I want to see the post-optimisation IR. J |