|
From: Julian S. <js...@ac...> - 2013-01-04 11:22:03
|
2013 greetings to all.
I've spent a bit of time fine-tuning the IR machinery lately, in
order to try and do better on ARM and particularly Thumb code.
Due to the use of conditionalisation, ARM/Thumb stress the IR
optimisation machinery in weird ways that the other front ends
don't, and we are missing some obvious tricks.
One change I am looking at is changing the guard type of Mux0X
from I8 to I1. This makes it consistent with I1 guard types for
Dirty helpers and for Exits. The use of I8 as a guard comes from
very early experiments supporting x86 via IR. It seemed like a
good idea at the time, but makes no sense really.
Changing the order of the second and third args in Mux0X, so it
matches the familiar C ?-: syntax, and possibly renaming it,
would be a nice thing, but does not change the generated code.
If anyone wants to volunteer to do that (and verify the change is
correct :) please speak up.
------------
The key thing I want to do once that is finished is make the
constant folder able to look 'backwards' through expressions in
the flattened IR, when looking for transformations to do. This
should be easy since the folder now carries around a mapping from
IRTemps to IRExprs. Currently the folder only looks at "one level"
of expression, which means we miss out on opportunities to fold out
pointless identities such as
CmpNE32(1Uto32(boolexpr), 0) ==> boolexpr
and many others, particularly relating to Memcheck
instrumentation. Some such folding is done in the tree builder,
but that is right at the end of the compilation pipeline, so it
is too late for other transformations to benefit from it. Doing
it early would be so much better. Also, doing it just once in
iropt would avoid various half-backed hacks in the instruction
selectors which try to deal with the worst of it, on an ad hoc
architecture-specific basis.
In particular, ARM/Thumb makes it clear the folder needs to learn
about folding nested Mux0Xs. The following transformations would
be useful
Mux0X(c, Mux0X(c, x, y), z) ==> Mux0X(c, x, z)
Mux0X(c, x, Mux0X(c, y, z)) ==> Mux0X(c, x, z)
Mux0X(c, x, Mux0X(!c, y, z)) ==> Mux0X(c, x, y)
This last one is interesting because it appears a lot in idiomatic
Thumb code from gcc. To get either X or Y in a register, gcc
generates
cmp ... // set the flags
moveq rD, X // rD := X if Z flag set
movne rD, Y // rD := Y if Z flag clear
Currently we wind up with an IR expression of the form
Mux0X(c, X, Mux0X(!c, Y, old-value-of-rD))
Because iropt doesn't currently know that 'c || !c' covers all
possibilities, it keeps the old value of rD alive unnecessarily.
------------
Unrelated to all this .. I have also more or less finished a
first implementation of direct IR support for conditional loads
and stores, in the COMEM branch. This improves accuracy of
Memcheck on ARM and should facilitate support of AVX/AVX2
scatter/gather loads/stores. I expect to merge this to trunk in
the next couple of weeks or so. There is still a bit of
adjustment to all the other (non-ARM) backends that needs to be
done first.
J
|