|
From: Florian K. <br...@ac...> - 2012-12-30 00:50:17
Attachments:
vex-patch
|
Hello Julian,
I was looking at improving s390 code generation by taking advantage of
memory-to-memory instructions. E.g. there is an insn that has memcpy
semantics and I'd like to use it. But today, there is no opportunity
to do so. Looking at IR optimisation I spotted a few patterns that
could be improved. Here is one:
t43 = GET:I64(264)
DIRTY 1:I1 RdFX-gst(312,8) RdFX-gst(336,8) :::
MC_(helperc_STOREV64be)[rp=1] {0x40100af90}(t5,GET:I64(712))
STbe(t5) = t43
t43 is not used anywhere else (use count == 1). So I'd like IR optim
to transform the above into:
DIRTY 1:I1 RdFX-gst(312,8) RdFX-gst(336,8) :::
MC_(helperc_STOREV64be)[rp=1] {0x40100af90}(t5,GET:I64(712))
STbe(t5) = GET:I64(264)
For the last IR stmt I can use a memory-to-memory insn. That saves me
one insn and one register.
This transform is safe, as the dirty helper does not PUT anything.
However, ado_treebuild_BB assumes that a dirty helper always PUTs and
always STOREs. That is more conservative than needs be.
The attached patch looks at dirty helpers more closely and gives a more
precise answer WRT modifying guest state or memory. It will enable the
above transformation.
I've tested this on x86-64 and s390x with no new regressions.
Runtime on x86-64 is unchanged, as expected.
--stats=yes tells me that about 2000 insns are saved (as is, without
changing insn selection). There is no change in runtime (all within
noise margin).
Any objections to applying it?
Florian
|