|
From: Florian K. <br...@ac...> - 2011-04-28 03:14:37
Attachments:
unit-test-patch
|
I have been thinking about end-to-end unit-testing of the VEX
translation pipeline. My motivation is to set up a test suite
for s390x where I run a single insn (or a handful of them)
through the VEX pipeline and make sure that
(a) we don't assert
(b) IR optimizations happen as expected
(c) code generation takes advantage of insns available only at
certain hwcaps levels
So I've written a small program VEX/auxprogs/run-insns.c which
reads a stream of encoded instructions from a file and pipes
them through VEX without any instrumentations. For instance,
if this is the input file:
$ cat flogr
A5 4F DE AD # R4 = 0xDEAD
B9 83 00 14 # R1 = leftmost-bit(R4); R2 = R4 with leftmost-bit=0
00 00
then
run-insns --s390x --trace-flags=00000010 flogr will produce
------------------------ Register-allocated code ------------------------
0 v-loadi %r1,0xDEAD 8 bytes
1 v-store %r1,guest_r4 8 bytes
2 v-loadi %r1,48 8 bytes
3 v-store %r1,guest_r1 8 bytes
4 v-loadi %r1,0x5EAD 8 bytes
5 v-store %r1,guest_r2 8 bytes
6 v-loadi %r1,2 8 bytes
7 v-store %r1,guest_CC_OP 8 bytes
8 v-loadi %r1,0xDEAD 8 bytes
9 v-store %r1,guest_CC_DEP1 8 bytes
10 v-loadi %r1,0 8 bytes
11 v-store %r1,guest_CC_DEP2 8 bytes
12 v-loadi %r1,0 8 bytes
13 v-store %r1,guest_CC_NDEP 8 bytes
14 v-loadi %r1,8 8 bytes
15 v-store %r1,guest_IA 8 bytes
16 if (always) goto guest_r0 0 bytes
I can easily verify that constant folding and propagation knocks
down all the complexity that the initial IR had. Good.
By using --trace-flags=00000001 I can study the generated code
for inefficiencies, i.e. whether it uses instructions that are
only available when certain hwcaps are present.
So this is quite useful to build up a testsuite for various complex
instructions.
Essentially, run-insns.c sets up abi_info, arch_info, vex_control,
and VexTranslateArgs based on command line flags given to it and
then invokes VEX. The whole thing is extensible but works for s390x
only, currently.
The changes to VEX proper are minimal: two new members to vex_control,
that's it. One to enable unit-testing and one to enable printing of
symbolic names for guest registers in VCODE (instead of printing the
amode). See above: guest_r1, guest_CC_OP etc.
One thing I'm not exactly sure about is where to put the tests.
I looked at VEX/test but did not understand how that is working.
So I've added VEX/tests/s390x and put the tests there. Is that agreeable?
Next, and more complex, is whether I can / should reuse the
vg_regtest machinery. It seems to always want to invoke valgrind, but
that would be overkill here. I did not look at vg_regtest in detail,
yet, so comments with respect to its suitability would be welcome.
Florian
Attached is an initial work-in-progress patch.
|
|
From: Julian S. <js...@ac...> - 2011-04-28 07:43:31
|
Seems like a good idea. There's a kind-of functionality overlap
which you may not be aware of, though (?): VEX/test_main.c is
a program which does something similar, but has two different
features:
(1) it contains a very cut-down, ancient version of the instrumenter
used in memcheck/mc_translate.c, so you can play the same game,
but also check that the back end does a reasonable job on the
IR ops that are generated only by Memcheck (CmpNEZ*, Left*)
(2) it will process a whole stream of blocks, and optionally
can iterate each block as many times as required. This is
useful for profiling VEX.
It reads the ".orig" files, eg VEX/orig_x86/exit42.orig.
It's fragile and unmaintained and hacky, but it's sometimes
useful. A clean, more maintainable semi-equivalent would be
no bad thing. Not sure if (1) is worth keeping. (2) is
occasionally useful.
J
On Thursday, April 28, 2011, Florian Krohm wrote:
> I have been thinking about end-to-end unit-testing of the VEX
> translation pipeline. My motivation is to set up a test suite
> for s390x where I run a single insn (or a handful of them)
> through the VEX pipeline and make sure that
> (a) we don't assert
> (b) IR optimizations happen as expected
> (c) code generation takes advantage of insns available only at
> certain hwcaps levels
>
> So I've written a small program VEX/auxprogs/run-insns.c which
> reads a stream of encoded instructions from a file and pipes
> them through VEX without any instrumentations. For instance,
> if this is the input file:
>
> $ cat flogr
>
> A5 4F DE AD # R4 = 0xDEAD
> B9 83 00 14 # R1 = leftmost-bit(R4); R2 = R4 with leftmost-bit=0
> 00 00
>
> then
>
> run-insns --s390x --trace-flags=00000010 flogr will produce
>
> ------------------------ Register-allocated code ------------------------
>
> 0 v-loadi %r1,0xDEAD 8 bytes
> 1 v-store %r1,guest_r4 8 bytes
> 2 v-loadi %r1,48 8 bytes
> 3 v-store %r1,guest_r1 8 bytes
> 4 v-loadi %r1,0x5EAD 8 bytes
> 5 v-store %r1,guest_r2 8 bytes
> 6 v-loadi %r1,2 8 bytes
> 7 v-store %r1,guest_CC_OP 8 bytes
> 8 v-loadi %r1,0xDEAD 8 bytes
> 9 v-store %r1,guest_CC_DEP1 8 bytes
> 10 v-loadi %r1,0 8 bytes
> 11 v-store %r1,guest_CC_DEP2 8 bytes
> 12 v-loadi %r1,0 8 bytes
> 13 v-store %r1,guest_CC_NDEP 8 bytes
> 14 v-loadi %r1,8 8 bytes
> 15 v-store %r1,guest_IA 8 bytes
> 16 if (always) goto guest_r0 0 bytes
>
> I can easily verify that constant folding and propagation knocks
> down all the complexity that the initial IR had. Good.
> By using --trace-flags=00000001 I can study the generated code
> for inefficiencies, i.e. whether it uses instructions that are
> only available when certain hwcaps are present.
> So this is quite useful to build up a testsuite for various complex
> instructions.
>
> Essentially, run-insns.c sets up abi_info, arch_info, vex_control,
> and VexTranslateArgs based on command line flags given to it and
> then invokes VEX. The whole thing is extensible but works for s390x
> only, currently.
>
> The changes to VEX proper are minimal: two new members to vex_control,
> that's it. One to enable unit-testing and one to enable printing of
> symbolic names for guest registers in VCODE (instead of printing the
> amode). See above: guest_r1, guest_CC_OP etc.
>
> One thing I'm not exactly sure about is where to put the tests.
> I looked at VEX/test but did not understand how that is working.
> So I've added VEX/tests/s390x and put the tests there. Is that agreeable?
>
> Next, and more complex, is whether I can / should reuse the
> vg_regtest machinery. It seems to always want to invoke valgrind, but
> that would be overkill here. I did not look at vg_regtest in detail,
> yet, so comments with respect to its suitability would be welcome.
>
> Florian
>
> Attached is an initial work-in-progress patch.
|
|
From: Florian K. <br...@ac...> - 2011-05-04 03:46:17
|
Indeed there is lot of overlap with test_main.c and combining them sounds good to me. One thing: how do you create these .orig files e.g. orig_x86/exit24.orig? It does not seem possible to do that with current trunk. With --trace-flags=00000001 the output contains GuestBytes 41DCEA7 4 8B 1C 24 C3 000004A3 which has pretty much the same information as the "." lines in the orig files. IMHO it makes more sense to support this format than the format in the orig files. That way we can easily create input data using --trace-flags. The .orig files can be converted to this new format. That's probably not much of a deal. Does that sound reasonable or would you rather keep the orig files as is? I've made some changes to run-insn. It supports all architectures now. run-insn --help shows: usage: run-insns [options] file Architecture options --x86 --amd64 --arm --ppc32 --ppc64 --s390x General options --address=HEX address of 1st insn in basic block [default=0] --num-bbs=INT number of bbs to process (only with --orig) --num-iter=INT number of iterations for LibVEX_Translate --opt-level=0|1|2 set optimization level --orig process a '.orig' file --stats print VEX statistics --trace-flags=xxxxxxxx select what to trace --verbosity=0|1|2|...|10 set verbosity x86 specific options --sse1 host has SSE1 capability --sse2 host has SSE2 capability --sse3 host has SSE3 capability --lzcnt host has LZCNT insn amd64 specific options --sse3 host has SSE3 capability --cx16 host has cmpxchg16b support --lzcnt host has LZCNT insn --fs=0 translate %fs-prefixed insn assuming %fs = 0 --gs=0x60 translate %gs-prefixed insn assuming %gs = 0x60 arm specific options --vfp host supports VFP extensions --vfp2 host supports VFPv2 extensions --vfp3 host supports VFPv3 extensions --neon host supports NEON extensions --archlevel= host's architecture level ppc32 specific options --fp host has floating point capability --vmx host has VMX (Altivec) capability --fx host has floating point extensions --gx host has graphics extensions --vx host has vector-scalar floating point --cache-line= size of a cache line in bytes --dcbz-bytes= number of bytes zeroed by the dcbz insn --dcbzl-bytes= number of bytes zeroed by the dcbzl insn ppc64 specific options --vmx host has VMX (Altivec) capability --fx host has floating point extensions --gx host has graphics extensions --vx host has vector-scalar floating point --cache-line= size of a cache line in bytes --dcbz-bytes= number of bytes zeroes by the dcbz insn --dcbzl-bytes= number of bytes zeroes by the dcbzl insn s390x specific options --ldisp host has long-displacement --eimm host has extended-immediate --gie host has general instruction extension --dfp host has decimal floating point --fgx host has FPR-GR transfer So you can pretty much control VEX's knobs via the command line. I've opened #272405 in bugzilla and attached the source code there. Florian On 04/28/2011 03:41 AM, Julian Seward wrote: > > Seems like a good idea. There's a kind-of functionality overlap > which you may not be aware of, though (?): VEX/test_main.c is > a program which does something similar, but has two different > features: > > (1) it contains a very cut-down, ancient version of the instrumenter > used in memcheck/mc_translate.c, so you can play the same game, > but also check that the back end does a reasonable job on the > IR ops that are generated only by Memcheck (CmpNEZ*, Left*) > > (2) it will process a whole stream of blocks, and optionally > can iterate each block as many times as required. This is > useful for profiling VEX. > > It reads the ".orig" files, eg VEX/orig_x86/exit42.orig. > > It's fragile and unmaintained and hacky, but it's sometimes > useful. A clean, more maintainable semi-equivalent would be > no bad thing. Not sure if (1) is worth keeping. (2) is > occasionally useful. > > J > > > On Thursday, April 28, 2011, Florian Krohm wrote: >> I have been thinking about end-to-end unit-testing of the VEX >> translation pipeline. My motivation is to set up a test suite >> for s390x where I run a single insn (or a handful of them) >> through the VEX pipeline and make sure that >> (a) we don't assert >> (b) IR optimizations happen as expected >> (c) code generation takes advantage of insns available only at >> certain hwcaps levels >> >> So I've written a small program VEX/auxprogs/run-insns.c which >> reads a stream of encoded instructions from a file and pipes >> them through VEX without any instrumentations. For instance, >> if this is the input file: >> >> $ cat flogr >> >> A5 4F DE AD # R4 = 0xDEAD >> B9 83 00 14 # R1 = leftmost-bit(R4); R2 = R4 with leftmost-bit=0 >> 00 00 >> >> then >> >> run-insns --s390x --trace-flags=00000010 flogr will produce >> >> ------------------------ Register-allocated code ------------------------ >> >> 0 v-loadi %r1,0xDEAD 8 bytes >> 1 v-store %r1,guest_r4 8 bytes >> 2 v-loadi %r1,48 8 bytes >> 3 v-store %r1,guest_r1 8 bytes >> 4 v-loadi %r1,0x5EAD 8 bytes >> 5 v-store %r1,guest_r2 8 bytes >> 6 v-loadi %r1,2 8 bytes >> 7 v-store %r1,guest_CC_OP 8 bytes >> 8 v-loadi %r1,0xDEAD 8 bytes >> 9 v-store %r1,guest_CC_DEP1 8 bytes >> 10 v-loadi %r1,0 8 bytes >> 11 v-store %r1,guest_CC_DEP2 8 bytes >> 12 v-loadi %r1,0 8 bytes >> 13 v-store %r1,guest_CC_NDEP 8 bytes >> 14 v-loadi %r1,8 8 bytes >> 15 v-store %r1,guest_IA 8 bytes >> 16 if (always) goto guest_r0 0 bytes >> >> I can easily verify that constant folding and propagation knocks >> down all the complexity that the initial IR had. Good. >> By using --trace-flags=00000001 I can study the generated code >> for inefficiencies, i.e. whether it uses instructions that are >> only available when certain hwcaps are present. >> So this is quite useful to build up a testsuite for various complex >> instructions. >> >> Essentially, run-insns.c sets up abi_info, arch_info, vex_control, >> and VexTranslateArgs based on command line flags given to it and >> then invokes VEX. The whole thing is extensible but works for s390x >> only, currently. >> >> The changes to VEX proper are minimal: two new members to vex_control, >> that's it. One to enable unit-testing and one to enable printing of >> symbolic names for guest registers in VCODE (instead of printing the >> amode). See above: guest_r1, guest_CC_OP etc. >> >> One thing I'm not exactly sure about is where to put the tests. >> I looked at VEX/test but did not understand how that is working. >> So I've added VEX/tests/s390x and put the tests there. Is that agreeable? >> >> Next, and more complex, is whether I can / should reuse the >> vg_regtest machinery. It seems to always want to invoke valgrind, but >> that would be overkill here. I did not look at vg_regtest in detail, >> yet, so comments with respect to its suitability would be welcome. >> >> Florian >> >> Attached is an initial work-in-progress patch. > > |