|
From: Carl E. L. <ce...@us...> - 2016-11-17 19:25:29
|
Julian, Valgrind developers:
Philippe found an issue on Power when self hosting Valgrind. The issue
shows up on Power 7 which does not support ISA 2.07.
The issue is we do a check to see if the run time check to see what the
hardware supports matches what the system says.
in short in initimg-linux.c, about line 736 we check if the auxv vector
has the ISA 2.07 flag set and compare that with the vex_archinfo->hwcaps
that was calculated by valgrind.
The a_val member of this entry is a bit map of hardware
capabilities. Some bit mask values include:
PPC_FEATURE2_ARCH_2_07 0x80000000
PPC_FEATURE2_HAS_HTM 0x40000000
PPC_FEATURE2_HAS_DSCR 0x20000000
PPC_FEATURE2_HAS_EBB 0x10000000
PPC_FEATURE2_HAS_ISEL 0x08000000
PPC_FEATURE2_HAS_TAR 0x04000000
PPC_FEATURE2_HAS_VCRYPTO 0x02000000
*/
auxv_2_07 = (auxv->u.a_val & 0x80000000ULL) == 0x80000000ULL;
hw_caps_2_07 = (vex_archinfo->hwcaps & VEX_HWCAPS_PPC64_ISA2_07)
== VEX_HWCAPS_PPC64_ISA2_07;
/* Verify the PPC_FEATURE2_ARCH_2_07 setting in HWCAP2
* matches the setting in VEX HWCAPS.
*/
The issue is on the inner valgrind vex_archinfo->hwcaps has the bit set
for VEX_HWCAPS_PPC64_ISA2_07.
The vex_archinfo->hwcaps value is set in coregrind/m_initimg/machine.cat
about line 1167
/* Check for ISA 2.07 support. */
have_isa_2_07 = True;
if (VG_MINIMAL_SETJMP(env_unsup_insn)) {
have_isa_2_07 = False;
} else {
__asm__ __volatile__(".long 0x7c000166"); /* mtvsrd XT,RA */
}
I have put debug statements into this code for both the inner and outer
valgrind trees. The value returned by VG_MINIMAL_SETJMP() is 1 for the
outer Valgrind and thus have_isa_2_07 is set to False. For the inner
valgrind, the return value is 0 and we do not change have_isa_2_07.
I have studied up a little on the setjump/longjump stuff. At this point,
I don't know why they don't seem to work in the inner Valgrind. I was
wondering if anyone had any insight it doesn't work on the inner?
Carl Love
On Sat, 2016-11-12 at 11:45 +0100, Philippe Waroquiers wrote:
> Hello Carl,
>
> I am busy investigating a strange behaviour (loss of performance)
> on ppc64 (callgrind tool), probably due to revision r16121.
>
> Doing that, I wanted to do valgrind self hosting
> (see README_DEVELOPERS for more info).
>
> When doing that, the inner valgrind asserts on the assert
> that I have commented below
> I can of course commit a fix that the below assert is not
> checked in an inner setup, but as I do not understand much
> of the below, it would be nice if you could investigate ?
> (no urgency of course, as the bypass is ok for now)
> Thanks
>
> Philippe
>
>
> Index: coregrind/m_initimg/initimg-linux.c
> ===================================================================
> --- coregrind/m_initimg/initimg-linux.c (revision 16120)
> +++ coregrind/m_initimg/initimg-linux.c (working copy)
> @@ -739,7 +739,7 @@
> /* Verify the PPC_FEATURE2_ARCH_2_07 setting in HWCAP2
> * matches the setting in VEX HWCAPS.
> */
> - vg_assert(auxv_2_07 == hw_caps_2_07);
> +// vg_assert(auxv_2_07 == hw_caps_2_07);
> }
>
> break;
>
>
|
|
From: Philippe W. <phi...@sk...> - 2016-11-17 20:01:10
|
On Thu, 2016-11-17 at 11:25 -0800, Carl E. Love wrote:
> I have put debug statements into this code for both the inner and outer
> valgrind trees. The value returned by VG_MINIMAL_SETJMP() is 1 for the
> outer Valgrind and thus have_isa_2_07 is set to False. For the inner
> valgrind, the return value is 0 and we do not change have_isa_2_07.
>
> I have studied up a little on the setjump/longjump stuff. At this point,
> I don't know why they don't seem to work in the inner Valgrind. I was
> wondering if anyone had any insight it doesn't work on the inner?
I am guessing that VG_MINIMAL_SETJMP is in fact working.
I am guessing that the difference is:
* the outer checks that a capability is (or is not) supported by
trying an instruction.
If the instruction is not supported, a LONGJMP will happen.
On Power 7, that is what happens in the outer, as the outer runs
on the real (hw) cpu.
* When the inner runs, it runs on the CPU simulated by the outer.
Now, if the outer just executes this instruction without checking
the hw cap of the physical cpu (and runs the IR code retranslated
in other instructions), then the inner believes it is on a different
cpu than the hw cpu.
More generally, we might improve on the way to control the virtual
cpu provided by valgrind.
For example, we might imagine to have some command line flags
to specify what 'hw features' to simulate.
Of course, when translating to real instructions, only the really
available hw instructions can be generated (which means that it might
not be possible to accept all instructions in the virtual cpu).
Philippe
|