Here is a WIP patch set to fix up memcheck such that guards on dirty helpers are observed properly. The patches have small granularity for ease of review and need to be applied in order. I've regtested on x86-64 with no new regressions. patch-1 is non-functional and just points out the various places that need fixing. It's purely informational and adds a bunch of FIXMEs that will be removed in patch-[234]. patch-2 adds an additional parameter (guard expression) to complainIfUndefined and changes the function such that the complaint is only issued if the condition is true. patch-3 adds an additional parameter (guard expression) to do_shadow_PUT and changes the function to observe the guard expression. If the guard expression is not satisfied, we simply put the old value stored in the guest state (which will be thrown out later in IR optimisation). patch-4 fixes do_shadow_Dirty and adds convenience function expr2vbits_guarded_Load. patch-5 modifies the amd64 insn selector which would otherwise assert. I have not yet investigated what needs to be done to do_origins_Dirty / do_origins_Store. Pointers welcome. Florian |
|
From: Julian S. <js...@ac...> - 2012-07-10 15:47:47
|
Thanks for the patches. Some (inadequate, I fear) review:
> The patches have small granularity for ease of review and need to be
> applied in order. I've regtested on x86-64 with no new regressions.
It'd be reassuring to test some completely deterministic app on x86-64
(eg, perf/bz2.c compiled -O2) to check that the quantity of JIT generated
code is unchanged.
> patch-1 is non-functional and just points out the various places
> that need fixing.
Looks plausible. I didn't verify that that's all the required places :-(
> patch-2 adds an additional parameter (guard expression) to
> complainIfUndefined and changes the function such that the complaint
> is only issued if the condition is true.
Looks plausible. Couple of things:
(trivial)
-
/*------------------------------------------------------------*/
pls can you leave the blank line there? I like 2 blank lines
before section breaks (more artistically pleasing :)
+ /* If the complaint is to be issued under a guard condition, AND that
+ guard condition. */
+ if (guard && ! isTRUE(di->guard)) {
(the following also applies to patches 3 and 4, I think)
I get a bit concerned on seeing constructions along the lines of
if (ir expression has jit-time-evaluatable-some-specific-value) {
// add more IR
}
because then we have to prove that that whatever the extra IR does
will also happen even in the case where the expression did have the
particular value (in this case, 1::Ity_I1) but that could not be known
at JIT time. (eg, if the expression is CmpEQ32(e1,e2))
In this case I think it's safe because you're skipping the And1 (effectively)
in the case where one arg is visibly 1::Ity_I1, hence is harmless.
That said, I would much prefer that you removed the " && ! isTRUE(di->guard)",
hence always constructing the And against di->guard if it is non-NULL, and
rely in ir_opt to tidy up afterwards if di->guard is a constant. All
of the rest of this file creates naive instrumentation IR and relies on
iropt to clean up later. eg there is no special casing of Add32(x,y) for
one of the args being constants, hence always defined.
> patch-3 adds an additional parameter (guard expression) to
> do_shadow_PUT and changes the function to observe the guard expression.
> If the guard expression is not satisfied, we simply put the old value
> stored in the guest state (which will be thrown out later in IR
> optimisation).
+ We assume here, that the validity of GUARD has already been checked.
s/validity/definedness
+ if (guard && ! isTRUE(guard)) {
comments as above
If the guard expression is not satisfied, we simply put the old value
stored in the guest state (which will be thrown out later in IR
optimisation).
I don't think that iropt can remove "PUT(offs) = GET(offs)". Not 100%
sure it can't though. Doesn't affect correctness.
> patch-4 fixes do_shadow_Dirty and adds convenience function
> expr2vbits_guarded_Load.
Presumably there is some way in all this that MC will complain if the
guard expression is undefined. Maybe the existing code already does that?
> patch-5 modifies the amd64 insn selector which would otherwise assert.
+ addInstr(env, AMD64Instr_Test64(0xFFFFFFFF,reg));
Shouldn't that be 0x1 ? IIUC, as you have it, rflags.NZ will get set to 1
if any of the bits in reg are nonzero.
> I have not yet investigated what needs to be done to do_origins_Dirty /
> do_origins_Store. Pointers welcome.
Some useful things to know re origins are:
* they are always 32 bit ints, regardless of the type being shadowed
* the only operation used to combine such ints ("origin tags" / otags)
is Max32U
* because of that, a safe I-know-nothing value for an otag is zero
J
|
|
From: Florian K. <br...@ac...> - 2012-07-13 03:23:43
|
Thanks for looking over the patches..
On 07/10/2012 11:45 AM, Julian Seward wrote:
>
> It'd be reassuring to test some completely deterministic app on x86-64
> (eg, perf/bz2.c compiled -O2) to check that the quantity of JIT generated
> code is unchanged.
>
I did some measurements with a modified patch that incorporates all your
suggested changes. Runtime is unchanged.
Instruction measurements show no significant detriment:
trunk unmodified
bz2 transtab: new 3,698 (99,519 -> 1,566,490; ratio 157:10)
fbench transtab: new 2,336 (55,102 -> 877,541; ratio 159:10)
ffbench transtab: new 2,115 (51,832 -> 828,083; ratio 159:10)
heap transtab: new 1,846 (40,908 -> 663,257; ratio 162:10)
tinycc transtab: new 4,662 (119,656 -> 1,926,502; ratio 161:10)
trunk with patches
bz2 transtab: new 3,698 (99,519 -> 1,566,490; ratio 157:10)
fbench transtab: new 2,336 (55,102 -> 877,534; ratio 159:10)
ffbench transtab: new 2,115 (51,832 -> 828,090; ratio 159:10)
heap transtab: new 1,846 (40,903 -> 663,264; ratio 162:10)
tinycc transtab: new 4,662 (119,646 -> 1,926,509; ratio 161:10)
Virtually identical insn counts.
> Looks plausible. Couple of things:
>
> (trivial)
> -
> /*------------------------------------------------------------*/
>
> pls can you leave the blank line there? I like 2 blank lines
> before section breaks (more artistically pleasing :)
Yes, sure. This was unintentional. I share your artistic preference.
> + /* If the complaint is to be issued under a guard condition, AND that
> + guard condition. */
> + if (guard && ! isTRUE(di->guard)) {
>
...snip ...
> That said, I would much prefer that you removed the " && ! isTRUE(di->guard)",
> hence always constructing the And against di->guard if it is non-NULL, and
> rely in ir_opt to tidy up afterwards if di->guard is a constant.
Ok. No problem. Above insn numbers show that your confidence in the
optimisers is warranted.
> + We assume here, that the validity of GUARD has already been checked.
> s/validity/definedness
Yep.
> I don't think that iropt can remove "PUT(offs) = GET(offs)". Not 100%
> sure it can't though. Doesn't affect correctness.
Throwing those out should be allowed for those guests offsets that do
not require precise memory exceptions. I did not check whether
redundant_put_removal actually throws them out.
> Presumably there is some way in all this that MC will complain if the
> guard expression is undefined. Maybe the existing code already does that?
Yes, the existing code already checks the definedness of the guard
expression.
>> patch-5 modifies the amd64 insn selector which would otherwise assert.
>
> + addInstr(env, AMD64Instr_Test64(0xFFFFFFFF,reg));
>
> Shouldn't that be 0x1 ? IIUC, as you have it, rflags.NZ will get set to 1
> if any of the bits in reg are nonzero.
I wasn't sure about this one. I take your word and change it.
I have an updated patch which I will post when s390 passes. There is
currently one testcase that is failing and I need to find out why. Could
be some latent bug that is exposed by these changes.
Florian
|
|
From: Julian S. <js...@ac...> - 2012-07-13 09:38:49
|
On Friday, July 13, 2012, Florian Krohm wrote: > I did some measurements with a modified patch that incorporates all your > suggested changes. Runtime is unchanged. > Instruction measurements show no significant detriment: > > trunk unmodified > bz2 transtab: new 3,698 (99,519 -> 1,566,490; ratio 157:10) > fbench transtab: new 2,336 (55,102 -> 877,541; ratio 159:10) > ffbench transtab: new 2,115 (51,832 -> 828,083; ratio 159:10) > heap transtab: new 1,846 (40,908 -> 663,257; ratio 162:10) > tinycc transtab: new 4,662 (119,656 -> 1,926,502; ratio 161:10) > > trunk with patches > bz2 transtab: new 3,698 (99,519 -> 1,566,490; ratio 157:10) > fbench transtab: new 2,336 (55,102 -> 877,534; ratio 159:10) > ffbench transtab: new 2,115 (51,832 -> 828,090; ratio 159:10) > heap transtab: new 1,846 (40,903 -> 663,264; ratio 162:10) > tinycc transtab: new 4,662 (119,646 -> 1,926,509; ratio 161:10) > > Virtually identical insn counts. Thanks for the numbers. I am surprised they are different at all considering that IIUC, the instrumentation IR you are generating on all non-s390 platforms should be unchanged. > > I don't think that iropt can remove "PUT(offs) = GET(offs)". Not 100% > > sure it can't though. Doesn't affect correctness. > > Throwing those out should be allowed for those guests offsets that do > not require precise memory exceptions. I did not check whether > redundant_put_removal actually throws them out. redundant_put_removal does a different transformation. It does PUT(offs) = e1 # no intervening GET of offs PUT(offs) = e2 ==> # no intervening GET of offs PUT(offs) = e2 which is a very common idiom in the initial IR, when the independent translations of each instruction are concatenated (as they are). So it won't remove the PUT(offs) = GET(offs) as that doesn't fit this pattern. Maybe doesn't matter; I imagine this will happen only very rarely. > I have an updated patch which I will post when s390 passes. There is > currently one testcase that is failing and I need to find out why. Could > be some latent bug that is exposed by these changes. Good. J |
|
From: Florian K. <br...@ac...> - 2012-07-16 22:21:49
|
Hi,
attached are three patches.
guarded-dirty-patch-1 combines the patches I had sent previously and
addresses these comments:
>> Looks plausible. Couple of things:
>>
>> (trivial)
>> -
>> /*------------------------------------------------------------*/
>>
>> pls can you leave the blank line there? I like 2 blank lines
>> before section breaks (more artistically pleasing :)
>
>> + /* If the complaint is to be issued under a guard condition, AND that
>> + guard condition. */
>> + if (guard && ! isTRUE(di->guard)) {
>>
>> That said, I would much prefer that you removed the " && ! isTRUE(di->guard)",
>> hence always constructing the And against di->guard if it is non-NULL, and
>> rely in ir_opt to tidy up afterwards if di->guard is a constant.
>
>> + We assume here, that the validity of GUARD has already been checked.
>> s/validity/definedness
>
vex-patch is an updated version of the changes to the amd64 insn
selector as suggested in this comment:
>>
>> + addInstr(env, AMD64Instr_Test64(0xFFFFFFFF,reg));
>>
>> Shouldn't that be 0x1 ? IIUC, as you have it, rflags.NZ will get set to 1
>> if any of the bits in reg are nonzero.
>
Finally, there is guarded-dirty-patch-2 which fixes up do_origins_Dirty
to observe guards.
> I have an updated patch which I will post when s390 passes. There is
> currently one testcase that is failing and I need to find out why. Could
> be some latent bug that is exposed by these changes.
This bug was completely unrelated to my patch set (and fixed in r12749).
It just surfaced in this context...
I've reg-tested the patches on x86-64, ppc64, and s390x with no new
regressions.
Florian
|
|
From: Julian S. <js...@ac...> - 2012-07-18 22:46:30
|
> attached are three patches [...] These look OK to me. > I've reg-tested the patches on x86-64, ppc64, and s390x with no new > regressions. Thanks for the extensive testing. Sorry for slow review. J |