|
From: Josef W. <Jos...@gm...> - 2012-07-23 22:20:28
|
Hi, [should have sent this email also to the mailing list; doing now...] Am 23.07.2012 23:43, schrieb Philippe Waroquiers: > On Mon, 2012-07-23 at 23:35 +0200, Josef Weidendorfer wrote: >> Hi Philippe, >> >> not sure this is a known shortcoming. I just chased down >> bug 303963, and tried to use vgdb to find the difference >> between behaviour of real hardware and VEX (for the >> over-complex PCMPxSTRx instruction). >> >> However, comparing flags after pcmpistri in vgdb single- >> stepping was not useful, as flags obviously were not >> changed. VEX does condition calculation lazily. Perhaps >> this should be done every time vgdb is asked for the >> flag register? > > Hello Josef, > > When GDB asks to examine a memory location or a register, GDB sends > a "read" packet to the Valgrind gdbserver, which then > gets the desired value from either memory or from the thread state > guest register. > So, if the register state in the thread is not up to date, > then Valgrind gdbsrv will not return the correct value to GDB. > > If the IR code is not computing the flags, it looks not possible > to have Valgrind gdbsrv computing it on demand > (at least, I do not see how to do it, or what VEX call to do to > compute these flags). I assume for amd64, that would be amd64g_calculate_rflags_all(...) (in VEX/priv/guest_amd64_helpers.c). However I am not sure if the VEX registers to use as parameters are fixed (if so, there would be no need to pass them all the time from generated code?). > So, we would need a way to force VEX to always compute these flags. > Maybe --vex-iropt-precise-memory-exns=yes can help for that ? That did not help, but "--vex-guest-max-insns=1" actually does! Thanks for helping me to think in the right direction. Should we add this hint to the documentation somewhere? Josef > > Philippe > > > |
|
From: Philippe W. <phi...@sk...> - 2012-07-23 23:07:55
|
On Tue, 2012-07-24 at 00:51 +0200, Philippe Waroquiers wrote:
> On Tue, 2012-07-24 at 00:20 +0200, Josef Weidendorfer wrote:
>
> > I assume for amd64, that would be amd64g_calculate_rflags_all(...)
> > (in VEX/priv/guest_amd64_helpers.c). However I am not sure if the VEX
> > registers to use as parameters are fixed (if so, there would be no need
> > to pass them all the time from generated code?).
> Yes, I suppose that cc_op, cc_dep1, cc_dep2, cc_ndep
> have all to be computed by the generated code, depending on what
> instruction has just been executed.
>
> If I understand correctly, Valgrind generated code will compute the
> flags for an instruction only if a following instruction in the
> same block is reading them ?
Looking in valgrind-low-amd64.c:188, I see that V gdbsrv retrieves
the flags to send to GDB using:
rflags = LibVEX_GuestAMD64_get_rflags (amd64);
which itself calls:
ULong rflags = amd64g_calculate_rflags_all_WRK(
vex_state->guest_CC_OP,
vex_state->guest_CC_DEP1,
vex_state->guest_CC_DEP2,
vex_state->guest_CC_NDEP
);
So, if the guest state is up to date, the flags sent to GDB should also
be correct.
Not clear to me when the guest_CC_* will be up to date.
Assuming these must/will be correct at the end of a block,
--vex-guest-max-insns=1 will then ensure they are always up to date.
Philippe
|
|
From: Josef W. <Jos...@gm...> - 2012-07-24 08:36:48
|
Am 24.07.2012 01:07, schrieb Philippe Waroquiers: > On Tue, 2012-07-24 at 00:51 +0200, Philippe Waroquiers wrote: >> On Tue, 2012-07-24 at 00:20 +0200, Josef Weidendorfer wrote: >> >>> I assume for amd64, that would be amd64g_calculate_rflags_all(...) >>> (in VEX/priv/guest_amd64_helpers.c). However I am not sure if the VEX >>> registers to use as parameters are fixed (if so, there would be no need >>> to pass them all the time from generated code?). >> Yes, I suppose that cc_op, cc_dep1, cc_dep2, cc_ndep >> have all to be computed by the generated code, depending on what >> instruction has just been executed. >> >> If I understand correctly, Valgrind generated code will compute the >> flags for an instruction only if a following instruction in the >> same block is reading them ? > Looking in valgrind-low-amd64.c:188, I see that V gdbsrv retrieves > the flags to send to GDB using: > rflags = LibVEX_GuestAMD64_get_rflags (amd64); > > which itself calls: > ULong rflags = amd64g_calculate_rflags_all_WRK( > vex_state->guest_CC_OP, > vex_state->guest_CC_DEP1, > vex_state->guest_CC_DEP2, > vex_state->guest_CC_NDEP > ); > > So, if the guest state is up to date, the flags sent to GDB should also > be correct. That means that the guest state is not up-to-date when vgdb is called within a block. Me wondering why this works at all for the other registers. Josef > Not clear to me when the guest_CC_* will be up to date. > Assuming these must/will be correct at the end of a block, > --vex-guest-max-insns=1 will then ensure they are always up to date. > > Philippe > > > > |
|
From: Philippe W. <phi...@sk...> - 2012-07-23 22:51:52
|
On Tue, 2012-07-24 at 00:20 +0200, Josef Weidendorfer wrote: > I assume for amd64, that would be amd64g_calculate_rflags_all(...) > (in VEX/priv/guest_amd64_helpers.c). However I am not sure if the VEX > registers to use as parameters are fixed (if so, there would be no need > to pass them all the time from generated code?). Yes, I suppose that cc_op, cc_dep1, cc_dep2, cc_ndep have all to be computed by the generated code, depending on what instruction has just been executed. If I understand correctly, Valgrind generated code will compute the flags for an instruction only if a following instruction in the same block is reading them ? > > > So, we would need a way to force VEX to always compute these flags. > > Maybe --vex-iropt-precise-memory-exns=yes can help for that ? > > That did not help, but "--vex-guest-max-insns=1" actually does! > Thanks for helping me to think in the right direction. > > Should we add this hint to the documentation somewhere? Yes, it looks a good idea to document that in gdbserver limitations. Are both --vex-guest-max-insns=1 and --vex-iropt-precise-memory-exns=yes needed for an as best as possible equivalence between Valgrind synthetic cpu and real hardware ? Or is --vex-guest-max-insns=1 also implying the effect of --vex-iropt-precise-memory-exns=yes ? An alternative might be to always put automatically these values to 1 and yes when --vgdb=full is given ? Philippe |
|
From: Julian S. <js...@ac...> - 2012-07-25 08:41:30
|
> Are both --vex-guest-max-insns=1 and --vex-iropt-precise-memory-exns=yes > needed for an as best as possible equivalence between Valgrind synthetic > cpu and real hardware ? > > Or is --vex-guest-max-insns=1 also implying the effect of > --vex-iropt-precise-memory-exns=yes ? Normally, iropt aggressively optimises the IR: it does redundant PUT removal and Put-to-Get forwarding in this kind of situation (example that ignores flags) %eax += 42 %eax *= 17 Initially each insn is translated in isolation, giving: IMark for the first insn t1 = GET(offset-eax) t2 = Add32( t1, 42 ) PUT(offset-eax) = t2 IMark for the second insn t3 = GET(offset-eax) t4 = Mul32( t3, 17 ) PUT(offset-eax) = t4 iropt first does Get-to-Put forwarding, giving this (effectively): IMark for the first insn t1 = GET(offset-eax) t2 = Add32( t1, 42 ) PUT(offset-eax) = t2 IMark for the second insn t4 = Mul32( t2, 17 ) PUT(offset-eax) = t4 and then it does redundant Put removal: IMark for the first insn t1 = GET(offset-eax) t2 = Add32( t1, 42 ) IMark for the second insn t4 = Mul32( t2, 17 ) PUT(offset-eax) = t4 Like this, the value in (guest) %eax can be cached in (host) register(s) for this pair of guest insns, and the intermediate Put and Get are gone. When applied to longer sequences of guest insns, these transformations are very effective if causing guest registers to be cached in host registers, but it does have the side effect of making the guest register state mostly not up to date, except at the end of basic blocks. There are some exceptions. By default, redundant Put removal is restricted that the guest PC, SP and frame pointer (if any) are up to date before any memory accesses (by the guest), so that Memcheck's helper functions can always take a stack trace if they need to. --vex-iropt-precise-memory-exns=yes restricts redundant Put removal so that _all_ guest register values are up to date at guest memory accesses. This normally is not necessary, but can be important if the guest is doing nasty games like looking at register contents in sigcontexts in SEGV signal handlers. --vex-guest-max-insns=1 "solves" the problem by making each instruction into its own IRSB, hence making impossible these transformations, and so the guest state is always up to date. Unfortunately this obviously also defeats all other transformations (constant folding, dead code removal, CSE, etc) that have a big effect when multiple insns are translated into the same IRSB. Hence my suggestion (re other message) that we have a new command line flag --precise-register-values, that disables redundant Put removal completely but has no other effect on iropt. I think this should get you what you want, but without a big performance overhead. J |