|
From: Stephen M. <sm...@CS...> - 2009-03-11 20:10:35
|
VEX uses a somewhat complex lazy translation scheme in order to accurately reflect the semantics of x86/amd64 condition codes ([ER]FLAGS) without too much of a runtime penalty. This mechanism was designed with Memcheck in mind, in as much as it separates part of the lazy EFLAGS state that should be relevant to V bits calculation from ones that shouldn't. However, when I was looking over this code in connection with another project, I noticed it appears to be a slight departure from the way Memcheck's V-bit propagation rules are usually defined. In most cases, Memcheck's V-bit rules are designed to avoid false negatives: Memcheck might set more bits undefined that necessary, but not less. The treatment of the flags departs from this philosophy a bit, though, with respect to the inc, dec, rol, and ror and instructions, which are unusual in that they set some but not all of the flags. (inc and dec don't affect CF, and rol and ror affect OF and CF only.) VEX's translation saves the old flags bits in order to get the right final ones, but it does so in a way that prevents V bits from propagating through these values. The comments justify this by saying it is "inconceivable" that a compiler would generate code that kept an undefined CF live across an inc/dec. I think that claim overstates things a bit. Though I've certainly never seen GCC-generated code reuse partial flags values, I think this would be a fairly natural thing for hand-written assembly or a fancier x86 compiler that modeled each flag bit separately to do. (Intel's manual explains that inc has this behavior for use with loop counters.) But as far as I know this decision hasn't affected Memcheck's practical bug-finding utility one way or the other. The real reason I care is that my Flowcheck tool reuses Memcheck's V-bit handling code for a different purpose, tracking bits that contain secret information. In that context, conservative V-bit propagation is more important, since a Memcheck false negative corresponds to a way of "laundering" secret data. (Drewry and Ormandy's Flayer tool, which also reuses Memcheck for a security-related application, might also care, though I think it might matter less there). So I need to change this for my own purposes, and I thought I'd ask whether it make sense to make the change for Memcheck too. In my smoke testing so far, I haven't seen the change generate any false positives, and if any do exist I think they could be fixed or worked around by normal mechanisms like increasing precision elsewhere or adding suppressions. I've been asked in the past not to send unsolicited VEX patches here, so I'll describe what I changed at a high level, and I can send a patch to anyone who's interested. Basically, I changed the translation of inc, dec, rol, and ror to pass the old flags in CC_DEP2 rather than CC_NDEP. I've also tried And32-ing the saved flags with a mask representing which flags are actually used, though now that I think about it this may not ever help, since I don't know that the translations can ever make partially-defined EFLAGS values. Some macros in ghelpers.c and the translation code in toIR.c need changing, as do a bunch of comments, but no code, in gdefs.h. (I also noticed an independent typo in gdefs.h: the comment about rol/ror mentions the "C" flag twice in an apparent contradiction; the second occurrence should be "S" instead.) Another subtle point that might benefit from more comments is whether, when a word is used to store a single flag value, the flag is stored in the least-significant bit or the bit corresponding to its location in EFLAGS. I think it's always the latter, but the code for adc/sbb only works because the location of CF is the least-significant position. I've done the change just for x86 so far, but I think the change for amd64 would be completely analogous. -- Stephen |