You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(1) |
2
|
3
|
4
|
5
|
|
6
|
7
|
8
|
9
|
10
(1) |
11
|
12
(2) |
|
13
(1) |
14
|
15
(3) |
16
|
17
|
18
(1) |
19
|
|
20
|
21
|
22
|
23
|
24
|
25
|
26
(2) |
|
27
(6) |
28
(2) |
29
|
30
(1) |
|
|
|
|
From: Eyal S. <eya...@gm...> - 2022-11-27 21:51:42
|
OMG it works! That is so much better than the ideas that I had! Somehow I
did not discover it! So it appears that there is a spec helper for each
architecture and it makes a pass over the asm that was translated to IR and
sometimes replaces it with something more accurate?
The code that I used along with the new, working output are below. Thank
you!
Eyal
diff --git a/VEX/priv/guest_amd64_helpers.c b/VEX/priv/guest_amd64_helpers.c
index abd2a1e37..fa8c4266c 100644
--- a/VEX/priv/guest_amd64_helpers.c
+++ b/VEX/priv/guest_amd64_helpers.c
@@ -1684,6 +1684,21 @@ IRExpr* guest_amd64_spechelper ( const HChar*
function_name,
mkU64(0)));
}
+ if (isU64(cc_op, AMD64G_CC_OP_LOGICQ) && isU64(cond, AMD64CondS)) {
+ /* long long and/or/xor, then S --> (ULong)result[63] */
+ return binop(Iop_And64,
+ binop(Iop_Shr64, cc_dep1, mkU8(63)),
+ mkU64(1));
+ }
+ if (isU64(cc_op, AMD64G_CC_OP_LOGICQ) && isU64(cond, AMD64CondNS)) {
+ /* long long and/or/xor, then S --> (ULong) ~ result[63] */
+ return binop(Iop_Xor64,
+ binop(Iop_And64,
+ binop(Iop_Shr64, cc_dep1, mkU8(63)),
+ mkU64(1)),
+ mkU64(1));
+ }
+
/*---------------- LOGICL ----------------*/
if (isU64(cc_op, AMD64G_CC_OP_LOGICL) && isU64(cond, AMD64CondZ)) {
C: ------ IMark(0x28060F, 4, 0) ------
V: t88 = CmpNEZ8(0x0:I8)
V: t89 = 1Sto64(t88)
V: t90 = Shr64(t87,0x3F:I8)
V: t91 = Or64(t90,t89)
V: t92 = t91
V: t93 = t92
C: t38 = Shr64(t20,0x3F:I8)
V: t94 = Or64(0x1:I64,0x0:I64)
V: t95 = Or64(t38,t93)
V: t96 = And64(t95,t94)
V: t97 = Or64(t93,0x0:I64)
V: t98 = And64(t97,t96)
V: t99 = t98
V: t100 = t99
C: t37 = And64(t38,0x1:I64)
V: t101 = Or64(t100,0x0:I64)
V: t102 = t101
C: t36 = Xor64(t37,0x1:I64)
V: t103 = t102
C: t27 = t36
V: t104 = 64to1(t103)
V: t105 = t104
C: t39 = 64to1(t27)
V: t106 = t105
C: t22 = t39
V: t107 = 1Sto64(t106)
V: t108 = ITE(t22,0x0:I64,0x0:I64)
V: t109 = Or64(t108,t107)
V: t110 = t109
C: t40 = ITE(t22,0x2D18F1:I64,0x2D18F5:I64)
V: t111 = t110
C: t21 = t40
V: PUT(1000) = t111
C: PUT(72) = t21
C: PUT(184) = 0x280613:I64
On Sun, Nov 27, 2022 at 11:54 AM Eyal Soha <eya...@gm...> wrote:
> Oh man that sounds way easier! So we just recognize this combination of
> cmovns after test and instead of doing the CCall, we do it in IR. And then
> mc_translate will do its usual work of tracking V bits for regular IR
> instead of for the CCall and it'll get that right.
>
> This is great! I'll get on it soon. Thanks for the tip.
>
> Eyal
>
> On Sun, Nov 27, 2022, 11:10 Julian Seward <jse...@gm...> wrote:
>
>>
>> The way most of these problems got fixed in the past is for the IR
>> optimiser
>> to, effectively, inline the amd64g_calculate_condition call for the
>> specific
>> arg-pair you have (0x9/0x14). That gets rid of the false data dependency
>> that you (correctly) diagnosed.
>>
>> > C: t27 =
>> >
>> amd64g_calculate_condition[mcx=0x13]{0x5814d3e0}(0x9:I64,0x14:I64,t20,0x0:I64,0x0:I64):I64
>>
>> For that you need to add a folding rule to guest_amd64_spechelper. Here's
>> the folding rule for S after a 32-bit-LOGIC operation:
>>
>> if (isU64(cc_op, AMD64G_CC_OP_LOGICL) && isU64(cond, AMD64CondNS))
>> {
>> /* long and/or/xor, then S --> (ULong) ~ result[31] */
>> return binop(Iop_Xor64,
>> binop(Iop_And64,
>> binop(Iop_Shr64, cc_dep1, mkU8(31)),
>> mkU64(1)),
>> mkU64(1));
>> }
>>
>> Make a 64-bit version of that and add it to the LOGICQ section of the file
>> just above.
>>
>> J
>>
>>
|
|
From: Eyal S. <eya...@gm...> - 2022-11-27 18:54:24
|
Oh man that sounds way easier! So we just recognize this combination of
cmovns after test and instead of doing the CCall, we do it in IR. And then
mc_translate will do its usual work of tracking V bits for regular IR
instead of for the CCall and it'll get that right.
This is great! I'll get on it soon. Thanks for the tip.
Eyal
On Sun, Nov 27, 2022, 11:10 Julian Seward <jse...@gm...> wrote:
>
> The way most of these problems got fixed in the past is for the IR
> optimiser
> to, effectively, inline the amd64g_calculate_condition call for the
> specific
> arg-pair you have (0x9/0x14). That gets rid of the false data dependency
> that you (correctly) diagnosed.
>
> > C: t27 =
> >
> amd64g_calculate_condition[mcx=0x13]{0x5814d3e0}(0x9:I64,0x14:I64,t20,0x0:I64,0x0:I64):I64
>
> For that you need to add a folding rule to guest_amd64_spechelper. Here's
> the folding rule for S after a 32-bit-LOGIC operation:
>
> if (isU64(cc_op, AMD64G_CC_OP_LOGICL) && isU64(cond, AMD64CondNS)) {
> /* long and/or/xor, then S --> (ULong) ~ result[31] */
> return binop(Iop_Xor64,
> binop(Iop_And64,
> binop(Iop_Shr64, cc_dep1, mkU8(31)),
> mkU64(1)),
> mkU64(1));
> }
>
> Make a 64-bit version of that and add it to the LOGICQ section of the file
> just above.
>
> J
>
>
|
|
From: Julian S. <jse...@gm...> - 2022-11-27 18:10:59
|
The way most of these problems got fixed in the past is for the IR optimiser
to, effectively, inline the amd64g_calculate_condition call for the specific
arg-pair you have (0x9/0x14). That gets rid of the false data dependency
that you (correctly) diagnosed.
> C: t27 =
> amd64g_calculate_condition[mcx=0x13]{0x5814d3e0}(0x9:I64,0x14:I64,t20,0x0:I64,0x0:I64):I64
For that you need to add a folding rule to guest_amd64_spechelper. Here's
the folding rule for S after a 32-bit-LOGIC operation:
if (isU64(cc_op, AMD64G_CC_OP_LOGICL) && isU64(cond, AMD64CondNS)) {
/* long and/or/xor, then S --> (ULong) ~ result[31] */
return binop(Iop_Xor64,
binop(Iop_And64,
binop(Iop_Shr64, cc_dep1, mkU8(31)),
mkU64(1)),
mkU64(1));
}
Make a 64-bit version of that and add it to the LOGICQ section of the file
just above.
J
|
|
From: Eyal S. <eya...@gm...> - 2022-11-27 16:12:07
|
I don't yet have a small test case but I will work on that. It should be pretty easy to make. I agree that messing with mcx_mask might not be the right answer. I'm still trying to figure out a better solution but I am confident that valgrind is reporting a "Conditional jump or move depends on uninitialised value" incorrectly. The error is here: https://github.com/pcb2gcode/pcb2gcode/actions/runs/3541776121/jobs/5946694353 . It only appears when compiling with clang 14. The relevant asm looks like this: 0x00000000002805fe <+190>: test %r12,%r12 0x000000000028060f <+207>: cmovns %rax,%rdi test is performing an AND of register 12 with itself and using the result to set the condition flags. Test is described here: https://en.wikipedia.org/wiki/TEST_(x86_instruction) cmovns is using the sign flag to determine if register rax should be copied over register rdi or not. The sign flag was set by previous "test" instruction and the sign should only depend on the most-significant bit of the result of register 12 AND with itself. Here is the IR that test generated: C: ------ IMark(0x2805FE, 3, 0) ------ C: PUT(144) = 0x14:I64 V: PUT(1080) = t82 C: PUT(152) = t20 V: PUT(1088) = 0x0:I64 C: PUT(160) = 0x0:I64 C: PUT(168) = 0x0:I64 test didn't actually perform the test, it just stored everything that we need to know in order to perform the test later. In location 144 we have 0x14, which is AMD64G_CC_OP_LOGICQ. That's because test performed a 64-bit AND. AND is a quad-word logic operation. In location 152 we have t20, which is the IR temporary register that stores the result of the AND. t82 is the validity bits of the result of the AND. In our case, only the upper half of the 64-bit word is valid so it has 0x00000000ffffffff because 1 means invalid and 0 means valid. C: ------ IMark(0x28060F, 4, 0) ------ V: t83 = CmpwNEZ64(t82) V: t84 = Or64(t83,0x0:I64) V: t85 = CmpwNEZ64(0x0:I64) V: t86 = Or64(t85,t84) V: t87 = CmpwNEZ64(t86) V: t88 = t87 C: t27 = amd64g_calculate_condition[mcx=0x13]{0x5814d3e0}(0x9:I64,0x14:I64,t20,0x0:I64,0x0:I64):I64 V: t89 = 64to1(t88) V: t90 = t89 C: t22 = 64to1(t27) V: t91 = 1Sto64(t90) V: t92 = ITE(t22,0x0:I64,0x0:I64) V: t93 = Or64(t92,t91) V: t94 = t93 C: t21 = ITE(t22,0x2D18F1:I64,0x2D18F5:I64) V: PUT(1000) = t94 C: PUT(72) = t21 C: PUT(184) = 0x280613:I64 The C line "amd64g_calculate_condition" is performing the calculation test. We see in the arguments 0x9, which is AMD64CondNS because our cmovns is looking at the not of the sign flag. The 0x14 in there is the LOGICQ that we saw before. t20 is the register. This is enough information for amd64g_calculate condition to figure out the sign flag. Note that flags are computed lazily, at the time of the cmovns and not at the time of the "test" instruction as the CPU would actually have done. The validity bits are computed in the first 6 V lines. The mcx mask is 0x13, which means that validity bits of arguments 0, 1, and 4 are to be ignored. There are five arguments to amd64g_calculate_condition so that means that only the t20 and the 0x0 are to be checked for validity. The validity bits of t20 are in t82. The CmpwNEZ64 will return all ones if any bit of t82 is one. It's true that the lower bits of t82 are ones so t83 is set to all ones. The rest of the lines do the same for 0x0 and combine the results. The 64to1 condenses this down to a single 1 bit, indicating that the result is not valid. valgrind has made a mistake. It assumes that the sign of "a AND a" cannot be known if any of the bits of "a' are unknown. This is wrong. What it should assume is that the sign of "a AND a" cannot be known if the MSB is unknown. The rest of the bits don't matter. This is the cause of the false report from memcheck. My solution was to change the mcx_mask from one-bit-per-argument to 64-bit-per-argument. And then we can mask just the bits that we care about when we do validity. The solution works, though it's a little wrong, and it's described here: https://github.com/eyal0/valgrind/compare/master...mcx_masks Here is a similar cmov as before with the new feature: C: ------ IMark(0x28060F, 4, 0) ------ V: t83 = And64(0x8000000000000000:I64,t82) V: t84 = CmpwNEZ64(t83) V: t85 = Or64(t84,0x0:I64) V: t86 = And64(0xFFFFFFFFFFFFFFFF:I64,0x0:I64) V: t87 = CmpwNEZ64(t86) V: t88 = Or64(t87,t85) V: t89 = CmpwNEZ64(t88) V: t90 = t89 C: t27 = amd64g_calculate_condition[mcx=0xFFFFFFFFFFFFFFFF,0xFFFFFFFFFFFFFFFF,0x7FFFFFFFFFFFFFFF,0x0,0xFFFFFFFFFFFFFFFF]{0x5814d680}(0x9:I64,0x14:I64,t20,0x0:I64,0x0:I64):I64 V: t91 = 64to1(t90) V: t92 = t91 C: t22 = 64to1(t27) V: t93 = 1Sto64(t92) V: t94 = ITE(t22,0x0:I64,0x0:I64) V: t95 = Or64(t94,t93) V: t96 = t95 C: t21 = ITE(t22,0x2D18F1:I64,0x2D18F5:I64) V: PUT(1000) = t96 C: PUT(72) = t21 C: PUT(184) = 0x280613:I64 The CMP and the OR are the same but now we have the AND. Notice that for t20, we have masked away all but the MSB. This is because, for sign, we only care about the MSB. However, I did get this wrong in that I modified the mcx_masks to use (1<<63) whenever the test is AMD64CondNS. That's true for the case of "test", which is performing "AND". But if the flag came from an addition then the mask is wrong! The difficulty here is the lazy computation of the flags because we have split the information needed into two pieces: Which operation (AND) and which flag (sign flag). I imagine that the choice of lazy was to improve performance because we don't want to be calculating all the flags for every add/sub/etc. It would take too long and often we don't even need the flag results. So I'd like any solution to maintain the laziness for performance. It's also nice that not too much memcheck stuff is in the VEX code. It would be great for the VEX code to be tool-agnostic. Also the VEX code has mcx_mask in it so it's not a perfect division but let's see if we can keep it as agnostic as possible. I thought that I'd be able to compute the mcx_flag entirely at IR generation time as a constant but I cannot because I only know AMD64CondNS (0x9) and not LOGICQ (0x14). I think that I will try to replace the mcx_flags with a pointer to a function. Just as amd64g_calculate_condition is a CCall that is evaluated from arguments, so will this new function be. It'll have the same arguments but instead of the t20 it'll have t82. And then it can do whatever calculation necessary to return the validity of each flag. For backward compatibility, I'll keep the mcx_flags around and by default we'll have a function that just does the old behavior. And as problems like this one are found, we can migrate away from the mcx_flags to the more-correct way. Anyway, that's my plan. Sorry that it's such a long email! Let me know if you have further questions. Perhaps this will be too complex to get accepted into valgrind? Ah well. Eyal On Sun, Nov 27, 2022 at 4:50 AM Julian Seward <jse...@gm...> wrote: > On 27/11/2022 03:09, Eyal Soha wrote: > > It's a long and boring research that I did but the result is that I'm > going > > to change the "mcx_mask" from an array of bits (uint) to an array of > ulong > > Hmm, we shouldn't really need to mess with the mcx_mask at all. What is > the > test case that shows the problem? > > J > > |
|
From: Julian S. <jse...@gm...> - 2022-11-27 11:50:40
|
On 27/11/2022 03:09, Eyal Soha wrote: > It's a long and boring research that I did but the result is that I'm going > to change the "mcx_mask" from an array of bits (uint) to an array of ulong Hmm, we shouldn't really need to mess with the mcx_mask at all. What is the test case that shows the problem? J |
|
From: Eyal S. <eya...@gm...> - 2022-11-27 02:09:20
|
Okay, I've figured it out. Thanks for your help! It's a long and boring research that I did but the result is that I'm going to change the "mcx_mask" from an array of bits (uint) to an array of ulong so that we can be more precise. This ought to solve the problem. Let me know if you want more details. I'll get some code ready and send it out for review. Maybe it'll get accepted? If you're curious, here are all the other improvements that I have made to valgrind in my branch. Maybe they could get merged, too? https://github.com/eyal0/valgrind/commit/4f130b5b5ea6be2c0c93f6359a88f7a2e51e459e https://github.com/eyal0/valgrind/commit/eccc56d407f2b2f277765bc0dae42cadb86c133a https://github.com/eyal0/valgrind/commit/c3a5f06e4620599339601640e9a18c383ecb4e2e Thanks again for your explanation! Eyal On Sat, Nov 26, 2022 at 2:36 PM Mark Wielaard <ma...@kl...> wrote: > Hi Eyal, > > On Sat, Nov 26, 2022 at 01:22:51PM -0700, Eyal Soha wrote: > > I found a false positive in amd64 conditional move. I'm comfortable > fixing > > it if I can just find how the cmov gets translated into IR for memcheck. > > I've done work on other IR before but I'm having the hardest time just > > finding where this code is generated! > > > > The issue is that the sign flag is depending upon all bits being defined > > where actually it only needs the highest bit. > > > > Where can I find how cmovnz translates to the valid bit checking IR? If > > there are docs that will help me, I'm happy to read them. And if not, > I'll > > make docs to describe whatever I'm taught. > > The amd64 code is transformed into VEX IR in > VEX/priv/guest_amd64_toIR.c. Look for "cmov not zero" to find > cmovnz. Which will call dis_cmov_E_G which uses > mk_amd64g_calculate_condition. The actual IR created is described in > VEX/pub/libvex_ir.h. The code that instruments this IR for memcheck > definedness checking is in memcheck/mc_translate.c. > > README_DEVELOPERS has some hints at the end about Printing out > problematic blocks. valgrind can print out various stages of the IR, > which is really helpful tracking down where which transformation > occurs. > > Cheers, > > Mark > |