You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(6) |
2
(4) |
3
(12) |
4
(14) |
5
(6) |
|
6
(1) |
7
(10) |
8
(4) |
9
(1) |
10
(2) |
11
(7) |
12
(1) |
|
13
(3) |
14
(8) |
15
(5) |
16
(6) |
17
(1) |
18
(11) |
19
(5) |
|
20
(2) |
21
(7) |
22
(3) |
23
(1) |
24
|
25
(4) |
26
(1) |
|
27
(1) |
28
|
29
(1) |
30
(7) |
|
|
|
|
From: <sv...@va...> - 2009-09-18 23:03:48
|
Author: sewardj
Date: 2009-09-19 00:03:38 +0100 (Sat, 19 Sep 2009)
New Revision: 10892
Log:
Comment-only change, describing the changes in r10891 ("Memcheck-side
handling for new primops [...] to do with tracking definedness flows
exactly in sse2-based strlen routines that are based on pmovmskb.)
Modified:
branches/ICC111/memcheck/mc_translate.c
Modified: branches/ICC111/memcheck/mc_translate.c
===================================================================
--- branches/ICC111/memcheck/mc_translate.c 2009-09-18 17:16:31 UTC (rev 10891)
+++ branches/ICC111/memcheck/mc_translate.c 2009-09-18 23:03:38 UTC (rev 10892)
@@ -120,6 +120,7 @@
static IRType shadowTypeV ( IRType ty );
static IRExpr* expr2vbits ( struct _MCEnv* mce, IRExpr* e );
static IRTemp findShadowTmpB ( struct _MCEnv* mce, IRTemp orig );
+static void schemeS ( struct _MCEnv* mce, IRStmt* st );
/*------------------------------------------------------------*/
@@ -177,9 +178,10 @@
instrumentation process. */
XArray* /* of TempMapEnt */ tmpMap;
- /* MODIFIED: indicates whether "bogus" literals have so far been
- found. Starts off False, and may change to True. */
- Bool bogusLiterals;
+ /* READONLY: indicates whether we should use the "expensive" or
+ "normal" V-bit tracking schemes. See big comment just before
+ stRequiresExpensive(). */
+ Bool expensiveV;
/* READONLY: the guest layout. This indicates which parts of
the guest state should be regarded as 'always defined'. */
@@ -1632,6 +1634,42 @@
}
+/* A better interpretation for Ctz# (count trailing zeroes).
+
+ Previously, we just pessimised this, viz:
+
+ Ctz#(x, x#) = PCast(x#).
+
+ However, with the advent of SSE2 strlen routines based on
+ pmovmskb(GetMSBs8x8) followed by bsf(Ctz32), we need to do better.
+ Specifically we are asked to count the trailing zeroes in a word
+ which has a rightmost defined 1 but undefined bits to the left of
+ it; that's ok.
+
+ The fix is to compute an improvement mask, which has a 0 for all
+ bit positions to the left of the rightmost 1 in the word, and AND
+ this into x# before the PCast. Since 0 denotes "defined", this
+ causes us to ignore the undefinedness of any bits to the left of
+ the rightmost 1 bit. Of course if the rightmost 1 bit or any of
+ the zeroes to its right are undefined then the result as a whole is
+ still undefined.
+
+ Hence: Ctz#(x, x#) = let improver = x ^ (x-1)
+ improved = x# & improver
+ in PCast(improved)
+
+ eg: x = UUUUU10000, x# = 1111100000
+ x-1 = UUUUU01111
+ x ^ (x-1) = 0000011111 (improver)
+ x# & improver = 1111100000 & 0000011111 = 0000000000 (improved)
+ PCast(improved) = 0000000000 ("all defined")
+
+ x = UUUUU10000, x# = 1111110000 (rightmost 1 is undef)
+ x-1 = UUUUU01111
+ x ^ (x-1) = 0000011111 (improver)
+ x# & improver = 1111110000 & 0000011111 = 0000010000 (improved)
+ PCast(improved) = 1111111111 ("all undefined")
+*/
static
IRAtom* expensiveCountTrailingZeroes32 ( MCEnv* mce,
IRAtom* aa, IRAtom* aav )
@@ -2510,13 +2548,13 @@
return mkLazy2(mce, Ity_I64, vatom1, vatom2);
case Iop_Add32:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
return expensiveAddSub(mce,True,Ity_I32,
vatom1,vatom2, atom1,atom2);
else
goto cheap_AddSub32;
case Iop_Sub32:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
return expensiveAddSub(mce,False,Ity_I32,
vatom1,vatom2, atom1,atom2);
else
@@ -2533,13 +2571,13 @@
return doCmpORD(mce, op, vatom1,vatom2, atom1,atom2);
case Iop_Add64:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
return expensiveAddSub(mce,True,Ity_I64,
vatom1,vatom2, atom1,atom2);
else
goto cheap_AddSub64;
case Iop_Sub64:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
return expensiveAddSub(mce,False,Ity_I64,
vatom1,vatom2, atom1,atom2);
else
@@ -2562,7 +2600,7 @@
case Iop_CmpEQ64:
case Iop_CmpNE64:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
goto expensive_cmp64;
else
goto cheap_cmp64;
@@ -2579,7 +2617,7 @@
case Iop_CmpEQ32:
case Iop_CmpNE32:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
goto expensive_cmp32;
else
goto cheap_cmp32;
@@ -2597,7 +2635,7 @@
case Iop_CmpEQ16:
case Iop_CmpNE16:
- if (mce->bogusLiterals)
+ if (mce->expensiveV)
goto expensive_cmp16;
else
goto cheap_cmp16;
@@ -3935,8 +3973,41 @@
/*--- Memcheck main ---*/
/*------------------------------------------------------------*/
-static void schemeS ( MCEnv* mce, IRStmt* st );
+/* Figure out whether we should use the "normal" or the "expensive"
+ instrumentation schemes for V bit tracking. We hope that for the
+ vast majority of blocks we can get away with the normal scheme.
+ But there are a few cases, mostly to do with highly optimised
+ strlen implementations, where extra accuracy is needed. Hence we
+ first scan the block looking for evidence that the expensive
+ schemes are needed.
+ What's affected?
+
+ * Add32 Sub32 Add64 Sub64 (expensiveAddSub)
+
+ * CmpEQ16 CmpNE16 CmpEQ32 CmpNE32 CmpEQ64 CmpNE64
+ (expensiveCmpEQorNE)
+
+ All other primops have unchanging interpretations.
+
+ ExpCmpNE{8,16,32,64} always use the expensive interpretation for
+ EQ/NE. The need to do so is flagged by the front ends, though; we
+ don't establish that here.
+
+ What evidence do we look for?
+
+ * literals of the form 0xFEFEFEFF and related values, which show up
+ in code that does zero-byte detection in 32/64-bit words by games
+ to do with carry chain propagation. This necessitates use of
+ more accurate Add and Sub interpretation. It may well be that
+ this requires expensive EQ/NE interpretations, although I can't
+ remember the reason.
+
+ * appearance of GetMSBs8x8, requiring expensive EQ/NE
+ interpretations for the degenerate-case (argument-is-zero) guards
+ for {Ctz,Clz}{32,64}, since the output from GetMSBs8x8 is fed
+ into a Ctz/Clz operation in x86/amd64 SSE2-based strlen routines.
+*/
static Bool isBogusAtom ( IRAtom* at )
{
ULong n = 0;
@@ -3969,7 +4040,7 @@
);
}
-static Bool checkForBogusLiterals ( /*FLAT*/ IRStmt* st )
+static Bool stRequiresExpensive ( /*FLAT*/ IRStmt* st )
{
Int i;
IRExpr* e;
@@ -4064,7 +4135,7 @@
IRType gWordTy, IRType hWordTy )
{
Bool verboze = 0||False;
- Bool bogus;
+ Bool expensiveV;
Int i, j, first_stmt;
IRStmt* st;
MCEnv mce;
@@ -4095,11 +4166,11 @@
.sb->tyenv and .tmpMap together, so the valid index-set for
those two arrays should always be identical. */
VG_(memset)(&mce, 0, sizeof(mce));
- mce.sb = sb_out;
- mce.trace = verboze;
- mce.layout = layout;
- mce.hWordTy = hWordTy;
- mce.bogusLiterals = False;
+ mce.sb = sb_out;
+ mce.trace = verboze;
+ mce.layout = layout;
+ mce.hWordTy = hWordTy;
+ mce.expensiveV = False;
mce.tmpMap = VG_(newXA)( VG_(malloc), "mc.MC_(instrument).1", VG_(free),
sizeof(TempMapEnt));
@@ -4113,12 +4184,12 @@
tl_assert( VG_(sizeXA)( mce.tmpMap ) == sb_in->tyenv->types_used );
/* Make a preliminary inspection of the statements, to see if there
- are any dodgy-looking literals. If there are, we generate
- extra-detailed (hence extra-expensive) instrumentation in
- places. Scan the whole bb even if dodgyness is found earlier,
- so that the flatness assertion is applied to all stmts. */
+ we need to use the expensive V-bit instrumentation schemes, as
+ per comment at the start of this fn. Scan the whole sb even if
+ dodgyness is found earlier, so that the flatness assertion is
+ applied to all stmts. */
- bogus = False;
+ expensiveV = False;
for (i = 0; i < sb_in->stmts_used; i++) {
@@ -4126,10 +4197,10 @@
tl_assert(st);
tl_assert(isFlatIRStmt(st));
- if (!bogus) {
- bogus = checkForBogusLiterals(st);
- if (0 && bogus) {
- VG_(printf)("bogus: ");
+ if (!expensiveV) {
+ expensiveV = stRequiresExpensive(st);
+ if (0 && expensiveV) {
+ VG_(printf)("expensiveV: ");
ppIRStmt(st);
VG_(printf)("\n");
}
@@ -4137,7 +4208,7 @@
}
- mce.bogusLiterals = bogus;
+ mce.expensiveV = expensiveV;
/* Copy verbatim any IR preamble preceding the first IMark */
|
|
From: Tom H. <to...@co...> - 2009-09-18 23:03:14
|
On 19/09/09 00:00, Nicholas Nethercote wrote: > On Sat, Sep 19, 2009 at 1:28 AM, Tom Hughes<to...@co...> wrote: > >> Beyond that it will be a case of somebody going through it and checking >> each call against the kernel source, which is a slow and horrible job >> for ioctls. > > You're welcome to do that, but not all ioctl patches have received > this level of scrutiny in the past, especially those involving more > obscure ioctls. The one's I have committed generally have ;-) Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: Nicholas N. <n.n...@gm...> - 2009-09-18 23:00:34
|
On Sat, Sep 19, 2009 at 1:28 AM, Tom Hughes <to...@co...> wrote: > > Beyond that it will be a case of somebody going through it and checking > each call against the kernel source, which is a slow and horrible job > for ioctls. You're welcome to do that, but not all ioctl patches have received this level of scrutiny in the past, especially those involving more obscure ioctls. Nick |
|
From: <sv...@va...> - 2009-09-18 17:16:45
|
Author: sewardj
Date: 2009-09-18 18:16:31 +0100 (Fri, 18 Sep 2009)
New Revision: 10891
Log:
Memcheck-side handling for new primops committed in vex r1922, to
do with tracking definedness flows exactly in sse2-based strlen
routines that are based on pmovmskb.
* Handle Iop_Ctz32 better, in the sense that we no longer care if
there are undefined bits above the rightmost 1 bit, provided
the rightmost 1 bit is defined, since they don't have any
effect on the definedness of the result.
* Handle Iop_8HLto16 (data steering)
* Handle Iop_GetMSBs8x8 (data steering)
* Handle Iop_ExpCmpNE{16,32,64} -- expensive case non-equality
comparisons
* use expensive interpretations everything for the entire block
if Iop_GetMSBs8x8 appears anywhere within it.
Modified:
branches/ICC111/memcheck/mc_translate.c
Modified: branches/ICC111/memcheck/mc_translate.c
===================================================================
--- branches/ICC111/memcheck/mc_translate.c 2009-09-18 14:58:40 UTC (rev 10890)
+++ branches/ICC111/memcheck/mc_translate.c 2009-09-18 17:16:31 UTC (rev 10891)
@@ -788,6 +788,15 @@
tl_assert(sameKindedAtoms(vyy,yy));
switch (ty) {
+ case Ity_I16:
+ opOR = Iop_Or16;
+ opDIFD = Iop_And16;
+ opUIFU = Iop_Or16;
+ opNOT = Iop_Not16;
+ opXOR = Iop_Xor16;
+ opCMP = Iop_CmpEQ16;
+ top = mkU16(0xFFFF);
+ break;
case Ity_I32:
opOR = Iop_Or32;
opDIFD = Iop_And32;
@@ -1623,6 +1632,33 @@
}
+static
+IRAtom* expensiveCountTrailingZeroes32 ( MCEnv* mce,
+ IRAtom* aa, IRAtom* aav )
+{
+ IRType ty;
+ IRAtom *improver, *improved;
+ tl_assert(isShadowAtom(mce,aav));
+ tl_assert(isOriginalAtom(mce,aa));
+ tl_assert(sameKindedAtoms(aav,aa));
+
+ ty = Ity_I32;
+ // improver = aa ^ (aa - 1)
+ improver = assignNew('V', mce,ty,
+ binop(Iop_Xor32,
+ aa,
+ assignNew('V', mce,ty,
+ binop(Iop_Sub32,
+ aa,
+ mkU32(1)))));
+ // improved = aav & improver
+ improved = assignNew('V', mce,ty,
+ binop(Iop_And32, aav, improver));
+ // PCast(improved)
+ return mkPCastTo(mce, ty, improved);
+}
+
+
/*------------------------------------------------------------*/
/*--- Scalar shifts. ---*/
/*------------------------------------------------------------*/
@@ -2430,6 +2466,8 @@
case Iop_DivModS128to64:
return mkLazy2(mce, Ity_I128, vatom1, vatom2);
+ case Iop_8HLto16:
+ return assignNew('V', mce, Ity_I16, binop(op, vatom1, vatom2));
case Iop_16HLto32:
return assignNew('V', mce, Ity_I32, binop(op, vatom1, vatom2));
case Iop_32HLto64:
@@ -2520,31 +2558,59 @@
case Iop_Add8:
return mkLeft8(mce, mkUifU8(mce, vatom1,vatom2));
+ /////////////////////////////
+
case Iop_CmpEQ64:
case Iop_CmpNE64:
if (mce->bogusLiterals)
- return expensiveCmpEQorNE(mce,Ity_I64, vatom1,vatom2, atom1,atom2 );
+ goto expensive_cmp64;
else
goto cheap_cmp64;
+
+ expensive_cmp64:
+ return expensiveCmpEQorNE(mce,Ity_I64, vatom1,vatom2, atom1,atom2 );
+
cheap_cmp64:
case Iop_CmpLE64S: case Iop_CmpLE64U:
case Iop_CmpLT64U: case Iop_CmpLT64S:
return mkPCastTo(mce, Ity_I1, mkUifU64(mce, vatom1,vatom2));
+ /////////////////////////////
+
case Iop_CmpEQ32:
case Iop_CmpNE32:
if (mce->bogusLiterals)
- return expensiveCmpEQorNE(mce,Ity_I32, vatom1,vatom2, atom1,atom2 );
+ goto expensive_cmp32;
else
goto cheap_cmp32;
+
+ expensive_cmp32:
+ case Iop_ExpCmpNE32:
+ return expensiveCmpEQorNE(mce,Ity_I32, vatom1,vatom2, atom1,atom2 );
+
cheap_cmp32:
case Iop_CmpLE32S: case Iop_CmpLE32U:
case Iop_CmpLT32U: case Iop_CmpLT32S:
return mkPCastTo(mce, Ity_I1, mkUifU32(mce, vatom1,vatom2));
- case Iop_CmpEQ16: case Iop_CmpNE16:
+ /////////////////////////////
+
+ case Iop_CmpEQ16:
+ case Iop_CmpNE16:
+ if (mce->bogusLiterals)
+ goto expensive_cmp16;
+ else
+ goto cheap_cmp16;
+
+ expensive_cmp16:
+ case Iop_ExpCmpNE16:
+ return expensiveCmpEQorNE(mce,Ity_I16, vatom1,vatom2, atom1,atom2 );
+
+ cheap_cmp16:
return mkPCastTo(mce, Ity_I1, mkUifU16(mce, vatom1,vatom2));
+ /////////////////////////////
+
case Iop_CmpEQ8: case Iop_CmpNE8:
return mkPCastTo(mce, Ity_I1, mkUifU8(mce, vatom1,vatom2));
@@ -2679,10 +2745,12 @@
return mkPCastTo(mce, Ity_I64, vatom);
case Iop_Clz32:
- case Iop_Ctz32:
case Iop_TruncF64asF32:
return mkPCastTo(mce, Ity_I32, vatom);
+ case Iop_Ctz32:
+ return expensiveCountTrailingZeroes32(mce, atom, vatom);
+
case Iop_1Uto64:
case Iop_8Uto64:
case Iop_8Sto64:
@@ -2719,6 +2787,7 @@
case Iop_16HIto8:
case Iop_32to8:
case Iop_64to8:
+ case Iop_GetMSBs8x8:
return assignNew('V', mce, Ity_I8, unop(op, vatom));
case Iop_32to1:
@@ -3916,7 +3985,8 @@
case Iex_Const:
return isBogusAtom(e);
case Iex_Unop:
- return isBogusAtom(e->Iex.Unop.arg);
+ return isBogusAtom(e->Iex.Unop.arg)
+ || e->Iex.Unop.op == Iop_GetMSBs8x8;
case Iex_GetI:
return isBogusAtom(e->Iex.GetI.ix);
case Iex_Binop:
|
|
From: <sv...@va...> - 2009-09-18 16:56:43
|
Author: sewardj
Date: 2009-09-18 17:56:27 +0100 (Fri, 18 Sep 2009)
New Revision: 1922
Log:
Add support to make it possible for Memcheck to track definedness
flows exactly in sse2-based strlen routines that are based on
pmovmskb.
* new primop, Iop_GetMSBs8x8, to represent behaviour of pmovmskb
directly instead of via helper functions, which are opaque to
Memcheck. Front-end and back-end mods to match.
* new primops Iop_ExpCmpNE{8,16,32,64}, which are exactly like
Iop_ExpCmpNE{8,16,32,64}, except carrying the additional hint
that they require expensive definedness tracking. These are
used in bsf/bsr, since it may be the case that their input is
generated by pmovmskb. (so far x86 only; amd64 fe is todo).
Modified:
branches/ICC111/priv/guest_amd64_defs.h
branches/ICC111/priv/guest_amd64_helpers.c
branches/ICC111/priv/guest_amd64_toIR.c
branches/ICC111/priv/guest_x86_defs.h
branches/ICC111/priv/guest_x86_helpers.c
branches/ICC111/priv/guest_x86_toIR.c
branches/ICC111/priv/host_amd64_isel.c
branches/ICC111/priv/host_generic_simd64.c
branches/ICC111/priv/host_generic_simd64.h
branches/ICC111/priv/host_x86_isel.c
branches/ICC111/priv/ir_defs.c
branches/ICC111/priv/ir_opt.c
branches/ICC111/pub/libvex_ir.h
Modified: branches/ICC111/priv/guest_amd64_defs.h
===================================================================
--- branches/ICC111/priv/guest_amd64_defs.h 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/guest_amd64_defs.h 2009-09-18 16:56:27 UTC (rev 1922)
@@ -140,8 +140,6 @@
extern ULong amd64g_calculate_mmx_pmaddwd ( ULong, ULong );
extern ULong amd64g_calculate_mmx_psadbw ( ULong, ULong );
-extern ULong amd64g_calculate_mmx_pmovmskb ( ULong );
-extern ULong amd64g_calculate_sse_pmovmskb ( ULong w64hi, ULong w64lo );
/* --- DIRTY HELPERS --- */
Modified: branches/ICC111/priv/guest_amd64_helpers.c
===================================================================
--- branches/ICC111/priv/guest_amd64_helpers.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/guest_amd64_helpers.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -2277,21 +2277,6 @@
}
/* CALLED FROM GENERATED CODE: CLEAN HELPER */
-ULong amd64g_calculate_mmx_pmovmskb ( ULong xx )
-{
- ULong r = 0;
- if (xx & (1ULL << (64-1))) r |= (1<<7);
- if (xx & (1ULL << (56-1))) r |= (1<<6);
- if (xx & (1ULL << (48-1))) r |= (1<<5);
- if (xx & (1ULL << (40-1))) r |= (1<<4);
- if (xx & (1ULL << (32-1))) r |= (1<<3);
- if (xx & (1ULL << (24-1))) r |= (1<<2);
- if (xx & (1ULL << (16-1))) r |= (1<<1);
- if (xx & (1ULL << ( 8-1))) r |= (1<<0);
- return r;
-}
-
-/* CALLED FROM GENERATED CODE: CLEAN HELPER */
ULong amd64g_calculate_mmx_psadbw ( ULong xx, ULong yy )
{
UInt t = 0;
@@ -2307,15 +2292,7 @@
return (ULong)t;
}
-/* CALLED FROM GENERATED CODE: CLEAN HELPER */
-ULong amd64g_calculate_sse_pmovmskb ( ULong w64hi, ULong w64lo )
-{
- ULong rHi8 = amd64g_calculate_mmx_pmovmskb ( w64hi );
- ULong rLo8 = amd64g_calculate_mmx_pmovmskb ( w64lo );
- return ((rHi8 & 0xFF) << 8) | (rLo8 & 0xFF);
-}
-
/*---------------------------------------------------------------*/
/*--- Helpers for dealing with, and describing, ---*/
/*--- guest state as a whole. ---*/
Modified: branches/ICC111/priv/guest_amd64_toIR.c
===================================================================
--- branches/ICC111/priv/guest_amd64_toIR.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/guest_amd64_toIR.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -9902,7 +9902,7 @@
/* ***--- this is an MMX class insn introduced in SSE1 ---*** */
/* 0F D7 = PMOVMSKB -- extract sign bits from each of 8 lanes in
- mmx(G), turn them into a byte, and put zero-extend of it in
+ mmx(E), turn them into a byte, and put zero-extend of it in
ireg(G). */
if (haveNo66noF2noF3(pfx) && sz == 4
&& insn[0] == 0x0F && insn[1] == 0xD7) {
@@ -9910,14 +9910,10 @@
if (epartIsReg(modrm)) {
do_MMX_preamble();
t0 = newTemp(Ity_I64);
- t1 = newTemp(Ity_I64);
+ t1 = newTemp(Ity_I32);
assign(t0, getMMXReg(eregLO3ofRM(modrm)));
- assign(t1, mkIRExprCCall(
- Ity_I64, 0/*regparms*/,
- "amd64g_calculate_mmx_pmovmskb",
- &amd64g_calculate_mmx_pmovmskb,
- mkIRExprVec_1(mkexpr(t0))));
- putIReg32(gregOfRexRM(pfx,modrm), unop(Iop_64to32,mkexpr(t1)));
+ assign(t1, unop(Iop_8Uto32, unop(Iop_GetMSBs8x8, mkexpr(t0))));
+ putIReg32(gregOfRexRM(pfx,modrm), mkexpr(t1));
DIP("pmovmskb %s,%s\n", nameMMXReg(eregLO3ofRM(modrm)),
nameIReg32(gregOfRexRM(pfx,modrm)));
delta += 3;
@@ -11829,13 +11825,13 @@
t1 = newTemp(Ity_I64);
assign(t0, getXMMRegLane64(eregOfRexRM(pfx,modrm), 0));
assign(t1, getXMMRegLane64(eregOfRexRM(pfx,modrm), 1));
- t5 = newTemp(Ity_I64);
- assign(t5, mkIRExprCCall(
- Ity_I64, 0/*regparms*/,
- "amd64g_calculate_sse_pmovmskb",
- &amd64g_calculate_sse_pmovmskb,
- mkIRExprVec_2( mkexpr(t1), mkexpr(t0) )));
- putIReg32(gregOfRexRM(pfx,modrm), unop(Iop_64to32,mkexpr(t5)));
+ t5 = newTemp(Ity_I32);
+ assign(t5,
+ unop(Iop_16Uto32,
+ binop(Iop_8HLto16,
+ unop(Iop_GetMSBs8x8, mkexpr(t1)),
+ unop(Iop_GetMSBs8x8, mkexpr(t0)))));
+ putIReg32(gregOfRexRM(pfx,modrm), mkexpr(t5));
DIP("pmovmskb %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
nameIReg32(gregOfRexRM(pfx,modrm)));
delta += 3;
Modified: branches/ICC111/priv/guest_x86_defs.h
===================================================================
--- branches/ICC111/priv/guest_x86_defs.h 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/guest_x86_defs.h 2009-09-18 16:56:27 UTC (rev 1922)
@@ -137,8 +137,6 @@
extern ULong x86g_calculate_mmx_pmaddwd ( ULong, ULong );
extern ULong x86g_calculate_mmx_psadbw ( ULong, ULong );
-extern UInt x86g_calculate_mmx_pmovmskb ( ULong );
-extern UInt x86g_calculate_sse_pmovmskb ( ULong w64hi, ULong w64lo );
/* --- DIRTY HELPERS --- */
Modified: branches/ICC111/priv/guest_x86_helpers.c
===================================================================
--- branches/ICC111/priv/guest_x86_helpers.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/guest_x86_helpers.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -2424,21 +2424,6 @@
}
/* CALLED FROM GENERATED CODE: CLEAN HELPER */
-UInt x86g_calculate_mmx_pmovmskb ( ULong xx )
-{
- UInt r = 0;
- if (xx & (1ULL << (64-1))) r |= (1<<7);
- if (xx & (1ULL << (56-1))) r |= (1<<6);
- if (xx & (1ULL << (48-1))) r |= (1<<5);
- if (xx & (1ULL << (40-1))) r |= (1<<4);
- if (xx & (1ULL << (32-1))) r |= (1<<3);
- if (xx & (1ULL << (24-1))) r |= (1<<2);
- if (xx & (1ULL << (16-1))) r |= (1<<1);
- if (xx & (1ULL << ( 8-1))) r |= (1<<0);
- return r;
-}
-
-/* CALLED FROM GENERATED CODE: CLEAN HELPER */
ULong x86g_calculate_mmx_psadbw ( ULong xx, ULong yy )
{
UInt t = 0;
@@ -2454,15 +2439,7 @@
return (ULong)t;
}
-/* CALLED FROM GENERATED CODE: CLEAN HELPER */
-UInt x86g_calculate_sse_pmovmskb ( ULong w64hi, ULong w64lo )
-{
- UInt rHi8 = x86g_calculate_mmx_pmovmskb ( w64hi );
- UInt rLo8 = x86g_calculate_mmx_pmovmskb ( w64lo );
- return ((rHi8 & 0xFF) << 8) | (rLo8 & 0xFF);
-}
-
/*---------------------------------------------------------------*/
/*--- Helpers for dealing with segment overrides. ---*/
/*---------------------------------------------------------------*/
Modified: branches/ICC111/priv/guest_x86_toIR.c
===================================================================
--- branches/ICC111/priv/guest_x86_toIR.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/guest_x86_toIR.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -723,6 +723,7 @@
|| op8 == Iop_Shl8 || op8 == Iop_Shr8 || op8 == Iop_Sar8
|| op8 == Iop_CmpEQ8 || op8 == Iop_CmpNE8
|| op8 == Iop_CasCmpNE8
+ || op8 == Iop_ExpCmpNE8
|| op8 == Iop_Not8);
adj = ty==Ity_I8 ? 0 : (ty==Ity_I16 ? 1 : 2);
return adj + op8;
@@ -6308,10 +6309,14 @@
( isReg ? nameIReg(sz, eregOfRM(modrm)) : dis_buf ),
nameIReg(sz, gregOfRM(modrm)));
- /* Generate an 8-bit expression which is zero iff the
- original is zero, and nonzero otherwise */
+ /* Generate an 8-bit expression which is zero iff the original is
+ zero, and nonzero otherwise. Ask for an ExpCmpNE version which,
+ if instrumented by Memcheck, is instrumented expensively, since
+ this may be used on the output of a preceding movmskb insn,
+ which has been known to be partially defined, and in need of
+ careful handling. */
assign( src8,
- unop(Iop_1Uto8, binop(mkSizedOp(ty,Iop_CmpNE8),
+ unop(Iop_1Uto8, binop(mkSizedOp(ty,Iop_ExpCmpNE8),
mkexpr(src), mkU(ty,0))) );
/* Flags: Z is 1 iff source value is zero. All others
@@ -8883,7 +8888,7 @@
/* ***--- this is an MMX class insn introduced in SSE1 ---*** */
/* 0F D7 = PMOVMSKB -- extract sign bits from each of 8 lanes in
- mmx(G), turn them into a byte, and put zero-extend of it in
+ mmx(E), turn them into a byte, and put zero-extend of it in
ireg(G). */
if (sz == 4 && insn[0] == 0x0F && insn[1] == 0xD7) {
modrm = insn[2];
@@ -8892,11 +8897,7 @@
t0 = newTemp(Ity_I64);
t1 = newTemp(Ity_I32);
assign(t0, getMMXReg(eregOfRM(modrm)));
- assign(t1, mkIRExprCCall(
- Ity_I32, 0/*regparms*/,
- "x86g_calculate_mmx_pmovmskb",
- &x86g_calculate_mmx_pmovmskb,
- mkIRExprVec_1(mkexpr(t0))));
+ assign(t1, unop(Iop_8Uto32, unop(Iop_GetMSBs8x8, mkexpr(t0))));
putIReg(4, gregOfRM(modrm), mkexpr(t1));
DIP("pmovmskb %s,%s\n", nameMMXReg(eregOfRM(modrm)),
nameIReg(4,gregOfRM(modrm)));
@@ -10720,11 +10721,9 @@
goto decode_success;
}
- /* 66 0F D7 = PMOVMSKB -- extract sign bits from each of 16 lanes in
- xmm(G), turn them into a byte, and put zero-extend of it in
- ireg(G). Doing this directly is just too cumbersome; give up
- therefore and call a helper. */
- /* UInt x86g_calculate_sse_pmovmskb ( ULong w64hi, ULong w64lo ); */
+ /* 66 0F D7 = PMOVMSKB -- extract sign bits from each of 16 lanes
+ in xmm(E), turn them into a byte, and put zero-extend of it in
+ ireg(G). */
if (sz == 2 && insn[0] == 0x0F && insn[1] == 0xD7) {
modrm = insn[2];
if (epartIsReg(modrm)) {
@@ -10733,11 +10732,11 @@
assign(t0, getXMMRegLane64(eregOfRM(modrm), 0));
assign(t1, getXMMRegLane64(eregOfRM(modrm), 1));
t5 = newTemp(Ity_I32);
- assign(t5, mkIRExprCCall(
- Ity_I32, 0/*regparms*/,
- "x86g_calculate_sse_pmovmskb",
- &x86g_calculate_sse_pmovmskb,
- mkIRExprVec_2( mkexpr(t1), mkexpr(t0) )));
+ assign(t5,
+ unop(Iop_16Uto32,
+ binop(Iop_8HLto16,
+ unop(Iop_GetMSBs8x8, mkexpr(t1)),
+ unop(Iop_GetMSBs8x8, mkexpr(t0)))));
putIReg(4, gregOfRM(modrm), mkexpr(t5));
DIP("pmovmskb %s,%s\n", nameXMMReg(eregOfRM(modrm)),
nameIReg(4,gregOfRM(modrm)));
Modified: branches/ICC111/priv/host_amd64_isel.c
===================================================================
--- branches/ICC111/priv/host_amd64_isel.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/host_amd64_isel.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -815,7 +815,7 @@
This should handle expressions of 64, 32, 16 and 8-bit type. All
results are returned in a 64-bit register. For 32-, 16- and 8-bit
- expressions, the upper 32/16/24 bits are arbitrary, so you should
+ expressions, the upper 32/48/56 bits are arbitrary, so you should
mask or sign extend partial values if necessary.
*/
@@ -1632,6 +1632,20 @@
/* These are no-ops. */
return iselIntExpr_R(env, e->Iex.Unop.arg);
+ case Iop_GetMSBs8x8: {
+ /* Note: the following assumes the helper is of
+ signature
+ UInt fn ( ULong ), and is not a regparm fn.
+ */
+ HReg dst = newVRegI(env);
+ HReg arg = iselIntExpr_R(env, e->Iex.Unop.arg);
+ fn = (HWord)h_generic_calc_GetMSBs8x8;
+ addInstr(env, mk_iMOVsd_RR(arg, hregAMD64_RDI()) );
+ addInstr(env, AMD64Instr_Call( Acc_ALWAYS, (ULong)fn, 1 ));
+ addInstr(env, AMD64Instr_MovZLQ(hregAMD64_RAX(), dst));
+ return dst;
+ }
+
default:
break;
}
Modified: branches/ICC111/priv/host_generic_simd64.c
===================================================================
--- branches/ICC111/priv/host_generic_simd64.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/host_generic_simd64.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -1041,6 +1041,19 @@
);
}
+UInt h_generic_calc_GetMSBs8x8 ( ULong xx )
+{
+ UInt r = 0;
+ if (xx & (1ULL << (64-1))) r |= (1<<7);
+ if (xx & (1ULL << (56-1))) r |= (1<<6);
+ if (xx & (1ULL << (48-1))) r |= (1<<5);
+ if (xx & (1ULL << (40-1))) r |= (1<<4);
+ if (xx & (1ULL << (32-1))) r |= (1<<3);
+ if (xx & (1ULL << (24-1))) r |= (1<<2);
+ if (xx & (1ULL << (16-1))) r |= (1<<1);
+ if (xx & (1ULL << ( 8-1))) r |= (1<<0);
+ return r;
+}
/*---------------------------------------------------------------*/
/*--- end host_generic_simd64.c ---*/
Modified: branches/ICC111/priv/host_generic_simd64.h
===================================================================
--- branches/ICC111/priv/host_generic_simd64.h 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/host_generic_simd64.h 2009-09-18 16:56:27 UTC (rev 1922)
@@ -132,6 +132,7 @@
extern ULong h_generic_calc_Min16Sx4 ( ULong, ULong );
extern ULong h_generic_calc_Min8Ux8 ( ULong, ULong );
+extern UInt h_generic_calc_GetMSBs8x8 ( ULong );
#endif /* ndef __VEX_HOST_GENERIC_SIMD64_H */
Modified: branches/ICC111/priv/host_x86_isel.c
===================================================================
--- branches/ICC111/priv/host_x86_isel.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/host_x86_isel.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -1286,6 +1286,23 @@
/* These are no-ops. */
return iselIntExpr_R(env, e->Iex.Unop.arg);
+ case Iop_GetMSBs8x8: {
+ /* Note: the following assumes the helper is of
+ signature
+ UInt fn ( ULong ), and is not a regparm fn.
+ */
+ HReg xLo, xHi;
+ HReg dst = newVRegI(env);
+ HWord fn = (HWord)h_generic_calc_GetMSBs8x8;
+ iselInt64Expr(&xHi, &xLo, env, e->Iex.Unop.arg);
+ addInstr(env, X86Instr_Push(X86RMI_Reg(xHi)));
+ addInstr(env, X86Instr_Push(X86RMI_Reg(xLo)));
+ addInstr(env, X86Instr_Call( Xcc_ALWAYS, (UInt)fn, 0 ));
+ add_to_esp(env, 2*4);
+ addInstr(env, mk_iMOVsd_RR(hregX86_EAX(), dst));
+ return dst;
+ }
+
default:
break;
}
@@ -1833,7 +1850,8 @@
&& (e->Iex.Binop.op == Iop_CmpEQ16
|| e->Iex.Binop.op == Iop_CmpNE16
|| e->Iex.Binop.op == Iop_CasCmpEQ16
- || e->Iex.Binop.op == Iop_CasCmpNE16)) {
+ || e->Iex.Binop.op == Iop_CasCmpNE16
+ || e->Iex.Binop.op == Iop_ExpCmpNE16)) {
HReg r1 = iselIntExpr_R(env, e->Iex.Binop.arg1);
X86RMI* rmi2 = iselIntExpr_RMI(env, e->Iex.Binop.arg2);
HReg r = newVRegI(env);
@@ -1842,7 +1860,8 @@
addInstr(env, X86Instr_Test32(0xFFFF,X86RM_Reg(r)));
switch (e->Iex.Binop.op) {
case Iop_CmpEQ16: case Iop_CasCmpEQ16: return Xcc_Z;
- case Iop_CmpNE16: case Iop_CasCmpNE16: return Xcc_NZ;
+ case Iop_CmpNE16:
+ case Iop_CasCmpNE16: case Iop_ExpCmpNE16: return Xcc_NZ;
default: vpanic("iselCondCode(x86): CmpXX16");
}
}
@@ -1856,13 +1875,15 @@
|| e->Iex.Binop.op == Iop_CmpLE32S
|| e->Iex.Binop.op == Iop_CmpLE32U
|| e->Iex.Binop.op == Iop_CasCmpEQ32
- || e->Iex.Binop.op == Iop_CasCmpNE32)) {
+ || e->Iex.Binop.op == Iop_CasCmpNE32
+ || e->Iex.Binop.op == Iop_ExpCmpNE32)) {
HReg r1 = iselIntExpr_R(env, e->Iex.Binop.arg1);
X86RMI* rmi2 = iselIntExpr_RMI(env, e->Iex.Binop.arg2);
addInstr(env, X86Instr_Alu32R(Xalu_CMP,rmi2,r1));
switch (e->Iex.Binop.op) {
case Iop_CmpEQ32: case Iop_CasCmpEQ32: return Xcc_Z;
- case Iop_CmpNE32: case Iop_CasCmpNE32: return Xcc_NZ;
+ case Iop_CmpNE32:
+ case Iop_CasCmpNE32: case Iop_ExpCmpNE32: return Xcc_NZ;
case Iop_CmpLT32S: return Xcc_L;
case Iop_CmpLT32U: return Xcc_B;
case Iop_CmpLE32S: return Xcc_LE;
Modified: branches/ICC111/priv/ir_defs.c
===================================================================
--- branches/ICC111/priv/ir_defs.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/ir_defs.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -148,6 +148,8 @@
str = "CasCmpEQ"; base = Iop_CasCmpEQ8; break;
case Iop_CasCmpNE8 ... Iop_CasCmpNE64:
str = "CasCmpNE"; base = Iop_CasCmpNE8; break;
+ case Iop_ExpCmpNE8 ... Iop_ExpCmpNE64:
+ str = "ExpCmpNE"; base = Iop_ExpCmpNE8; break;
case Iop_Not8 ... Iop_Not64:
str = "Not"; base = Iop_Not8; break;
/* other cases must explicitly "return;" */
@@ -371,6 +373,7 @@
case Iop_CatOddLanes16x4: vex_printf("CatOddLanes16x4"); return;
case Iop_CatEvenLanes16x4: vex_printf("CatEvenLanes16x4"); return;
case Iop_Perm8x8: vex_printf("Perm8x8"); return;
+ case Iop_GetMSBs8x8: vex_printf("GetMSBs8x8"); return;
case Iop_CmpNEZ32x2: vex_printf("CmpNEZ32x2"); return;
case Iop_CmpNEZ16x4: vex_printf("CmpNEZ16x4"); return;
@@ -1647,18 +1650,18 @@
UNARY(Ity_I64, Ity_I64);
case Iop_CmpEQ8: case Iop_CmpNE8:
- case Iop_CasCmpEQ8: case Iop_CasCmpNE8:
+ case Iop_CasCmpEQ8: case Iop_CasCmpNE8: case Iop_ExpCmpNE8:
COMPARISON(Ity_I8);
case Iop_CmpEQ16: case Iop_CmpNE16:
- case Iop_CasCmpEQ16: case Iop_CasCmpNE16:
+ case Iop_CasCmpEQ16: case Iop_CasCmpNE16: case Iop_ExpCmpNE16:
COMPARISON(Ity_I16);
case Iop_CmpEQ32: case Iop_CmpNE32:
- case Iop_CasCmpEQ32: case Iop_CasCmpNE32:
+ case Iop_CasCmpEQ32: case Iop_CasCmpNE32: case Iop_ExpCmpNE32:
case Iop_CmpLT32S: case Iop_CmpLE32S:
case Iop_CmpLT32U: case Iop_CmpLE32U:
COMPARISON(Ity_I32);
case Iop_CmpEQ64: case Iop_CmpNE64:
- case Iop_CasCmpEQ64: case Iop_CasCmpNE64:
+ case Iop_CasCmpEQ64: case Iop_CasCmpNE64: case Iop_ExpCmpNE64:
case Iop_CmpLT64S: case Iop_CmpLE64S:
case Iop_CmpLT64U: case Iop_CmpLE64U:
COMPARISON(Ity_I64);
@@ -1672,6 +1675,7 @@
case Iop_Left16: UNARY(Ity_I16,Ity_I16);
case Iop_CmpwNEZ32: case Iop_Left32: UNARY(Ity_I32,Ity_I32);
case Iop_CmpwNEZ64: case Iop_Left64: UNARY(Ity_I64,Ity_I64);
+ case Iop_GetMSBs8x8: UNARY(Ity_I64, Ity_I8);
case Iop_MullU8: case Iop_MullS8:
BINARY(Ity_I8,Ity_I8, Ity_I16);
Modified: branches/ICC111/priv/ir_opt.c
===================================================================
--- branches/ICC111/priv/ir_opt.c 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/priv/ir_opt.c 2009-09-18 16:56:27 UTC (rev 1922)
@@ -1336,16 +1336,22 @@
/* -- CmpNE -- */
case Iop_CmpNE8:
+ case Iop_CasCmpNE8:
+ case Iop_ExpCmpNE8:
e2 = IRExpr_Const(IRConst_U1(toBool(
((0xFF & e->Iex.Binop.arg1->Iex.Const.con->Ico.U8)
!= (0xFF & e->Iex.Binop.arg2->Iex.Const.con->Ico.U8)))));
break;
case Iop_CmpNE32:
+ case Iop_CasCmpNE32:
+ case Iop_ExpCmpNE32:
e2 = IRExpr_Const(IRConst_U1(toBool(
(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
!= e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))));
break;
case Iop_CmpNE64:
+ case Iop_CasCmpNE64:
+ case Iop_ExpCmpNE64:
e2 = IRExpr_Const(IRConst_U1(toBool(
(e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
!= e->Iex.Binop.arg2->Iex.Const.con->Ico.U64))));
Modified: branches/ICC111/pub/libvex_ir.h
===================================================================
--- branches/ICC111/pub/libvex_ir.h 2009-09-18 14:52:41 UTC (rev 1921)
+++ branches/ICC111/pub/libvex_ir.h 2009-09-18 16:56:27 UTC (rev 1922)
@@ -431,6 +431,10 @@
Iop_CasCmpEQ8, Iop_CasCmpEQ16, Iop_CasCmpEQ32, Iop_CasCmpEQ64,
Iop_CasCmpNE8, Iop_CasCmpNE16, Iop_CasCmpNE32, Iop_CasCmpNE64,
+ /* Exactly like CmpNE8/16/32/64, but carrying the additional
+ hint that these needs expensive definedness tracking. */
+ Iop_ExpCmpNE8, Iop_ExpCmpNE16, Iop_ExpCmpNE32, Iop_ExpCmpNE64,
+
/* -- Ordering not important after here. -- */
/* Widening multiplies */
@@ -718,6 +722,10 @@
is undefined. */
Iop_Perm8x8,
+ /* MISC CONVERSION -- get high bits of each byte lane, a la
+ x86/amd64 pmovmskb */
+ Iop_GetMSBs8x8, /* I64 -> I8 */
+
/* ------------------ 128-bit SIMD FP. ------------------ */
/* --- 32x4 vector FP --- */
|
|
From: Tom H. <to...@co...> - 2009-09-18 15:28:34
|
On 18/09/09 14:56, Robert Bragg wrote: > I've attached a patch for coregrind/m_syswrap/syswrap-linux.c that > describes enough of the drm ioctls that I can remove the memsets in > libdrm and run the applications I work on with Valgrind knowing what's > going on. In include/vki/vki-linux-drm.h I've tried to add all the > structures and ioctl definitions so it should be relatively straight > forward to add additional drm ioctls to syswrap-linux.c in an > incremental fashion. Please file a bug in bugzilla and attach your patch to that so it doesn't get forgotten about. > I'd love to get some feedback on this patch and to know if this is > something that could be accepted upstream? Well the only obvious thing is the new include file - there is no precedent for that as we have previously simple added all required structure definitions etc to vki-linux.h rather than creating new files. Beyond that it will be a case of somebody going through it and checking each call against the kernel source, which is a slow and horrible job for ioctls. Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: <sv...@va...> - 2009-09-18 14:58:53
|
Author: sewardj Date: 2009-09-18 15:58:40 +0100 (Fri, 18 Sep 2009) New Revision: 10890 Log: Swizzle external. Modified: branches/ICC111/ Property changes on: branches/ICC111 ___________________________________________________________________ Name: svn:externals - VEX svn://svn.valgrind.org/vex/trunk + VEX svn://svn.valgrind.org/vex/branches/ICC111 |
|
From: <sv...@va...> - 2009-09-18 14:56:48
|
Author: sewardj Date: 2009-09-18 15:56:34 +0100 (Fri, 18 Sep 2009) New Revision: 10889 Log: Make a copy of trunk r10888 to experiment with support for Intel-11.1 compiled code. Added: branches/ICC111/ Copied: branches/ICC111 (from rev 10888, trunk) |
|
From: Robert B. <bo...@o-...> - 2009-09-18 14:55:03
|
Hi, As someone who works a lot with OpenGL applications I've often found it to be a shame that Valgrind doesn't know about any of the drm ioctls which can result in a lot of noise when using memcheck. More recently the libdrm developers have started memsetting all their ioctl structures to avoid complaints from Valgrind, but that's not ideal and isn't enough for all the warnings. I've attached a patch for coregrind/m_syswrap/syswrap-linux.c that describes enough of the drm ioctls that I can remove the memsets in libdrm and run the applications I work on with Valgrind knowing what's going on. In include/vki/vki-linux-drm.h I've tried to add all the structures and ioctl definitions so it should be relatively straight forward to add additional drm ioctls to syswrap-linux.c in an incremental fashion. For reference, I'm running with intel i965 hardware with the i915 drm module and I'm building linux, mesa and libdrm from git master which also implies I'm running using the GEM memory manager. It's likely that this patch is missing some ioctls if your setup diverges from mine, but it's what I can most easily test. I'd love to get some feedback on this patch and to know if this is something that could be accepted upstream? I've also attached a file called drm-ioctls.txt that is basically lots of boilerplate snippets of code that have been generated with various vim macros from the original drm ioctl declarations. Using these as a starting point to add further ioctls should remove some of the tedium. kind regards, - Robert -- Robert Bragg, Intel Open Source Technology Center |
|
From: <sv...@va...> - 2009-09-18 14:53:00
|
Author: sewardj Date: 2009-09-18 15:52:41 +0100 (Fri, 18 Sep 2009) New Revision: 1921 Log: Make a copy of trunk r1920 to experiment with support for Intel-11.1 compiled code. Added: branches/ICC111/ Copied: branches/ICC111 (from rev 1920, trunk) |
|
From: Alexander P. <gl...@go...> - 2009-09-18 08:05:45
|
Nightly build on mcgrind ( Darwin 9.7.0 i386 ) Started at 2009-09-18 09:06:01 MSD Ended at 2009-09-18 09:28:14 MSD Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 433 tests, 22 stderr failures, 1 stdout failure, 0 post failures == memcheck/tests/null_socket (stdout) memcheck/tests/origin5-bz2 (stderr) memcheck/tests/varinfo1 (stderr) memcheck/tests/varinfo2 (stderr) memcheck/tests/varinfo3 (stderr) memcheck/tests/varinfo4 (stderr) memcheck/tests/varinfo5 (stderr) memcheck/tests/varinfo6 (stderr) none/tests/async-sigs (stderr) none/tests/faultstatus (stderr) none/tests/pth_blockedsig (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/rwlock_race (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc06_two_races_xml (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc23_bogus_condwait (stderr) -- Alexander Potapenko Software Engineer Google Moscow |