You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
1
(26) |
2
(24) |
|
3
(21) |
4
(23) |
5
(19) |
6
(24) |
7
(27) |
8
(28) |
9
(18) |
|
10
(15) |
11
(14) |
12
(17) |
13
(18) |
14
(24) |
15
(27) |
16
(17) |
|
17
(26) |
18
(22) |
19
(27) |
20
(25) |
21
(19) |
22
(22) |
23
(17) |
|
24
(30) |
25
(21) |
26
(14) |
27
(20) |
28
(25) |
29
(23) |
30
(22) |
|
From: <sv...@va...> - 2012-06-24 15:11:55
|
sewardj 2012-06-24 16:11:48 +0100 (Sun, 24 Jun 2012)
New Revision: 12672
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+63 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 15:58:08 +01:00 (rev 12671)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 16:11:48 +01:00 (rev 12672)
@@ -2034,6 +2034,55 @@
"vpextrw $0x7, %%xmm7, %%r14d",
"vpextrw $0x7, %%xmm7, (%%rax)")
+GEN_test_RandM(VAESENC,
+ "vaesenc %%xmm6, %%xmm8, %%xmm7",
+ "vaesenc (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VAESENCLAST,
+ "vaesenclast %%xmm6, %%xmm8, %%xmm7",
+ "vaesenclast (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VAESDEC,
+ "vaesdec %%xmm6, %%xmm8, %%xmm7",
+ "vaesdec (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VAESDECLAST,
+ "vaesdeclast %%xmm6, %%xmm8, %%xmm7",
+ "vaesdeclast (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VAESIMC,
+ "vaesimc %%xmm6, %%xmm7",
+ "vaesimc (%%rax), %%xmm7")
+
+GEN_test_RandM(VAESKEYGENASSIST_0x00,
+ "vaeskeygenassist $0x00, %%xmm6, %%xmm7",
+ "vaeskeygenassist $0x00, (%%rax), %%xmm7")
+GEN_test_RandM(VAESKEYGENASSIST_0x31,
+ "vaeskeygenassist $0x31, %%xmm6, %%xmm7",
+ "vaeskeygenassist $0x31, (%%rax), %%xmm7")
+GEN_test_RandM(VAESKEYGENASSIST_0xB2,
+ "vaeskeygenassist $0xb2, %%xmm6, %%xmm7",
+ "vaeskeygenassist $0xb2, (%%rax), %%xmm7")
+GEN_test_RandM(VAESKEYGENASSIST_0xFF,
+ "vaeskeygenassist $0xFF, %%xmm6, %%xmm7",
+ "vaeskeygenassist $0xFF, (%%rax), %%xmm7")
+
+GEN_test_RandM(VPCLMULQDQ_0x00,
+ "vpclmulqdq $0x00, %%xmm6, %%xmm8, %%xmm7",
+ "vpclmulqdq $0x00, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPCLMULQDQ_0x01,
+ "vpclmulqdq $0x01, %%xmm6, %%xmm8, %%xmm7",
+ "vpclmulqdq $0x01, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPCLMULQDQ_0x10,
+ "vpclmulqdq $0x10, %%xmm6, %%xmm8, %%xmm7",
+ "vpclmulqdq $0x10, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPCLMULQDQ_0x11,
+ "vpclmulqdq $0x11, %%xmm6, %%xmm8, %%xmm7",
+ "vpclmulqdq $0x11, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPCLMULQDQ_0xFF,
+ "vpclmulqdq $0xFF, %%xmm6, %%xmm8, %%xmm7",
+ "vpclmulqdq $0xFF, (%%rax), %%xmm8, %%xmm7")
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -2540,5 +2589,19 @@
DO_D( VPEXTRW_128_0x5 );
DO_D( VPEXTRW_128_0x6 );
DO_D( VPEXTRW_128_0x7 );
+ DO_D( VAESENC );
+ DO_D( VAESENCLAST );
+ DO_D( VAESDEC );
+ DO_D( VAESDECLAST );
+ DO_D( VAESIMC );
+ DO_D( VAESKEYGENASSIST_0x00 );
+ DO_D( VAESKEYGENASSIST_0x31 );
+ DO_D( VAESKEYGENASSIST_0xB2 );
+ DO_D( VAESKEYGENASSIST_0xFF );
+ DO_D( VPCLMULQDQ_0x00 );
+ DO_D( VPCLMULQDQ_0x01 );
+ DO_D( VPCLMULQDQ_0x10 );
+ DO_D( VPCLMULQDQ_0x11 );
+ DO_D( VPCLMULQDQ_0xFF );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 15:11:46
|
sewardj 2012-06-24 16:11:38 +0100 (Sun, 24 Jun 2012)
New Revision: 2410
Log:
More AVX insns:
VAESIMC xmm2/m128, xmm1 = VEX.128.66.0F38.WIG DB /r
VAESENC xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DC /r
VAESENCLAST xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DD /r
VAESDEC xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DE /r
VAESDECLAST xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DF /r
VPCLMULQDQ imm8, xmm3/m128,xmm2,xmm1
VAESKEYGENASSIST imm8, xmm2/m128, xmm1 = VEX.128.66.0F3A.WIG DF /r
(Jakub Jelinek, ja...@re...), #273475 comments 138 and 139.
Modified files:
trunk/priv/guest_amd64_defs.h
trunk/priv/guest_amd64_helpers.c
trunk/priv/guest_amd64_toIR.c
Modified: trunk/priv/guest_amd64_helpers.c (+17 -13)
===================================================================
--- trunk/priv/guest_amd64_helpers.c 2012-06-24 15:57:59 +01:00 (rev 2409)
+++ trunk/priv/guest_amd64_helpers.c 2012-06-24 16:11:38 +01:00 (rev 2410)
@@ -3546,38 +3546,42 @@
/* For description, see definition in guest_amd64_defs.h */
void amd64g_dirtyhelper_AES (
VexGuestAMD64State* gst,
- HWord opc4,
+ HWord opc4, HWord gstOffD,
HWord gstOffL, HWord gstOffR
)
{
// where the args are
+ V128* argD = (V128*)( ((UChar*)gst) + gstOffD );
V128* argL = (V128*)( ((UChar*)gst) + gstOffL );
V128* argR = (V128*)( ((UChar*)gst) + gstOffR );
+ V128 r;
switch (opc4) {
case 0xDC: /* AESENC */
case 0xDD: /* AESENCLAST */
- ShiftRows (argR);
- SubBytes (argR);
+ r = *argR;
+ ShiftRows (&r);
+ SubBytes (&r);
if (opc4 == 0xDC)
- MixColumns (argR);
- argR->w64[0] = argR->w64[0] ^ argL->w64[0];
- argR->w64[1] = argR->w64[1] ^ argL->w64[1];
+ MixColumns (&r);
+ argD->w64[0] = r.w64[0] ^ argL->w64[0];
+ argD->w64[1] = r.w64[1] ^ argL->w64[1];
break;
case 0xDE: /* AESDEC */
case 0xDF: /* AESDECLAST */
- InvShiftRows (argR);
- InvSubBytes (argR);
+ r = *argR;
+ InvShiftRows (&r);
+ InvSubBytes (&r);
if (opc4 == 0xDE)
- InvMixColumns (argR);
- argR->w64[0] = argR->w64[0] ^ argL->w64[0];
- argR->w64[1] = argR->w64[1] ^ argL->w64[1];
+ InvMixColumns (&r);
+ argD->w64[0] = r.w64[0] ^ argL->w64[0];
+ argD->w64[1] = r.w64[1] ^ argL->w64[1];
break;
case 0xDB: /* AESIMC */
- *argR = *argL;
- InvMixColumns (argR);
+ *argD = *argL;
+ InvMixColumns (argD);
break;
default: vassert(0);
}
Modified: trunk/priv/guest_amd64_defs.h (+4 -4)
===================================================================
--- trunk/priv/guest_amd64_defs.h 2012-06-24 15:57:59 +01:00 (rev 2409)
+++ trunk/priv/guest_amd64_defs.h 2012-06-24 16:11:38 +01:00 (rev 2410)
@@ -238,14 +238,14 @@
(will assert otherwise).
gstOffL and gstOffR are the guest state offsets for the two XMM
- register inputs and/or output. We never have to deal with the memory
- case since that is handled by pre-loading the relevant value into the fake
- XMM16 register.
+ register inputs, gstOffD is the guest state offset for the XMM register
+ output. We never have to deal with the memory case since that is handled
+ by pre-loading the relevant value into the fake XMM16 register.
*/
extern void amd64g_dirtyhelper_AES (
VexGuestAMD64State* gst,
- HWord opc4,
+ HWord opc4, HWord gstOffD,
HWord gstOffL, HWord gstOffR
);
Modified: trunk/priv/guest_amd64_toIR.c (+263 -163)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 15:57:59 +01:00 (rev 2409)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 16:11:38 +01:00 (rev 2410)
@@ -16008,6 +16008,167 @@
}
+static Long dis_AESx ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx, UChar opc )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ UInt regNoL = 0;
+ UInt regNoR = (isAvx && opc != 0xDB) ? getVexNvvvv(pfx) : rG;
+
+ /* This is a nasty kludge. We need to pass 2 x V128 to the
+ helper. Since we can't do that, use a dirty
+ helper to compute the results directly from the XMM regs in
+ the guest state. That means for the memory case, we need to
+ move the left operand into a pseudo-register (XMM16, let's
+ call it). */
+ if (epartIsReg(modrm)) {
+ regNoL = eregOfRexRM(pfx, modrm);
+ delta += 1;
+ } else {
+ regNoL = 16; /* use XMM16 as an intermediary */
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ /* alignment check needed ???? */
+ stmt( IRStmt_Put( OFFB_YMM16, loadLE(Ity_V128, mkexpr(addr)) ));
+ delta += alen;
+ }
+
+ void* fn = &amd64g_dirtyhelper_AES;
+ HChar* nm = "amd64g_dirtyhelper_AES";
+
+ /* Round up the arguments. Note that this is a kludge -- the
+ use of mkU64 rather than mkIRExpr_HWord implies the
+ assumption that the host's word size is 64-bit. */
+ UInt gstOffD = ymmGuestRegOffset(rG);
+ UInt gstOffL = regNoL == 16 ? OFFB_YMM16 : ymmGuestRegOffset(regNoL);
+ UInt gstOffR = ymmGuestRegOffset(regNoR);
+ IRExpr* opc4 = mkU64(opc);
+ IRExpr* gstOffDe = mkU64(gstOffD);
+ IRExpr* gstOffLe = mkU64(gstOffL);
+ IRExpr* gstOffRe = mkU64(gstOffR);
+ IRExpr** args
+ = mkIRExprVec_4( opc4, gstOffDe, gstOffLe, gstOffRe );
+
+ IRDirty* d = unsafeIRDirty_0_N( 0/*regparms*/, nm, fn, args );
+ /* It's not really a dirty call, but we can't use the clean
+ helper mechanism here for the very lame reason that we can't
+ pass 2 x V128s by value to a helper, nor get one back. Hence
+ this roundabout scheme. */
+ d->needsBBP = True;
+ d->nFxState = 2;
+ vex_bzero(&d->fxState, sizeof(d->fxState));
+ /* AES{ENC,ENCLAST,DEC,DECLAST} read both registers, and writes
+ the second for !isAvx or the third for isAvx.
+ AESIMC (0xDB) reads the first register, and writes the second. */
+ d->fxState[0].fx = Ifx_Read;
+ d->fxState[0].offset = gstOffL;
+ d->fxState[0].size = sizeof(U128);
+ d->fxState[1].offset = gstOffR;
+ d->fxState[1].size = sizeof(U128);
+ if (opc == 0xDB)
+ d->fxState[1].fx = Ifx_Write;
+ else if (!isAvx || rG == regNoR)
+ d->fxState[1].fx = Ifx_Modify;
+ else {
+ d->fxState[1].fx = Ifx_Read;
+ d->nFxState++;
+ d->fxState[2].fx = Ifx_Write;
+ d->fxState[2].offset = gstOffD;
+ d->fxState[2].size = sizeof(U128);
+ }
+
+ stmt( IRStmt_Dirty(d) );
+ {
+ HChar* opsuf;
+ switch (opc) {
+ case 0xDC: opsuf = "enc"; break;
+ case 0XDD: opsuf = "enclast"; break;
+ case 0xDE: opsuf = "dec"; break;
+ case 0xDF: opsuf = "declast"; break;
+ case 0xDB: opsuf = "imc"; break;
+ default: vassert(0);
+ }
+ DIP("%saes%s %s,%s%s%s\n", isAvx ? "v" : "", opsuf,
+ (regNoL == 16 ? dis_buf : nameXMMReg(regNoL)),
+ nameXMMReg(regNoR),
+ (isAvx && opc != 0xDB) ? "," : "",
+ (isAvx && opc != 0xDB) ? nameXMMReg(rG) : "");
+ }
+ if (isAvx)
+ putYMMRegLane128( rG, 1, mkV128(0) );
+ return delta;
+}
+
+static Long dis_AESKEYGENASSIST ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ UChar modrm = getUChar(delta);
+ UInt regNoL = 0;
+ UInt regNoR = gregOfRexRM(pfx, modrm);
+ UChar imm = 0;
+
+ /* This is a nasty kludge. See AESENC et al. instructions. */
+ modrm = getUChar(delta);
+ if (epartIsReg(modrm)) {
+ regNoL = eregOfRexRM(pfx, modrm);
+ imm = getUChar(delta+1);
+ delta += 1+1;
+ } else {
+ regNoL = 16; /* use XMM16 as an intermediary */
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
+ /* alignment check ???? . */
+ stmt( IRStmt_Put( OFFB_YMM16, loadLE(Ity_V128, mkexpr(addr)) ));
+ imm = getUChar(delta+alen);
+ delta += alen+1;
+ }
+
+ /* Who ya gonna call? Presumably not Ghostbusters. */
+ void* fn = &amd64g_dirtyhelper_AESKEYGENASSIST;
+ HChar* nm = "amd64g_dirtyhelper_AESKEYGENASSIST";
+
+ /* Round up the arguments. Note that this is a kludge -- the
+ use of mkU64 rather than mkIRExpr_HWord implies the
+ assumption that the host's word size is 64-bit. */
+ UInt gstOffL = regNoL == 16 ? OFFB_YMM16 : ymmGuestRegOffset(regNoL);
+ UInt gstOffR = ymmGuestRegOffset(regNoR);
+
+ IRExpr* imme = mkU64(imm & 0xFF);
+ IRExpr* gstOffLe = mkU64(gstOffL);
+ IRExpr* gstOffRe = mkU64(gstOffR);
+ IRExpr** args
+ = mkIRExprVec_3( imme, gstOffLe, gstOffRe );
+
+ IRDirty* d = unsafeIRDirty_0_N( 0/*regparms*/, nm, fn, args );
+ /* It's not really a dirty call, but we can't use the clean
+ helper mechanism here for the very lame reason that we can't
+ pass 2 x V128s by value to a helper, nor get one back. Hence
+ this roundabout scheme. */
+ d->needsBBP = True;
+ d->nFxState = 2;
+ vex_bzero(&d->fxState, sizeof(d->fxState));
+ d->fxState[0].fx = Ifx_Read;
+ d->fxState[0].offset = gstOffL;
+ d->fxState[0].size = sizeof(U128);
+ d->fxState[1].fx = Ifx_Write;
+ d->fxState[1].offset = gstOffR;
+ d->fxState[1].size = sizeof(U128);
+ stmt( IRStmt_Dirty(d) );
+
+ DIP("%saeskeygenassist $%x,%s,%s\n", isAvx ? "v" : "", (UInt)imm,
+ (regNoL == 16 ? dis_buf : nameXMMReg(regNoL)),
+ nameXMMReg(regNoR));
+ if (isAvx)
+ putYMMRegLane128( regNoR, 1, mkV128(0) );
+ return delta;
+}
+
+
__attribute__((noinline))
static
Long dis_ESC_0F38__SSE4 ( Bool* decode_OK,
@@ -16429,76 +16590,7 @@
DB /r = AESIMC xmm1, xmm2/m128 */
if (have66noF2noF3(pfx) && sz == 2) {
- UInt regNoL = 0;
- UInt regNoR = 0;
-
- /* This is a nasty kludge. We need to pass 2 x V128 to the
- helper. Since we can't do that, use a dirty
- helper to compute the results directly from the XMM regs in
- the guest state. That means for the memory case, we need to
- move the left operand into a pseudo-register (XMM16, let's
- call it). */
- modrm = getUChar(delta);
- if (epartIsReg(modrm)) {
- regNoL = eregOfRexRM(pfx, modrm);
- regNoR = gregOfRexRM(pfx, modrm);
- delta += 1;
- } else {
- regNoL = 16; /* use XMM16 as an intermediary */
- regNoR = gregOfRexRM(pfx, modrm);
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
- /* alignment check needed ???? */
- stmt( IRStmt_Put( OFFB_YMM16, loadLE(Ity_V128, mkexpr(addr)) ));
- delta += alen;
- }
-
- void* fn = &amd64g_dirtyhelper_AES;
- HChar* nm = "amd64g_dirtyhelper_AES";
-
- /* Round up the arguments. Note that this is a kludge -- the
- use of mkU64 rather than mkIRExpr_HWord implies the
- assumption that the host's word size is 64-bit. */
- UInt gstOffL = regNoL == 16 ? OFFB_YMM16 : ymmGuestRegOffset(regNoL);
- UInt gstOffR = ymmGuestRegOffset(regNoR);
- IRExpr* opc4 = mkU64(opc);
- IRExpr* gstOffLe = mkU64(gstOffL);
- IRExpr* gstOffRe = mkU64(gstOffR);
- IRExpr** args
- = mkIRExprVec_3( opc4, gstOffLe, gstOffRe );
-
- IRDirty* d = unsafeIRDirty_0_N( 0/*regparms*/, nm, fn, args );
- /* It's not really a dirty call, but we can't use the clean
- helper mechanism here for the very lame reason that we can't
- pass 2 x V128s by value to a helper, nor get one back. Hence
- this roundabout scheme. */
- d->needsBBP = True;
- d->nFxState = 2;
- vex_bzero(&d->fxState, sizeof(d->fxState));
- /* AES{ENC,ENCLAST,DEC,DECLAST} read both registers, and writes
- the second.
- AESIMC (0xDB) reads the first register, and writes the second. */
- d->fxState[0].fx = Ifx_Read;
- d->fxState[0].offset = gstOffL;
- d->fxState[0].size = sizeof(U128);
- d->fxState[1].fx = (opc == 0xDB ? Ifx_Write : Ifx_Modify);
- d->fxState[1].offset = gstOffR;
- d->fxState[1].size = sizeof(U128);
-
- stmt( IRStmt_Dirty(d) );
- {
- HChar* opsuf;
- switch (opc) {
- case 0xDC: opsuf = "enc"; break;
- case 0XDD: opsuf = "enclast"; break;
- case 0xDE: opsuf = "dec"; break;
- case 0xDF: opsuf = "declast"; break;
- case 0xDB: opsuf = "imc"; break;
- default: vassert(0);
- }
- DIP("aes%s %s,%s\n", opsuf,
- (regNoL == 16 ? dis_buf : nameXMMReg(regNoL)),
- nameXMMReg(regNoR));
- }
+ delta = dis_AESx( vbi, pfx, delta, False/*!isAvx*/, opc );
goto decode_success;
}
break;
@@ -17196,6 +17288,33 @@
}
+static IRTemp math_PCLMULQDQ( IRTemp dV, IRTemp sV, UInt imm8 )
+{
+ IRTemp t0 = newTemp(Ity_I64);
+ IRTemp t1 = newTemp(Ity_I64);
+ assign(t0, unop((imm8&1)? Iop_V128HIto64 : Iop_V128to64,
+ mkexpr(dV)));
+ assign(t1, unop((imm8&16) ? Iop_V128HIto64 : Iop_V128to64,
+ mkexpr(sV)));
+
+ IRTemp t2 = newTemp(Ity_I64);
+ IRTemp t3 = newTemp(Ity_I64);
+
+ IRExpr** args;
+
+ args = mkIRExprVec_3(mkexpr(t0), mkexpr(t1), mkU64(0));
+ assign(t2, mkIRExprCCall(Ity_I64,0, "amd64g_calculate_pclmul",
+ &amd64g_calculate_pclmul, args));
+ args = mkIRExprVec_3(mkexpr(t0), mkexpr(t1), mkU64(1));
+ assign(t3, mkIRExprCCall(Ity_I64,0, "amd64g_calculate_pclmul",
+ &amd64g_calculate_pclmul, args));
+
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, binop(Iop_64HLtoV128, mkexpr(t3), mkexpr(t2)));
+ return res;
+}
+
+
__attribute__((noinline))
static
Long dis_ESC_0F3A__SSE4 ( Bool* decode_OK,
@@ -17203,10 +17322,6 @@
Prefix pfx, Int sz, Long deltaIN )
{
IRTemp addr = IRTemp_INVALID;
- IRTemp t0 = IRTemp_INVALID;
- IRTemp t1 = IRTemp_INVALID;
- IRTemp t2 = IRTemp_INVALID;
- IRTemp t3 = IRTemp_INVALID;
UChar modrm = 0;
Int alen = 0;
HChar dis_buf[50];
@@ -17805,18 +17920,18 @@
Int imm8;
IRTemp svec = newTemp(Ity_V128);
IRTemp dvec = newTemp(Ity_V128);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, modrm);
- modrm = getUChar(delta);
-
- assign( dvec, getXMMReg( gregOfRexRM(pfx, modrm) ) );
+ assign( dvec, getXMMReg(rG) );
if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
imm8 = (Int)getUChar(delta+1);
- assign( svec, getXMMReg( eregOfRexRM(pfx, modrm) ) );
+ assign( svec, getXMMReg(rE) );
delta += 1+1;
DIP( "pclmulqdq $%d, %s,%s\n", imm8,
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ nameXMMReg(rE), nameXMMReg(rG) );
} else {
addr = disAMode( &alen, vbi, pfx, delta, dis_buf,
1/* imm8 is 1 byte after the amode */ );
@@ -17825,34 +17940,10 @@
imm8 = (Int)getUChar(delta+alen);
delta += alen+1;
DIP( "pclmulqdq $%d, %s,%s\n",
- imm8, dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ imm8, dis_buf, nameXMMReg(rG) );
}
- t0 = newTemp(Ity_I64);
- t1 = newTemp(Ity_I64);
- assign(t0, unop((imm8&1)? Iop_V128HIto64 : Iop_V128to64,
- mkexpr(dvec)));
- assign(t1, unop((imm8&16) ? Iop_V128HIto64 : Iop_V128to64,
- mkexpr(svec)));
-
- t2 = newTemp(Ity_I64);
- t3 = newTemp(Ity_I64);
-
- IRExpr** args;
-
- args = mkIRExprVec_3(mkexpr(t0), mkexpr(t1), mkU64(0));
- assign(t2,
- mkIRExprCCall(Ity_I64,0, "amd64g_calculate_pclmul",
- &amd64g_calculate_pclmul, args));
- args = mkIRExprVec_3(mkexpr(t0), mkexpr(t1), mkU64(1));
- assign(t3,
- mkIRExprCCall(Ity_I64,0, "amd64g_calculate_pclmul",
- &amd64g_calculate_pclmul, args));
-
- IRTemp res = newTemp(Ity_V128);
- assign(res, binop(Iop_64HLtoV128, mkexpr(t3), mkexpr(t2)));
- putXMMReg( gregOfRexRM(pfx,modrm), mkexpr(res) );
-
+ putXMMReg( rG, mkexpr( math_PCLMULQDQ(dvec, svec, imm8) ) );
goto decode_success;
}
break;
@@ -17879,63 +17970,7 @@
case 0xDF:
/* 66 0F 3A DF /r ib = AESKEYGENASSIST imm8, xmm2/m128, xmm1 */
if (have66noF2noF3(pfx) && sz == 2) {
- UInt regNoL = 0;
- UInt regNoR = 0;
- UChar imm = 0;
-
- /* This is a nasty kludge. See AESENC et al. instructions. */
- modrm = getUChar(delta);
- if (epartIsReg(modrm)) {
- regNoL = eregOfRexRM(pfx, modrm);
- regNoR = gregOfRexRM(pfx, modrm);
- imm = getUChar(delta+1);
- delta += 1+1;
- } else {
- regNoL = 16; /* use XMM16 as an intermediary */
- regNoR = gregOfRexRM(pfx, modrm);
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
- /* alignment check ???? . */
- stmt( IRStmt_Put( OFFB_YMM16, loadLE(Ity_V128, mkexpr(addr)) ));
- imm = getUChar(delta+alen);
- delta += alen+1;
- }
-
- /* Who ya gonna call? Presumably not Ghostbusters. */
- void* fn = &amd64g_dirtyhelper_AESKEYGENASSIST;
- HChar* nm = "amd64g_dirtyhelper_AESKEYGENASSIST";
-
- /* Round up the arguments. Note that this is a kludge -- the
- use of mkU64 rather than mkIRExpr_HWord implies the
- assumption that the host's word size is 64-bit. */
- UInt gstOffL = regNoL == 16 ? OFFB_YMM16 : ymmGuestRegOffset(regNoL);
- UInt gstOffR = ymmGuestRegOffset(regNoR);
-
- IRExpr* imme = mkU64(imm & 0xFF);
- IRExpr* gstOffLe = mkU64(gstOffL);
- IRExpr* gstOffRe = mkU64(gstOffR);
- IRExpr** args
- = mkIRExprVec_3( imme, gstOffLe, gstOffRe );
-
- IRDirty* d = unsafeIRDirty_0_N( 0/*regparms*/, nm, fn, args );
- /* It's not really a dirty call, but we can't use the clean
- helper mechanism here for the very lame reason that we can't
- pass 2 x V128s by value to a helper, nor get one back. Hence
- this roundabout scheme. */
- d->needsBBP = True;
- d->nFxState = 2;
- vex_bzero(&d->fxState, sizeof(d->fxState));
- d->fxState[0].fx = Ifx_Read;
- d->fxState[0].offset = gstOffL;
- d->fxState[0].size = sizeof(U128);
- d->fxState[1].fx = Ifx_Write;
- d->fxState[1].offset = gstOffR;
- d->fxState[1].size = sizeof(U128);
- stmt( IRStmt_Dirty(d) );
-
- DIP("aeskeygenassist $%x,%s,%s\n", (UInt)imm,
- (regNoL == 16 ? dis_buf : nameXMMReg(regNoL)),
- nameXMMReg(regNoR));
-
+ delta = dis_AESKEYGENASSIST( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -25052,6 +25087,23 @@
}
break;
+ case 0xDB:
+ case 0xDC:
+ case 0xDD:
+ case 0xDE:
+ case 0xDF:
+ /* VAESIMC xmm2/m128, xmm1 = VEX.128.66.0F38.WIG DB /r */
+ /* VAESENC xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DC /r */
+ /* VAESENCLAST xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DD /r */
+ /* VAESDEC xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DE /r */
+ /* VAESDECLAST xmm3/m128, xmm2, xmm1 = VEX.128.66.0F38.WIG DF /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AESx( vbi, pfx, delta, True/*!isAvx*/, opc );
+ if (opc != 0xDB) *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
default:
break;
@@ -26135,6 +26187,46 @@
}
break;
+ case 0x44:
+ /* VPCLMULQDQ imm8, xmm3/m128,xmm2,xmm1 */
+ /* VPCLMULQDQ = VEX.NDS.128.66.0F3A.WIG 44 /r ib */
+ /* 66 0F 3A 44 /r ib = PCLMULQDQ xmm1, xmm2/m128, imm8
+ * Carry-less multiplication of selected XMM quadwords into XMM
+ * registers (a.k.a multiplication of polynomials over GF(2))
+ */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ UChar modrm = getUChar(delta);
+ Int imm8;
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ UInt rV = getVexNvvvv(pfx);
+
+ assign( dV, getXMMReg(rV) );
+
+ if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+ imm8 = (Int)getUChar(delta+1);
+ assign( sV, getXMMReg(rE) );
+ delta += 1+1;
+ DIP( "vpclmulqdq $%d, %s,%s,%s\n", imm8,
+ nameXMMReg(rE), nameXMMReg(rV), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf,
+ 1/* imm8 is 1 byte after the amode */ );
+ assign( sV, loadLE( Ity_V128, mkexpr(addr) ) );
+ imm8 = (Int)getUChar(delta+alen);
+ delta += alen+1;
+ DIP( "vpclmulqdq $%d, %s,%s,%s\n",
+ imm8, dis_buf, nameXMMReg(rV), nameXMMReg(rG) );
+ }
+
+ putYMMRegLoAndZU( rG, mkexpr( math_PCLMULQDQ(dV, sV, imm8) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
case 0x4A:
/* VBLENDVPS xmmG, xmmE/memE, xmmV, xmmIS4
::: xmmG:V128 = PBLEND(xmmE, xmmV, xmmIS4) (RMVR) */
@@ -26208,6 +26300,14 @@
}
break;
+ case 0xDF:
+ /* VAESKEYGENASSIST imm8, xmm2/m128, xmm1 = VEX.128.66.0F3A.WIG DF /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AESKEYGENASSIST( vbi, pfx, delta, True/*!isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
default:
break;
|
|
From: <sv...@va...> - 2012-06-24 14:58:15
|
sewardj 2012-06-24 15:58:08 +0100 (Sun, 24 Jun 2012)
New Revision: 12671
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+88 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 15:27:06 +01:00 (rev 12670)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 15:58:08 +01:00 (rev 12671)
@@ -1966,6 +1966,74 @@
"vmpsadbw $7, %%xmm6, %%xmm8, %%xmm7",
"vmpsadbw $7, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMOVDDUP_YMMorMEM256_to_YMM,
+ "vmovddup %%ymm8, %%ymm7",
+ "vmovddup (%%rax), %%ymm9")
+
+GEN_test_Monly(VMOVLPS_128_M64_XMM_XMM, "vmovlps (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_Monly(VMOVLPS_128_XMM_M64, "vmovlps %%xmm7, (%%rax)")
+
+GEN_test_RandM(VRCPSS_128,
+ "vrcpss %%xmm6, %%xmm8, %%xmm7",
+ "vrcpss (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VRCPPS_128,
+ "vrcpps %%xmm6, %%xmm8",
+ "vrcpps (%%rax), %%xmm8")
+
+GEN_test_RandM(VRCPPS_256,
+ "vrcpps %%ymm6, %%ymm8",
+ "vrcpps (%%rax), %%ymm8")
+
+GEN_test_RandM(VPSADBW_128,
+ "vpsadbw %%xmm6, %%xmm8, %%xmm7",
+ "vpsadbw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPSIGNB_128,
+ "vpsignb %%xmm6, %%xmm8, %%xmm7",
+ "vpsignb (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPSIGNW_128,
+ "vpsignw %%xmm6, %%xmm8, %%xmm7",
+ "vpsignw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPSIGND_128,
+ "vpsignd %%xmm6, %%xmm8, %%xmm7",
+ "vpsignd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPMULHRSW_128,
+ "vpmulhrsw %%xmm6, %%xmm8, %%xmm7",
+ "vpmulhrsw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_Monly(VBROADCASTF128,
+ "vbroadcastf128 (%%rax), %%ymm9")
+
+GEN_test_RandM(VPEXTRW_128_0x0,
+ "vpextrw $0x0, %%xmm7, %%r14d",
+ "vpextrw $0x0, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x1,
+ "vpextrw $0x1, %%xmm7, %%r14d",
+ "vpextrw $0x1, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x2,
+ "vpextrw $0x2, %%xmm7, %%r14d",
+ "vpextrw $0x2, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x3,
+ "vpextrw $0x3, %%xmm7, %%r14d",
+ "vpextrw $0x3, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x4,
+ "vpextrw $0x4, %%xmm7, %%r14d",
+ "vpextrw $0x4, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x5,
+ "vpextrw $0x5, %%xmm7, %%r14d",
+ "vpextrw $0x5, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x6,
+ "vpextrw $0x6, %%xmm7, %%r14d",
+ "vpextrw $0x6, %%xmm7, (%%rax)")
+GEN_test_RandM(VPEXTRW_128_0x7,
+ "vpextrw $0x7, %%xmm7, %%r14d",
+ "vpextrw $0x7, %%xmm7, (%%rax)")
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -2452,5 +2520,25 @@
DO_D( VMPSADBW_128_0x5 );
DO_D( VMPSADBW_128_0x6 );
DO_D( VMPSADBW_128_0x7 );
+ DO_D( VMOVDDUP_YMMorMEM256_to_YMM );
+ DO_D( VMOVLPS_128_M64_XMM_XMM );
+ DO_D( VMOVLPS_128_XMM_M64 );
+ DO_D( VRCPSS_128 );
+ DO_D( VRCPPS_128 );
+ DO_D( VRCPPS_256 );
+ DO_D( VPSADBW_128 );
+ DO_D( VPSIGNB_128 );
+ DO_D( VPSIGNW_128 );
+ DO_D( VPSIGND_128 );
+ DO_D( VPMULHRSW_128 );
+ DO_D( VBROADCASTF128 );
+ DO_D( VPEXTRW_128_0x0 );
+ DO_D( VPEXTRW_128_0x1 );
+ DO_D( VPEXTRW_128_0x2 );
+ DO_D( VPEXTRW_128_0x3 );
+ DO_D( VPEXTRW_128_0x4 );
+ DO_D( VPEXTRW_128_0x5 );
+ DO_D( VPEXTRW_128_0x6 );
+ DO_D( VPEXTRW_128_0x7 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 14:58:07
|
sewardj 2012-06-24 15:57:59 +0100 (Sun, 24 Jun 2012)
New Revision: 2409
Log:
More AVX insns:
VMOVDDUP ymm2/m256, ymm1 = VEX.256.F2.0F.WIG /12 r
VMOVLPS m64, xmm1, xmm2 = VEX.NDS.128.0F.WIG 12 /r
VMOVLPS xmm1, m64 = VEX.128.0F.WIG 13 /r
VRCPSS xmm3/m64(E), xmm2(V), xmm1(G) = VEX.NDS.LIG.F3.0F.WIG 53 /r
VRCPPS xmm2/m128(E), xmm1(G) = VEX.NDS.128.0F.WIG 53 /r
VRCPPS ymm2/m256(E), ymm1(G) = VEX.NDS.256.0F.WIG 53 /r
VPSADBW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG F6 /r
VPSIGNB xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 08 /r
VPSIGNW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 09 /r
VPSIGND xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 0A /r
VPMULHRSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 0B /r
VBROADCASTF128 m128, ymm1 = VEX.256.66.0F38.WIG 1A /r
VPEXTRW = VEX.128.66.0F3A.W0 15 /r ib
(Jakub Jelinek, ja...@re...), #273475 comment 137.
Modified files:
trunk/priv/guest_amd64_toIR.c
trunk/priv/host_amd64_isel.c
trunk/priv/ir_defs.c
trunk/pub/libvex_ir.h
Modified: trunk/priv/ir_defs.c (+2 -0)
===================================================================
--- trunk/priv/ir_defs.c 2012-06-24 15:26:30 +01:00 (rev 2408)
+++ trunk/priv/ir_defs.c 2012-06-24 15:57:59 +01:00 (rev 2409)
@@ -619,6 +619,7 @@
case Iop_Recip32x2: vex_printf("Recip32x2"); return;
case Iop_Recip32Fx2: vex_printf("Recip32Fx2"); return;
case Iop_Recip32Fx4: vex_printf("Recip32Fx4"); return;
+ case Iop_Recip32Fx8: vex_printf("Recip32Fx8"); return;
case Iop_Recip32x4: vex_printf("Recip32x4"); return;
case Iop_Recip32F0x4: vex_printf("Recip32F0x4"); return;
case Iop_Recip64Fx2: vex_printf("Recip64Fx2"); return;
@@ -2826,6 +2827,7 @@
case Iop_RSqrt32Fx8:
case Iop_Sqrt32Fx8:
case Iop_Sqrt64Fx4:
+ case Iop_Recip32Fx8:
UNARY(Ity_V256, Ity_V256);
default:
Modified: trunk/priv/host_amd64_isel.c (+1 -0)
===================================================================
--- trunk/priv/host_amd64_isel.c 2012-06-24 15:26:30 +01:00 (rev 2408)
+++ trunk/priv/host_amd64_isel.c 2012-06-24 15:57:59 +01:00 (rev 2409)
@@ -3444,6 +3444,7 @@
return;
}
+ case Iop_Recip32Fx8: op = Asse_RCPF; goto do_32Fx8_unary;
case Iop_Sqrt32Fx8: op = Asse_SQRTF; goto do_32Fx8_unary;
case Iop_RSqrt32Fx8: op = Asse_RSQRTF; goto do_32Fx8_unary;
do_32Fx8_unary:
Modified: trunk/priv/guest_amd64_toIR.c (+318 -91)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 15:26:30 +01:00 (rev 2408)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 15:57:59 +01:00 (rev 2409)
@@ -1493,6 +1493,11 @@
return IRExpr_Get( ymmGuestRegLane128offset(ymmreg,laneno), Ity_V128 );
}
+static IRExpr* getYMMRegLane64 ( UInt ymmreg, Int laneno )
+{
+ return IRExpr_Get( ymmGuestRegLane64offset(ymmreg,laneno), Ity_I64 );
+}
+
static IRExpr* getYMMRegLane32 ( UInt ymmreg, Int laneno )
{
return IRExpr_Get( ymmGuestRegLane32offset(ymmreg,laneno), Ity_I32 );
@@ -1516,6 +1521,12 @@
stmt( IRStmt_Put( ymmGuestRegLane64offset(ymmreg,laneno), e ) );
}
+static void putYMMRegLane64 ( UInt ymmreg, Int laneno, IRExpr* e )
+{
+ vassert(typeOfIRExpr(irsb->tyenv,e) == Ity_I64);
+ stmt( IRStmt_Put( ymmGuestRegLane64offset(ymmreg,laneno), e ) );
+}
+
static void putYMMRegLane32F ( UInt ymmreg, Int laneno, IRExpr* e )
{
vassert(typeOfIRExpr(irsb->tyenv,e) == Ity_F32);
@@ -10866,6 +10877,29 @@
}
+static IRTemp math_PSADBW_128 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp s1, s0, d1, d0;
+ s1 = s0 = d1 = d0 = IRTemp_INVALID;
+
+ breakupV128to64s( sV, &s1, &s0 );
+ breakupV128to64s( dV, &d1, &d0 );
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res,
+ binop(Iop_64HLtoV128,
+ mkIRExprCCall(Ity_I64, 0/*regparms*/,
+ "amd64g_calculate_mmx_psadbw",
+ &amd64g_calculate_mmx_psadbw,
+ mkIRExprVec_2( mkexpr(s1), mkexpr(d1))),
+ mkIRExprCCall(Ity_I64, 0/*regparms*/,
+ "amd64g_calculate_mmx_psadbw",
+ &amd64g_calculate_mmx_psadbw,
+ mkIRExprVec_2( mkexpr(s0), mkexpr(d0)))) );
+ return res;
+}
+
+
static Long dis_MASKMOVDQU ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx )
{
@@ -13818,47 +13852,24 @@
/* 66 0F F6 = PSADBW -- 2 x (8x8 -> 48 zeroes ++ u16) Sum Abs Diffs
from E(xmm or mem) to G(xmm) */
if (have66noF2noF3(pfx) && sz == 2) {
- IRTemp s1V = newTemp(Ity_V128);
- IRTemp s2V = newTemp(Ity_V128);
- IRTemp dV = newTemp(Ity_V128);
- IRTemp s1Hi = newTemp(Ity_I64);
- IRTemp s1Lo = newTemp(Ity_I64);
- IRTemp s2Hi = newTemp(Ity_I64);
- IRTemp s2Lo = newTemp(Ity_I64);
- IRTemp dHi = newTemp(Ity_I64);
- IRTemp dLo = newTemp(Ity_I64);
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
if (epartIsReg(modrm)) {
- assign( s1V, getXMMReg(eregOfRexRM(pfx,modrm)) );
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
delta += 1;
- DIP("psadbw %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("psadbw %s,%s\n", nameXMMReg(rE), nameXMMReg(rG));
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
- assign( s1V, loadLE(Ity_V128, mkexpr(addr)) );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
delta += alen;
- DIP("psadbw %s,%s\n", dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("psadbw %s,%s\n", dis_buf, nameXMMReg(rG));
}
- assign( s2V, getXMMReg(gregOfRexRM(pfx,modrm)) );
- assign( s1Hi, unop(Iop_V128HIto64, mkexpr(s1V)) );
- assign( s1Lo, unop(Iop_V128to64, mkexpr(s1V)) );
- assign( s2Hi, unop(Iop_V128HIto64, mkexpr(s2V)) );
- assign( s2Lo, unop(Iop_V128to64, mkexpr(s2V)) );
- assign( dHi, mkIRExprCCall(
- Ity_I64, 0/*regparms*/,
- "amd64g_calculate_mmx_psadbw",
- &amd64g_calculate_mmx_psadbw,
- mkIRExprVec_2( mkexpr(s1Hi), mkexpr(s2Hi))
- ));
- assign( dLo, mkIRExprCCall(
- Ity_I64, 0/*regparms*/,
- "amd64g_calculate_mmx_psadbw",
- &amd64g_calculate_mmx_psadbw,
- mkIRExprVec_2( mkexpr(s1Lo), mkexpr(s2Lo))
- ));
- assign( dV, binop(Iop_64HLtoV128, mkexpr(dHi), mkexpr(dLo))) ;
- putXMMReg(gregOfRexRM(pfx,modrm), mkexpr(dV));
+ assign( dV, getXMMReg(rG) );
+ putXMMReg( rG, mkexpr( math_PSADBW_128 ( dV, sV ) ) );
+
goto decode_success;
}
break;
@@ -14000,6 +14011,38 @@
}
+static Long dis_MOVDDUP_256 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ IRTemp d0 = newTemp(Ity_I64);
+ IRTemp d1 = newTemp(Ity_I64);
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ DIP("vmovddup %s,%s\n", nameYMMReg(rE), nameYMMReg(rG));
+ delta += 1;
+ assign ( d0, getYMMRegLane64(rE, 0) );
+ assign ( d1, getYMMRegLane64(rE, 2) );
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( d0, loadLE(Ity_I64, mkexpr(addr)) );
+ assign( d1, loadLE(Ity_I64, binop(Iop_Add64,
+ mkexpr(addr), mkU64(16))) );
+ DIP("vmovddup %s,%s\n", dis_buf, nameYMMReg(rG));
+ delta += alen;
+ }
+ putYMMRegLane64( rG, 0, mkexpr(d0) );
+ putYMMRegLane64( rG, 1, mkexpr(d0) );
+ putYMMRegLane64( rG, 2, mkexpr(d1) );
+ putYMMRegLane64( rG, 3, mkexpr(d1) );
+ return delta;
+}
+
+
static Long dis_MOVSxDUP_128 ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx, Bool isL )
{
@@ -16544,6 +16587,61 @@
/*--- ---*/
/*------------------------------------------------------------*/
+static Long dis_PEXTRW ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ IRTemp t0 = IRTemp_INVALID;
+ IRTemp t1 = IRTemp_INVALID;
+ IRTemp t2 = IRTemp_INVALID;
+ IRTemp t3 = IRTemp_INVALID;
+ UChar modrm = getUChar(delta);
+ Int alen = 0;
+ HChar dis_buf[50];
+ UInt rG = gregOfRexRM(pfx,modrm);
+ Int imm8_20;
+ IRTemp xmm_vec = newTemp(Ity_V128);
+ IRTemp d16 = newTemp(Ity_I16);
+ HChar* mbV = isAvx ? "v" : "";
+
+ vassert(0==getRexW(pfx)); /* ensured by caller */
+ assign( xmm_vec, getXMMReg(rG) );
+ breakupV128to32s( xmm_vec, &t3, &t2, &t1, &t0 );
+
+ if ( epartIsReg( modrm ) ) {
+ imm8_20 = (Int)(getUChar(delta+1) & 7);
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
+ imm8_20 = (Int)(getUChar(delta+alen) & 7);
+ }
+
+ switch (imm8_20) {
+ case 0: assign(d16, unop(Iop_32to16, mkexpr(t0))); break;
+ case 1: assign(d16, unop(Iop_32HIto16, mkexpr(t0))); break;
+ case 2: assign(d16, unop(Iop_32to16, mkexpr(t1))); break;
+ case 3: assign(d16, unop(Iop_32HIto16, mkexpr(t1))); break;
+ case 4: assign(d16, unop(Iop_32to16, mkexpr(t2))); break;
+ case 5: assign(d16, unop(Iop_32HIto16, mkexpr(t2))); break;
+ case 6: assign(d16, unop(Iop_32to16, mkexpr(t3))); break;
+ case 7: assign(d16, unop(Iop_32HIto16, mkexpr(t3))); break;
+ default: vassert(0);
+ }
+
+ if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ putIReg32( rE, unop(Iop_16Uto32, mkexpr(d16)) );
+ delta += 1+1;
+ DIP( "%spextrw $%d, %s,%s\n", mbV, imm8_20,
+ nameXMMReg( rG ), nameIReg32( rE ) );
+ } else {
+ storeLE( mkexpr(addr), mkexpr(d16) );
+ delta += alen+1;
+ DIP( "%spextrw $%d, %s,%s\n", mbV, imm8_20, nameXMMReg( rG ), dis_buf );
+ }
+ return delta;
+}
+
+
static Long dis_PEXTRD ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx )
{
@@ -17423,48 +17521,7 @@
Extract Word from xmm, store in mem or zero-extend + store in gen.reg.
(XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
- Int imm8_20;
- IRTemp xmm_vec = newTemp(Ity_V128);
- IRTemp src_word = newTemp(Ity_I16);
-
- modrm = getUChar(delta);
- assign( xmm_vec, getXMMReg( gregOfRexRM(pfx,modrm) ) );
- breakupV128to32s( xmm_vec, &t3, &t2, &t1, &t0 );
-
- if ( epartIsReg( modrm ) ) {
- imm8_20 = (Int)(getUChar(delta+1) & 7);
- } else {
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
- imm8_20 = (Int)(getUChar(delta+alen) & 7);
- }
-
- switch ( imm8_20 ) {
- case 0: assign( src_word, unop(Iop_32to16, mkexpr(t0)) ); break;
- case 1: assign( src_word, unop(Iop_32HIto16, mkexpr(t0)) ); break;
- case 2: assign( src_word, unop(Iop_32to16, mkexpr(t1)) ); break;
- case 3: assign( src_word, unop(Iop_32HIto16, mkexpr(t1)) ); break;
- case 4: assign( src_word, unop(Iop_32to16, mkexpr(t2)) ); break;
- case 5: assign( src_word, unop(Iop_32HIto16, mkexpr(t2)) ); break;
- case 6: assign( src_word, unop(Iop_32to16, mkexpr(t3)) ); break;
- case 7: assign( src_word, unop(Iop_32HIto16, mkexpr(t3)) ); break;
- default: vassert(0);
- }
-
- if ( epartIsReg( modrm ) ) {
- putIReg64( eregOfRexRM(pfx,modrm),
- unop(Iop_16Uto64, mkexpr(src_word)) );
- delta += 1+1;
- DIP( "pextrw $%d, %s,%s\n", imm8_20,
- nameXMMReg( gregOfRexRM(pfx, modrm) ),
- nameIReg64( eregOfRexRM(pfx, modrm) ) );
- } else {
- storeLE( mkexpr(addr), mkexpr(src_word) );
- delta += alen+1;
- DIP( "pextrw $%d, %s,%s\n",
- imm8_20, nameXMMReg( gregOfRexRM(pfx, modrm) ), dis_buf );
- }
-
+ delta = dis_PEXTRW( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -21519,6 +21576,11 @@
delta = dis_MOVDDUP_128( vbi, pfx, delta, True/*isAvx*/ );
goto decode_success;
}
+ /* VMOVDDUP ymm2/m256, ymm1 = VEX.256.F2.0F.WIG /12 r */
+ if (haveF2no66noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_MOVDDUP_256( vbi, pfx, delta );
+ goto decode_success;
+ }
/* VMOVHLPS xmm3, xmm2, xmm1 = VEX.NDS.128.0F.WIG 12 /r */
/* Insn only exists in reg form */
if (haveNo66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
@@ -21538,10 +21600,12 @@
*uses_vvvv = True;
goto decode_success;
}
+ /* VMOVLPS m64, xmm1, xmm2 = VEX.NDS.128.0F.WIG 12 /r */
+ /* Insn exists only in mem form, it appears. */
/* VMOVLPD m64, xmm1, xmm2 = VEX.NDS.128.66.0F.WIG 12 /r */
/* Insn exists only in mem form, it appears. */
- if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
- && !epartIsReg(getUChar(delta))) {
+ if ((have66noF2noF3(pfx) || haveNo66noF2noF3(pfx))
+ && 0==getVexL(pfx)/*128*/ && !epartIsReg(getUChar(delta))) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx, modrm);
UInt rV = getVexNvvvv(pfx);
@@ -21571,10 +21635,12 @@
break;
case 0x13:
+ /* VMOVLPS xmm1, m64 = VEX.128.0F.WIG 13 /r */
+ /* Insn exists only in mem form, it appears. */
/* VMOVLPD xmm1, m64 = VEX.128.66.0F.WIG 13 /r */
/* Insn exists only in mem form, it appears. */
- if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
- && !epartIsReg(getUChar(delta))) {
+ if ((have66noF2noF3(pfx) || haveNo66noF2noF3(pfx))
+ && 0==getVexL(pfx)/*128*/ && !epartIsReg(getUChar(delta))) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx, modrm);
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
@@ -22224,6 +22290,27 @@
}
break;
+ case 0x53:
+ /* VRCPSS xmm3/m64(E), xmm2(V), xmm1(G) = VEX.NDS.LIG.F3.0F.WIG 53 /r */
+ if (haveF3no66noF2(pfx)) {
+ delta = dis_AVX128_E_V_to_G_lo32_unary(
+ uses_vvvv, vbi, pfx, delta, "vrcpss", Iop_Recip32F0x4 );
+ goto decode_success;
+ }
+ /* VRCPPS xmm2/m128(E), xmm1(G) = VEX.NDS.128.0F.WIG 53 /r */
+ if (haveNo66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_to_G_unary_all(
+ uses_vvvv, vbi, pfx, delta, "vrcpps", Iop_Recip32Fx4 );
+ goto decode_success;
+ }
+ /* VRCPPS ymm2/m256(E), ymm1(G) = VEX.NDS.256.0F.WIG 53 /r */
+ if (haveNo66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_AVX256_E_to_G_unary_all(
+ uses_vvvv, vbi, pfx, delta, "vrcpps", Iop_Recip32Fx8 );
+ goto decode_success;
+ }
+ break;
+
case 0x54:
/* VANDPD r/m, rV, r ::: r = rV & r/m */
/* VANDPD = VEX.NDS.128.66.0F.WIG 54 /r */
@@ -23301,14 +23388,20 @@
/* Moves from G to E, so is a store-form insn */
/* Intel docs list this in the VMOVD entry for some reason. */
if (have66noF2noF3(pfx)
- && 0==getVexL(pfx)/*128*/ && 1==getRexW(pfx)/*W1*/
- && epartIsReg(getUChar(delta))) {
+ && 0==getVexL(pfx)/*128*/ && 1==getRexW(pfx)/*W1*/) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx,modrm);
- UInt rE = eregOfRexRM(pfx,modrm);
- DIP("vmovq %s,%s\n", nameXMMReg(rG), nameIReg64(rE));
- putIReg64(rE, getXMMRegLane64(rG, 0));
- delta += 1;
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ DIP("vmovq %s,%s\n", nameXMMReg(rG), nameIReg64(rE));
+ putIReg64(rE, getXMMRegLane64(rG, 0));
+ delta += 1;
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ storeLE( mkexpr(addr), getXMMRegLane64(rG, 0) );
+ DIP("vmovq %s,%s\n", dis_buf, nameXMMReg(rG));
+ delta += alen;
+ }
goto decode_success;
}
/* VMOVD xmm1, m32/r32 = VEX.128.66.0F.W0 7E /r (reg case only) */
@@ -24100,6 +24193,16 @@
}
break;
+ case 0xF6:
+ /* VPSADBW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG F6 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vpsadbw", math_PSADBW_128 );
+ goto decode_success;
+ }
+ break;
+
case 0xF7:
/* VMASKMOVDQU xmm2, xmm1 = VEX.128.66.0F.WIG F7 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
@@ -24327,6 +24430,103 @@
}
break;
+ case 0x08:
+ case 0x09:
+ case 0x0A:
+ /* VPSIGNB xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 08 /r */
+ /* VPSIGNW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 09 /r */
+ /* VPSIGND xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 0A /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ IRTemp sHi, sLo, dHi, dLo;
+ sHi = sLo = dHi = dLo = IRTemp_INVALID;
+ UChar ch = '?';
+ Int laneszB = 0;
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = getVexNvvvv(pfx);
+
+ switch (opc) {
+ case 0x08: laneszB = 1; ch = 'b'; break;
+ case 0x09: laneszB = 2; ch = 'w'; break;
+ case 0x0A: laneszB = 4; ch = 'd'; break;
+ default: vassert(0);
+ }
+
+ assign( dV, getXMMReg(rV) );
+
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
+ delta += 1;
+ DIP("vpsign%c %s,%s,%s\n", ch, nameXMMReg(rE),
+ nameXMMReg(rV), nameXMMReg(rG));
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
+ delta += alen;
+ DIP("vpsign%c %s,%s,%s\n", ch, dis_buf,
+ nameXMMReg(rV), nameXMMReg(rG));
+ }
+
+ breakupV128to64s( dV, &dHi, &dLo );
+ breakupV128to64s( sV, &sHi, &sLo );
+
+ putYMMRegLoAndZU(
+ rG,
+ binop(Iop_64HLtoV128,
+ dis_PSIGN_helper( mkexpr(sHi), mkexpr(dHi), laneszB ),
+ dis_PSIGN_helper( mkexpr(sLo), mkexpr(dLo), laneszB )
+ )
+ );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
+ case 0x0B:
+ /* VPMULHRSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 0B /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ IRTemp sHi, sLo, dHi, dLo;
+ sHi = sLo = dHi = dLo = IRTemp_INVALID;
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = getVexNvvvv(pfx);
+
+ assign( dV, getXMMReg(rV) );
+
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
+ delta += 1;
+ DIP("vpmulhrsw %s,%s,%s\n", nameXMMReg(rE),
+ nameXMMReg(rV), nameXMMReg(rG));
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
+ delta += alen;
+ DIP("vpmulhrsw %s,%s,%s\n", dis_buf,
+ nameXMMReg(rV), nameXMMReg(rG));
+ }
+
+ breakupV128to64s( dV, &dHi, &dLo );
+ breakupV128to64s( sV, &sHi, &sLo );
+
+ putYMMRegLoAndZU(
+ rG,
+ binop(Iop_64HLtoV128,
+ dis_PMULHRSW_helper( mkexpr(sHi), mkexpr(dHi) ),
+ dis_PMULHRSW_helper( mkexpr(sLo), mkexpr(dLo) )
+ )
+ );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
case 0x0C:
/* VPERMILPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.W0 0C /r */
if (have66noF2noF3(pfx)
@@ -24497,7 +24697,7 @@
IRExpr* res = binop(Iop_64HLtoV128, mkexpr(t64), mkexpr(t64));
putYMMRegLoAndZU(rG, res);
goto decode_success;
- }
+ }
/* VBROADCASTSS m32, ymm1 = VEX.256.66.0F38.WIG 18 /r */
if (have66noF2noF3(pfx)
&& 1==getVexL(pfx)/*256*/
@@ -24515,8 +24715,8 @@
mkexpr(t64), mkexpr(t64));
putYMMReg(rG, res);
goto decode_success;
- }
- break;
+ }
+ break;
case 0x19:
/* VBROADCASTSD m64, ymm1 = VEX.256.66.0F38.WIG 19 /r */
@@ -24534,9 +24734,26 @@
mkexpr(t64), mkexpr(t64));
putYMMReg(rG, res);
goto decode_success;
- }
- break;
+ }
+ break;
+ case 0x1A:
+ /* VBROADCASTF128 m128, ymm1 = VEX.256.66.0F38.WIG 1A /r */
+ if (have66noF2noF3(pfx)
+ && 1==getVexL(pfx)/*256*/
+ && !epartIsReg(getUChar(delta))) {
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ delta += alen;
+ DIP("vbroadcastf128 %s,%s\n", dis_buf, nameYMMReg(rG));
+ IRTemp t128 = newTemp(Ity_V128);
+ assign(t128, loadLE(Ity_V128, mkexpr(addr)));
+ putYMMReg( rG, binop(Iop_V128HLtoV256, mkexpr(t128), mkexpr(t128)) );
+ goto decode_success;
+ }
+ break;
+
case 0x1C:
/* VPABSB xmm2/m128, xmm1 = VEX.128.66.0F38.WIG 1C /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -25534,6 +25751,16 @@
}
break;
+ case 0x15:
+ /* VPEXTRW imm8, reg/m16, xmm2 */
+ /* VPEXTRW = VEX.128.66.0F3A.W0 15 /r ib */
+ if (have66noF2noF3(pfx)
+ && 0==getVexL(pfx)/*128*/ && 0==getRexW(pfx)/*W0*/) {
+ delta = dis_PEXTRW( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x16:
/* VPEXTRD imm8, r32/m32, xmm2 */
/* VPEXTRD = VEX.128.66.0F3A.W0 16 /r ib */
Modified: trunk/pub/libvex_ir.h (+1 -0)
===================================================================
--- trunk/pub/libvex_ir.h 2012-06-24 15:26:30 +01:00 (rev 2408)
+++ trunk/pub/libvex_ir.h 2012-06-24 15:57:59 +01:00 (rev 2409)
@@ -1454,6 +1454,7 @@
Iop_Sqrt32Fx8,
Iop_Sqrt64Fx4,
Iop_RSqrt32Fx8,
+ Iop_Recip32Fx8,
Iop_Max32Fx8, Iop_Min32Fx8,
Iop_Max64Fx4, Iop_Min64Fx4
|
|
From: <sv...@va...> - 2012-06-24 14:27:13
|
sewardj 2012-06-24 15:27:06 +0100 (Sun, 24 Jun 2012)
New Revision: 12670
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+132 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 15:00:56 +01:00 (rev 12669)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 15:27:06 +01:00 (rev 12670)
@@ -1861,7 +1861,111 @@
GEN_test_Monly(VMOVNTPS_256,
"vmovntps %%ymm9, (%%rax)")
+GEN_test_RandM(VPACKSSWB_128,
+ "vpacksswb %%xmm6, %%xmm8, %%xmm7",
+ "vpacksswb (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPAVGB_128,
+ "vpavgb %%xmm6, %%xmm8, %%xmm7",
+ "vpavgb (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPAVGW_128,
+ "vpavgw %%xmm6, %%xmm8, %%xmm7",
+ "vpavgw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPADDSB_128,
+ "vpaddsb %%xmm6, %%xmm8, %%xmm7",
+ "vpaddsb (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPADDSW_128,
+ "vpaddsw %%xmm6, %%xmm8, %%xmm7",
+ "vpaddsw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPHADDW_128,
+ "vphaddw %%xmm6, %%xmm8, %%xmm7",
+ "vphaddw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPHADDD_128,
+ "vphaddd %%xmm6, %%xmm8, %%xmm7",
+ "vphaddd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPHADDSW_128,
+ "vphaddsw %%xmm6, %%xmm8, %%xmm7",
+ "vphaddsw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPMADDUBSW_128,
+ "vpmaddubsw %%xmm6, %%xmm8, %%xmm7",
+ "vpmaddubsw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPHSUBW_128,
+ "vphsubw %%xmm6, %%xmm8, %%xmm7",
+ "vphsubw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPHSUBD_128,
+ "vphsubd %%xmm6, %%xmm8, %%xmm7",
+ "vphsubd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPHSUBSW_128,
+ "vphsubsw %%xmm6, %%xmm8, %%xmm7",
+ "vphsubsw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPABSB_128,
+ "vpabsb %%xmm6, %%xmm7",
+ "vpabsb (%%rax), %%xmm7")
+
+GEN_test_RandM(VPABSW_128,
+ "vpabsw %%xmm6, %%xmm7",
+ "vpabsw (%%rax), %%xmm7")
+
+GEN_test_RandM(VPMOVSXBQ_128,
+ "vpmovsxbq %%xmm6, %%xmm8",
+ "vpmovsxbq (%%rax), %%xmm8")
+
+GEN_test_RandM(VPMOVSXWQ_128,
+ "vpmovsxwq %%xmm6, %%xmm8",
+ "vpmovsxwq (%%rax), %%xmm8")
+
+GEN_test_RandM(VPACKUSDW_128,
+ "vpackusdw %%xmm6, %%xmm8, %%xmm7",
+ "vpackusdw (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPMOVZXBQ_128,
+ "vpmovzxbq %%xmm6, %%xmm8",
+ "vpmovzxbq (%%rax), %%xmm8")
+
+GEN_test_RandM(VPMOVZXWQ_128,
+ "vpmovzxwq %%xmm6, %%xmm8",
+ "vpmovzxwq (%%rax), %%xmm8")
+
+GEN_test_RandM(VPMOVZXDQ_128,
+ "vpmovzxdq %%xmm6, %%xmm8",
+ "vpmovzxdq (%%rax), %%xmm8")
+
+GEN_test_RandM(VMPSADBW_128_0x0,
+ "vmpsadbw $0, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $0, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x1,
+ "vmpsadbw $1, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $1, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x2,
+ "vmpsadbw $2, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $2, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x3,
+ "vmpsadbw $3, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $3, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x4,
+ "vmpsadbw $4, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $4, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x5,
+ "vmpsadbw $5, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $5, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x6,
+ "vmpsadbw $6, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $6, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMPSADBW_128_0x7,
+ "vmpsadbw $7, %%xmm6, %%xmm8, %%xmm7",
+ "vmpsadbw $7, (%%rax), %%xmm8, %%xmm7")
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -2320,5 +2424,33 @@
DO_D( VMOVNTPD_256 );
DO_D( VMOVNTPS_128 );
DO_D( VMOVNTPS_256 );
+ DO_D( VPACKSSWB_128 );
+ DO_D( VPAVGB_128 );
+ DO_D( VPAVGW_128 );
+ DO_D( VPADDSB_128 );
+ DO_D( VPADDSW_128 );
+ DO_D( VPHADDW_128 );
+ DO_D( VPHADDD_128 );
+ DO_D( VPHADDSW_128 );
+ DO_D( VPMADDUBSW_128 );
+ DO_D( VPHSUBW_128 );
+ DO_D( VPHSUBD_128 );
+ DO_D( VPHSUBSW_128 );
+ DO_D( VPABSB_128 );
+ DO_D( VPABSW_128 );
+ DO_D( VPMOVSXBQ_128 );
+ DO_D( VPMOVSXWQ_128 );
+ DO_D( VPACKUSDW_128 );
+ DO_D( VPMOVZXBQ_128 );
+ DO_D( VPMOVZXWQ_128 );
+ DO_D( VPMOVZXDQ_128 );
+ DO_D( VMPSADBW_128_0x0 );
+ DO_D( VMPSADBW_128_0x1 );
+ DO_D( VMPSADBW_128_0x2 );
+ DO_D( VMPSADBW_128_0x3 );
+ DO_D( VMPSADBW_128_0x4 );
+ DO_D( VMPSADBW_128_0x5 );
+ DO_D( VMPSADBW_128_0x6 );
+ DO_D( VMPSADBW_128_0x7 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 14:26:38
|
sewardj 2012-06-24 15:26:30 +0100 (Sun, 24 Jun 2012)
New Revision: 2408
Log:
More AVX insns:
VPACKSSWB r/m, rV, r ::: r = QNarrowBin16Sto8Sx16(rV, r/m)
VPAVGB xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E0 /r
VPAVGW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E3 /r
VPADDSB xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG EC /r
VPADDSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG ED /r
VPHADDW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 01 /r
VPHADDD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 02 /r
VPHADDSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 03 /r
VPMADDUBSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 04 /r
VPHSUBW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 05 /r
VPHSUBD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 06 /r
VPHSUBSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 07 /r
VPABSB xmm2/m128, xmm1 = VEX.128.66.0F38.WIG 1C /r
VPABSW xmm2/m128, xmm1 = VEX.128.66.0F38.WIG 1D /r
VPMOVSXBQ xmm2/m16, xmm1 = VEX.128.66.0F38.WIG 22 /r
VPMOVSXWQ xmm2/m32, xmm1 = VEX.128.66.0F38.WIG 24 /r
VPACKUSDW = VEX.NDS.128.66.0F38.WIG 2B /r
VPMOVZXBQ = VEX.128.66.0F38.WIG 32 /r
VPMOVZXWQ xmm2/m32, xmm1 = VEX.128.66.0F38.WIG 34 /r
VPMOVZXDQ xmm2/m64, xmm1 = VEX.128.66.0F38.WIG 35 /r
VMPSADBW = VEX.NDS.128.66.0F3A.WIG 42 /r ib
(Jakub Jelinek, ja...@re...), #273475 comment 136.
Modified files:
trunk/priv/guest_amd64_toIR.c
Modified: trunk/priv/guest_amd64_toIR.c (+528 -283)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 15:00:27 +01:00 (rev 2407)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 15:26:30 +01:00 (rev 2408)
@@ -9219,7 +9219,14 @@
return math_PABS_XMM(aa, 4);
}
+static IRTemp math_PABS_XMM_pap2 ( IRTemp aa ) {
+ return math_PABS_XMM(aa, 2);
+}
+static IRTemp math_PABS_XMM_pap1 ( IRTemp aa ) {
+ return math_PABS_XMM(aa, 1);
+}
+
static IRExpr* dis_PALIGNR_XMM_helper ( IRTemp hi64,
IRTemp lo64, Long byteShift )
{
@@ -14378,6 +14385,104 @@
}
+static Long dis_PHADD_128 ( VexAbiInfo* vbi, Prefix pfx, Long delta,
+ Bool isAvx, UChar opc )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ HChar* str = "???";
+ IROp opV64 = Iop_INVALID;
+ IROp opCatO = Iop_CatOddLanes16x4;
+ IROp opCatE = Iop_CatEvenLanes16x4;
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ IRTemp sHi = newTemp(Ity_I64);
+ IRTemp sLo = newTemp(Ity_I64);
+ IRTemp dHi = newTemp(Ity_I64);
+ IRTemp dLo = newTemp(Ity_I64);
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = isAvx ? getVexNvvvv(pfx) : rG;
+
+ switch (opc) {
+ case 0x01: opV64 = Iop_Add16x4; str = "addw"; break;
+ case 0x02: opV64 = Iop_Add32x2; str = "addd"; break;
+ case 0x03: opV64 = Iop_QAdd16Sx4; str = "addsw"; break;
+ case 0x05: opV64 = Iop_Sub16x4; str = "subw"; break;
+ case 0x06: opV64 = Iop_Sub32x2; str = "subd"; break;
+ case 0x07: opV64 = Iop_QSub16Sx4; str = "subsw"; break;
+ default: vassert(0);
+ }
+ if (opc == 0x02 || opc == 0x06) {
+ opCatO = Iop_InterleaveHI32x2;
+ opCatE = Iop_InterleaveLO32x2;
+ }
+
+ assign( dV, getXMMReg(rV) );
+
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
+ DIP("ph%s %s,%s\n", str, nameXMMReg(rE), nameXMMReg(rG));
+ delta += 1;
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ if (!isAvx)
+ gen_SEGV_if_not_16_aligned( addr );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
+ DIP("ph%s %s,%s\n", str, dis_buf, nameXMMReg(rG));
+ delta += alen;
+ }
+
+ assign( dHi, unop(Iop_V128HIto64, mkexpr(dV)) );
+ assign( dLo, unop(Iop_V128to64, mkexpr(dV)) );
+ assign( sHi, unop(Iop_V128HIto64, mkexpr(sV)) );
+ assign( sLo, unop(Iop_V128to64, mkexpr(sV)) );
+
+ /* This isn't a particularly efficient way to compute the
+ result, but at least it avoids a proliferation of IROps,
+ hence avoids complication all the backends. */
+
+ (isAvx ? putYMMRegLoAndZU : putXMMReg)
+ ( rG,
+ binop(Iop_64HLtoV128,
+ binop(opV64,
+ binop(opCatE,mkexpr(sHi),mkexpr(sLo)),
+ binop(opCatO,mkexpr(sHi),mkexpr(sLo)) ),
+ binop(opV64,
+ binop(opCatE,mkexpr(dHi),mkexpr(dLo)),
+ binop(opCatO,mkexpr(dHi),mkexpr(dLo)) ) ) );
+ return delta;
+}
+
+
+static IRTemp math_PMADDUBSW_128 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp sVoddsSX = newTemp(Ity_V128);
+ IRTemp sVevensSX = newTemp(Ity_V128);
+ IRTemp dVoddsZX = newTemp(Ity_V128);
+ IRTemp dVevensZX = newTemp(Ity_V128);
+ /* compute dV unsigned x sV signed */
+ assign( sVoddsSX, binop(Iop_SarN16x8, mkexpr(sV), mkU8(8)) );
+ assign( sVevensSX, binop(Iop_SarN16x8,
+ binop(Iop_ShlN16x8, mkexpr(sV), mkU8(8)),
+ mkU8(8)) );
+ assign( dVoddsZX, binop(Iop_ShrN16x8, mkexpr(dV), mkU8(8)) );
+ assign( dVevensZX, binop(Iop_ShrN16x8,
+ binop(Iop_ShlN16x8, mkexpr(dV), mkU8(8)),
+ mkU8(8)) );
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, binop(Iop_QAdd16Sx8,
+ binop(Iop_Mul16x8, mkexpr(sVoddsSX), mkexpr(dVoddsZX)),
+ binop(Iop_Mul16x8, mkexpr(sVevensSX), mkexpr(dVevensZX))
+ )
+ );
+ return res;
+}
+
+
__attribute__((noinline))
static
Long dis_ESC_0F38__SupSSE3 ( Bool* decode_OK,
@@ -14484,70 +14589,7 @@
xmm) and G to G (xmm). */
if (have66noF2noF3(pfx)
&& (sz == 2 || /*redundant REX.W*/ sz == 8)) {
- HChar* str = "???";
- IROp opV64 = Iop_INVALID;
- IROp opCatO = Iop_CatOddLanes16x4;
- IROp opCatE = Iop_CatEvenLanes16x4;
- IRTemp sV = newTemp(Ity_V128);
- IRTemp dV = newTemp(Ity_V128);
- IRTemp sHi = newTemp(Ity_I64);
- IRTemp sLo = newTemp(Ity_I64);
- IRTemp dHi = newTemp(Ity_I64);
- IRTemp dLo = newTemp(Ity_I64);
-
- modrm = getUChar(delta);
-
- switch (opc) {
- case 0x01: opV64 = Iop_Add16x4; str = "addw"; break;
- case 0x02: opV64 = Iop_Add32x2; str = "addd"; break;
- case 0x03: opV64 = Iop_QAdd16Sx4; str = "addsw"; break;
- case 0x05: opV64 = Iop_Sub16x4; str = "subw"; break;
- case 0x06: opV64 = Iop_Sub32x2; str = "subd"; break;
- case 0x07: opV64 = Iop_QSub16Sx4; str = "subsw"; break;
- default: vassert(0);
- }
- if (opc == 0x02 || opc == 0x06) {
- opCatO = Iop_InterleaveHI32x2;
- opCatE = Iop_InterleaveLO32x2;
- }
-
- assign( dV, getXMMReg(gregOfRexRM(pfx,modrm)) );
-
- if (epartIsReg(modrm)) {
- assign( sV, getXMMReg( eregOfRexRM(pfx,modrm)) );
- DIP("ph%s %s,%s\n", str, nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
- delta += 1;
- } else {
- addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
- gen_SEGV_if_not_16_aligned( addr );
- assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
- DIP("ph%s %s,%s\n", str, dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
- delta += alen;
- }
-
- assign( dHi, unop(Iop_V128HIto64, mkexpr(dV)) );
- assign( dLo, unop(Iop_V128to64, mkexpr(dV)) );
- assign( sHi, unop(Iop_V128HIto64, mkexpr(sV)) );
- assign( sLo, unop(Iop_V128to64, mkexpr(sV)) );
-
- /* This isn't a particularly efficient way to compute the
- result, but at least it avoids a proliferation of IROps,
- hence avoids complication all the backends. */
- putXMMReg(
- gregOfRexRM(pfx,modrm),
- binop(Iop_64HLtoV128,
- binop(opV64,
- binop(opCatE,mkexpr(sHi),mkexpr(sLo)),
- binop(opCatO,mkexpr(sHi),mkexpr(sLo))
- ),
- binop(opV64,
- binop(opCatE,mkexpr(dHi),mkexpr(dLo)),
- binop(opCatO,mkexpr(dHi),mkexpr(dLo))
- )
- )
- );
+ delta = dis_PHADD_128( vbi, pfx, delta, False/*isAvx*/, opc );
goto decode_success;
}
/* ***--- these are MMX class insns introduced in SSSE3 ---*** */
@@ -14619,51 +14661,27 @@
Unsigned Bytes (XMM) */
if (have66noF2noF3(pfx)
&& (sz == 2 || /*redundant REX.W*/ sz == 8)) {
- IRTemp sV = newTemp(Ity_V128);
- IRTemp dV = newTemp(Ity_V128);
- IRTemp sVoddsSX = newTemp(Ity_V128);
- IRTemp sVevensSX = newTemp(Ity_V128);
- IRTemp dVoddsZX = newTemp(Ity_V128);
- IRTemp dVevensZX = newTemp(Ity_V128);
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
- modrm = getUChar(delta);
- assign( dV, getXMMReg(gregOfRexRM(pfx,modrm)) );
+ assign( dV, getXMMReg(rG) );
if (epartIsReg(modrm)) {
- assign( sV, getXMMReg(eregOfRexRM(pfx,modrm)) );
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
delta += 1;
- DIP("pmaddubsw %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("pmaddubsw %s,%s\n", nameXMMReg(rE), nameXMMReg(rG));
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
gen_SEGV_if_not_16_aligned( addr );
assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
delta += alen;
- DIP("pmaddubsw %s,%s\n", dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("pmaddubsw %s,%s\n", dis_buf, nameXMMReg(rG));
}
- /* compute dV unsigned x sV signed */
- assign( sVoddsSX,
- binop(Iop_SarN16x8, mkexpr(sV), mkU8(8)) );
- assign( sVevensSX,
- binop(Iop_SarN16x8,
- binop(Iop_ShlN16x8, mkexpr(sV), mkU8(8)),
- mkU8(8)) );
- assign( dVoddsZX,
- binop(Iop_ShrN16x8, mkexpr(dV), mkU8(8)) );
- assign( dVevensZX,
- binop(Iop_ShrN16x8,
- binop(Iop_ShlN16x8, mkexpr(dV), mkU8(8)),
- mkU8(8)) );
-
- putXMMReg(
- gregOfRexRM(pfx,modrm),
- binop(Iop_QAdd16Sx8,
- binop(Iop_Mul16x8, mkexpr(sVoddsSX), mkexpr(dVoddsZX)),
- binop(Iop_Mul16x8, mkexpr(sVevensSX), mkexpr(dVevensZX))
- )
- );
+ putXMMReg( rG, mkexpr( math_PMADDUBSW_128( dV, sV ) ) );
goto decode_success;
}
/* 0F 38 04 = PMADDUBSW -- Multiply and Add Packed Signed and
@@ -15647,20 +15665,20 @@
IRTemp srcVec = newTemp(Ity_V128);
UChar modrm = getUChar(delta);
UChar* mbV = isAvx ? "v" : "";
+ UChar how = xIsZ ? 'z' : 's';
+ UInt rG = gregOfRexRM(pfx, modrm);
if ( epartIsReg(modrm) ) {
- assign( srcVec, getXMMReg( eregOfRexRM(pfx, modrm) ) );
+ UInt rE = eregOfRexRM(pfx, modrm);
+ assign( srcVec, getXMMReg(rE) );
delta += 1;
- DIP( "%spmovzxwd %s,%s\n", mbV,
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ DIP( "%spmov%cxwd %s,%s\n", mbV, how, nameXMMReg(rE), nameXMMReg(rG) );
} else {
addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
assign( srcVec,
unop( Iop_64UtoV128, loadLE( Ity_I64, mkexpr(addr) ) ) );
delta += alen;
- DIP( "%spmovzxwd %s,%s\n", mbV,
- dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ DIP( "%spmov%cxwd %s,%s\n", mbV, how, dis_buf, nameXMMReg(rG) );
}
IRExpr* res
@@ -15677,6 +15695,75 @@
}
+static Long dis_PMOVSXWQ_128 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ IRTemp srcBytes = newTemp(Ity_I32);
+ UChar modrm = getUChar(delta);
+ UChar* mbV = isAvx ? "v" : "";
+ UInt rG = gregOfRexRM(pfx, modrm);
+
+ if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+ assign( srcBytes, getXMMRegLane32( rE, 0 ) );
+ delta += 1;
+ DIP( "%spmovsxwq %s,%s\n", mbV, nameXMMReg(rE), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( srcBytes, loadLE( Ity_I32, mkexpr(addr) ) );
+ delta += alen;
+ DIP( "%spmovsxwq %s,%s\n", mbV, dis_buf, nameXMMReg(rG) );
+ }
+
+ (isAvx ? putYMMRegLoAndZU : putXMMReg)
+ ( rG, binop( Iop_64HLtoV128,
+ unop( Iop_16Sto64,
+ unop( Iop_32HIto16, mkexpr(srcBytes) ) ),
+ unop( Iop_16Sto64,
+ unop( Iop_32to16, mkexpr(srcBytes) ) ) ) );
+ return delta;
+}
+
+
+static Long dis_PMOVZXWQ_128 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ IRTemp srcVec = newTemp(Ity_V128);
+ UChar modrm = getUChar(delta);
+ UChar* mbV = isAvx ? "v" : "";
+ UInt rG = gregOfRexRM(pfx, modrm);
+
+ if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+ assign( srcVec, getXMMReg(rE) );
+ delta += 1;
+ DIP( "%spmovzxwq %s,%s\n", mbV, nameXMMReg(rE), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( srcVec,
+ unop( Iop_32UtoV128, loadLE( Ity_I32, mkexpr(addr) ) ) );
+ delta += alen;
+ DIP( "%spmovzxwq %s,%s\n", mbV, dis_buf, nameXMMReg(rG) );
+ }
+
+ IRTemp zeroVec = newTemp( Ity_V128 );
+ assign( zeroVec, IRExpr_Const( IRConst_V128(0) ) );
+
+ (isAvx ? putYMMRegLoAndZU : putXMMReg)
+ ( rG, binop( Iop_InterleaveLO16x8,
+ mkexpr(zeroVec),
+ binop( Iop_InterleaveLO16x8,
+ mkexpr(zeroVec), mkexpr(srcVec) ) ) );
+ return delta;
+}
+
+
/* Handles 128 bit versions of PMOVZXDQ and PMOVSXDQ. */
static Long dis_PMOVxXDQ_128 ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx, Bool xIsZ )
@@ -15767,6 +15854,78 @@
}
+/* Handles 128 bit versions of PMOVSXBQ. */
+static Long dis_PMOVSXBQ_128 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ IRTemp srcBytes = newTemp(Ity_I16);
+ UChar modrm = getUChar(delta);
+ UChar* mbV = isAvx ? "v" : "";
+ UInt rG = gregOfRexRM(pfx, modrm);
+ if ( epartIsReg(modrm) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+ assign( srcBytes, getXMMRegLane16( rE, 0 ) );
+ delta += 1;
+ DIP( "%spmovsxbq %s,%s\n", mbV, nameXMMReg(rE), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( srcBytes, loadLE( Ity_I16, mkexpr(addr) ) );
+ delta += alen;
+ DIP( "%spmovsxbq %s,%s\n", mbV, dis_buf, nameXMMReg(rG) );
+ }
+
+ (isAvx ? putYMMRegLoAndZU : putXMMReg)
+ ( rG, binop( Iop_64HLtoV128,
+ unop( Iop_8Sto64,
+ unop( Iop_16HIto8, mkexpr(srcBytes) ) ),
+ unop( Iop_8Sto64,
+ unop( Iop_16to8, mkexpr(srcBytes) ) ) ) );
+ return delta;
+}
+
+
+/* Handles 128 bit versions of PMOVZXBQ. */
+static Long dis_PMOVZXBQ_128 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ IRTemp srcVec = newTemp(Ity_V128);
+ UChar modrm = getUChar(delta);
+ UChar* mbV = isAvx ? "v" : "";
+ UInt rG = gregOfRexRM(pfx, modrm);
+ if ( epartIsReg(modrm) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+ assign( srcVec, getXMMReg(rE) );
+ delta += 1;
+ DIP( "%spmovzxbq %s,%s\n", mbV, nameXMMReg(rE), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( srcVec,
+ unop( Iop_32UtoV128,
+ unop( Iop_16Uto32, loadLE( Ity_I16, mkexpr(addr) ))));
+ delta += alen;
+ DIP( "%spmovzxbq %s,%s\n", mbV, dis_buf, nameXMMReg(rG) );
+ }
+
+ IRTemp zeroVec = newTemp(Ity_V128);
+ assign( zeroVec, IRExpr_Const( IRConst_V128(0) ) );
+
+ (isAvx ? putYMMRegLoAndZU : putXMMReg)
+ ( rG, binop( Iop_InterleaveLO8x16,
+ mkexpr(zeroVec),
+ binop( Iop_InterleaveLO8x16,
+ mkexpr(zeroVec),
+ binop( Iop_InterleaveLO8x16,
+ mkexpr(zeroVec), mkexpr(srcVec) ) ) ) );
+ return delta;
+}
+
+
static Long dis_PHMINPOSUW_128 ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx )
{
@@ -15915,33 +16074,7 @@
/* 66 0F 38 22 /r = PMOVSXBQ xmm1, xmm2/m16
Packed Move with Sign Extend from Byte to QWord (XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
- modrm = getUChar(delta);
-
- IRTemp srcBytes = newTemp(Ity_I16);
-
- if ( epartIsReg(modrm) ) {
- assign( srcBytes, getXMMRegLane16( eregOfRexRM(pfx, modrm), 0 ) );
- delta += 1;
- DIP( "pmovsxbq %s,%s\n",
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- } else {
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
- assign( srcBytes, loadLE( Ity_I16, mkexpr(addr) ) );
- delta += alen;
- DIP( "pmovsxbq %s,%s\n",
- dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- }
-
- putXMMReg( gregOfRexRM( pfx, modrm ),
- binop( Iop_64HLtoV128,
- unop( Iop_8Sto64,
- unop( Iop_16HIto8,
- mkexpr(srcBytes) ) ),
- unop( Iop_8Sto64,
- unop( Iop_16to8, mkexpr(srcBytes) ) ) ) );
-
+ delta = dis_PMOVSXBQ_128( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -15960,32 +16093,7 @@
/* 66 0F 38 24 /r = PMOVSXWQ xmm1, xmm2/m32
Packed Move with Sign Extend from Word to QWord (XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
- modrm = getUChar(delta);
-
- IRTemp srcBytes = newTemp(Ity_I32);
-
- if ( epartIsReg( modrm ) ) {
- assign( srcBytes, getXMMRegLane32( eregOfRexRM(pfx, modrm), 0 ) );
- delta += 1;
- DIP( "pmovsxwq %s,%s\n",
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- } else {
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
- assign( srcBytes, loadLE( Ity_I32, mkexpr(addr) ) );
- delta += alen;
- DIP( "pmovsxwq %s,%s\n",
- dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- }
-
- putXMMReg( gregOfRexRM( pfx, modrm ),
- binop( Iop_64HLtoV128,
- unop( Iop_16Sto64,
- unop( Iop_32HIto16, mkexpr(srcBytes) ) ),
- unop( Iop_16Sto64,
- unop( Iop_32to16, mkexpr(srcBytes) ) ) ) );
-
+ delta = dis_PMOVSXWQ_128( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -16099,38 +16207,7 @@
/* 66 0F 38 32 /r = PMOVZXBQ xmm1, xmm2/m16
Packed Move with Zero Extend from Byte to QWord (XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
- modrm = getUChar(delta);
-
- IRTemp srcVec = newTemp(Ity_V128);
-
- if ( epartIsReg(modrm) ) {
- assign( srcVec, getXMMReg( eregOfRexRM(pfx, modrm) ) );
- delta += 1;
- DIP( "pmovzxbq %s,%s\n",
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- } else {
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
- assign( srcVec,
- unop( Iop_32UtoV128,
- unop( Iop_16Uto32, loadLE( Ity_I16, mkexpr(addr) ))));
- delta += alen;
- DIP( "pmovzxbq %s,%s\n",
- dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- }
-
- IRTemp zeroVec = newTemp(Ity_V128);
- assign( zeroVec, IRExpr_Const( IRConst_V128(0) ) );
-
- putXMMReg( gregOfRexRM( pfx, modrm ),
- binop( Iop_InterleaveLO8x16,
- mkexpr(zeroVec),
- binop( Iop_InterleaveLO8x16,
- mkexpr(zeroVec),
- binop( Iop_InterleaveLO8x16,
- mkexpr(zeroVec), mkexpr(srcVec) ) ) ) );
-
+ delta = dis_PMOVZXBQ_128( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -16149,35 +16226,7 @@
/* 66 0F 38 34 /r = PMOVZXWQ xmm1, xmm2/m32
Packed Move with Zero Extend from Word to QWord (XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
- modrm = getUChar(delta);
-
- IRTemp srcVec = newTemp(Ity_V128);
-
- if ( epartIsReg( modrm ) ) {
- assign( srcVec, getXMMReg( eregOfRexRM(pfx, modrm) ) );
- delta += 1;
- DIP( "pmovzxwq %s,%s\n",
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- } else {
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
- assign( srcVec,
- unop( Iop_32UtoV128, loadLE( Ity_I32, mkexpr(addr) ) ) );
- delta += alen;
- DIP( "pmovzxwq %s,%s\n",
- dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
- }
-
- IRTemp zeroVec = newTemp( Ity_V128 );
- assign( zeroVec, IRExpr_Const( IRConst_V128(0) ) );
-
- putXMMReg( gregOfRexRM( pfx, modrm ),
- binop( Iop_InterleaveLO16x8,
- mkexpr(zeroVec),
- binop( Iop_InterleaveLO16x8,
- mkexpr(zeroVec), mkexpr(srcVec) ) ) );
-
+ delta = dis_PMOVZXWQ_128( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -16947,6 +16996,59 @@
}
+static IRTemp math_MPSADBW_128 ( IRTemp dst_vec, IRTemp src_vec, UInt imm8 )
+{
+ /* Mask out bits of the operands we don't need. This isn't
+ strictly necessary, but it does ensure Memcheck doesn't
+ give us any false uninitialised value errors as a
+ result. */
+ UShort src_mask[4] = { 0x000F, 0x00F0, 0x0F00, 0xF000 };
+ UShort dst_mask[2] = { 0x07FF, 0x7FF0 };
+
+ IRTemp src_maskV = newTemp(Ity_V128);
+ IRTemp dst_maskV = newTemp(Ity_V128);
+ assign(src_maskV, mkV128( src_mask[ imm8 & 3 ] ));
+ assign(dst_maskV, mkV128( dst_mask[ (imm8 >> 2) & 1 ] ));
+
+ IRTemp src_masked = newTemp(Ity_V128);
+ IRTemp dst_masked = newTemp(Ity_V128);
+ assign(src_masked, binop(Iop_AndV128, mkexpr(src_vec), mkexpr(src_maskV)));
+ assign(dst_masked, binop(Iop_AndV128, mkexpr(dst_vec), mkexpr(dst_maskV)));
+
+ /* Generate 4 64 bit values that we can hand to a clean helper */
+ IRTemp sHi = newTemp(Ity_I64);
+ IRTemp sLo = newTemp(Ity_I64);
+ assign( sHi, unop(Iop_V128HIto64, mkexpr(src_masked)) );
+ assign( sLo, unop(Iop_V128to64, mkexpr(src_masked)) );
+
+ IRTemp dHi = newTemp(Ity_I64);
+ IRTemp dLo = newTemp(Ity_I64);
+ assign( dHi, unop(Iop_V128HIto64, mkexpr(dst_masked)) );
+ assign( dLo, unop(Iop_V128to64, mkexpr(dst_masked)) );
+
+ /* Compute halves of the result separately */
+ IRTemp resHi = newTemp(Ity_I64);
+ IRTemp resLo = newTemp(Ity_I64);
+
+ IRExpr** argsHi
+ = mkIRExprVec_5( mkexpr(sHi), mkexpr(sLo), mkexpr(dHi), mkexpr(dLo),
+ mkU64( 0x80 | (imm8 & 7) ));
+ IRExpr** argsLo
+ = mkIRExprVec_5( mkexpr(sHi), mkexpr(sLo), mkexpr(dHi), mkexpr(dLo),
+ mkU64( 0x00 | (imm8 & 7) ));
+
+ assign(resHi, mkIRExprCCall( Ity_I64, 0/*regparm*/,
+ "amd64g_calc_mpsadbw",
+ &amd64g_calc_mpsadbw, argsHi ));
+ assign(resLo, mkIRExprCCall( Ity_I64, 0/*regparm*/,
+ "amd64g_calc_mpsadbw",
+ &amd64g_calc_mpsadbw, argsLo ));
+
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, binop(Iop_64HLtoV128, mkexpr(resHi), mkexpr(resLo)));
+ return res;
+}
+
static Long dis_EXTRACTPS ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx )
{
@@ -17605,22 +17707,22 @@
/* 66 0F 3A 42 /r ib = MPSADBW xmm1, xmm2/m128, imm8
Multiple Packed Sums of Absolule Difference (XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
Int imm8;
IRTemp src_vec = newTemp(Ity_V128);
IRTemp dst_vec = newTemp(Ity_V128);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, modrm);
- modrm = getUChar(delta);
-
- assign( dst_vec, getXMMReg( gregOfRexRM(pfx, modrm) ) );
+ assign( dst_vec, getXMMReg(rG) );
if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+
imm8 = (Int)getUChar(delta+1);
- assign( src_vec, getXMMReg( eregOfRexRM(pfx, modrm) ) );
+ assign( src_vec, getXMMReg(rE) );
delta += 1+1;
DIP( "mpsadbw $%d, %s,%s\n", imm8,
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ nameXMMReg(rE), nameXMMReg(rG) );
} else {
addr = disAMode( &alen, vbi, pfx, delta, dis_buf,
1/* imm8 is 1 byte after the amode */ );
@@ -17628,65 +17730,10 @@
assign( src_vec, loadLE( Ity_V128, mkexpr(addr) ) );
imm8 = (Int)getUChar(delta+alen);
delta += alen+1;
- DIP( "mpsadbw $%d, %s,%s\n",
- imm8, dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ DIP( "mpsadbw $%d, %s,%s\n", imm8, dis_buf, nameXMMReg(rG) );
}
- /* Mask out bits of the operands we don't need. This isn't
- strictly necessary, but it does ensure Memcheck doesn't
- give us any false uninitialised value errors as a
- result. */
- UShort src_mask[4] = { 0x000F, 0x00F0, 0x0F00, 0xF000 };
- UShort dst_mask[2] = { 0x07FF, 0x7FF0 };
-
- IRTemp src_maskV = newTemp(Ity_V128);
- IRTemp dst_maskV = newTemp(Ity_V128);
- assign(src_maskV, mkV128( src_mask[ imm8 & 3 ] ));
- assign(dst_maskV, mkV128( dst_mask[ (imm8 >> 2) & 1 ] ));
-
- IRTemp src_masked = newTemp(Ity_V128);
- IRTemp dst_masked = newTemp(Ity_V128);
- assign(src_masked,
- binop(Iop_AndV128, mkexpr(src_vec), mkexpr(src_maskV)));
- assign(dst_masked,
- binop(Iop_AndV128, mkexpr(dst_vec), mkexpr(dst_maskV)));
-
- /* Generate 4 64 bit values that we can hand to a clean helper */
- IRTemp sHi = newTemp(Ity_I64);
- IRTemp sLo = newTemp(Ity_I64);
- assign( sHi, unop(Iop_V128HIto64, mkexpr(src_masked)) );
- assign( sLo, unop(Iop_V128to64, mkexpr(src_masked)) );
-
- IRTemp dHi = newTemp(Ity_I64);
- IRTemp dLo = newTemp(Ity_I64);
- assign( dHi, unop(Iop_V128HIto64, mkexpr(dst_masked)) );
- assign( dLo, unop(Iop_V128to64, mkexpr(dst_masked)) );
-
- /* Compute halves of the result separately */
- IRTemp resHi = newTemp(Ity_I64);
- IRTemp resLo = newTemp(Ity_I64);
-
- IRExpr** argsHi
- = mkIRExprVec_5( mkexpr(sHi), mkexpr(sLo), mkexpr(dHi), mkexpr(dLo),
- mkU64( 0x80 | (imm8 & 7) ));
- IRExpr** argsLo
- = mkIRExprVec_5( mkexpr(sHi), mkexpr(sLo), mkexpr(dHi), mkexpr(dLo),
- mkU64( 0x00 | (imm8 & 7) ));
-
- assign(resHi, mkIRExprCCall( Ity_I64, 0/*regparm*/,
- "amd64g_calc_mpsadbw",
- &amd64g_calc_mpsadbw,
- argsHi ));
- assign(resLo, mkIRExprCCall( Ity_I64, 0/*regparm*/,
- "amd64g_calc_mpsadbw",
- &amd64g_calc_mpsadbw,
- argsLo ));
-
- IRTemp res = newTemp(Ity_V128);
- assign(res, binop(Iop_64HLtoV128, mkexpr(resHi), mkexpr(resLo)));
-
- putXMMReg( gregOfRexRM( pfx, modrm ), mkexpr(res) );
-
+ putXMMReg( rG, mkexpr( math_MPSADBW_128(dst_vec, src_vec, imm8) ) );
goto decode_success;
}
break;
@@ -22686,6 +22733,18 @@
}
break;
+ case 0x63:
+ /* VPACKSSWB r/m, rV, r ::: r = QNarrowBin16Sto8Sx16(rV, r/m) */
+ /* VPACKSSWB = VEX.NDS.128.66.0F.WIG 63 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG(
+ uses_vvvv, vbi, pfx, delta, "vpacksswb",
+ Iop_QNarrowBin16Sto8Sx16, NULL,
+ False/*!invertLeftArg*/, True/*swapArgs*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x64:
/* VPCMPGTB r/m, rV, r ::: r = rV `>s-by-8s` r/m */
/* VPCMPGTB = VEX.NDS.128.66.0F.WIG 64 /r */
@@ -23763,6 +23822,15 @@
}
break;
+ case 0xE0:
+ /* VPAVGB xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E0 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vpavgb", Iop_Avg8Ux16 );
+ goto decode_success;
+ }
+ break;
+
case 0xE1:
/* VPSRAW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E1 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -23783,6 +23851,15 @@
}
break;
+ case 0xE3:
+ /* VPAVGW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E3 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vpavgw", Iop_Avg16Ux8 );
+ goto decode_success;
+ }
+ break;
+
case 0xE4:
/* VPMULHUW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E4 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -23874,7 +23951,7 @@
uses_vvvv, vbi, pfx, delta, "vpsubsb", Iop_QSub8Sx16 );
goto decode_success;
}
- break;
+ break;
case 0xE9:
/* VPSUBSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG E9 /r */
@@ -23905,6 +23982,24 @@
}
break;
+ case 0xEC:
+ /* VPADDSB xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG EC /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vpaddsb", Iop_QAdd8Sx16 );
+ goto decode_success;
+ }
+ break;
+
+ case 0xED:
+ /* VPADDSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG ED /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vpaddsw", Iop_QAdd16Sx8 );
+ goto decode_success;
+ }
+ break;
+
case 0xEE:
/* VPMAXSW r/m, rV, r ::: r = max-signed16s(rV, r/m) */
/* VPMAXSW = VEX.NDS.128.66.0F.WIG EE /r */
@@ -24196,6 +24291,42 @@
}
break;
+ case 0x01:
+ case 0x02:
+ case 0x03:
+ /* VPHADDW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 01 /r */
+ /* VPHADDD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 02 /r */
+ /* VPHADDSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 03 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PHADD_128( vbi, pfx, delta, True/*isAvx*/, opc );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
+ case 0x04:
+ /* VPMADDUBSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 04 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta, "vpmaddubsw",
+ math_PMADDUBSW_128 );
+ goto decode_success;
+ }
+ break;
+
+ case 0x05:
+ case 0x06:
+ case 0x07:
+ /* VPHSUBW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 05 /r */
+ /* VPHSUBD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 06 /r */
+ /* VPHSUBSW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 07 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PHADD_128( vbi, pfx, delta, True/*isAvx*/, opc );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
case 0x0C:
/* VPERMILPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.W0 0C /r */
if (have66noF2noF3(pfx)
@@ -24406,6 +24537,26 @@
}
break;
+ case 0x1C:
+ /* VPABSB xmm2/m128, xmm1 = VEX.128.66.0F38.WIG 1C /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_to_G_unary(
+ uses_vvvv, vbi, pfx, delta,
+ "vpabsb", math_PABS_XMM_pap1 );
+ goto decode_success;
+ }
+ break;
+
+ case 0x1D:
+ /* VPABSW xmm2/m128, xmm1 = VEX.128.66.0F38.WIG 1D /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_to_G_unary(
+ uses_vvvv, vbi, pfx, delta,
+ "vpabsw", math_PABS_XMM_pap2 );
+ goto decode_success;
+ }
+ break;
+
case 0x1E:
/* VPABSD xmm2/m128, xmm1 = VEX.128.66.0F38.WIG 1E /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -24436,6 +24587,15 @@
}
break;
+ case 0x22:
+ /* VPMOVSXBQ xmm2/m16, xmm1 */
+ /* VPMOVSXBQ = VEX.128.66.0F38.WIG 22 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PMOVSXBQ_128( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x23:
/* VPMOVSXWD xmm2/m64, xmm1 = VEX.128.66.0F38.WIG 23 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -24445,6 +24605,14 @@
}
break;
+ case 0x24:
+ /* VPMOVSXWQ xmm2/m32, xmm1 = VEX.128.66.0F38.WIG 24 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PMOVSXWQ_128( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x25:
/* VPMOVSXDQ xmm2/m64, xmm1 = VEX.128.66.0F38.WIG 25 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -24491,6 +24659,18 @@
}
break;
+ case 0x2B:
+ /* VPACKUSDW r/m, rV, r ::: r = QNarrowBin32Sto16Ux8(rV, r/m) */
+ /* VPACKUSDW = VEX.NDS.128.66.0F38.WIG 2B /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG(
+ uses_vvvv, vbi, pfx, delta, "vpackusdw",
+ Iop_QNarrowBin32Sto16Ux8, NULL,
+ False/*!invertLeftArg*/, True/*swapArgs*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x30:
/* VPMOVZXBW xmm2/m64, xmm1 */
/* VPMOVZXBW = VEX.128.66.0F38.WIG 30 /r */
@@ -24511,6 +24691,15 @@
}
break;
+ case 0x32:
+ /* VPMOVZXBQ xmm2/m16, xmm1 */
+ /* VPMOVZXBQ = VEX.128.66.0F38.WIG 32 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PMOVZXBQ_128( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x33:
/* VPMOVZXWD xmm2/m64, xmm1 */
/* VPMOVZXWD = VEX.128.66.0F38.WIG 33 /r */
@@ -24521,6 +24710,23 @@
}
break;
+ case 0x34:
+ /* VPMOVZXWQ xmm2/m32, xmm1 = VEX.128.66.0F38.WIG 34 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PMOVZXWQ_128( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
+ case 0x35:
+ /* VPMOVZXDQ xmm2/m64, xmm1 = VEX.128.66.0F38.WIG 35 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_PMOVxXDQ_128( vbi, pfx, delta,
+ True/*isAvx*/, True/*xIsZ*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x37:
/* VPCMPGTQ r/m, rV, r ::: r = rV `>s-by-64s` r/m */
/* VPCMPGTQ = VEX.NDS.128.66.0F38.WIG 37 /r */
@@ -25663,6 +25869,45 @@
}
break;
+ case 0x42:
+ /* VMPSADBW imm8, xmm3/m128,xmm2,xmm1 */
+ /* VMPSADBW = VEX.NDS.128.66.0F3A.WIG 42 /r ib */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ UChar modrm = getUChar(delta);
+ Int imm8;
+ IRTemp src_vec = newTemp(Ity_V128);
+ IRTemp dst_vec = newTemp(Ity_V128);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ UInt rV = getVexNvvvv(pfx);
+
+ assign( dst_vec, getXMMReg(rV) );
+
+ if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
+
+ imm8 = (Int)getUChar(delta+1);
+ assign( src_vec, getXMMReg(rE) );
+ delta += 1+1;
+ DIP( "vmpsadbw $%d, %s,%s,%s\n", imm8,
+ nameXMMReg(rE), nameXMMReg(rV), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf,
+ 1/* imm8 is 1 byte after the amode */ );
+ gen_SEGV_if_not_16_aligned( addr );
+ assign( src_vec, loadLE( Ity_V128, mkexpr(addr) ) );
+ imm8 = (Int)getUChar(delta+alen);
+ delta += alen+1;
+ DIP( "vmpsadbw $%d, %s,%s,%s\n", imm8,
+ dis_buf, nameXMMReg(rV), nameXMMReg(rG) );
+ }
+
+ putYMMRegLoAndZU( rG, mkexpr( math_MPSADBW_128(dst_vec,
+ src_vec, imm8) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
case 0x4A:
/* VBLENDVPS xmmG, xmmE/memE, xmmV, xmmIS4
::: xmmG:V128 = PBLEND(xmmE, xmmV, xmmIS4) (RMVR) */
|
|
From: <sv...@va...> - 2012-06-24 14:01:03
|
sewardj 2012-06-24 15:00:56 +0100 (Sun, 24 Jun 2012)
New Revision: 12669
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+84 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 14:44:35 +01:00 (rev 12668)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 15:00:56 +01:00 (rev 12669)
@@ -283,6 +283,9 @@
GEN_test_Monly(VMOVNTDQ_128,
"vmovntdq %%xmm8, (%%rax)")
+GEN_test_Monly(VMOVNTDQ_256,
+ "vmovntdq %%ymm8, (%%rax)")
+
GEN_test_RandM(VMOVUPS_XMM_to_XMMorMEM,
"vmovups %%xmm8, %%xmm7",
"vmovups %%xmm9, (%%rax)")
@@ -329,6 +332,9 @@
GEN_test_Monly(VMOVHPD_128_StoreForm,
"vmovhpd %%xmm8, (%%rax)")
+GEN_test_Monly(VMOVHPS_128_StoreForm,
+ "vmovhps %%xmm8, (%%rax)")
+
GEN_test_RandM(VPCMPEQB_128,
"vpcmpeqb %%xmm9, %%xmm8, %%xmm7",
"vpcmpeqb (%%rax), %%xmm8, %%xmm7")
@@ -353,10 +359,34 @@
"vmaxps %%xmm9, %%xmm8, %%xmm7",
"vmaxps (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMAXPS_256,
+ "vmaxps %%ymm9, %%ymm8, %%ymm7",
+ "vmaxps (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VMAXPD_128,
+ "vmaxpd %%xmm9, %%xmm8, %%xmm7",
+ "vmaxpd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VMAXPD_256,
+ "vmaxpd %%ymm9, %%ymm8, %%ymm7",
+ "vmaxpd (%%rax), %%ymm8, %%ymm7")
+
GEN_test_RandM(VMINPS_128,
"vminps %%xmm9, %%xmm8, %%xmm7",
"vminps (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VMINPS_256,
+ "vminps %%ymm9, %%ymm8, %%ymm7",
+ "vminps (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VMINPD_128,
+ "vminpd %%xmm9, %%xmm8, %%xmm7",
+ "vminpd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VMINPD_256,
+ "vminpd %%ymm9, %%ymm8, %%ymm7",
+ "vminpd (%%rax), %%ymm8, %%ymm7")
+
GEN_test_RandM(VCVTPS2DQ_128,
"vcvtps2dq %%xmm8, %%xmm7",
"vcvtps2dq (%%rax), %%xmm8")
@@ -996,6 +1026,9 @@
GEN_test_Monly(VMOVHPD_128_LoadForm,
"vmovhpd (%%rax), %%xmm8, %%xmm7")
+GEN_test_Monly(VMOVHPS_128_LoadForm,
+ "vmovhps (%%rax), %%xmm8, %%xmm7")
+
// The y suffix denotes a 256 -> 128 operation
GEN_test_RandM(VCVTPD2PS_256,
"vcvtpd2psy %%ymm8, %%xmm7",
@@ -1796,7 +1829,39 @@
GEN_test_Monly(VLDDQU_256,
"vlddqu 1(%%rax), %%ymm8")
+GEN_test_Monly(VMOVNTDQA_128,
+ "vmovntdqa (%%rax), %%xmm9")
+GEN_test_Monly(VMASKMOVDQU_128,
+ "xchgq %%rax, %%rdi;"
+ "vmaskmovdqu %%xmm8, %%xmm9;"
+ "xchgq %%rax, %%rdi")
+
+GEN_test_Ronly(VMOVMSKPD_128,
+ "vmovmskpd %%xmm9, %%r14d")
+
+GEN_test_Ronly(VMOVMSKPD_256,
+ "vmovmskpd %%ymm9, %%r14d")
+
+GEN_test_Ronly(VMOVMSKPS_128,
+ "vmovmskps %%xmm9, %%r14d")
+
+GEN_test_Ronly(VMOVMSKPS_256,
+ "vmovmskps %%ymm9, %%r14d")
+
+GEN_test_Monly(VMOVNTPD_128,
+ "vmovntpd %%xmm9, (%%rax)")
+
+GEN_test_Monly(VMOVNTPD_256,
+ "vmovntpd %%ymm9, (%%rax)")
+
+GEN_test_Monly(VMOVNTPS_128,
+ "vmovntps %%xmm9, (%%rax)")
+
+GEN_test_Monly(VMOVNTPS_256,
+ "vmovntps %%ymm9, (%%rax)")
+
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -2236,5 +2301,24 @@
DO_D( VEXTRACTPS_0x3 );
DO_D( VLDDQU_128 );
DO_D( VLDDQU_256 );
+ DO_D( VMAXPS_256 );
+ DO_D( VMAXPD_128 );
+ DO_D( VMAXPD_256 );
+ DO_D( VMINPS_256 );
+ DO_D( VMINPD_128 );
+ DO_D( VMINPD_256 );
+ DO_D( VMOVHPS_128_StoreForm );
+ DO_D( VMOVNTDQ_256 );
+ DO_D( VMOVHPS_128_LoadForm );
+ DO_D( VMOVNTDQA_128 );
+ DO_D( VMASKMOVDQU_128 );
+ DO_D( VMOVMSKPD_128 );
+ DO_D( VMOVMSKPD_256 );
+ DO_D( VMOVMSKPS_128 );
+ DO_D( VMOVMSKPS_256 );
+ DO_D( VMOVNTPD_128 );
+ DO_D( VMOVNTPD_256 );
+ DO_D( VMOVNTPS_128 );
+ DO_D( VMOVNTPS_256 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 14:00:37
|
sewardj 2012-06-24 15:00:27 +0100 (Sun, 24 Jun 2012)
New Revision: 2407
Log:
Even more AVX isns:
VMOVHPS m64, xmm1, xmm2 = VEX.NDS.128.0F.WIG 16 /r
VMOVHPS xmm1, m64 = VEX.128.0F.WIG 17 /r
VMOVNTPD xmm1, m128 = VEX.128.66.0F.WIG 2B /r
VMOVNTPS xmm1, m128 = VEX.128.0F.WIG 2B /r
VMOVNTPD ymm1, m256 = VEX.256.66.0F.WIG 2B /r
VMOVNTPS ymm1, m256 = VEX.256.0F.WIG 2B /r
VMOVMSKPD xmm2, r32 = VEX.128.66.0F.WIG 50 /r
VMOVMSKPD ymm2, r32 = VEX.256.66.0F.WIG 50 /r
VMOVMSKPS xmm2, r32 = VEX.128.0F.WIG 50 /r
VMOVMSKPS ymm2, r32 = VEX.256.0F.WIG 50 /r
VMINPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.0F.WIG 5D /r
VMINPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 5D /r
VMINPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 5D /r
VMAXPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.0F.WIG 5F /r
VMAXPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 5F /r
VMAXPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 5F /r
VMOVNTDQ ymm1, m256 = VEX.256.66.0F.WIG E7 /r
VMASKMOVDQU xmm2, xmm1 = VEX.128.66.0F.WIG F7 /r
VMOVNTDQA m128, xmm1 = VEX.128.66.0F38.WIG 2A /r
(Jakub Jelinek, ja...@re...), #273475 comment 135.
Modified files:
trunk/priv/guest_amd64_toIR.c
trunk/priv/host_amd64_isel.c
trunk/priv/ir_defs.c
trunk/pub/libvex_ir.h
Modified: trunk/pub/libvex_ir.h (+4 -1)
===================================================================
--- trunk/pub/libvex_ir.h 2012-06-24 14:44:17 +01:00 (rev 2406)
+++ trunk/pub/libvex_ir.h 2012-06-24 15:00:27 +01:00 (rev 2407)
@@ -1453,7 +1453,10 @@
Iop_Sqrt32Fx8,
Iop_Sqrt64Fx4,
- Iop_RSqrt32Fx8
+ Iop_RSqrt32Fx8,
+
+ Iop_Max32Fx8, Iop_Min32Fx8,
+ Iop_Max64Fx4, Iop_Min64Fx4
}
IROp;
Modified: trunk/priv/guest_amd64_toIR.c (+337 -102)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 14:44:17 +01:00 (rev 2406)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 15:00:27 +01:00 (rev 2407)
@@ -1493,6 +1493,11 @@
return IRExpr_Get( ymmGuestRegLane128offset(ymmreg,laneno), Ity_V128 );
}
+static IRExpr* getYMMRegLane32 ( UInt ymmreg, Int laneno )
+{
+ return IRExpr_Get( ymmGuestRegLane32offset(ymmreg,laneno), Ity_I32 );
+}
+
static void putYMMReg ( UInt ymmreg, IRExpr* e )
{
vassert(typeOfIRExpr(irsb->tyenv,e) == Ity_V256);
@@ -10854,6 +10859,183 @@
}
+static Long dis_MASKMOVDQU ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp regD = newTemp(Ity_V128);
+ IRTemp mask = newTemp(Ity_V128);
+ IRTemp olddata = newTemp(Ity_V128);
+ IRTemp newdata = newTemp(Ity_V128);
+ IRTemp addr = newTemp(Ity_I64);
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rE = eregOfRexRM(pfx,modrm);
+
+ assign( addr, handleAddrOverrides( vbi, pfx, getIReg64(R_RDI) ));
+ assign( regD, getXMMReg( rG ));
+
+ /* Unfortunately can't do the obvious thing with SarN8x16
+ here since that can't be re-emitted as SSE2 code - no such
+ insn. */
+ assign( mask,
+ binop(Iop_64HLtoV128,
+ binop(Iop_SarN8x8,
+ getXMMRegLane64( eregOfRexRM(pfx,modrm), 1 ),
+ mkU8(7) ),
+ binop(Iop_SarN8x8,
+ getXMMRegLane64( eregOfRexRM(pfx,modrm), 0 ),
+ mkU8(7) ) ));
+ assign( olddata, loadLE( Ity_V128, mkexpr(addr) ));
+ assign( newdata, binop(Iop_OrV128,
+ binop(Iop_AndV128,
+ mkexpr(regD),
+ mkexpr(mask) ),
+ binop(Iop_AndV128,
+ mkexpr(olddata),
+ unop(Iop_NotV128, mkexpr(mask)))) );
+ storeLE( mkexpr(addr), mkexpr(newdata) );
+
+ delta += 1;
+ DIP("%smaskmovdqu %s,%s\n", isAvx ? "v" : "",
+ nameXMMReg(rE), nameXMMReg(rG) );
+ return delta;
+}
+
+
+static Long dis_MOVMSKPS_128 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rE = eregOfRexRM(pfx,modrm);
+ IRTemp t0 = newTemp(Ity_I32);
+ IRTemp t1 = newTemp(Ity_I32);
+ IRTemp t2 = newTemp(Ity_I32);
+ IRTemp t3 = newTemp(Ity_I32);
+ delta += 1;
+ assign( t0, binop( Iop_And32,
+ binop(Iop_Shr32, getXMMRegLane32(rE,0), mkU8(31)),
+ mkU32(1) ));
+ assign( t1, binop( Iop_And32,
+ binop(Iop_Shr32, getXMMRegLane32(rE,1), mkU8(30)),
+ mkU32(2) ));
+ assign( t2, binop( Iop_And32,
+ binop(Iop_Shr32, getXMMRegLane32(rE,2), mkU8(29)),
+ mkU32(4) ));
+ assign( t3, binop( Iop_And32,
+ binop(Iop_Shr32, getXMMRegLane32(rE,3), mkU8(28)),
+ mkU32(8) ));
+ putIReg32( rG, binop(Iop_Or32,
+ binop(Iop_Or32, mkexpr(t0), mkexpr(t1)),
+ binop(Iop_Or32, mkexpr(t2), mkexpr(t3)) ) );
+ DIP("%smovmskps %s,%s\n", isAvx ? "v" : "",
+ nameXMMReg(rE), nameIReg32(rG));
+ return delta;
+}
+
+
+static Long dis_MOVMSKPS_256 ( VexAbiInfo* vbi, Prefix pfx, Long delta )
+{
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rE = eregOfRexRM(pfx,modrm);
+ IRTemp t0 = newTemp(Ity_I32);
+ IRTemp t1 = newTemp(Ity_I32);
+ IRTemp t2 = newTemp(Ity_I32);
+ IRTemp t3 = newTemp(Ity_I32);
+ IRTemp t4 = newTemp(Ity_I32);
+ IRTemp t5 = newTemp(Ity_I32);
+ IRTemp t6 = newTemp(Ity_I32);
+ IRTemp t7 = newTemp(Ity_I32);
+ delta += 1;
+ assign( t0, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,0), mkU8(31)),
+ mkU32(1) ));
+ assign( t1, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,1), mkU8(30)),
+ mkU32(2) ));
+ assign( t2, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,2), mkU8(29)),
+ mkU32(4) ));
+ assign( t3, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,3), mkU8(28)),
+ mkU32(8) ));
+ assign( t4, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,4), mkU8(27)),
+ mkU32(16) ));
+ assign( t5, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,5), mkU8(26)),
+ mkU32(32) ));
+ assign( t6, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,6), mkU8(25)),
+ mkU32(64) ));
+ assign( t7, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,7), mkU8(24)),
+ mkU32(128) ));
+ putIReg32( rG, binop(Iop_Or32,
+ binop(Iop_Or32,
+ binop(Iop_Or32, mkexpr(t0), mkexpr(t1)),
+ binop(Iop_Or32, mkexpr(t2), mkexpr(t3)) ),
+ binop(Iop_Or32,
+ binop(Iop_Or32, mkexpr(t4), mkexpr(t5)),
+ binop(Iop_Or32, mkexpr(t6), mkexpr(t7)) ) ) );
+ DIP("vmovmskps %s,%s\n", nameYMMReg(rE), nameIReg32(rG));
+ return delta;
+}
+
+
+static Long dis_MOVMSKPD_128 ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rE = eregOfRexRM(pfx,modrm);
+ IRTemp t0 = newTemp(Ity_I32);
+ IRTemp t1 = newTemp(Ity_I32);
+ delta += 1;
+ assign( t0, binop( Iop_And32,
+ binop(Iop_Shr32, getXMMRegLane32(rE,1), mkU8(31)),
+ mkU32(1) ));
+ assign( t1, binop( Iop_And32,
+ binop(Iop_Shr32, getXMMRegLane32(rE,3), mkU8(30)),
+ mkU32(2) ));
+ putIReg32( rG, binop(Iop_Or32, mkexpr(t0), mkexpr(t1) ) );
+ DIP("%smovmskpd %s,%s\n", isAvx ? "v" : "",
+ nameXMMReg(rE), nameIReg32(rG));
+ return delta;
+}
+
+
+static Long dis_MOVMSKPD_256 ( VexAbiInfo* vbi, Prefix pfx, Long delta )
+{
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rE = eregOfRexRM(pfx,modrm);
+ IRTemp t0 = newTemp(Ity_I32);
+ IRTemp t1 = newTemp(Ity_I32);
+ IRTemp t2 = newTemp(Ity_I32);
+ IRTemp t3 = newTemp(Ity_I32);
+ delta += 1;
+ assign( t0, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,1), mkU8(31)),
+ mkU32(1) ));
+ assign( t1, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,3), mkU8(30)),
+ mkU32(2) ));
+ assign( t2, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,5), mkU8(29)),
+ mkU32(4) ));
+ assign( t3, binop( Iop_And32,
+ binop(Iop_Shr32, getYMMRegLane32(rE,7), mkU8(28)),
+ mkU32(8) ));
+ putIReg32( rG, binop(Iop_Or32,
+ binop(Iop_Or32, mkexpr(t0), mkexpr(t1)),
+ binop(Iop_Or32, mkexpr(t2), mkexpr(t3)) ) );
+ DIP("vmovmskps %s,%s\n", nameYMMReg(rE), nameIReg32(rG));
+ return delta;
+}
+
+
/* Note, this also handles SSE(1) insns. */
__attribute__((noinline))
static
@@ -11730,7 +11912,8 @@
case 0x50:
/* 0F 50 = MOVMSKPS - move 4 sign bits from 4 x F32 in xmm(E)
to 4 lowest bits of ireg(G) */
- if (haveNo66noF2noF3(pfx) && (sz == 4 || sz == 8)) {
+ if (haveNo66noF2noF3(pfx) && (sz == 4 || sz == 8)
+ && epartIsReg(getUChar(delta))) {
/* sz == 8 is a kludge to handle insns with REX.W redundantly
set to 1, which has been known to happen:
@@ -11749,38 +11932,8 @@
AMD docs give no indication that REX.W is even valid for this
insn. */
- modrm = getUChar(delta);
- if (epartIsReg(modrm)) {
- Int src;
- t0 = newTemp(Ity_I32);
- t1 = newTemp(Ity_I32);
- t2 = newTemp(Ity_I32);
- t3 = newTemp(Ity_I32);
- delta += 1;
- src = eregOfRexRM(pfx,modrm);
- assign( t0, binop( Iop_And32,
- binop(Iop_Shr32, getXMMRegLane32(src,0), mkU8(31)),
- mkU32(1) ));
- assign( t1, binop( Iop_And32,
- binop(Iop_Shr32, getXMMRegLane32(src,1), mkU8(30)),
- mkU32(2) ));
- assign( t2, binop( Iop_And32,
- binop(Iop_Shr32, getXMMRegLane32(src,2), mkU8(29)),
- mkU32(4) ));
- assign( t3, binop( Iop_And32,
- binop(Iop_Shr32, getXMMRegLane32(src,3), mkU8(28)),
- mkU32(8) ));
- putIReg32( gregOfRexRM(pfx,modrm),
- binop(Iop_Or32,
- binop(Iop_Or32, mkexpr(t0), mkexpr(t1)),
- binop(Iop_Or32, mkexpr(t2), mkexpr(t3))
- )
- );
- DIP("movmskps %s,%s\n", nameXMMReg(src),
- nameIReg32(gregOfRexRM(pfx,modrm)));
- goto decode_success;
- }
- /* else fall through */
+ delta = dis_MOVMSKPS_128( vbi, pfx, delta, False/*!isAvx*/ );
+ goto decode_success;
}
/* 66 0F 50 = MOVMSKPD - move 2 sign bits from 2 x F64 in xmm(E) to
2 lowest bits of ireg(G) */
@@ -11790,27 +11943,8 @@
66 4c 0f 50 d9 rex64X movmskpd %xmm1,%r11d
20071106: see further comments on MOVMSKPS implementation above.
*/
- modrm = getUChar(delta);
- if (epartIsReg(modrm)) {
- Int src;
- t0 = newTemp(Ity_I32);
- t1 = newTemp(Ity_I32);
- delta += 1;
- src = eregOfRexRM(pfx,modrm);
- assign( t0, binop( Iop_And32,
- binop(Iop_Shr32, getXMMRegLane32(src,1), mkU8(31)),
- mkU32(1) ));
- assign( t1, binop( Iop_And32,
- binop(Iop_Shr32, getXMMRegLane32(src,3), mkU8(30)),
- mkU32(2) ));
- putIReg32( gregOfRexRM(pfx,modrm),
- binop(Iop_Or32, mkexpr(t0), mkexpr(t1))
- );
- DIP("movmskpd %s,%s\n", nameXMMReg(src),
- nameIReg32(gregOfRexRM(pfx,modrm)));
- goto decode_success;
- }
- /* else fall through */
+ delta = dis_MOVMSKPD_128( vbi, pfx, delta, False/*!isAvx*/ );
+ goto decode_success;
}
break;
@@ -13731,47 +13865,9 @@
if (ok) goto decode_success;
}
/* 66 0F F7 = MASKMOVDQU -- store selected bytes of double quadword */
- if (have66noF2noF3(pfx) && sz == 2) {
- modrm = getUChar(delta);
- if (epartIsReg(modrm)) {
- IRTemp regD = newTemp(Ity_V128);
- IRTemp mask = newTemp(Ity_V128);
- IRTemp olddata = newTemp(Ity_V128);
- IRTemp newdata = newTemp(Ity_V128);
- addr = newTemp(Ity_I64);
-
- assign( addr, handleAddrOverrides( vbi, pfx, getIReg64(R_RDI) ));
- assign( regD, getXMMReg( gregOfRexRM(pfx,modrm) ));
-
- /* Unfortunately can't do the obvious thing with SarN8x16
- here since that can't be re-emitted as SSE2 code - no such
- insn. */
- assign(
- mask,
- binop(Iop_64HLtoV128,
- binop(Iop_SarN8x8,
- getXMMRegLane64( eregOfRexRM(pfx,modrm), 1 ),
- mkU8(7) ),
- binop(Iop_SarN8x8,
- getXMMRegLane64( eregOfRexRM(pfx,modrm), 0 ),
- mkU8(7) ) ));
- assign( olddata, loadLE( Ity_V128, mkexpr(addr) ));
- assign( newdata,
- binop(Iop_OrV128,
- binop(Iop_AndV128,
- mkexpr(regD),
- mkexpr(mask) ),
- binop(Iop_AndV128,
- mkexpr(olddata),
- unop(Iop_NotV128, mkexpr(mask)))) );
- storeLE( mkexpr(addr), mkexpr(newdata) );
-
- delta += 1;
- DIP("maskmovdqu %s,%s\n", nameXMMReg( eregOfRexRM(pfx,modrm) ),
- nameXMMReg( gregOfRexRM(pfx,modrm) ) );
- goto decode_success;
- }
- /* else fall through */
+ if (have66noF2noF3(pfx) && sz == 2 && epartIsReg(getUChar(delta))) {
+ delta = dis_MASKMOVDQU( vbi, pfx, delta, False/*!isAvx*/ );
+ goto decode_success;
}
break;
@@ -21578,16 +21674,18 @@
*uses_vvvv = True;
goto decode_success;
}
+ /* VMOVHPS m64, xmm1, xmm2 = VEX.NDS.128.0F.WIG 16 /r */
+ /* Insn exists only in mem form, it appears. */
/* VMOVHPD m64, xmm1, xmm2 = VEX.NDS.128.66.0F.WIG 16 /r */
/* Insn exists only in mem form, it appears. */
- if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
- && !epartIsReg(getUChar(delta))) {
+ if ((have66noF2noF3(pfx) || haveNo66noF2noF3(pfx))
+ && 0==getVexL(pfx)/*128*/ && !epartIsReg(getUChar(delta))) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx, modrm);
UInt rV = getVexNvvvv(pfx);
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
delta += alen;
- DIP("vmovhpd %s,%s,%s\n",
+ DIP("vmovhp%c %s,%s,%s\n", have66(pfx) ? 'd' : 's',
dis_buf, nameXMMReg(rV), nameXMMReg(rG));
IRTemp res = newTemp(Ity_V128);
assign(res, binop(Iop_64HLtoV128,
@@ -21611,16 +21709,19 @@
break;
case 0x17:
+ /* VMOVHPS xmm1, m64 = VEX.128.0F.WIG 17 /r */
+ /* Insn exists only in mem form, it appears. */
/* VMOVHPD xmm1, m64 = VEX.128.66.0F.WIG 17 /r */
/* Insn exists only in mem form, it appears. */
- if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
- && !epartIsReg(getUChar(delta))) {
+ if ((have66noF2noF3(pfx) || haveNo66noF2noF3(pfx))
+ && 0==getVexL(pfx)/*128*/ && !epartIsReg(getUChar(delta))) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx, modrm);
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
delta += alen;
storeLE( mkexpr(addr), getXMMRegLane64( rG, 1));
- DIP("vmovhpd %s,%s\n", nameXMMReg(rG), dis_buf);
+ DIP("vmovhp%c %s,%s\n", have66(pfx) ? 'd' : 's',
+ nameXMMReg(rG), dis_buf);
goto decode_success;
}
break;
@@ -21896,6 +21997,41 @@
break;
}
+ case 0x2B:
+ /* VMOVNTPD xmm1, m128 = VEX.128.66.0F.WIG 2B /r */
+ /* VMOVNTPS xmm1, m128 = VEX.128.0F.WIG 2B /r */
+ if ((have66noF2noF3(pfx) || haveNo66noF2noF3(pfx))
+ && 0==getVexL(pfx)/*128*/ && !epartIsReg(getUChar(delta))) {
+ UChar modrm = getUChar(delta);
+ UInt rS = gregOfRexRM(pfx, modrm);
+ IRTemp tS = newTemp(Ity_V128);
+ assign(tS, getXMMReg(rS));
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ delta += alen;
+ gen_SEGV_if_not_16_aligned(addr);
+ storeLE(mkexpr(addr), mkexpr(tS));
+ DIP("vmovntp%c %s,%s\n", have66(pfx) ? 'd' : 's',
+ nameXMMReg(rS), dis_buf);
+ goto decode_success;
+ }
+ /* VMOVNTPD ymm1, m256 = VEX.256.66.0F.WIG 2B /r */
+ /* VMOVNTPS ymm1, m256 = VEX.256.0F.WIG 2B /r */
+ if ((have66noF2noF3(pfx) || haveNo66noF2noF3(pfx))
+ && 1==getVexL(pfx)/*256*/ && !epartIsReg(getUChar(delta))) {
+ UChar modrm = getUChar(delta);
+ UInt rS = gregOfRexRM(pfx, modrm);
+ IRTemp tS = newTemp(Ity_V256);
+ assign(tS, getYMMReg(rS));
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ delta += alen;
+ gen_SEGV_if_not_32_aligned(addr);
+ storeLE(mkexpr(addr), mkexpr(tS));
+ DIP("vmovntp%c %s,%s\n", have66(pfx) ? 'd' : 's',
+ nameYMMReg(rS), dis_buf);
+ goto decode_success;
+ }
+ break;
+
case 0x2C:
/* VCVTTSD2SI xmm1/m32, r32 = VEX.LIG.F2.0F.W0 2C /r */
if (haveF2no66noF3(pfx) && 0==getRexW(pfx)/*W0*/) {
@@ -21958,6 +22094,29 @@
}
break;
+ case 0x50:
+ /* VMOVMSKPD xmm2, r32 = VEX.128.66.0F.WIG 50 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_MOVMSKPD_128( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ /* VMOVMSKPD ymm2, r32 = VEX.256.66.0F.WIG 50 /r */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_MOVMSKPD_256( vbi, pfx, delta );
+ goto decode_success;
+ }
+ /* VMOVMSKPS xmm2, r32 = VEX.128.0F.WIG 50 /r */
+ if (haveNo66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_MOVMSKPS_128( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ /* VMOVMSKPS ymm2, r32 = VEX.256.0F.WIG 50 /r */
+ if (haveNo66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_MOVMSKPS_256( vbi, pfx, delta );
+ goto decode_success;
+ }
+ break;
+
case 0x51:
/* VSQRTSS xmm3/m64(E), xmm2(V), xmm1(G) = VEX.NDS.LIG.F3.0F.WIG 51 /r */
if (haveF3no66noF2(pfx)) {
@@ -22393,6 +22552,24 @@
uses_vvvv, vbi, pfx, delta, "vminps", Iop_Min32Fx4 );
goto decode_success;
}
+ /* VMINPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.0F.WIG 5D /r */
+ if (haveNo66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_AVX256_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vminps", Iop_Min32Fx8 );
+ goto decode_success;
+ }
+ /* VMINPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 5D /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vminpd", Iop_Min64Fx2 );
+ goto decode_success;
+ }
+ /* VMINPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 5D /r */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_AVX256_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vminpd", Iop_Min64Fx4 );
+ goto decode_success;
+ }
break;
case 0x5E:
@@ -22453,6 +22630,24 @@
uses_vvvv, vbi, pfx, delta, "vmaxps", Iop_Max32Fx4 );
goto decode_success;
}
+ /* VMAXPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.0F.WIG 5F /r */
+ if (haveNo66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_AVX256_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vmaxps", Iop_Max32Fx8 );
+ goto decode_success;
+ }
+ /* VMAXPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 5F /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_AVX128_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vmaxpd", Iop_Max64Fx2 );
+ goto decode_success;
+ }
+ /* VMAXPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 5F /r */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_AVX256_E_V_to_G(
+ uses_vvvv, vbi, pfx, delta, "vmaxpd", Iop_Max64Fx4 );
+ goto decode_success;
+ }
break;
case 0x60:
@@ -23642,7 +23837,7 @@
break;
case 0xE7:
- /* MOVNTDQ xmm1, m128 = VEX.128.66.0F.WIG E7 /r */
+ /* VMOVNTDQ xmm1, m128 = VEX.128.66.0F.WIG E7 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx,modrm);
@@ -23656,6 +23851,20 @@
}
/* else fall through */
}
+ /* VMOVNTDQ ymm1, m256 = VEX.256.66.0F.WIG E7 /r */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ if (!epartIsReg(modrm)) {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ gen_SEGV_if_not_32_aligned( addr );
+ storeLE( mkexpr(addr), getYMMReg(rG) );
+ DIP("vmovntdq %s,%s\n", dis_buf, nameYMMReg(rG));
+ delta += alen;
+ goto decode_success;
+ }
+ /* else fall through */
+ }
break;
case 0xE8:
@@ -23796,6 +24005,15 @@
}
break;
+ case 0xF7:
+ /* VMASKMOVDQU xmm2, xmm1 = VEX.128.66.0F.WIG F7 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
+ && epartIsReg(getUChar(delta))) {
+ delta = dis_MASKMOVDQU( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
case 0xF8:
/* VPSUBB r/m, rV, r ::: r = rV - r/m */
/* VPSUBB = VEX.NDS.128.66.0F.WIG F8 /r */
@@ -24256,6 +24474,23 @@
}
break;
+ case 0x2A:
+ /* VMOVNTDQA m128, xmm1 = VEX.128.66.0F38.WIG 2A /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/
+ && !epartIsReg(getUChar(delta))) {
+ UChar modrm = getUChar(delta);
+ UInt rD = gregOfRexRM(pfx, modrm);
+ IRTemp tD = newTemp(Ity_V128);
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ delta += alen;
+ gen_SEGV_if_not_16_aligned(addr);
+ assign(tD, loadLE(Ity_V128, mkexpr(addr)));
+ DIP("vmovntdqa %s,%s\n", dis_buf, nameXMMReg(rD));
+ putYMMRegLoAndZU(rD, mkexpr(tD));
+ goto decode_success;
+ }
+ break;
+
case 0x30:
/* VPMOVZXBW xmm2/m64, xmm1 */
/* VPMOVZXBW = VEX.128.66.0F38.WIG 30 /r */
Modified: trunk/priv/ir_defs.c (+6 -0)
===================================================================
--- trunk/priv/ir_defs.c 2012-06-24 14:44:17 +01:00 (rev 2406)
+++ trunk/priv/ir_defs.c 2012-06-24 15:00:27 +01:00 (rev 2407)
@@ -591,19 +591,23 @@
case Iop_Div64Fx2: vex_printf("Div64Fx2"); return;
case Iop_Div64F0x2: vex_printf("Div64F0x2"); return;
+ case Iop_Max32Fx8: vex_printf("Max32Fx8"); return;
case Iop_Max32Fx4: vex_printf("Max32Fx4"); return;
case Iop_Max32Fx2: vex_printf("Max32Fx2"); return;
case Iop_PwMax32Fx4: vex_printf("PwMax32Fx4"); return;
case Iop_PwMax32Fx2: vex_printf("PwMax32Fx2"); return;
case Iop_Max32F0x4: vex_printf("Max32F0x4"); return;
+ case Iop_Max64Fx4: vex_printf("Max64Fx4"); return;
case Iop_Max64Fx2: vex_printf("Max64Fx2"); return;
case Iop_Max64F0x2: vex_printf("Max64F0x2"); return;
+ case Iop_Min32Fx8: vex_printf("Min32Fx8"); return;
case Iop_Min32Fx4: vex_printf("Min32Fx4"); return;
case Iop_Min32Fx2: vex_printf("Min32Fx2"); return;
case Iop_PwMin32Fx4: vex_printf("PwMin32Fx4"); return;
case Iop_PwMin32Fx2: vex_printf("PwMin32Fx2"); return;
case Iop_Min32F0x4: vex_printf("Min32F0x4"); return;
+ case Iop_Min64Fx4: vex_printf("Min64Fx4"); return;
case Iop_Min64Fx2: vex_printf("Min64Fx2"); return;
case Iop_Min64F0x2: vex_printf("Min64F0x2"); return;
@@ -2808,6 +2812,8 @@
case Iop_Mul32Fx8: case Iop_Div32Fx8:
case Iop_AndV256: case Iop_OrV256:
case Iop_XorV256:
+ case Iop_Max32Fx8: case Iop_Min32Fx8:
+ case Iop_Max64Fx4: case Iop_Min64Fx4:
BINARY(Ity_V256,Ity_V256, Ity_V256);
case Iop_V256toV128_1: case Iop_V256toV128_0:
Modified: trunk/priv/host_amd64_isel.c (+4 -0)
===================================================================
--- trunk/priv/host_amd64_isel.c 2012-06-24 14:44:17 +01:00 (rev 2406)
+++ trunk/priv/host_amd64_isel.c 2012-06-24 15:00:27 +01:00 (rev 2407)
@@ -3485,6 +3485,8 @@
case Iop_Sub64Fx4: op = Asse_SUBF; goto do_64Fx4;
case Iop_Mul64Fx4: op = Asse_MULF; goto do_64Fx4;
case Iop_Div64Fx4: op = Asse_DIVF; goto do_64Fx4;
+ case Iop_Max64Fx4: op = Asse_MAXF; goto do_64Fx4;
+ case Iop_Min64Fx4: op = Asse_MINF; goto do_64Fx4;
do_64Fx4:
{
HReg argLhi, argLlo, argRhi, argRlo;
@@ -3505,6 +3507,8 @@
case Iop_Sub32Fx8: op = Asse_SUBF; goto do_32Fx8;
case Iop_Mul32Fx8: op = Asse_MULF; goto do_32Fx8;
case Iop_Div32Fx8: op = Asse_DIVF; goto do_32Fx8;
+ case Iop_Max32Fx8: op = Asse_MAXF; goto do_32Fx8;
+ case Iop_Min32Fx8: op = Asse_MINF; goto do_32Fx8;
do_32Fx8:
{
HReg argLhi, argLlo, argRhi, argRlo;
|
|
From: <sv...@va...> - 2012-06-24 13:44:45
|
sewardj 2012-06-24 14:44:35 +0100 (Sun, 24 Jun 2012)
New Revision: 12668
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+123 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 14:28:04 +01:00 (rev 12667)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 14:44:35 +01:00 (rev 12668)
@@ -176,6 +176,14 @@
"vcvttsd2si %%xmm8, %%r14",
"vcvttsd2si (%%rax), %%r14")
+GEN_test_RandM(VCVTSD2SI_32,
+ "vcvtsd2si %%xmm8, %%r14d",
+ "vcvtsd2si (%%rax), %%r14d")
+
+GEN_test_RandM(VCVTSD2SI_64,
+ "vcvtsd2si %%xmm8, %%r14",
+ "vcvtsd2si (%%rax), %%r14")
+
GEN_test_RandM(VPSHUFB_128,
"vpshufb %%xmm6, %%xmm8, %%xmm7",
"vpshufb (%%rax), %%xmm8, %%xmm7")
@@ -248,6 +256,10 @@
"vcvttss2si %%xmm8, %%r14d",
"vcvttss2si (%%rax), %%r14d")
+GEN_test_RandM(VCVTSS2SI_32,
+ "vcvtss2si %%xmm8, %%r14d",
+ "vcvtss2si (%%rax), %%r14d")
+
GEN_test_RandM(VMOVQ_XMMorMEM64_to_XMM,
"vmovq %%xmm7, %%xmm8",
"vmovq (%%rax), %%xmm8")
@@ -303,6 +315,10 @@
"vcvttss2si %%xmm8, %%r14",
"vcvttss2si (%%rax), %%r14")
+GEN_test_RandM(VCVTSS2SI_64,
+ "vcvtss2si %%xmm8, %%r14",
+ "vcvtss2si (%%rax), %%r14")
+
GEN_test_Ronly(VPMOVMSKB_128,
"vpmovmskb %%xmm8, %%r14")
@@ -1120,6 +1136,32 @@
"vdppd $0xF0, %%xmm6, %%xmm8, %%xmm7",
"vdppd $0x73, (%%rax), %%xmm9, %%xmm6")
+GEN_test_RandM(VDPPS_128_1of4,
+ "vdpps $0x00, %%xmm6, %%xmm8, %%xmm7",
+ "vdpps $0xA5, (%%rax), %%xmm9, %%xmm6")
+GEN_test_RandM(VDPPS_128_2of4,
+ "vdpps $0x5A, %%xmm6, %%xmm8, %%xmm7",
+ "vdpps $0xFF, (%%rax), %%xmm9, %%xmm6")
+GEN_test_RandM(VDPPS_128_3of4,
+ "vdpps $0x0F, %%xmm6, %%xmm8, %%xmm7",
+ "vdpps $0x37, (%%rax), %%xmm9, %%xmm6")
+GEN_test_RandM(VDPPS_128_4of4,
+ "vdpps $0xF0, %%xmm6, %%xmm8, %%xmm7",
+ "vdpps $0x73, (%%rax), %%xmm9, %%xmm6")
+
+GEN_test_RandM(VDPPS_256_1of4,
+ "vdpps $0x00, %%ymm6, %%ymm8, %%ymm7",
+ "vdpps $0xA5, (%%rax), %%ymm9, %%ymm6")
+GEN_test_RandM(VDPPS_256_2of4,
+ "vdpps $0x5A, %%ymm6, %%ymm8, %%ymm7",
+ "vdpps $0xFF, (%%rax), %%ymm9, %%ymm6")
+GEN_test_RandM(VDPPS_256_3of4,
+ "vdpps $0x0F, %%ymm6, %%ymm8, %%ymm7",
+ "vdpps $0x37, (%%rax), %%ymm9, %%ymm6")
+GEN_test_RandM(VDPPS_256_4of4,
+ "vdpps $0xF0, %%ymm6, %%ymm8, %%ymm7",
+ "vdpps $0x73, (%%rax), %%ymm9, %%ymm6")
+
GEN_test_Monly(VBROADCASTSS_256,
"vbroadcastss (%%rax), %%ymm8")
@@ -1700,6 +1742,61 @@
"vblendvpd %%ymm9, (%%rax), %%ymm8, %%ymm7")
+GEN_test_RandM(VHADDPS_128,
+ "vhaddps %%xmm6, %%xmm8, %%xmm7",
+ "vhaddps (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VHADDPS_256,
+ "vhaddps %%ymm6, %%ymm8, %%ymm7",
+ "vhaddps (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VHADDPD_128,
+ "vhaddpd %%xmm6, %%xmm8, %%xmm7",
+ "vhaddpd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VHADDPD_256,
+ "vhaddpd %%ymm6, %%ymm8, %%ymm7",
+ "vhaddpd (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VHSUBPS_128,
+ "vhsubps %%xmm6, %%xmm8, %%xmm7",
+ "vhsubps (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VHSUBPS_256,
+ "vhsubps %%ymm6, %%ymm8, %%ymm7",
+ "vhsubps (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VHSUBPD_128,
+ "vhsubpd %%xmm6, %%xmm8, %%xmm7",
+ "vhsubpd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VHSUBPD_256,
+ "vhsubpd %%ymm6, %%ymm8, %%ymm7",
+ "vhsubpd (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VEXTRACTPS_0x0,
+ "vextractps $0, %%xmm8, %%r14d",
+ "vextractps $0, %%xmm8, (%%rax)")
+
+GEN_test_RandM(VEXTRACTPS_0x1,
+ "vextractps $1, %%xmm8, %%r14d",
+ "vextractps $1, %%xmm8, (%%rax)")
+
+GEN_test_RandM(VEXTRACTPS_0x2,
+ "vextractps $2, %%xmm8, %%r14d",
+ "vextractps $2, %%xmm8, (%%rax)")
+
+GEN_test_RandM(VEXTRACTPS_0x3,
+ "vextractps $3, %%xmm8, %%r14d",
+ "vextractps $3, %%xmm8, (%%rax)")
+
+GEN_test_Monly(VLDDQU_128,
+ "vlddqu 1(%%rax), %%xmm8")
+
+GEN_test_Monly(VLDDQU_256,
+ "vlddqu 1(%%rax), %%ymm8")
+
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -2113,5 +2210,31 @@
DO_D( VADDSUBPS_256 );
DO_D( VADDSUBPD_128 );
DO_D( VADDSUBPD_256 );
+ DO_D( VCVTSS2SI_64 );
+ DO_D( VCVTSS2SI_32 );
+ DO_D( VCVTSD2SI_32 );
+ DO_D( VCVTSD2SI_64 );
+ DO_D( VDPPS_128_1of4 );
+ DO_D( VDPPS_128_2of4 );
+ DO_D( VDPPS_128_3of4 );
+ DO_D( VDPPS_128_4of4 );
+ DO_D( VDPPS_256_1of4 );
+ DO_D( VDPPS_256_2of4 );
+ DO_D( VDPPS_256_3of4 );
+ DO_D( VDPPS_256_4of4 );
+ DO_D( VHADDPS_128 );
+ DO_D( VHADDPS_256 );
+ DO_D( VHADDPD_128 );
+ DO_D( VHADDPD_256 );
+ DO_D( VHSUBPS_128 );
+ DO_D( VHSUBPS_256 );
+ DO_D( VHSUBPD_128 );
+ DO_D( VHSUBPD_256 );
+ DO_D( VEXTRACTPS_0x0 );
+ DO_D( VEXTRACTPS_0x1 );
+ DO_D( VEXTRACTPS_0x2 );
+ DO_D( VEXTRACTPS_0x3 );
+ DO_D( VLDDQU_128 );
+ DO_D( VLDDQU_256 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 13:44:29
|
sewardj 2012-06-24 14:44:17 +0100 (Sun, 24 Jun 2012)
New Revision: 2406
Log:
Even more AVX insns:
VCVTSD2SI xmm1/m32, r32 = VEX.LIG.F2.0F.W0 2D /r
VCVTSD2SI xmm1/m64, r64 = VEX.LIG.F2.0F.W1 2D /r
VCVTSS2SI xmm1/m32, r32 = VEX.LIG.F3.0F.W0 2D /r
VCVTSS2SI xmm1/m64, r64 = VEX.LIG.F3.0F.W1 2D /r
VHADDPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.F2.0F.WIG 7C /r
VHSUBPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.F2.0F.WIG 7D /r
VHADDPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.F2.0F.WIG 7C /r
VHSUBPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.F2.0F.WIG 7D /r
VHADDPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 7C /r
VHSUBPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 7D /r
VHADDPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 7C /r
VHSUBPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 7D /r
VLDDQU m256, ymm1 = VEX.256.F2.0F.WIG F0 /r
VLDDQU m128, xmm1 = VEX.128.F2.0F.WIG F0 /r
VEXTRACTPS imm8, xmm1, r32/m32 = VEX.128.66.0F3A.WIG 17 /r ib
VDPPS imm8, xmm3/m128,xmm2,xmm1 = VEX.NDS.128.66.0F3A.WIG 40 /r ib
VDPPS imm8, ymm3/m128,ymm2,ymm1 = VEX.NDS.128.66.0F3A.WIG 40 /r ib
(Jakub Jelinek, ja...@re...), #273475 comment 134.
Modified files:
trunk/priv/guest_amd64_toIR.c
Modified: trunk/priv/guest_amd64_toIR.c (+415 -134)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 14:27:46 +01:00 (rev 2405)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 14:44:17 +01:00 (rev 2406)
@@ -13964,6 +13964,46 @@
}
+static IRTemp math_HADDPS_128 ( IRTemp dV, IRTemp sV, Bool isAdd )
+{
+ IRTemp s3, s2, s1, s0, d3, d2, d1, d0;
+ IRTemp leftV = newTemp(Ity_V128);
+ IRTemp rightV = newTemp(Ity_V128);
+ s3 = s2 = s1 = s0 = d3 = d2 = d1 = d0 = IRTemp_INVALID;
+
+ breakupV128to32s( sV, &s3, &s2, &s1, &s0 );
+ breakupV128to32s( dV, &d3, &d2, &d1, &d0 );
+
+ assign( leftV, mkV128from32s( s2, s0, d2, d0 ) );
+ assign( rightV, mkV128from32s( s3, s1, d3, d1 ) );
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, binop(isAdd ? Iop_Add32Fx4 : Iop_Sub32Fx4,
+ mkexpr(leftV), mkexpr(rightV) ) );
+ return res;
+}
+
+
+static IRTemp math_HADDPD_128 ( IRTemp dV, IRTemp sV, Bool isAdd )
+{
+ IRTemp s1, s0, d1, d0;
+ IRTemp leftV = newTemp(Ity_V128);
+ IRTemp rightV = newTemp(Ity_V128);
+ s1 = s0 = d1 = d0 = IRTemp_INVALID;
+
+ breakupV128to64s( sV, &s1, &s0 );
+ breakupV128to64s( dV, &d1, &d0 );
+
+ assign( leftV, binop(Iop_64HLtoV128, mkexpr(s0), mkexpr(d0)) );
+ assign( rightV, binop(Iop_64HLtoV128, mkexpr(s1), mkexpr(d1)) );
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, binop(isAdd ? Iop_Add64Fx2 : Iop_Sub64Fx2,
+ mkexpr(leftV), mkexpr(rightV) ) );
+ return res;
+}
+
+
__attribute__((noinline))
static
Long dis_ESC_0F__SSE3 ( Bool* decode_OK,
@@ -14014,83 +14054,51 @@
/* F2 0F 7C = HADDPS -- 32x4 add across from E (mem or xmm) to G (xmm). */
/* F2 0F 7D = HSUBPS -- 32x4 sub across from E (mem or xmm) to G (xmm). */
if (haveF2no66noF3(pfx) && sz == 4) {
- IRTemp e3, e2, e1, e0, g3, g2, g1, g0;
IRTemp eV = newTemp(Ity_V128);
IRTemp gV = newTemp(Ity_V128);
- IRTemp leftV = newTemp(Ity_V128);
- IRTemp rightV = newTemp(Ity_V128);
Bool isAdd = opc == 0x7C;
HChar* str = isAdd ? "add" : "sub";
- e3 = e2 = e1 = e0 = g3 = g2 = g1 = g0 = IRTemp_INVALID;
-
- modrm = getUChar(delta);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
if (epartIsReg(modrm)) {
- assign( eV, getXMMReg( eregOfRexRM(pfx,modrm)) );
- DIP("h%sps %s,%s\n", str, nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( eV, getXMMReg(rE) );
+ DIP("h%sps %s,%s\n", str, nameXMMReg(rE), nameXMMReg(rG));
delta += 1;
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
assign( eV, loadLE(Ity_V128, mkexpr(addr)) );
- DIP("h%sps %s,%s\n", str, dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("h%sps %s,%s\n", str, dis_buf, nameXMMReg(rG));
delta += alen;
}
- assign( gV, getXMMReg(gregOfRexRM(pfx,modrm)) );
-
- breakupV128to32s( eV, &e3, &e2, &e1, &e0 );
- breakupV128to32s( gV, &g3, &g2, &g1, &g0 );
-
- assign( leftV, mkV128from32s( e2, e0, g2, g0 ) );
- assign( rightV, mkV128from32s( e3, e1, g3, g1 ) );
-
- putXMMReg( gregOfRexRM(pfx,modrm),
- binop(isAdd ? Iop_Add32Fx4 : Iop_Sub32Fx4,
- mkexpr(leftV), mkexpr(rightV) ) );
+ assign( gV, getXMMReg(rG) );
+ putXMMReg( rG, mkexpr( math_HADDPS_128 ( gV, eV, isAdd ) ) );
goto decode_success;
}
/* 66 0F 7C = HADDPD -- 64x2 add across from E (mem or xmm) to G (xmm). */
/* 66 0F 7D = HSUBPD -- 64x2 sub across from E (mem or xmm) to G (xmm). */
if (have66noF2noF3(pfx) && sz == 2) {
- IRTemp e1 = newTemp(Ity_I64);
- IRTemp e0 = newTemp(Ity_I64);
- IRTemp g1 = newTemp(Ity_I64);
- IRTemp g0 = newTemp(Ity_I64);
IRTemp eV = newTemp(Ity_V128);
IRTemp gV = newTemp(Ity_V128);
- IRTemp leftV = newTemp(Ity_V128);
- IRTemp rightV = newTemp(Ity_V128);
Bool isAdd = opc == 0x7C;
HChar* str = isAdd ? "add" : "sub";
-
- modrm = getUChar(delta);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
if (epartIsReg(modrm)) {
- assign( eV, getXMMReg( eregOfRexRM(pfx,modrm)) );
- DIP("h%spd %s,%s\n", str, nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( eV, getXMMReg(rE) );
+ DIP("h%spd %s,%s\n", str, nameXMMReg(rE), nameXMMReg(rG));
delta += 1;
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
assign( eV, loadLE(Ity_V128, mkexpr(addr)) );
- DIP("h%spd %s,%s\n", str, dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("h%spd %s,%s\n", str, dis_buf, nameXMMReg(rG));
delta += alen;
}
- assign( gV, getXMMReg(gregOfRexRM(pfx,modrm)) );
-
- assign( e1, unop(Iop_V128HIto64, mkexpr(eV) ));
- assign( e0, unop(Iop_V128to64, mkexpr(eV) ));
- assign( g1, unop(Iop_V128HIto64, mkexpr(gV) ));
- assign( g0, unop(Iop_V128to64, mkexpr(gV) ));
-
- assign( leftV, binop(Iop_64HLtoV128, mkexpr(e0),mkexpr(g0)) );
- assign( rightV, binop(Iop_64HLtoV128, mkexpr(e1),mkexpr(g1)) );
-
- putXMMReg( gregOfRexRM(pfx,modrm),
- binop(isAdd ? Iop_Add64Fx2 : Iop_Sub64Fx2,
- mkexpr(leftV), mkexpr(rightV) ) );
+ assign( gV, getXMMReg(rG) );
+ putXMMReg( rG, mkexpr( math_HADDPD_128 ( gV, eV, isAdd ) ) );
goto decode_success;
}
break;
@@ -16804,6 +16812,94 @@
}
+static IRTemp math_DPPS_128 ( IRTemp src_vec, IRTemp dst_vec, UInt imm8 )
+{
+ vassert(imm8 < 256);
+ IRTemp tmp_prod_vec = newTemp(Ity_V128);
+ IRTemp prod_vec = newTemp(Ity_V128);
+ IRTemp sum_vec = newTemp(Ity_V128);
+ IRTemp v3, v2, v1, v0;
+ v3 = v2 = v1 = v0 = IRTemp_INVALID;
+ UShort imm8_perms[16] = { 0x0000, 0x000F, 0x00F0, 0x00FF, 0x0F00,
+ 0x0F0F, 0x0FF0, 0x0FFF, 0xF000, 0xF00F,
+ 0xF0F0, 0xF0FF, 0xFF00, 0xFF0F, 0xFFF0,
+ 0xFFFF };
+
+ assign( tmp_prod_vec,
+ binop( Iop_AndV128,
+ binop( Iop_Mul32Fx4, mkexpr(dst_vec),
+ mkexpr(src_vec) ),
+ mkV128( imm8_perms[((imm8 >> 4)& 15)] ) ) );
+ breakupV128to32s( tmp_prod_vec, &v3, &v2, &v1, &v0 );
+ assign( prod_vec, mkV128from32s( v3, v1, v2, v0 ) );
+
+ assign( sum_vec, binop( Iop_Add32Fx4,
+ binop( Iop_InterleaveHI32x4,
+ mkexpr(prod_vec), mkexpr(prod_vec) ),
+ binop( Iop_InterleaveLO32x4,
+ mkexpr(prod_vec), mkexpr(prod_vec) ) ) );
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, binop( Iop_AndV128,
+ binop( Iop_Add32Fx4,
+ binop( Iop_InterleaveHI32x4,
+ mkexpr(sum_vec), mkexpr(sum_vec) ),
+ binop( Iop_InterleaveLO32x4,
+ mkexpr(sum_vec), mkexpr(sum_vec) ) ),
+ mkV128( imm8_perms[ (imm8 & 15) ] ) ) );
+ return res;
+}
+
+
+static Long dis_EXTRACTPS ( VexAbiInfo* vbi, Prefix pfx,
+ Long delta, Bool isAvx )
+{
+ IRTemp addr = IRTemp_INVALID;
+ Int alen = 0;
+ HChar dis_buf[50];
+ UChar modrm = getUChar(delta);
+ Int imm8_10;
+ IRTemp xmm_vec = newTemp(Ity_V128);
+ IRTemp src_dword = newTemp(Ity_I32);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ IRTemp t3, t2, t1, t0;
+ t3 = t2 = t1 = t0 = IRTemp_INVALID;
+
+ assign( xmm_vec, getXMMReg( rG ) );
+ breakupV128to32s( xmm_vec, &t3, &t2, &t1, &t0 );
+
+ if ( epartIsReg( modrm ) ) {
+ imm8_10 = (Int)(getUChar(delta+1) & 3);
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
+ imm8_10 = (Int)(getUChar(delta+alen) & 3);
+ }
+
+ switch ( imm8_10 ) {
+ case 0: assign( src_dword, mkexpr(t0) ); break;
+ case 1: assign( src_dword, mkexpr(t1) ); break;
+ case 2: assign( src_dword, mkexpr(t2) ); break;
+ case 3: assign( src_dword, mkexpr(t3) ); break;
+ default: vassert(0);
+ }
+
+ if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ putIReg32( rE, mkexpr(src_dword) );
+ delta += 1+1;
+ DIP( "%sextractps $%d, %s,%s\n", isAvx ? "v" : "", imm8_10,
+ nameXMMReg( rG ), nameIReg32( rE ) );
+ } else {
+ storeLE( mkexpr(addr), mkexpr(src_dword) );
+ delta += alen+1;
+ DIP( "%sextractps $%d, %s,%s\n", isAvx ? "v" : "", imm8_10,
+ nameXMMReg( rG ), dis_buf );
+ }
+
+ return delta;
+}
+
+
__attribute__((noinline))
static
Long dis_ESC_0F3A__SSE4 ( Bool* decode_OK,
@@ -17203,43 +17299,7 @@
*/
if (have66noF2noF3(pfx)
&& (sz == 2 || /* ignore redundant REX.W */ sz == 8)) {
-
- Int imm8_10;
- IRTemp xmm_vec = newTemp(Ity_V128);
- IRTemp src_dword = newTemp(Ity_I32);
-
- modrm = getUChar(delta);
- assign( xmm_vec, getXMMReg( gregOfRexRM(pfx,modrm) ) );
- breakupV128to32s( xmm_vec, &t3, &t2, &t1, &t0 );
-
- if ( epartIsReg( modrm ) ) {
- imm8_10 = (Int)(getUChar(delta+1) & 3);
- } else {
- addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
- imm8_10 = (Int)(getUChar(delta+alen) & 3);
- }
-
- switch ( imm8_10 ) {
- case 0: assign( src_dword, mkexpr(t0) ); break;
- case 1: assign( src_dword, mkexpr(t1) ); break;
- case 2: assign( src_dword, mkexpr(t2) ); break;
- case 3: assign( src_dword, mkexpr(t3) ); break;
- default: vassert(0);
- }
-
- if ( epartIsReg( modrm ) ) {
- putIReg32( eregOfRexRM(pfx,modrm), mkexpr(src_dword) );
- delta += 1+1;
- DIP( "extractps $%d, %s,%s\n", imm8_10,
- nameXMMReg( gregOfRexRM(pfx, modrm) ),
- nameIReg32( eregOfRexRM(pfx, modrm) ) );
- } else {
- storeLE( mkexpr(addr), mkexpr(src_dword) );
- delta += alen+1;
- DIP( "extractps $%d, %s,%s\n",
- imm8_10, nameXMMReg( gregOfRexRM(pfx, modrm) ), dis_buf );
- }
-
+ delta = dis_EXTRACTPS( vbi, pfx, delta, False/*!isAvx*/ );
goto decode_success;
}
break;
@@ -17383,66 +17443,31 @@
/* 66 0F 3A 40 /r ib = DPPS xmm1, xmm2/m128, imm8
Dot Product of Packed Single Precision Floating-Point Values (XMM) */
if (have66noF2noF3(pfx) && sz == 2) {
-
- Int imm8;
- IRTemp xmm1_vec = newTemp(Ity_V128);
- IRTemp xmm2_vec = newTemp(Ity_V128);
- IRTemp tmp_prod_vec = newTemp(Ity_V128);
- IRTemp prod_vec = newTemp(Ity_V128);
- IRTemp sum_vec = newTemp(Ity_V128);
- IRTemp v3, v2, v1, v0;
- v3 = v2 = v1 = v0 = IRTemp_INVALID;
-
modrm = getUChar(delta);
-
- assign( xmm1_vec, getXMMReg( gregOfRexRM(pfx, modrm) ) );
-
+ Int imm8;
+ IRTemp src_vec = newTemp(Ity_V128);
+ IRTemp dst_vec = newTemp(Ity_V128);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ assign( dst_vec, getXMMReg( rG ) );
if ( epartIsReg( modrm ) ) {
+ UInt rE = eregOfRexRM(pfx, modrm);
imm8 = (Int)getUChar(delta+1);
- assign( xmm2_vec, getXMMReg( eregOfRexRM(pfx, modrm) ) );
+ assign( src_vec, getXMMReg(rE) );
delta += 1+1;
- DIP( "dpps $%d, %s,%s\n", imm8,
- nameXMMReg( eregOfRexRM(pfx, modrm) ),
- nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ DIP( "dpps $%d, %s,%s\n",
+ imm8, nameXMMReg(rE), nameXMMReg(rG) );
} else {
addr = disAMode( &alen, vbi, pfx, delta, dis_buf,
1/* imm8 is 1 byte after the amode */ );
gen_SEGV_if_not_16_aligned( addr );
- assign( xmm2_vec, loadLE( Ity_V128, mkexpr(addr) ) );
+ assign( src_vec, loadLE( Ity_V128, mkexpr(addr) ) );
imm8 = (Int)getUChar(delta+alen);
delta += alen+1;
DIP( "dpps $%d, %s,%s\n",
- imm8, dis_buf, nameXMMReg( gregOfRexRM(pfx, modrm) ) );
+ imm8, dis_buf, nameXMMReg(rG) );
}
-
- UShort imm8_perms[16] = { 0x0000, 0x000F, 0x00F0, 0x00FF, 0x0F00,
- 0x0F0F, 0x0FF0, 0x0FFF, 0xF000, 0xF00F,
- 0xF0F0, 0xF0FF, 0xFF00, 0xFF0F, 0xFFF0,
- 0xFFFF };
-
- assign( tmp_prod_vec,
- binop( Iop_AndV128,
- binop( Iop_Mul32Fx4, mkexpr(xmm1_vec),
- mkexpr(xmm2_vec) ),
- mkV128( imm8_perms[((imm8 >> 4)& 15)] ) ) );
- breakupV128to32s( tmp_prod_vec, &v3, &v2, &v1, &v0 );
- assign( prod_vec, mkV128from32s( v3, v1, v2, v0 ) );
-
- assign( sum_vec, binop( Iop_Add32Fx4,
- binop( Iop_InterleaveHI32x4,
- mkexpr(prod_vec), mkexpr(prod_vec) ),
- binop( Iop_InterleaveLO32x4,
- mkexpr(prod_vec), mkexpr(prod_vec) ) ) );
-
- putXMMReg( gregOfRexRM(pfx, modrm),
- binop( Iop_AndV128,
- binop( Iop_Add32Fx4,
- binop( Iop_InterleaveHI32x4,
- mkexpr(sum_vec), mkexpr(sum_vec) ),
- binop( Iop_InterleaveLO32x4,
- mkexpr(sum_vec), mkexpr(sum_vec) ) ),
- mkV128( imm8_perms[ (imm8 & 15) ] ) ) );
-
+ IRTemp res = math_DPPS_128( src_vec, dst_vec, imm8 );
+ putXMMReg( rG, mkexpr(res) );
goto decode_success;
}
break;
@@ -21894,6 +21919,29 @@
}
break;
+ case 0x2D:
+ /* VCVTSD2SI xmm1/m32, r32 = VEX.LIG.F2.0F.W0 2D /r */
+ if (haveF2no66noF3(pfx) && 0==getRexW(pfx)/*W0*/) {
+ delta = dis_CVTxSD2SI( vbi, pfx, delta, True/*isAvx*/, opc, 4);
+ goto decode_success;
+ }
+ /* VCVTSD2SI xmm1/m64, r64 = VEX.LIG.F2.0F.W1 2D /r */
+ if (haveF2no66noF3(pfx) && 1==getRexW(pfx)/*W1*/) {
+ delta = dis_CVTxSD2SI( vbi, pfx, delta, True/*isAvx*/, opc, 8);
+ goto decode_success;
+ }
+ /* VCVTSS2SI xmm1/m32, r32 = VEX.LIG.F3.0F.W0 2D /r */
+ if (haveF3no66noF2(pfx) && 0==getRexW(pfx)/*W0*/) {
+ delta = dis_CVTxSS2SI( vbi, pfx, delta, True/*isAvx*/, opc, 4);
+ goto decode_success;
+ }
+ /* VCVTSS2SI xmm1/m64, r64 = VEX.LIG.F3.0F.W1 2D /r */
+ if (haveF3no66noF2(pfx) && 1==getRexW(pfx)/*W1*/) {
+ delta = dis_CVTxSS2SI( vbi, pfx, delta, True/*isAvx*/, opc, 8);
+ goto decode_success;
+ }
+ break;
+
case 0x2E:
case 0x2F:
/* VUCOMISD xmm2/m64, xmm1 = VEX.LIG.66.0F.WIG 2E /r */
@@ -22840,6 +22888,134 @@
}
break;
+ case 0x7C:
+ case 0x7D:
+ /* VHADDPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.F2.0F.WIG 7C /r */
+ /* VHSUBPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.F2.0F.WIG 7D /r */
+ if (haveF2no66noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ Bool isAdd = opc == 0x7C;
+ HChar* str = isAdd ? "add" : "sub";
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = getVexNvvvv(pfx);
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
+ DIP("vh%spd %s,%s,%s\n", str, nameXMMReg(rE),
+ nameXMMReg(rV), nameXMMReg(rG));
+ delta += 1;
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
+ DIP("vh%spd %s,%s,%s\n", str, dis_buf,
+ nameXMMReg(rV), nameXMMReg(rG));
+ delta += alen;
+ }
+ assign( dV, getXMMReg(rV) );
+ putYMMRegLoAndZU( rG, mkexpr( math_HADDPS_128 ( dV, sV, isAdd ) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ /* VHADDPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.F2.0F.WIG 7C /r */
+ /* VHSUBPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.F2.0F.WIG 7D /r */
+ if (haveF2no66noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ IRTemp sV = newTemp(Ity_V256);
+ IRTemp dV = newTemp(Ity_V256);
+ IRTemp s1, s0, d1, d0;
+ Bool isAdd = opc == 0x7C;
+ HChar* str = isAdd ? "add" : "sub";
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = getVexNvvvv(pfx);
+ s1 = s0 = d1 = d0 = IRTemp_INVALID;
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getYMMReg(rE) );
+ DIP("vh%spd %s,%s,%s\n", str, nameYMMReg(rE),
+ nameYMMReg(rV), nameYMMReg(rG));
+ delta += 1;
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( sV, loadLE(Ity_V256, mkexpr(addr)) );
+ DIP("vh%spd %s,%s,%s\n", str, dis_buf,
+ nameYMMReg(rV), nameYMMReg(rG));
+ delta += alen;
+ }
+ assign( dV, getYMMReg(rV) );
+ breakupV256toV128s( dV, &d1, &d0 );
+ breakupV256toV128s( sV, &s1, &s0 );
+ putYMMReg( rG, binop(Iop_V128HLtoV256,
+ mkexpr( math_HADDPS_128 ( d1, s1, isAdd ) ),
+ mkexpr( math_HADDPS_128 ( d0, s0, isAdd ) ) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ /* VHADDPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 7C /r */
+ /* VHSUBPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG 7D /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ Bool isAdd = opc == 0x7C;
+ HChar* str = isAdd ? "add" : "sub";
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = getVexNvvvv(pfx);
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
+ DIP("vh%spd %s,%s,%s\n", str, nameXMMReg(rE),
+ nameXMMReg(rV), nameXMMReg(rG));
+ delta += 1;
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
+ DIP("vh%spd %s,%s,%s\n", str, dis_buf,
+ nameXMMReg(rV), nameXMMReg(rG));
+ delta += alen;
+ }
+ assign( dV, getXMMReg(rV) );
+ putYMMRegLoAndZU( rG, mkexpr( math_HADDPD_128 ( dV, sV, isAdd ) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ /* VHADDPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 7C /r */
+ /* VHSUBPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG 7D /r */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ IRTemp sV = newTemp(Ity_V256);
+ IRTemp dV = newTemp(Ity_V256);
+ IRTemp s1, s0, d1, d0;
+ Bool isAdd = opc == 0x7C;
+ HChar* str = isAdd ? "add" : "sub";
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
+ UInt rV = getVexNvvvv(pfx);
+ s1 = s0 = d1 = d0 = IRTemp_INVALID;
+ if (epartIsReg(modrm)) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getYMMReg(rE) );
+ DIP("vh%spd %s,%s,%s\n", str, nameYMMReg(rE),
+ nameYMMReg(rV), nameYMMReg(rG));
+ delta += 1;
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
+ assign( sV, loadLE(Ity_V256, mkexpr(addr)) );
+ DIP("vh%spd %s,%s,%s\n", str, dis_buf,
+ nameYMMReg(rV), nameYMMReg(rG));
+ delta += alen;
+ }
+ assign( dV, getYMMReg(rV) );
+ breakupV256toV128s( dV, &d1, &d0 );
+ breakupV256toV128s( sV, &s1, &s0 );
+ putYMMReg( rG, binop(Iop_V128HLtoV256,
+ mkexpr( math_HADDPD_128 ( d1, s1, isAdd ) ),
+ mkexpr( math_HADDPD_128 ( d0, s0, isAdd ) ) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
case 0x7E:
/* Note the Intel docs don't make sense for this. I think they
are wrong. They seem to imply it is a store when in fact I
@@ -23540,6 +23716,35 @@
}
break;
+ case 0xF0:
+ /* VLDDQU m256, ymm1 = VEX.256.F2.0F.WIG F0 /r */
+ if (haveF2no66noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ UChar modrm = getUChar(delta);
+ UInt rD = gregOfRexRM(pfx, modrm);
+ IRTemp tD = newTemp(Ity_V256);
+ if (epartIsReg(modrm)) break;
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ delta += alen;
+ assign(tD, loadLE(Ity_V256, mkexpr(addr)));
+ DIP("vlddqu %s,%s\n", dis_buf, nameYMMReg(rD));
+ putYMMReg(rD, mkexpr(tD));
+ goto decode_success;
+ }
+ /* VLDDQU m128, xmm1 = VEX.128.F2.0F.WIG F0 /r */
+ if (haveF2no66noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ UChar modrm = getUChar(delta);
+ UInt rD = gregOfRexRM(pfx, modrm);
+ IRTemp tD = newTemp(Ity_V128);
+ if (epartIsReg(modrm)) break;
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 0 );
+ delta += alen;
+ assign(tD, loadLE(Ity_V128, mkexpr(addr)));
+ DIP("vlddqu %s,%s\n", dis_buf, nameXMMReg(rD));
+ putYMMRegLoAndZU(rD, mkexpr(tD));
+ goto decode_success;
+ }
+ break;
+
case 0xF1:
/* VPSLLW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG F1 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -24904,6 +25109,14 @@
}
break;
+ case 0x17:
+ /* VEXTRACTPS imm8, xmm1, r32/m32 = VEX.128.66.0F3A.WIG 17 /r ib */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_EXTRACTPS( vbi, pfx, delta, True/*isAvx*/ );
+ goto decode_success;
+ }
+ break;
+
case 0x18:
/* VINSERTF128 r/m, rV, rD
::: rD = insertinto(a lane in rV, 128 bits from r/m) */
@@ -25114,8 +25327,76 @@
}
break;
+ case 0x40:
+ /* VDPPS imm8, xmm3/m128,xmm2,xmm1 = VEX.NDS.128.66.0F3A.WIG 40 /r ib */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ UInt rV = getVexNvvvv(pfx);
+ IRTemp dst_vec = newTemp(Ity_V128);
+ Int imm8;
+ if (epartIsReg( modrm )) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ imm8 = (Int)getUChar(delta+1);
+ assign( dst_vec, getXMMReg( rE ) );
+ delta += 1+1;
+ DIP( "vdpps $%d,%s,%s,%s\n",
+ imm8, nameXMMReg(rE), nameXMMReg(rV), nameXMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
+ imm8 = (Int)getUChar(delta+alen);
+ assign( dst_vec, loadLE( Ity_V128, mkexpr(addr) ) );
+ delta += alen+1;
+ DIP( "vdpps $%d,%s,%s,%s\n",
+ imm8, dis_buf, nameXMMReg(rV), nameXMMReg(rG) );
+ }
+
+ IRTemp src_vec = newTemp(Ity_V128);
+ assign(src_vec, getXMMReg( rV ));
+ IRTemp res_vec = math_DPPS_128( src_vec, dst_vec, imm8 );
+ putYMMRegLoAndZU( rG, mkexpr(res_vec) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ /* VDPPS imm8, ymm3/m128,ymm2,ymm1 = VEX.NDS.256.66.0F3A.WIG 40 /r ib */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ UChar modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, modrm);
+ UInt rV = getVexNvvvv(pfx);
+ IRTemp dst_vec = newTemp(Ity_V256);
+ Int imm8;
+ if (epartIsReg( modrm )) {
+ UInt rE = eregOfRexRM(pfx,modrm);
+ imm8 = (Int)getUChar(delta+1);
+ assign( dst_vec, getYMMReg( rE ) );
+ delta += 1+1;
+ DIP( "vdpps $%d,%s,%s,%s\n",
+ imm8, nameYMMReg(rE), nameYMMReg(rV), nameYMMReg(rG) );
+ } else {
+ addr = disAMode( &alen, vbi, pfx, delta, dis_buf, 1 );
+ imm8 = (Int)getUChar(delta+alen);
+ assign( dst_vec, loadLE( Ity_V256, mkexpr(addr) ) );
+ delta += alen+1;
+ DIP( "vdpps $%d,%s,%s,%s\n",
+ imm8, dis_buf, nameYMMReg(rV), nameYMMReg(rG) );
+ }
+
+ IRTemp src_vec = newTemp(Ity_V256);
+ assign(src_vec, getYMMReg( rV ));
+ IRTemp s0, s1, d0, d1;
+ s0 = s1 = d0 = d1 = IRTemp_INVALID;
+ breakupV256toV128s( dst_vec, &d1, &d0 );
+ breakupV256toV128s( src_vec, &s1, &s0 );
+ putYMMReg( rG, binop( Iop_V128HLtoV256,
+ mkexpr( math_DPPS_128(s1, d1, imm8) ),
+ mkexpr( math_DPPS_128(s0, d0, imm8) ) ) );
+ *uses_vvvv = True;
+ goto decode_success;
+ }
+ break;
+
case 0x41:
- /* VDPPD xmm3/m128,xmm2,xmm1 = VEX.NDS.128.66.0F3A.WIG 41 /r ib */
+ /* VDPPD imm8, xmm3/m128,xmm2,xmm1 = VEX.NDS.128.66.0F3A.WIG 41 /r ib */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
UChar modrm = getUChar(delta);
UInt rG = gregOfRexRM(pfx, modrm);
|
|
From: <sv...@va...> - 2012-06-24 13:28:12
|
sewardj 2012-06-24 14:28:04 +0100 (Sun, 24 Jun 2012)
New Revision: 12667
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+54 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 12:04:08 +01:00 (rev 12666)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 14:28:04 +01:00 (rev 12667)
@@ -756,6 +756,10 @@
"vpmuludq %%xmm6, %%xmm8, %%xmm7",
"vpmuludq (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPMULDQ_128,
+ "vpmuldq %%xmm6, %%xmm8, %%xmm7",
+ "vpmuldq (%%rax), %%xmm8, %%xmm7")
+
GEN_test_Ronly(VPSLLQ_0x05_128,
"vpsllq $0x5, %%xmm9, %%xmm7")
@@ -921,6 +925,18 @@
"vcmppd $4, %%xmm6, %%xmm8, %%xmm7",
"vcmppd $4, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VCMPPD_256_0x4,
+ "vcmppd $4, %%ymm6, %%ymm8, %%ymm7",
+ "vcmppd $4, (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VCMPPS_128_0x4,
+ "vcmpps $4, %%xmm6, %%xmm8, %%xmm7",
+ "vcmpps $4, (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VCMPPS_256_0x4,
+ "vcmpps $4, %%ymm6, %%ymm8, %%ymm7",
+ "vcmpps $4, (%%rax), %%ymm8, %%ymm7")
+
GEN_test_RandM(VCVTDQ2PD_128,
"vcvtdq2pd %%xmm6, %%xmm8",
"vcvtdq2pd (%%rax), %%xmm8")
@@ -1036,6 +1052,14 @@
GEN_test_Ronly(VPSRAW_0x05_128,
"vpsraw $0x5, %%xmm9, %%xmm7")
+GEN_test_RandM(VPCMPGTB_128,
+ "vpcmpgtb %%xmm6, %%xmm8, %%xmm7",
+ "vpcmpgtb (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VPCMPGTW_128,
+ "vpcmpgtw %%xmm6, %%xmm8, %%xmm7",
+ "vpcmpgtw (%%rax), %%xmm8, %%xmm7")
+
GEN_test_RandM(VPCMPGTD_128,
"vpcmpgtd %%xmm6, %%xmm8, %%xmm7",
"vpcmpgtd (%%rax), %%xmm8, %%xmm7")
@@ -1436,7 +1460,26 @@
"vroundpd $0x4, %%ymm8, %%ymm9",
"vroundpd $0x4, (%%rax), %%ymm9")
+GEN_test_RandM(VPMADDWD_128,
+ "vpmaddwd %%xmm6, %%xmm8, %%xmm7",
+ "vpmaddwd (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VADDSUBPS_128,
+ "vaddsubps %%xmm6, %%xmm8, %%xmm7",
+ "vaddsubps (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VADDSUBPS_256,
+ "vaddsubps %%ymm6, %%ymm8, %%ymm7",
+ "vaddsubps (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VADDSUBPD_128,
+ "vaddsubpd %%xmm6, %%xmm8, %%xmm7",
+ "vaddsubpd (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VADDSUBPD_256,
+ "vaddsubpd %%ymm6, %%ymm8, %%ymm7",
+ "vaddsubpd (%%rax), %%ymm8, %%ymm7")
+
GEN_test_RandM(VROUNDSS_0x0,
"vroundss $0x0, %%xmm8, %%xmm6, %%xmm9",
"vroundss $0x0, (%%rax), %%xmm6, %%xmm9")
@@ -2059,5 +2102,16 @@
DO_D( VBLENDVPS_256 );
DO_D( VBLENDVPD_128 );
DO_D( VBLENDVPD_256 );
+ DO_D( VPMULDQ_128 );
+ DO_D( VCMPPD_256_0x4 );
+ DO_D( VCMPPS_128_0x4 );
+ DO_D( VCMPPS_256_0x4 );
+ DO_D( VPCMPGTB_128 );
+ DO_D( VPCMPGTW_128 );
+ DO_D( VPMADDWD_128 );
+ DO_D( VADDSUBPS_128 );
+ DO_D( VADDSUBPS_256 );
+ DO_D( VADDSUBPD_128 );
+ DO_D( VADDSUBPD_256 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 13:27:53
|
sewardj 2012-06-24 14:27:46 +0100 (Sun, 24 Jun 2012)
New Revision: 2405
Log:
VCMPPD and VCMPPS incremental fix
(Jakub Jelinek, ja...@re...), #273475 comment 133.
Modified files:
trunk/priv/guest_amd64_toIR.c
Modified: trunk/priv/guest_amd64_toIR.c (+2 -2)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 13:12:20 +01:00 (rev 2404)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 14:27:46 +01:00 (rev 2405)
@@ -20770,8 +20770,8 @@
opname, (Int)imm8, dis_buf, nameYMMReg(rV), nameYMMReg(rG));
}
- breakupV256toV128s( preSwap ? argL : argR, &argLhi, &argLlo );
- breakupV256toV128s( preSwap ? argR : argL, &argRhi, &argRlo );
+ breakupV256toV128s( preSwap ? argR : argL, &argLhi, &argLlo );
+ breakupV256toV128s( preSwap ? argL : argR, &argRhi, &argRlo );
assign(plain, binop( Iop_V128HLtoV256,
binop(op, mkexpr(argLhi), mkexpr(argRhi)),
binop(op, mkexpr(argLlo), mkexpr(argRlo)) ) );
|
|
From: <sv...@va...> - 2012-06-24 12:12:30
|
sewardj 2012-06-24 13:12:20 +0100 (Sun, 24 Jun 2012)
New Revision: 2404
Log:
Implement more AVX insns:
VPCMPGTB = VEX.NDS.128.66.0F.WIG 64 /r
VPCMPGTW = VEX.NDS.128.66.0F.WIG 65 /r
VCMPPD ymm3/m256(E=argL), ymm2(V=argR), ymm1(G)
VCMPPS xmm3/m128(E=argL), xmm2(V=argR), xmm1(G)
VCMPPS ymm3/m256(E=argL), ymm2(V=argR), ymm1(G)
VADDSUBPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG D0 /r
VADDSUBPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG D0 /r
VADDSUBPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.F2.0F.WIG D0 /r
VADDSUBPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.F2.0F.WIG D0 /r
VPMADDWD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG F5 /r
VPMULDQ xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 28 /r
(Jakub Jelinek, ja...@re...), #273475 comment 132.
Modified files:
trunk/priv/guest_amd64_toIR.c
Modified: trunk/priv/guest_amd64_toIR.c (+373 -99)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-24 12:09:37 +01:00 (rev 2403)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 13:12:20 +01:00 (rev 2404)
@@ -9016,6 +9016,36 @@
assign( *t1, unop(Iop_V128HIto64, mkexpr(t128)) );
}
+/* Construct a V256-bit value from eight 32-bit ints. */
+
+static IRExpr* mkV256from32s ( IRTemp t7, IRTemp t6,
+ IRTemp t5, IRTemp t4,
+ IRTemp t3, IRTemp t2,
+ IRTemp t1, IRTemp t0 )
+{
+ return
+ binop( Iop_V128HLtoV256,
+ binop( Iop_64HLtoV128,
+ binop(Iop_32HLto64, mkexpr(t7), mkexpr(t6)),
+ binop(Iop_32HLto64, mkexpr(t5), mkexpr(t4)) ),
+ binop( Iop_64HLtoV128,
+ binop(Iop_32HLto64, mkexpr(t3), mkexpr(t2)),
+ binop(Iop_32HLto64, mkexpr(t1), mkexpr(t0)) )
+ );
+}
+
+/* Construct a V256-bit value from four 64-bit ints. */
+
+static IRExpr* mkV256from64s ( IRTemp t3, IRTemp t2,
+ IRTemp t1, IRTemp t0 )
+{
+ return
+ binop( Iop_V128HLtoV256,
+ binop(Iop_64HLtoV128, mkexpr(t3), mkexpr(t2)),
+ binop(Iop_64HLtoV128, mkexpr(t1), mkexpr(t0))
+ );
+}
+
/* Helper for the SSSE3 (not SSE3) PMULHRSW insns. Given two 64-bit
values (aa,bb), computes, for each of the 4 16-bit lanes:
@@ -10466,6 +10496,122 @@
}
+static IRTemp math_PMULDQ_128 ( IRTemp dV, IRTemp sV )
+{
+ /* This is a really poor translation -- could be improved if
+ performance critical */
+ IRTemp s3, s2, s1, s0, d3, d2, d1, d0;
+ s3 = s2 = s1 = s0 = d3 = d2 = d1 = d0 = IRTemp_INVALID;
+ breakupV128to32s( dV, &d3, &d2, &d1, &d0 );
+ breakupV128to32s( sV, &s3, &s2, &s1, &s0 );
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, binop(Iop_64HLtoV128,
+ binop( Iop_MullS32, mkexpr(d2), mkexpr(s2)),
+ binop( Iop_MullS32, mkexpr(d0), mkexpr(s0)) ));
+ return res;
+}
+
+
+static IRTemp math_PMADDWD_128 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp sVhi, sVlo, dVhi, dVlo;
+ IRTemp resHi = newTemp(Ity_I64);
+ IRTemp resLo = newTemp(Ity_I64);
+ sVhi = sVlo = dVhi = dVlo = IRTemp_INVALID;
+ breakupV128to64s( sV, &sVhi, &sVlo );
+ breakupV128to64s( dV, &dVhi, &dVlo );
+ assign( resHi, mkIRExprCCall(Ity_I64, 0/*regparms*/,
+ "amd64g_calculate_mmx_pmaddwd",
+ &amd64g_calculate_mmx_pmaddwd,
+ mkIRExprVec_2( mkexpr(sVhi), mkexpr(dVhi))));
+ assign( resLo, mkIRExprCCall(Ity_I64, 0/*regparms*/,
+ "amd64g_calculate_mmx_pmaddwd",
+ &amd64g_calculate_mmx_pmaddwd,
+ mkIRExprVec_2( mkexpr(sVlo), mkexpr(dVlo))));
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, binop(Iop_64HLtoV128, mkexpr(resHi), mkexpr(resLo))) ;
+ return res;
+}
+
+
+static IRTemp math_ADDSUBPD_128 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp addV = newTemp(Ity_V128);
+ IRTemp subV = newTemp(Ity_V128);
+ IRTemp a1 = newTemp(Ity_I64);
+ IRTemp s0 = newTemp(Ity_I64);
+
+ assign( addV, binop(Iop_Add64Fx2, mkexpr(dV), mkexpr(sV)) );
+ assign( subV, binop(Iop_Sub64Fx2, mkexpr(dV), mkexpr(sV)) );
+
+ assign( a1, unop(Iop_V128HIto64, mkexpr(addV) ));
+ assign( s0, unop(Iop_V128to64, mkexpr(subV) ));
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, binop(Iop_64HLtoV128, mkexpr(a1), mkexpr(s0)) );
+ return res;
+}
+
+
+static IRTemp math_ADDSUBPD_256 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp a3, a2, a1, a0, s3, s2, s1, s0;
+ IRTemp addV = newTemp(Ity_V256);
+ IRTemp subV = newTemp(Ity_V256);
+ a3 = a2 = a1 = a0 = s3 = s2 = s1 = s0 = IRTemp_INVALID;
+
+ assign( addV, binop(Iop_Add64Fx4, mkexpr(dV), mkexpr(sV)) );
+ assign( subV, binop(Iop_Sub64Fx4, mkexpr(dV), mkexpr(sV)) );
+
+ breakupV256to64s( addV, &a3, &a2, &a1, &a0 );
+ breakupV256to64s( subV, &s3, &s2, &s1, &s0 );
+
+ IRTemp res = newTemp(Ity_V256);
+ assign( res, mkV256from64s( a3, s2, a1, s0 ) );
+ return res;
+}
+
+
+static IRTemp math_ADDSUBPS_128 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp a3, a2, a1, a0, s3, s2, s1, s0;
+ IRTemp addV = newTemp(Ity_V128);
+ IRTemp subV = newTemp(Ity_V128);
+ a3 = a2 = a1 = a0 = s3 = s2 = s1 = s0 = IRTemp_INVALID;
+
+ assign( addV, binop(Iop_Add32Fx4, mkexpr(dV), mkexpr(sV)) );
+ assign( subV, binop(Iop_Sub32Fx4, mkexpr(dV), mkexpr(sV)) );
+
+ breakupV128to32s( addV, &a3, &a2, &a1, &a0 );
+ breakupV128to32s( subV, &s3, &s2, &s1, &s0 );
+
+ IRTemp res = newTemp(Ity_V128);
+ assign( res, mkV128from32s( a3, s2, a1, s0 ) );
+ return res;
+}
+
+
+static IRTemp math_ADDSUBPS_256 ( IRTemp dV, IRTemp sV )
+{
+ IRTemp a7, a6, a5, a4, a3, a2, a1, a0;
+ IRTemp s7, s6, s5, s4, s3, s2, s1, s0;
+ IRTemp addV = newTemp(Ity_V256);
+ IRTemp subV = newTemp(Ity_V256);
+ a7 = a6 = a5 = a4 = a3 = a2 = a1 = a0 = IRTemp_INVALID;
+ s7 = s6 = s5 = s4 = s3 = s2 = s1 = s0 = IRTemp_INVALID;
+
+ assign( addV, binop(Iop_Add32Fx8, mkexpr(dV), mkexpr(sV)) );
+ assign( subV, binop(Iop_Sub32Fx8, mkexpr(dV), mkexpr(sV)) );
+
+ breakupV256to32s( addV, &a7, &a6, &a5, &a4, &a3, &a2, &a1, &a0 );
+ breakupV256to32s( subV, &s7, &s6, &s5, &s4, &s3, &s2, &s1, &s0 );
+
+ IRTemp res = newTemp(Ity_V256);
+ assign( res, mkV256from32s( a7, s6, a5, s4, a3, s2, a1, s0 ) );
+ return res;
+}
+
+
/* Handle 128 bit PSHUFLW and PSHUFHW. */
static Long dis_PSHUFxW_128 ( VexAbiInfo* vbi, Prefix pfx,
Long delta, Bool isAvx, Bool xIsH )
@@ -13498,47 +13644,23 @@
/* 66 0F F5 = PMADDWD -- Multiply and add packed integers from
E(xmm or mem) to G(xmm) */
if (have66noF2noF3(pfx) && sz == 2) {
- IRTemp s1V = newTemp(Ity_V128);
- IRTemp s2V = newTemp(Ity_V128);
- IRTemp dV = newTemp(Ity_V128);
- IRTemp s1Hi = newTemp(Ity_I64);
- IRTemp s1Lo = newTemp(Ity_I64);
- IRTemp s2Hi = newTemp(Ity_I64);
- IRTemp s2Lo = newTemp(Ity_I64);
- IRTemp dHi = newTemp(Ity_I64);
- IRTemp dLo = newTemp(Ity_I64);
- modrm = getUChar(delta);
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
if (epartIsReg(modrm)) {
- assign( s1V, getXMMReg(eregOfRexRM(pfx,modrm)) );
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
delta += 1;
- DIP("pmaddwd %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("pmaddwd %s,%s\n", nameXMMReg(rE), nameXMMReg(rG));
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
- assign( s1V, loadLE(Ity_V128, mkexpr(addr)) );
+ assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
delta += alen;
- DIP("pmaddwd %s,%s\n", dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("pmaddwd %s,%s\n", dis_buf, nameXMMReg(rG));
}
- assign( s2V, getXMMReg(gregOfRexRM(pfx,modrm)) );
- assign( s1Hi, unop(Iop_V128HIto64, mkexpr(s1V)) );
- assign( s1Lo, unop(Iop_V128to64, mkexpr(s1V)) );
- assign( s2Hi, unop(Iop_V128HIto64, mkexpr(s2V)) );
- assign( s2Lo, unop(Iop_V128to64, mkexpr(s2V)) );
- assign( dHi, mkIRExprCCall(
- Ity_I64, 0/*regparms*/,
- "amd64g_calculate_mmx_pmaddwd",
- &amd64g_calculate_mmx_pmaddwd,
- mkIRExprVec_2( mkexpr(s1Hi), mkexpr(s2Hi))
- ));
- assign( dLo, mkIRExprCCall(
- Ity_I64, 0/*regparms*/,
- "amd64g_calculate_mmx_pmaddwd",
- &amd64g_calculate_mmx_pmaddwd,
- mkIRExprVec_2( mkexpr(s1Lo), mkexpr(s2Lo))
- ));
- assign( dV, binop(Iop_64HLtoV128, mkexpr(dHi), mkexpr(dLo))) ;
- putXMMReg(gregOfRexRM(pfx,modrm), mkexpr(dV));
+ assign( dV, getXMMReg(rG) );
+ putXMMReg( rG, mkexpr(math_PMADDWD_128(dV, sV)) );
goto decode_success;
}
break;
@@ -13978,69 +14100,46 @@
if (have66noF2noF3(pfx) && sz == 2) {
IRTemp eV = newTemp(Ity_V128);
IRTemp gV = newTemp(Ity_V128);
- IRTemp addV = newTemp(Ity_V128);
- IRTemp subV = newTemp(Ity_V128);
- IRTemp a1 = newTemp(Ity_I64);
- IRTemp s0 = newTemp(Ity_I64);
-
- modrm = getUChar(delta);
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
if (epartIsReg(modrm)) {
- assign( eV, getXMMReg( eregOfRexRM(pfx,modrm)) );
- DIP("addsubpd %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( eV, getXMMReg(rE) );
+ DIP("addsubpd %s,%s\n", nameXMMReg(rE), nameXMMReg(rG));
delta += 1;
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
assign( eV, loadLE(Ity_V128, mkexpr(addr)) );
- DIP("addsubpd %s,%s\n", dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("addsubpd %s,%s\n", dis_buf, nameXMMReg(rG));
delta += alen;
}
- assign( gV, getXMMReg(gregOfRexRM(pfx,modrm)) );
-
- assign( addV, binop(Iop_Add64Fx2, mkexpr(gV), mkexpr(eV)) );
- assign( subV, binop(Iop_Sub64Fx2, mkexpr(gV), mkexpr(eV)) );
-
- assign( a1, unop(Iop_V128HIto64, mkexpr(addV) ));
- assign( s0, unop(Iop_V128to64, mkexpr(subV) ));
-
- putXMMReg( gregOfRexRM(pfx,modrm),
- binop(Iop_64HLtoV128, mkexpr(a1), mkexpr(s0)) );
+ assign( gV, getXMMReg(rG) );
+ putXMMReg( rG, mkexpr( math_ADDSUBPD_128 ( gV, eV ) ) );
goto decode_success;
}
/* F2 0F D0 = ADDSUBPS -- 32x4 +/-/+/- from E (mem or xmm) to G (xmm). */
if (haveF2no66noF3(pfx) && sz == 4) {
- IRTemp a3, a2, a1, a0, s3, s2, s1, s0;
IRTemp eV = newTemp(Ity_V128);
IRTemp gV = newTemp(Ity_V128);
- IRTemp addV = newTemp(Ity_V128);
- IRTemp subV = newTemp(Ity_V128);
- a3 = a2 = a1 = a0 = s3 = s2 = s1 = s0 = IRTemp_INVALID;
+ modrm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx,modrm);
modrm = getUChar(delta);
if (epartIsReg(modrm)) {
- assign( eV, getXMMReg( eregOfRexRM(pfx,modrm)) );
- DIP("addsubps %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( eV, getXMMReg(rE) );
+ DIP("addsubps %s,%s\n", nameXMMReg(rE), nameXMMReg(rG));
delta += 1;
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
assign( eV, loadLE(Ity_V128, mkexpr(addr)) );
- DIP("addsubps %s,%s\n", dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("addsubps %s,%s\n", dis_buf, nameXMMReg(rG));
delta += alen;
}
- assign( gV, getXMMReg(gregOfRexRM(pfx,modrm)) );
-
- assign( addV, binop(Iop_Add32Fx4, mkexpr(gV), mkexpr(eV)) );
- assign( subV, binop(Iop_Sub32Fx4, mkexpr(gV), mkexpr(eV)) );
-
- breakupV128to32s( addV, &a3, &a2, &a1, &a0 );
- breakupV128to32s( subV, &s3, &s2, &s1, &s0 );
-
- putXMMReg( gregOfRexRM(pfx,modrm), mkV128from32s( a3, s2, a1, s0 ));
+ assign( gV, getXMMReg(rG) );
+ putXMMReg( rG, mkexpr( math_ADDSUBPS_128 ( gV, eV ) ) );
goto decode_success;
}
break;
@@ -15798,42 +15897,30 @@
break;
case 0x28:
- /* 66 0F 38 28 = PMULUDQ -- signed widening multiply of 32-lanes
+ /* 66 0F 38 28 = PMULDQ -- signed widening multiply of 32-lanes
0 x 0 to form lower 64-bit half and lanes 2 x 2 to form upper
64-bit half */
/* This is a really poor translation -- could be improved if
- performance critical. It's a copy-paste of PMULDQ, too. */
+ performance critical. It's a copy-paste of PMULUDQ, too. */
if (have66noF2noF3(pfx) && sz == 2) {
- IRTemp sV, dV, t0, t1;
- IRTemp s3, s2, s1, s0, d3, d2, d1, d0;
- sV = newTemp(Ity_V128);
- dV = newTemp(Ity_V128);
- s3 = s2 = s1 = s0 = d3 = d2 = d1 = d0 = IRTemp_INVALID;
- t1 = newTemp(Ity_I64);
- t0 = newTemp(Ity_I64);
+ IRTemp sV = newTemp(Ity_V128);
+ IRTemp dV = newTemp(Ity_V128);
modrm = getUChar(delta);
- assign( dV, getXMMReg(gregOfRexRM(pfx,modrm)) );
-
+ UInt rG = gregOfRexRM(pfx,modrm);
+ assign( dV, getXMMReg(rG) );
if (epartIsReg(modrm)) {
- assign( sV, getXMMReg(eregOfRexRM(pfx,modrm)) );
+ UInt rE = eregOfRexRM(pfx,modrm);
+ assign( sV, getXMMReg(rE) );
delta += 1;
- DIP("pmuldq %s,%s\n", nameXMMReg(eregOfRexRM(pfx,modrm)),
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("pmuldq %s,%s\n", nameXMMReg(rE), nameXMMReg(rG));
} else {
addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
assign( sV, loadLE(Ity_V128, mkexpr(addr)) );
delta += alen;
- DIP("pmuldq %s,%s\n", dis_buf,
- nameXMMReg(gregOfRexRM(pfx,modrm)));
+ DIP("pmuldq %s,%s\n", dis_buf, nameXMMReg(rG));
}
- breakupV128to32s( dV, &d3, &d2, &d1, &d0 );
- breakupV128to32s( sV, &s3, &s2, &s1, &s0 );
-
- assign( t0, binop( Iop_MullS32, mkexpr(d0), mkexpr(s0)) );
- putXMMRegLane64( gregOfRexRM(pfx,modrm), 0, mkexpr(t0) );
- assign( t1, binop( Iop_MullS32, mkexpr(d2), mkexpr(s2)) );
- putXMMRegLane64( gregOfRexRM(pfx,modrm), 1, mkexpr(t1) );
+ putXMMReg( rG, mkexpr(math_PMULDQ_128( dV, sV )) );
goto decode_success;
}
break;
@@ -20630,6 +20717,78 @@
}
+/* Handles AVX256 32F/64F comparisons. A derivative of
+ dis_SSEcmp_E_to_G. It can fail, in which case it returns the
+ original delta to indicate failure. */
+static
+Long dis_AVX256_cmp_V_E_to_G ( /*OUT*/Bool* uses_vvvv,
+ VexAbiInfo* vbi,
+ Prefix pfx, Long delta,
+ HChar* opname, Int sz )
+{
+ vassert(sz == 4 || sz == 8);
+ Long deltaIN = delta;
+ HChar dis_buf[50];
+ Int alen;
+ UInt imm8;
+ IRTemp addr;
+ Bool preSwap = False;
+ IROp op = Iop_INVALID;
+ Bool postNot = False;
+ IRTemp plain = newTemp(Ity_V256);
+ UChar rm = getUChar(delta);
+ UInt rG = gregOfRexRM(pfx, rm);
+ UInt rV = getVexNvvvv(pfx);
+ IRTemp argL = newTemp(Ity_V256);
+ IRTemp argR = newTemp(Ity_V256);
+ IRTemp argLhi = IRTemp_INVALID;
+ IRTemp argLlo = IRTemp_INVALID;
+ IRTemp argRhi = IRTemp_INVALID;
+ IRTemp argRlo = IRTemp_INVALID;
+
+ assign(argL, getYMMReg(rV));
+ if (epartIsReg(rm)) {
+ imm8 = getUChar(delta+1);
+ Bool ok = findSSECmpOp(&preSwap, &op, &postNot, imm8,
+ True/*all_lanes*/, sz);
+ if (!ok) return deltaIN; /* FAIL */
+ UInt rE = eregOfRexRM(pfx,rm);
+ assign(argR, getYMMReg(rE));
+ delta += 1+1;
+ DIP("%s $%d,%s,%s,%s\n",
+ opname, (Int)imm8,
+ nameYMMReg(rE), nameYMMReg(rV), nameYMMReg(rG));
+ } else {
+ addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 1 );
+ imm8 = getUChar(delta+alen);
+ Bool ok = findSSECmpOp(&preSwap, &op, &postNot, imm8,
+ True/*all_lanes*/, sz);
+ if (!ok) return deltaIN; /* FAIL */
+ assign(argR, loadLE(Ity_V256, mkexpr(addr)) );
+ delta += alen+1;
+ DIP("%s $%d,%s,%s,%s\n",
+ opname, (Int)imm8, dis_buf, nameYMMReg(rV), nameYMMReg(rG));
+ }
+
+ breakupV256toV128s( preSwap ? argL : argR, &argLhi, &argLlo );
+ breakupV256toV128s( preSwap ? argR : argL, &argRhi, &argRlo );
+ assign(plain, binop( Iop_V128HLtoV256,
+ binop(op, mkexpr(argLhi), mkexpr(argRhi)),
+ binop(op, mkexpr(argLlo), mkexpr(argRlo)) ) );
+
+ /* This is simple: just invert the result, if necessary, and
+ have done. */
+ if (postNot) {
+ putYMMReg( rG, unop(Iop_NotV256, mkexpr(plain)) );
+ } else {
+ putYMMReg( rG, mkexpr(plain) );
+ }
+
+ *uses_vvvv = True;
+ return delta;
+}
+
+
/* Handles AVX128 unary E-to-G all-lanes operations. */
static
Long dis_AVX128_E_to_G_unary ( /*OUT*/Bool* uses_vvvv,
@@ -20765,6 +20924,22 @@
}
+/* Handle a VEX_NDS_256_66_0F_WIG (3-addr) insn, using the given IR
+ generator to compute the result, no inversion of the left
+ arg, and no swapping of args. */
+static
+Long dis_VEX_NDS_256_AnySimdPfx_0F_WIG_complex (
+ /*OUT*/Bool* uses_vvvv, VexAbiInfo* vbi,
+ Prefix pfx, Long delta, HChar* name,
+ IRTemp(*opFn)(IRTemp,IRTemp)
+ )
+{
+ return dis_VEX_NDS_256_AnySimdPfx_0F_WIG(
+ uses_vvvv, vbi, pfx, delta, name,
+ Iop_INVALID, opFn, False, False );
+}
+
+
/* Handles AVX256 unary E-to-G all-lanes operations. */
static
Long dis_AVX256_E_to_G_unary_all ( /*OUT*/Bool* uses_vvvv,
@@ -22268,6 +22443,26 @@
}
break;
+ case 0x64:
+ /* VPCMPGTB r/m, rV, r ::: r = rV `>s-by-8s` r/m */
+ /* VPCMPGTB = VEX.NDS.128.66.0F.WIG 64 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_simple(
+ uses_vvvv, vbi, pfx, delta, "vpcmpgtb", Iop_CmpGT8Sx16 );
+ goto decode_success;
+ }
+ break;
+
+ case 0x65:
+ /* VPCMPGTW r/m, rV, r ::: r = rV `>s-by-16s` r/m */
+ /* VPCMPGTW = VEX.NDS.128.66.0F.WIG 65 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_simple(
+ uses_vvvv, vbi, pfx, delta, "vpcmpgtw", Iop_CmpGT16Sx8 );
+ goto decode_success;
+ }
+ break;
+
case 0x66:
/* VPCMPGTD r/m, rV, r ::: r = rV `>s-by-32s` r/m */
/* VPCMPGTD = VEX.NDS.128.66.0F.WIG 66 /r */
@@ -22802,7 +22997,7 @@
if (delta > delta0) goto decode_success;
/* else fall through -- decoding has failed */
}
- /* VCMPPD xmm3/m64(E=argL), xmm2(V=argR), xmm1(G) */
+ /* VCMPPD xmm3/m128(E=argL), xmm2(V=argR), xmm1(G) */
/* = VEX.NDS.128.66.0F.WIG C2 /r ib */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
Long delta0 = delta;
@@ -22812,6 +23007,34 @@
if (delta > delta0) goto decode_success;
/* else fall through -- decoding has failed */
}
+ /* VCMPPD ymm3/m256(E=argL), ymm2(V=argR), ymm1(G) */
+ /* = VEX.NDS.256.66.0F.WIG C2 /r ib */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ Long delta0 = delta;
+ delta = dis_AVX256_cmp_V_E_to_G( uses_vvvv, vbi, pfx, delta,
+ "vcmppd", 8/*sz*/);
+ if (delta > delta0) goto decode_success;
+ /* else fall through -- decoding has failed */
+ }
+ /* VCMPPS xmm3/m128(E=argL), xmm2(V=argR), xmm1(G) */
+ /* = VEX.NDS.128.0F.WIG C2 /r ib */
+ if (haveNo66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ Long delta0 = delta;
+ delta = dis_AVX128_cmp_V_E_to_G( uses_vvvv, vbi, pfx, delta,
+ "vcmpps", True/*all_lanes*/,
+ 4/*sz*/);
+ if (delta > delta0) goto decode_success;
+ /* else fall through -- decoding has failed */
+ }
+ /* VCMPPS ymm3/m256(E=argL), ymm2(V=argR), ymm1(G) */
+ /* = VEX.NDS.256.0F.WIG C2 /r ib */
+ if (haveNo66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ Long delta0 = delta;
+ delta = dis_AVX256_cmp_V_E_to_G( uses_vvvv, vbi, pfx, delta,
+ "vcmpps", 4/*sz*/);
+ if (delta > delta0) goto decode_success;
+ /* else fall through -- decoding has failed */
+ }
break;
case 0xC4:
@@ -22983,6 +23206,37 @@
}
break;
+ case 0xD0:
+ /* VADDSUBPD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG D0 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vaddsubpd", math_ADDSUBPD_128 );
+ goto decode_success;
+ }
+ /* VADDSUBPD ymm3/m256, ymm2, ymm1 = VEX.NDS.256.66.0F.WIG D0 /r */
+ if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_VEX_NDS_256_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vaddsubpd", math_ADDSUBPD_256 );
+ goto decode_success;
+ }
+ /* VADDSUBPS xmm3/m128, xmm2, xmm1 = VEX.NDS.128.F2.0F.WIG D0 /r */
+ if (haveF2no66noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vaddsubps", math_ADDSUBPS_128 );
+ goto decode_success;
+ }
+ /* VADDSUBPS ymm3/m256, ymm2, ymm1 = VEX.NDS.256.F2.0F.WIG D0 /r */
+ if (haveF2no66noF3(pfx) && 1==getVexL(pfx)/*256*/) {
+ delta = dis_VEX_NDS_256_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vaddsubps", math_ADDSUBPS_256 );
+ goto decode_success;
+ }
+ break;
+
case 0xD1:
/* VPSRLW xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG D1 /r */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
@@ -23323,10 +23577,20 @@
delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
uses_vvvv, vbi, pfx, delta,
"vpmuludq", math_PMULUDQ_128 );
- goto decode_success;
+ goto decode_success;
}
break;
+ case 0xF5:
+ /* VPMADDWD xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F.WIG F5 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vpmaddwd", math_PMADDWD_128 );
+ goto decode_success;
+ }
+ break;
+
case 0xF8:
/* VPSUBB r/m, rV, r ::: r = rV - r/m */
/* VPSUBB = VEX.NDS.128.66.0F.WIG F8 /r */
@@ -23767,6 +24031,16 @@
}
break;
+ case 0x28:
+ /* VPMULDQ xmm3/m128, xmm2, xmm1 = VEX.NDS.128.66.0F38.WIG 28 /r */
+ if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
+ delta = dis_VEX_NDS_128_AnySimdPfx_0F_WIG_complex(
+ uses_vvvv, vbi, pfx, delta,
+ "vpmuldq", math_PMULDQ_128 );
+ goto decode_success;
+ }
+ break;
+
case 0x29:
/* VPCMPEQQ r/m, rV, r ::: r = rV `eq-by-64s` r/m */
/* VPCMPEQQ = VEX.NDS.128.66.0F38.WIG 29 /r */
|
|
From: <sv...@va...> - 2012-06-24 11:09:47
|
sewardj 2012-06-24 12:09:37 +0100 (Sun, 24 Jun 2012)
New Revision: 2403
Log:
VROUND{PS,PD}: fix incorrect comments.
Modified files:
trunk/priv/guest_amd64_toIR.c
Modified: trunk/priv/guest_amd64_toIR.c (+4 -4)
===================================================================
--- trunk/priv/guest_amd64_toIR.c 2012-06-22 13:36:19 +01:00 (rev 2402)
+++ trunk/priv/guest_amd64_toIR.c 2012-06-24 12:09:37 +01:00 (rev 2403)
@@ -24149,7 +24149,7 @@
break;
case 0x08:
- /* VROUNDPS imm8, xmm3/m128, xmm2, xmm1 */
+ /* VROUNDPS imm8, xmm2/m128, xmm1 */
/* VROUNDPS = VEX.NDS.128.66.0F3A.WIG 08 ib */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
UChar modrm = getUChar(delta);
@@ -24197,7 +24197,7 @@
# undef CVT
goto decode_success;
}
- /* VROUNDPS imm8, ymm3/m256, ymm2, ymm1 */
+ /* VROUNDPS imm8, ymm2/m256, ymm1 */
/* VROUNDPS = VEX.NDS.256.66.0F3A.WIG 08 ib */
if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
UChar modrm = getUChar(delta);
@@ -24254,7 +24254,7 @@
}
case 0x09:
- /* VROUNDPD imm8, xmm3/m128, xmm2, xmm1 */
+ /* VROUNDPD imm8, xmm2/m128, xmm1 */
/* VROUNDPD = VEX.NDS.128.66.0F3A.WIG 09 ib */
if (have66noF2noF3(pfx) && 0==getVexL(pfx)/*128*/) {
UChar modrm = getUChar(delta);
@@ -24298,7 +24298,7 @@
# undef CVT
goto decode_success;
}
- /* VROUNDPD imm8, ymm3/m256, ymm2, ymm1 */
+ /* VROUNDPD imm8, ymm2/m256, ymm1 */
/* VROUNDPD = VEX.NDS.256.66.0F3A.WIG 09 ib */
if (have66noF2noF3(pfx) && 1==getVexL(pfx)/*256*/) {
UChar modrm = getUChar(delta);
|
|
From: <sv...@va...> - 2012-06-24 11:04:20
|
sewardj 2012-06-24 12:04:08 +0100 (Sun, 24 Jun 2012)
New Revision: 12666
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+125 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 11:30:53 +01:00 (rev 12665)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 12:04:08 +01:00 (rev 12666)
@@ -1372,29 +1372,108 @@
"andq $63, 128(%%rax);"
"vpsrlq 128(%%rax), %%xmm8, %%xmm9")
+GEN_test_RandM(VROUNDPS_128_0x0,
+ "vroundps $0x0, %%xmm8, %%xmm9",
+ "vroundps $0x0, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPS_128_0x1,
+ "vroundps $0x1, %%xmm8, %%xmm9",
+ "vroundps $0x1, (%%rax), %%xmm9")
GEN_test_RandM(VROUNDPS_128_0x2,
"vroundps $0x2, %%xmm8, %%xmm9",
"vroundps $0x2, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPS_128_0x3,
+ "vroundps $0x3, %%xmm8, %%xmm9",
+ "vroundps $0x3, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPS_128_0x4,
+ "vroundps $0x4, %%xmm8, %%xmm9",
+ "vroundps $0x4, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPS_256_0x0,
+ "vroundps $0x0, %%ymm8, %%ymm9",
+ "vroundps $0x0, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPS_256_0x1,
+ "vroundps $0x1, %%ymm8, %%ymm9",
+ "vroundps $0x1, (%%rax), %%ymm9")
GEN_test_RandM(VROUNDPS_256_0x2,
"vroundps $0x2, %%ymm8, %%ymm9",
"vroundps $0x2, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPS_256_0x3,
+ "vroundps $0x3, %%ymm8, %%ymm9",
+ "vroundps $0x3, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPS_256_0x4,
+ "vroundps $0x4, %%ymm8, %%ymm9",
+ "vroundps $0x4, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPD_128_0x0,
+ "vroundpd $0x0, %%xmm8, %%xmm9",
+ "vroundpd $0x0, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPD_128_0x1,
+ "vroundpd $0x1, %%xmm8, %%xmm9",
+ "vroundpd $0x1, (%%rax), %%xmm9")
GEN_test_RandM(VROUNDPD_128_0x2,
"vroundpd $0x2, %%xmm8, %%xmm9",
"vroundpd $0x2, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPD_128_0x3,
+ "vroundpd $0x3, %%xmm8, %%xmm9",
+ "vroundpd $0x3, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPD_128_0x4,
+ "vroundpd $0x4, %%xmm8, %%xmm9",
+ "vroundpd $0x4, (%%rax), %%xmm9")
+GEN_test_RandM(VROUNDPD_256_0x0,
+ "vroundpd $0x0, %%ymm8, %%ymm9",
+ "vroundpd $0x0, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPD_256_0x1,
+ "vroundpd $0x1, %%ymm8, %%ymm9",
+ "vroundpd $0x1, (%%rax), %%ymm9")
GEN_test_RandM(VROUNDPD_256_0x2,
"vroundpd $0x2, %%ymm8, %%ymm9",
"vroundpd $0x2, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPD_256_0x3,
+ "vroundpd $0x3, %%ymm8, %%ymm9",
+ "vroundpd $0x3, (%%rax), %%ymm9")
+GEN_test_RandM(VROUNDPD_256_0x4,
+ "vroundpd $0x4, %%ymm8, %%ymm9",
+ "vroundpd $0x4, (%%rax), %%ymm9")
+
+GEN_test_RandM(VROUNDSS_0x0,
+ "vroundss $0x0, %%xmm8, %%xmm6, %%xmm9",
+ "vroundss $0x0, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSS_0x1,
+ "vroundss $0x1, %%xmm8, %%xmm6, %%xmm9",
+ "vroundss $0x1, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSS_0x2,
+ "vroundss $0x2, %%xmm8, %%xmm6, %%xmm9",
+ "vroundss $0x2, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSS_0x3,
+ "vroundss $0x3, %%xmm8, %%xmm6, %%xmm9",
+ "vroundss $0x3, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSS_0x4,
+ "vroundss $0x4, %%xmm8, %%xmm6, %%xmm9",
+ "vroundss $0x4, (%%rax), %%xmm6, %%xmm9")
GEN_test_RandM(VROUNDSS_0x5,
"vroundss $0x5, %%xmm8, %%xmm6, %%xmm9",
"vroundss $0x5, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSD_0x0,
+ "vroundsd $0x0, %%xmm8, %%xmm6, %%xmm9",
+ "vroundsd $0x0, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSD_0x1,
+ "vroundsd $0x1, %%xmm8, %%xmm6, %%xmm9",
+ "vroundsd $0x1, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSD_0x2,
+ "vroundsd $0x2, %%xmm8, %%xmm6, %%xmm9",
+ "vroundsd $0x2, (%%rax), %%xmm6, %%xmm9")
GEN_test_RandM(VROUNDSD_0x3,
"vroundsd $0x3, %%xmm8, %%xmm6, %%xmm9",
"vroundsd $0x3, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSD_0x4,
+ "vroundsd $0x4, %%xmm8, %%xmm6, %%xmm9",
+ "vroundsd $0x4, (%%rax), %%xmm6, %%xmm9")
+GEN_test_RandM(VROUNDSD_0x5,
+ "vroundsd $0x5, %%xmm8, %%xmm6, %%xmm9",
+ "vroundsd $0x5, (%%rax), %%xmm6, %%xmm9")
GEN_test_RandM(VPTEST_128_1,
"vptest %%xmm6, %%xmm8; "
@@ -1561,7 +1640,23 @@
"vtestpd (%%rax), %%ymm9; "
"pushfq; popq %%r14; andq $0x8D5, %%r14")
+GEN_test_RandM(VBLENDVPS_128,
+ "vblendvps %%xmm9, %%xmm6, %%xmm8, %%xmm7",
+ "vblendvps %%xmm9, (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VBLENDVPS_256,
+ "vblendvps %%ymm9, %%ymm6, %%ymm8, %%ymm7",
+ "vblendvps %%ymm9, (%%rax), %%ymm8, %%ymm7")
+
+GEN_test_RandM(VBLENDVPD_128,
+ "vblendvpd %%xmm9, %%xmm6, %%xmm8, %%xmm7",
+ "vblendvpd %%xmm9, (%%rax), %%xmm8, %%xmm7")
+
+GEN_test_RandM(VBLENDVPD_256,
+ "vblendvpd %%ymm9, %%ymm6, %%ymm8, %%ymm7",
+ "vblendvpd %%ymm9, (%%rax), %%ymm8, %%ymm7")
+
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -1912,12 +2007,38 @@
DO_D( VPSRAD_128 );
DO_D( VPSLLQ_128 );
DO_D( VPSRLQ_128 );
+ DO_D( VROUNDPS_128_0x0 );
+ DO_D( VROUNDPS_128_0x1 );
DO_D( VROUNDPS_128_0x2 );
+ DO_D( VROUNDPS_128_0x3 );
+ DO_D( VROUNDPS_128_0x4 );
+ DO_D( VROUNDPS_256_0x0 );
+ DO_D( VROUNDPS_256_0x1 );
DO_D( VROUNDPS_256_0x2 );
+ DO_D( VROUNDPS_256_0x3 );
+ DO_D( VROUNDPS_256_0x4 );
+ DO_D( VROUNDPD_128_0x0 );
+ DO_D( VROUNDPD_128_0x1 );
DO_D( VROUNDPD_128_0x2 );
+ DO_D( VROUNDPD_128_0x3 );
+ DO_D( VROUNDPD_128_0x4 );
+ DO_D( VROUNDPD_256_0x0 );
+ DO_D( VROUNDPD_256_0x1 );
DO_D( VROUNDPD_256_0x2 );
+ DO_D( VROUNDPD_256_0x3 );
+ DO_D( VROUNDPD_256_0x4 );
+ DO_D( VROUNDSS_0x0 );
+ DO_D( VROUNDSS_0x1 );
+ DO_D( VROUNDSS_0x2 );
+ DO_D( VROUNDSS_0x3 );
+ DO_D( VROUNDSS_0x4 );
DO_D( VROUNDSS_0x5 );
+ DO_D( VROUNDSD_0x0 );
+ DO_D( VROUNDSD_0x1 );
+ DO_D( VROUNDSD_0x2 );
DO_D( VROUNDSD_0x3 );
+ DO_D( VROUNDSD_0x4 );
+ DO_D( VROUNDSD_0x5 );
DO_D( VPTEST_128_1 );
DO_D( VPTEST_128_2 );
DO_D( VPTEST_256_1 );
@@ -1934,5 +2055,9 @@
DO_D( VTESTPD_256_1 );
DO_D( VTESTPD_256_2 );
DO_N( 10, VTESTPD_256_3 );
+ DO_D( VBLENDVPS_128 );
+ DO_D( VBLENDVPS_256 );
+ DO_D( VBLENDVPD_128 );
+ DO_D( VBLENDVPD_256 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 10:31:04
|
sewardj 2012-06-24 11:30:53 +0100 (Sun, 24 Jun 2012)
New Revision: 12665
Log:
Allow each test to be run multiple times (default is 3), rather than
just once.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+361 -351)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-24 10:10:38 +01:00 (rev 12664)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 11:30:53 +01:00 (rev 12665)
@@ -1569,360 +1569,370 @@
Imm8 etc fields are also allowed, where they make sense.
*/
+#define N_DEFAULT_ITERS 3
+
+// Do the specified test some number of times
+#define DO_N(_iters, _testfn) \
+ do { int i; for (i = 0; i < (_iters); i++) { test_##_testfn(); } } while (0)
+
+// Do the specified test the default number of times
+#define DO_D(_testfn) DO_N(N_DEFAULT_ITERS, _testfn)
+
+
int main ( void )
{
- test_VMOVUPD_EtoG_256();
- test_VMOVUPD_GtoE_256();
- test_VPSUBW_128();
- test_VPSUBQ_128();
- test_VPADDQ_128();
- test_VPINSRQ_128();
- test_VUCOMISS_128();
- test_VUCOMISD_128();
- test_VCVTPS2PD_128();
- test_VANDNPD_128();
- test_VORPD_128();
- test_VXORPD_128();
- test_VXORPS_128();
- test_VMULSD_128();
- test_VADDSD_128();
- test_VMINSD_128();
- test_VSUBSD_128();
- test_VDIVSD_128();
- test_VMAXSD_128();
- test_VPSHUFD_0x39_128();
- test_VPCMPEQD_128();
- test_VPEXTRD_128_0x3();
- test_VPEXTRD_128_0x0();
- test_VINSERTF128_0x0();
- test_VINSERTF128_0x1();
- test_VEXTRACTF128_0x0();
- test_VEXTRACTF128_0x1();
- test_VCVTPD2PS_128(); // see comment on the test
+ DO_D( VMOVUPD_EtoG_256 );
+ DO_D( VMOVUPD_GtoE_256 );
+ DO_D( VPSUBW_128 );
+ DO_D( VPSUBQ_128 );
+ DO_D( VPADDQ_128 );
+ DO_D( VPINSRQ_128 );
+ DO_D( VUCOMISS_128 );
+ DO_D( VUCOMISD_128 );
+ DO_D( VCVTPS2PD_128 );
+ DO_D( VANDNPD_128 );
+ DO_D( VORPD_128 );
+ DO_D( VXORPD_128 );
+ DO_D( VXORPS_128 );
+ DO_D( VMULSD_128 );
+ DO_D( VADDSD_128 );
+ DO_D( VMINSD_128 );
+ DO_D( VSUBSD_128 );
+ DO_D( VDIVSD_128 );
+ DO_D( VMAXSD_128 );
+ DO_D( VPSHUFD_0x39_128 );
+ DO_D( VPCMPEQD_128 );
+ DO_D( VPEXTRD_128_0x3 );
+ DO_D( VPEXTRD_128_0x0 );
+ DO_D( VINSERTF128_0x0 );
+ DO_D( VINSERTF128_0x1 );
+ DO_D( VEXTRACTF128_0x0 );
+ DO_D( VEXTRACTF128_0x1 );
+ DO_D( VCVTPD2PS_128 );
/* Test all CMPSS variants; this code is tricky. */
- test_VCMPSS_128_0x0();
- test_VCMPSS_128_0x1();
- test_VCMPSS_128_0x2();
- test_VCMPSS_128_0x3();
- test_VCMPSS_128_0x4();
- test_VCMPSS_128_0x5();
- test_VCMPSS_128_0x6();
- test_VCMPSS_128_0x7();
- test_VCMPSS_128_0xA();
+ DO_D( VCMPSS_128_0x0 );
+ DO_D( VCMPSS_128_0x1 );
+ DO_D( VCMPSS_128_0x2 );
+ DO_D( VCMPSS_128_0x3 );
+ DO_D( VCMPSS_128_0x4 );
+ DO_D( VCMPSS_128_0x5 );
+ DO_D( VCMPSS_128_0x6 );
+ DO_D( VCMPSS_128_0x7 );
+ DO_D( VCMPSS_128_0xA );
/* no 0xB case yet observed */
- test_VCMPSS_128_0xC();
- test_VCMPSS_128_0xD();
- test_VCMPSS_128_0xE();
- test_VMOVDDUP_XMMorMEM64_to_XMM();
- test_VMOVD_IREGorMEM32_to_XMM();
- test_VMOVQ_XMM_MEM64();
- test_VMOVDQA_GtoE_256();
- test_VMOVDQA_GtoE_128();
- test_VMOVDQU_GtoE_128();
- test_VMOVDQA_EtoG_256();
- test_VMOVDQA_EtoG_128();
- test_VMOVDQU_EtoG_128();
- test_VMOVAPD_GtoE_128();
- test_VMOVAPD_GtoE_256();
- test_VMOVAPS_GtoE_128();
- test_VMOVAPS_GtoE_256();
- test_VMOVAPS_EtoG_128();
- test_VMOVAPD_EtoG_256();
- test_VMOVAPD_EtoG_128();
- test_VMOVUPD_GtoE_128();
- test_VMOVSS_XMM_M32();
- test_VMOVSD_XMM_M64();
- test_VMOVSS_M64_XMM();
- test_VMOVSD_M64_XMM();
- test_VINSERTPS_0x39_128();
- test_VPUNPCKLDQ_128();
- test_VPACKSSDW_128();
- test_VPADDW_128();
- test_VPSRLW_0x05_128();
- test_VPSLLW_0x05_128();
- test_VPUNPCKLQDQ_128();
- test_VPINSRD_128();
- test_VMOVD_XMM_to_MEM32();
- test_VPANDN_128();
- test_VPSLLDQ_0x05_128();
- test_VPSRLDQ_0x05_128();
- test_VPSUBUSB_128();
- test_VPSUBSB_128();
- test_VPSLLD_0x05_128();
- test_VPSRLD_0x05_128();
- test_VPSRAD_0x05_128();
- test_VPUNPCKLWD_128();
- test_VPUNPCKHWD_128();
- test_VPADDUSB_128();
- test_VPMULHUW_128();
- test_VPADDUSW_128();
- test_VPMULLW_128();
- test_VPSHUFHW_0x39_128();
- test_VPSHUFLW_0x39_128();
- test_VCVTPS2DQ_128();
- test_VSUBPS_128();
- test_VADDPS_128();
- test_VMULPS_128();
- test_VMAXPS_128();
- test_VMINPS_128();
- test_VSHUFPS_0x39_128();
- test_VPCMPEQB_128();
- test_VMOVHPD_128_StoreForm();
- test_VPAND_128();
- test_VPMOVMSKB_128();
- test_VCVTTSS2SI_64();
- test_VPACKUSWB_128();
- test_VCVTSS2SD_128();
- test_VCVTSD2SS_128();
- test_VMOVD_XMM_to_IREG32();
- test_VPCMPESTRM_0x45_128();
- test_VMOVQ_IREGorMEM64_to_XMM();
- test_VMOVUPS_XMM_to_XMMorMEM();
- test_VMOVNTDQ_128();
- test_VMOVLHPS_128();
- test_VPABSD_128();
- test_VMOVHLPS_128();
- test_VMOVQ_XMM_to_IREG64();
- test_VMOVQ_XMMorMEM64_to_XMM();
- test_VCVTTSS2SI_32();
- test_VPUNPCKLBW_128();
- test_VPUNPCKHBW_128();
- test_VMULSS_128();
- test_VSUBSS_128();
- test_VADDSS_128();
- test_VDIVSS_128();
- test_VUNPCKLPS_128();
- test_VCVTSI2SS_128();
- test_VANDPS_128();
- test_VMINSS_128();
- test_VMAXSS_128();
- test_VANDNPS_128();
- test_VORPS_128();
- test_VSQRTSD_128();
- test_VCMPSD_128_0xD();
- test_VCMPSD_128_0x0();
- test_VPSHUFB_128();
- test_VCVTTSD2SI_32();
- test_VCVTTSD2SI_64();
- test_VCVTSI2SS_64();
- test_VCVTSI2SD_64();
- test_VCVTSI2SD_32();
- test_VPOR_128();
- test_VPXOR_128();
- test_VPSUBB_128();
- test_VPSUBD_128();
- test_VPADDD_128();
- test_VPMOVZXBW_128();
- test_VPMOVZXWD_128();
- test_VPBLENDVB_128();
- test_VPMINSD_128();
- test_VPMAXSD_128();
- test_VANDPD_128();
- test_VMULPD_256();
- test_VMOVUPD_EtoG_128();
- test_VADDPD_256();
- test_VSUBPD_256();
- test_VDIVPD_256();
- test_VPCMPEQQ_128();
- test_VSUBPD_128();
- test_VADDPD_128();
- test_VUNPCKLPD_128();
- test_VUNPCKHPD_128();
- test_VUNPCKHPS_128();
- test_VMOVUPS_EtoG_128();
- test_VADDPS_256();
- test_VSUBPS_256();
- test_VMULPS_256();
- test_VDIVPS_256();
- test_VPCMPGTQ_128();
- test_VPEXTRQ_128_0x0();
- test_VPEXTRQ_128_0x1();
- test_VPSRLQ_0x05_128();
- test_VPMULUDQ_128();
- test_VPSLLQ_0x05_128();
- test_VPMAXUD_128();
- test_VPMINUD_128();
- test_VPMULLD_128();
- test_VPMAXUW_128();
- test_VPEXTRW_128_EregOnly_toG_0x0();
- test_VPEXTRW_128_EregOnly_toG_0x7();
- test_VPMINUW_128();
- test_VPHMINPOSUW_128();
- test_VPMAXSW_128();
- test_VPMINSW_128();
- test_VPMAXUB_128();
- test_VPEXTRB_GtoE_128_0x0();
- test_VPEXTRB_GtoE_128_0x1();
- test_VPEXTRB_GtoE_128_0x2();
- test_VPEXTRB_GtoE_128_0x3();
- test_VPEXTRB_GtoE_128_0x4();
- test_VPEXTRB_GtoE_128_0x9();
- test_VPEXTRB_GtoE_128_0xE();
- test_VPEXTRB_GtoE_128_0xF();
- test_VPMINUB_128();
- test_VPMAXSB_128();
- test_VPMINSB_128();
- test_VPERM2F128_0x00();
- test_VPERM2F128_0xFF();
- test_VPERM2F128_0x30();
- test_VPERM2F128_0x21();
- test_VPERM2F128_0x12();
- test_VPERM2F128_0x03();
- test_VPERM2F128_0x85();
- test_VPERM2F128_0x5A();
- test_VPERMILPD_256_0x0();
- test_VPERMILPD_256_0xF();
- test_VPERMILPD_256_0xA();
- test_VPERMILPD_256_0x5();
- test_VPERMILPD_128_0x0();
- test_VPERMILPD_128_0x3();
- test_VUNPCKLPD_256();
- test_VUNPCKHPD_256();
- test_VSHUFPS_0x39_256();
- test_VUNPCKLPS_256();
- test_VUNPCKHPS_256();
- test_VXORPD_256();
- test_VBROADCASTSD_256();
- test_VCMPPD_128_0x4();
- test_VCVTDQ2PD_128();
- test_VDIVPD_128();
- test_VANDPD_256();
- test_VPMOVSXBW_128();
- test_VPSUBUSW_128();
- test_VPSUBSW_128();
- test_VPCMPEQW_128();
- test_VPADDB_128();
- test_VMOVAPS_EtoG_256();
- test_VCVTDQ2PD_256();
- test_VMOVHPD_128_LoadForm();
- test_VCVTPD2PS_256();
- test_VPUNPCKHDQ_128();
- test_VBROADCASTSS_128();
- test_VPMOVSXDQ_128();
- test_VPMOVSXWD_128();
- test_VDIVPS_128();
- test_VANDPS_256();
- test_VXORPS_256();
- test_VORPS_256();
- test_VANDNPD_256();
- test_VANDNPS_256();
- test_VORPD_256();
- test_VPERMILPS_256_0x0F();
- test_VPERMILPS_256_0xFA();
- test_VPERMILPS_256_0xA3();
- test_VPERMILPS_256_0x5A();
- test_VPMULHW_128();
- test_VPUNPCKHQDQ_128();
- test_VPSRAW_0x05_128();
- test_VPCMPGTD_128();
- test_VPMOVZXBD_128();
- test_VPMOVSXBD_128();
- test_VPINSRB_128_1of3();
- test_VPINSRB_128_2of3();
- test_VPINSRB_128_3of3();
- test_VCOMISD_128();
- test_VCOMISS_128();
- test_VMOVUPS_YMM_to_YMMorMEM();
- test_VDPPD_128_1of4();
- test_VDPPD_128_2of4();
- test_VDPPD_128_3of4();
- test_VDPPD_128_4of4();
- test_VPINSRW_128_1of4();
- test_VPINSRW_128_2of4();
- test_VPINSRW_128_3of4();
- test_VPINSRW_128_4of4();
- test_VBROADCASTSS_256();
- test_VPALIGNR_128_1of3();
- test_VPALIGNR_128_2of3();
- test_VPALIGNR_128_3of3();
- test_VMOVSD_REG_XMM();
- test_VMOVSS_REG_XMM();
- test_VMOVLPD_128_M64_XMM_XMM();
- test_VMOVLPD_128_XMM_M64();
- test_VSHUFPD_128_1of2();
- test_VSHUFPD_128_2of2();
- test_VSHUFPD_256_1of2();
- test_VSHUFPD_256_2of2();
- test_VPERMILPS_128_0x00();
- test_VPERMILPS_128_0xFE();
- test_VPERMILPS_128_0x30();
- test_VPERMILPS_128_0x21();
- test_VPERMILPS_128_0xD7();
- test_VPERMILPS_128_0xB5();
- test_VPERMILPS_128_0x85();
- test_VPERMILPS_128_0x29();
- test_VBLENDPS_128_1of3();
- test_VBLENDPS_128_2of3();
- test_VBLENDPS_128_3of3();
- test_VBLENDPD_128_1of2();
- test_VBLENDPD_128_2of2();
- test_VBLENDPD_256_1of3();
- test_VBLENDPD_256_2of3();
- test_VBLENDPD_256_3of3();
- test_VPBLENDW_128_0x00();
- test_VPBLENDW_128_0xFE();
- test_VPBLENDW_128_0x30();
- test_VPBLENDW_128_0x21();
- test_VPBLENDW_128_0xD7();
- test_VPBLENDW_128_0xB5();
- test_VPBLENDW_128_0x85();
- test_VPBLENDW_128_0x29();
- test_VMOVUPS_EtoG_256();
- test_VSQRTSS_128();
- test_VSQRTPS_128();
- test_VSQRTPS_256();
- test_VSQRTPD_128();
- test_VSQRTPD_256();
- test_VRSQRTSS_128();
- test_VRSQRTPS_128();
- test_VRSQRTPS_256();
- test_VMOVDQU_GtoE_256();
- test_VCVTPS2PD_256();
- test_VCVTTPS2DQ_128();
- test_VCVTTPS2DQ_256();
- test_VCVTDQ2PS_128();
- test_VCVTDQ2PS_256();
- test_VCVTTPD2DQ_128();
- test_VCVTTPD2DQ_256();
- test_VCVTPD2DQ_128();
- test_VCVTPD2DQ_256();
- test_VMOVSLDUP_128();
- test_VMOVSLDUP_256();
- test_VMOVSHDUP_128();
- test_VMOVSHDUP_256();
- test_VPERMILPS_VAR_128();
- test_VPERMILPD_VAR_128();
- test_VPERMILPS_VAR_256();
- test_VPERMILPD_VAR_256();
- test_VPSLLW_128();
- test_VPSRLW_128();
- test_VPSRAW_128();
- test_VPSLLD_128();
- test_VPSRLD_128();
- test_VPSRAD_128();
- test_VPSLLQ_128();
- test_VPSRLQ_128();
- test_VROUNDPS_128_0x2();
- test_VROUNDPS_256_0x2();
- test_VROUNDPD_128_0x2();
- test_VROUNDPD_256_0x2();
- test_VROUNDSS_0x5();
- test_VROUNDSD_0x3();
- test_VPTEST_128_1();
- test_VPTEST_128_2();
- test_VPTEST_256_1();
- test_VPTEST_256_2();
- test_VTESTPS_128_1();
- test_VTESTPS_128_2();
- test_VTESTPS_128_3(); // 10x
- test_VTESTPS_256_1();
- test_VTESTPS_256_2();
- test_VTESTPS_256_3(); // 10x
- test_VTESTPD_128_1();
- test_VTESTPD_128_2();
- test_VTESTPD_128_3(); // 10x
- test_VTESTPD_256_1();
- test_VTESTPD_256_2();
- test_VTESTPD_256_3(); // 10x
+ DO_D( VCMPSS_128_0xC );
+ DO_D( VCMPSS_128_0xD );
+ DO_D( VCMPSS_128_0xE );
+ DO_D( VMOVDDUP_XMMorMEM64_to_XMM );
+ DO_D( VMOVD_IREGorMEM32_to_XMM );
+ DO_D( VMOVQ_XMM_MEM64 );
+ DO_D( VMOVDQA_GtoE_256 );
+ DO_D( VMOVDQA_GtoE_128 );
+ DO_D( VMOVDQU_GtoE_128 );
+ DO_D( VMOVDQA_EtoG_256 );
+ DO_D( VMOVDQA_EtoG_128 );
+ DO_D( VMOVDQU_EtoG_128 );
+ DO_D( VMOVAPD_GtoE_128 );
+ DO_D( VMOVAPD_GtoE_256 );
+ DO_D( VMOVAPS_GtoE_128 );
+ DO_D( VMOVAPS_GtoE_256 );
+ DO_D( VMOVAPS_EtoG_128 );
+ DO_D( VMOVAPD_EtoG_256 );
+ DO_D( VMOVAPD_EtoG_128 );
+ DO_D( VMOVUPD_GtoE_128 );
+ DO_D( VMOVSS_XMM_M32 );
+ DO_D( VMOVSD_XMM_M64 );
+ DO_D( VMOVSS_M64_XMM );
+ DO_D( VMOVSD_M64_XMM );
+ DO_D( VINSERTPS_0x39_128 );
+ DO_D( VPUNPCKLDQ_128 );
+ DO_D( VPACKSSDW_128 );
+ DO_D( VPADDW_128 );
+ DO_D( VPSRLW_0x05_128 );
+ DO_D( VPSLLW_0x05_128 );
+ DO_D( VPUNPCKLQDQ_128 );
+ DO_D( VPINSRD_128 );
+ DO_D( VMOVD_XMM_to_MEM32 );
+ DO_D( VPANDN_128 );
+ DO_D( VPSLLDQ_0x05_128 );
+ DO_D( VPSRLDQ_0x05_128 );
+ DO_D( VPSUBUSB_128 );
+ DO_D( VPSUBSB_128 );
+ DO_D( VPSLLD_0x05_128 );
+ DO_D( VPSRLD_0x05_128 );
+ DO_D( VPSRAD_0x05_128 );
+ DO_D( VPUNPCKLWD_128 );
+ DO_D( VPUNPCKHWD_128 );
+ DO_D( VPADDUSB_128 );
+ DO_D( VPMULHUW_128 );
+ DO_D( VPADDUSW_128 );
+ DO_D( VPMULLW_128 );
+ DO_D( VPSHUFHW_0x39_128 );
+ DO_D( VPSHUFLW_0x39_128 );
+ DO_D( VCVTPS2DQ_128 );
+ DO_D( VSUBPS_128 );
+ DO_D( VADDPS_128 );
+ DO_D( VMULPS_128 );
+ DO_D( VMAXPS_128 );
+ DO_D( VMINPS_128 );
+ DO_D( VSHUFPS_0x39_128 );
+ DO_D( VPCMPEQB_128 );
+ DO_D( VMOVHPD_128_StoreForm );
+ DO_D( VPAND_128 );
+ DO_D( VPMOVMSKB_128 );
+ DO_D( VCVTTSS2SI_64 );
+ DO_D( VPACKUSWB_128 );
+ DO_D( VCVTSS2SD_128 );
+ DO_D( VCVTSD2SS_128 );
+ DO_D( VMOVD_XMM_to_IREG32 );
+ DO_D( VPCMPESTRM_0x45_128 );
+ DO_D( VMOVQ_IREGorMEM64_to_XMM );
+ DO_D( VMOVUPS_XMM_to_XMMorMEM );
+ DO_D( VMOVNTDQ_128 );
+ DO_D( VMOVLHPS_128 );
+ DO_D( VPABSD_128 );
+ DO_D( VMOVHLPS_128 );
+ DO_D( VMOVQ_XMM_to_IREG64 );
+ DO_D( VMOVQ_XMMorMEM64_to_XMM );
+ DO_D( VCVTTSS2SI_32 );
+ DO_D( VPUNPCKLBW_128 );
+ DO_D( VPUNPCKHBW_128 );
+ DO_D( VMULSS_128 );
+ DO_D( VSUBSS_128 );
+ DO_D( VADDSS_128 );
+ DO_D( VDIVSS_128 );
+ DO_D( VUNPCKLPS_128 );
+ DO_D( VCVTSI2SS_128 );
+ DO_D( VANDPS_128 );
+ DO_D( VMINSS_128 );
+ DO_D( VMAXSS_128 );
+ DO_D( VANDNPS_128 );
+ DO_D( VORPS_128 );
+ DO_D( VSQRTSD_128 );
+ DO_D( VCMPSD_128_0xD );
+ DO_D( VCMPSD_128_0x0 );
+ DO_D( VPSHUFB_128 );
+ DO_D( VCVTTSD2SI_32 );
+ DO_D( VCVTTSD2SI_64 );
+ DO_D( VCVTSI2SS_64 );
+ DO_D( VCVTSI2SD_64 );
+ DO_D( VCVTSI2SD_32 );
+ DO_D( VPOR_128 );
+ DO_D( VPXOR_128 );
+ DO_D( VPSUBB_128 );
+ DO_D( VPSUBD_128 );
+ DO_D( VPADDD_128 );
+ DO_D( VPMOVZXBW_128 );
+ DO_D( VPMOVZXWD_128 );
+ DO_D( VPBLENDVB_128 );
+ DO_D( VPMINSD_128 );
+ DO_D( VPMAXSD_128 );
+ DO_D( VANDPD_128 );
+ DO_D( VMULPD_256 );
+ DO_D( VMOVUPD_EtoG_128 );
+ DO_D( VADDPD_256 );
+ DO_D( VSUBPD_256 );
+ DO_D( VDIVPD_256 );
+ DO_D( VPCMPEQQ_128 );
+ DO_D( VSUBPD_128 );
+ DO_D( VADDPD_128 );
+ DO_D( VUNPCKLPD_128 );
+ DO_D( VUNPCKHPD_128 );
+ DO_D( VUNPCKHPS_128 );
+ DO_D( VMOVUPS_EtoG_128 );
+ DO_D( VADDPS_256 );
+ DO_D( VSUBPS_256 );
+ DO_D( VMULPS_256 );
+ DO_D( VDIVPS_256 );
+ DO_D( VPCMPGTQ_128 );
+ DO_D( VPEXTRQ_128_0x0 );
+ DO_D( VPEXTRQ_128_0x1 );
+ DO_D( VPSRLQ_0x05_128 );
+ DO_D( VPMULUDQ_128 );
+ DO_D( VPSLLQ_0x05_128 );
+ DO_D( VPMAXUD_128 );
+ DO_D( VPMINUD_128 );
+ DO_D( VPMULLD_128 );
+ DO_D( VPMAXUW_128 );
+ DO_D( VPEXTRW_128_EregOnly_toG_0x0 );
+ DO_D( VPEXTRW_128_EregOnly_toG_0x7 );
+ DO_D( VPMINUW_128 );
+ DO_D( VPHMINPOSUW_128 );
+ DO_D( VPMAXSW_128 );
+ DO_D( VPMINSW_128 );
+ DO_D( VPMAXUB_128 );
+ DO_D( VPEXTRB_GtoE_128_0x0 );
+ DO_D( VPEXTRB_GtoE_128_0x1 );
+ DO_D( VPEXTRB_GtoE_128_0x2 );
+ DO_D( VPEXTRB_GtoE_128_0x3 );
+ DO_D( VPEXTRB_GtoE_128_0x4 );
+ DO_D( VPEXTRB_GtoE_128_0x9 );
+ DO_D( VPEXTRB_GtoE_128_0xE );
+ DO_D( VPEXTRB_GtoE_128_0xF );
+ DO_D( VPMINUB_128 );
+ DO_D( VPMAXSB_128 );
+ DO_D( VPMINSB_128 );
+ DO_D( VPERM2F128_0x00 );
+ DO_D( VPERM2F128_0xFF );
+ DO_D( VPERM2F128_0x30 );
+ DO_D( VPERM2F128_0x21 );
+ DO_D( VPERM2F128_0x12 );
+ DO_D( VPERM2F128_0x03 );
+ DO_D( VPERM2F128_0x85 );
+ DO_D( VPERM2F128_0x5A );
+ DO_D( VPERMILPD_256_0x0 );
+ DO_D( VPERMILPD_256_0xF );
+ DO_D( VPERMILPD_256_0xA );
+ DO_D( VPERMILPD_256_0x5 );
+ DO_D( VPERMILPD_128_0x0 );
+ DO_D( VPERMILPD_128_0x3 );
+ DO_D( VUNPCKLPD_256 );
+ DO_D( VUNPCKHPD_256 );
+ DO_D( VSHUFPS_0x39_256 );
+ DO_D( VUNPCKLPS_256 );
+ DO_D( VUNPCKHPS_256 );
+ DO_D( VXORPD_256 );
+ DO_D( VBROADCASTSD_256 );
+ DO_D( VCMPPD_128_0x4 );
+ DO_D( VCVTDQ2PD_128 );
+ DO_D( VDIVPD_128 );
+ DO_D( VANDPD_256 );
+ DO_D( VPMOVSXBW_128 );
+ DO_D( VPSUBUSW_128 );
+ DO_D( VPSUBSW_128 );
+ DO_D( VPCMPEQW_128 );
+ DO_D( VPADDB_128 );
+ DO_D( VMOVAPS_EtoG_256 );
+ DO_D( VCVTDQ2PD_256 );
+ DO_D( VMOVHPD_128_LoadForm );
+ DO_D( VCVTPD2PS_256 );
+ DO_D( VPUNPCKHDQ_128 );
+ DO_D( VBROADCASTSS_128 );
+ DO_D( VPMOVSXDQ_128 );
+ DO_D( VPMOVSXWD_128 );
+ DO_D( VDIVPS_128 );
+ DO_D( VANDPS_256 );
+ DO_D( VXORPS_256 );
+ DO_D( VORPS_256 );
+ DO_D( VANDNPD_256 );
+ DO_D( VANDNPS_256 );
+ DO_D( VORPD_256 );
+ DO_D( VPERMILPS_256_0x0F );
+ DO_D( VPERMILPS_256_0xFA );
+ DO_D( VPERMILPS_256_0xA3 );
+ DO_D( VPERMILPS_256_0x5A );
+ DO_D( VPMULHW_128 );
+ DO_D( VPUNPCKHQDQ_128 );
+ DO_D( VPSRAW_0x05_128 );
+ DO_D( VPCMPGTD_128 );
+ DO_D( VPMOVZXBD_128 );
+ DO_D( VPMOVSXBD_128 );
+ DO_D( VPINSRB_128_1of3 );
+ DO_D( VPINSRB_128_2of3 );
+ DO_D( VPINSRB_128_3of3 );
+ DO_D( VCOMISD_128 );
+ DO_D( VCOMISS_128 );
+ DO_D( VMOVUPS_YMM_to_YMMorMEM );
+ DO_D( VDPPD_128_1of4 );
+ DO_D( VDPPD_128_2of4 );
+ DO_D( VDPPD_128_3of4 );
+ DO_D( VDPPD_128_4of4 );
+ DO_D( VPINSRW_128_1of4 );
+ DO_D( VPINSRW_128_2of4 );
+ DO_D( VPINSRW_128_3of4 );
+ DO_D( VPINSRW_128_4of4 );
+ DO_D( VBROADCASTSS_256 );
+ DO_D( VPALIGNR_128_1of3 );
+ DO_D( VPALIGNR_128_2of3 );
+ DO_D( VPALIGNR_128_3of3 );
+ DO_D( VMOVSD_REG_XMM );
+ DO_D( VMOVSS_REG_XMM );
+ DO_D( VMOVLPD_128_M64_XMM_XMM );
+ DO_D( VMOVLPD_128_XMM_M64 );
+ DO_D( VSHUFPD_128_1of2 );
+ DO_D( VSHUFPD_128_2of2 );
+ DO_D( VSHUFPD_256_1of2 );
+ DO_D( VSHUFPD_256_2of2 );
+ DO_D( VPERMILPS_128_0x00 );
+ DO_D( VPERMILPS_128_0xFE );
+ DO_D( VPERMILPS_128_0x30 );
+ DO_D( VPERMILPS_128_0x21 );
+ DO_D( VPERMILPS_128_0xD7 );
+ DO_D( VPERMILPS_128_0xB5 );
+ DO_D( VPERMILPS_128_0x85 );
+ DO_D( VPERMILPS_128_0x29 );
+ DO_D( VBLENDPS_128_1of3 );
+ DO_D( VBLENDPS_128_2of3 );
+ DO_D( VBLENDPS_128_3of3 );
+ DO_D( VBLENDPD_128_1of2 );
+ DO_D( VBLENDPD_128_2of2 );
+ DO_D( VBLENDPD_256_1of3 );
+ DO_D( VBLENDPD_256_2of3 );
+ DO_D( VBLENDPD_256_3of3 );
+ DO_D( VPBLENDW_128_0x00 );
+ DO_D( VPBLENDW_128_0xFE );
+ DO_D( VPBLENDW_128_0x30 );
+ DO_D( VPBLENDW_128_0x21 );
+ DO_D( VPBLENDW_128_0xD7 );
+ DO_D( VPBLENDW_128_0xB5 );
+ DO_D( VPBLENDW_128_0x85 );
+ DO_D( VPBLENDW_128_0x29 );
+ DO_D( VMOVUPS_EtoG_256 );
+ DO_D( VSQRTSS_128 );
+ DO_D( VSQRTPS_128 );
+ DO_D( VSQRTPS_256 );
+ DO_D( VSQRTPD_128 );
+ DO_D( VSQRTPD_256 );
+ DO_D( VRSQRTSS_128 );
+ DO_D( VRSQRTPS_128 );
+ DO_D( VRSQRTPS_256 );
+ DO_D( VMOVDQU_GtoE_256 );
+ DO_D( VCVTPS2PD_256 );
+ DO_D( VCVTTPS2DQ_128 );
+ DO_D( VCVTTPS2DQ_256 );
+ DO_D( VCVTDQ2PS_128 );
+ DO_D( VCVTDQ2PS_256 );
+ DO_D( VCVTTPD2DQ_128 );
+ DO_D( VCVTTPD2DQ_256 );
+ DO_D( VCVTPD2DQ_128 );
+ DO_D( VCVTPD2DQ_256 );
+ DO_D( VMOVSLDUP_128 );
+ DO_D( VMOVSLDUP_256 );
+ DO_D( VMOVSHDUP_128 );
+ DO_D( VMOVSHDUP_256 );
+ DO_D( VPERMILPS_VAR_128 );
+ DO_D( VPERMILPD_VAR_128 );
+ DO_D( VPERMILPS_VAR_256 );
+ DO_D( VPERMILPD_VAR_256 );
+ DO_D( VPSLLW_128 );
+ DO_D( VPSRLW_128 );
+ DO_D( VPSRAW_128 );
+ DO_D( VPSLLD_128 );
+ DO_D( VPSRLD_128 );
+ DO_D( VPSRAD_128 );
+ DO_D( VPSLLQ_128 );
+ DO_D( VPSRLQ_128 );
+ DO_D( VROUNDPS_128_0x2 );
+ DO_D( VROUNDPS_256_0x2 );
+ DO_D( VROUNDPD_128_0x2 );
+ DO_D( VROUNDPD_256_0x2 );
+ DO_D( VROUNDSS_0x5 );
+ DO_D( VROUNDSD_0x3 );
+ DO_D( VPTEST_128_1 );
+ DO_D( VPTEST_128_2 );
+ DO_D( VPTEST_256_1 );
+ DO_D( VPTEST_256_2 );
+ DO_D( VTESTPS_128_1 );
+ DO_D( VTESTPS_128_2 );
+ DO_N( 10, VTESTPS_128_3 );
+ DO_D( VTESTPS_256_1 );
+ DO_D( VTESTPS_256_2 );
+ DO_N( 10, VTESTPS_256_3 );
+ DO_D( VTESTPD_128_1 );
+ DO_D( VTESTPD_128_2 );
+ DO_N( 10, VTESTPD_128_3 );
+ DO_D( VTESTPD_256_1 );
+ DO_D( VTESTPD_256_2 );
+ DO_N( 10, VTESTPD_256_3 );
return 0;
}
|
|
From: <sv...@va...> - 2012-06-24 09:10:51
|
sewardj 2012-06-24 10:10:38 +0100 (Sun, 24 Jun 2012)
New Revision: 12664
Log:
Update.
Modified files:
trunk/none/tests/amd64/avx-1.c
Modified: trunk/none/tests/amd64/avx-1.c (+285 -0)
===================================================================
--- trunk/none/tests/amd64/avx-1.c 2012-06-23 12:04:01 +01:00 (rev 12663)
+++ trunk/none/tests/amd64/avx-1.c 2012-06-24 10:10:38 +01:00 (rev 12664)
@@ -390,6 +390,10 @@
"vpsubusb %%xmm9, %%xmm8, %%xmm7",
"vpsubusb (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPSUBSB_128,
+ "vpsubsb %%xmm9, %%xmm8, %%xmm7",
+ "vpsubsb (%%rax), %%xmm8, %%xmm7")
+
GEN_test_Ronly(VPSRLDQ_0x05_128,
"vpsrldq $0x5, %%xmm9, %%xmm7")
@@ -937,6 +941,10 @@
"vpsubusw %%xmm9, %%xmm8, %%xmm7",
"vpsubusw (%%rax), %%xmm8, %%xmm7")
+GEN_test_RandM(VPSUBSW_128,
+ "vpsubsw %%xmm9, %%xmm8, %%xmm7",
+ "vpsubsw (%%rax), %%xmm8, %%xmm7")
+
GEN_test_RandM(VPCMPEQW_128,
"vpcmpeqw %%xmm6, %%xmm8, %%xmm7",
"vpcmpeqw (%%rax), %%xmm8, %%xmm7")
@@ -1308,7 +1316,252 @@
"vpermilpd %%ymm6, %%ymm8, %%ymm7",
"vpermilpd (%%rax), %%ymm8, %%ymm7")
+GEN_test_RandM(VPSLLW_128,
+ "andl $15, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsllw %%xmm6, %%xmm8, %%xmm9",
+ "andq $15, 128(%%rax);"
+ "vpsllw 128(%%rax), %%xmm8, %%xmm9")
+GEN_test_RandM(VPSRLW_128,
+ "andl $15, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsrlw %%xmm6, %%xmm8, %%xmm9",
+ "andq $15, 128(%%rax);"
+ "vpsrlw 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VPSRAW_128,
+ "andl $31, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsraw %%xmm6, %%xmm8, %%xmm9",
+ "andq $15, 128(%%rax);"
+ "vpsraw 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VPSLLD_128,
+ "andl $31, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpslld %%xmm6, %%xmm8, %%xmm9",
+ "andq $31, 128(%%rax);"
+ "vpslld 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VPSRLD_128,
+ "andl $31, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsrld %%xmm6, %%xmm8, %%xmm9",
+ "andq $31, 128(%%rax);"
+ "vpsrld 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VPSRAD_128,
+ "andl $31, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsrad %%xmm6, %%xmm8, %%xmm9",
+ "andq $31, 128(%%rax);"
+ "vpsrad 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VPSLLQ_128,
+ "andl $63, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsllq %%xmm6, %%xmm8, %%xmm9",
+ "andq $63, 128(%%rax);"
+ "vpsllq 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VPSRLQ_128,
+ "andl $63, %%r14d;"
+ "vmovd %%r14d, %%xmm6;"
+ "vpsrlq %%xmm6, %%xmm8, %%xmm9",
+ "andq $63, 128(%%rax);"
+ "vpsrlq 128(%%rax), %%xmm8, %%xmm9")
+
+GEN_test_RandM(VROUNDPS_128_0x2,
+ "vroundps $0x2, %%xmm8, %%xmm9",
+ "vroundps $0x2, (%%rax), %%xmm9")
+
+GEN_test_RandM(VROUNDPS_256_0x2,
+ "vroundps $0x2, %%ymm8, %%ymm9",
+ "vroundps $0x2, (%%rax), %%ymm9")
+
+GEN_test_RandM(VROUNDPD_128_0x2,
+ "vroundpd $0x2, %%xmm8, %%xmm9",
+ "vroundpd $0x2, (%%rax), %%xmm9")
+
+GEN_test_RandM(VROUNDPD_256_0x2,
+ "vroundpd $0x2, %%ymm8, %%ymm9",
+ "vroundpd $0x2, (%%rax), %%ymm9")
+
+GEN_test_RandM(VROUNDSS_0x5,
+ "vroundss $0x5, %%xmm8, %%xmm6, %%xmm9",
+ "vroundss $0x5, (%%rax), %%xmm6, %%xmm9")
+
+GEN_test_RandM(VROUNDSD_0x3,
+ "vroundsd $0x3, %%xmm8, %%xmm6, %%xmm9",
+ "vroundsd $0x3, (%%rax), %%xmm6, %%xmm9")
+
+GEN_test_RandM(VPTEST_128_1,
+ "vptest %%xmm6, %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vptest (%%rax), %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+/* Here we ignore the boilerplate-supplied data and try to do
+ x AND x and x AND NOT x. Not a great test but better
+ than nothing. */
+GEN_test_RandM(VPTEST_128_2,
+ "vmovups %%xmm6, %%xmm8;"
+ "vptest %%xmm6, %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vmovups (%%rax), %%xmm8;"
+ "vcmpeqpd %%xmm8,%%xmm8,%%xmm7;"
+ "vxorpd %%xmm8,%%xmm7,%%xmm8;"
+ "vptest (%%rax), %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+GEN_test_RandM(VPTEST_256_1,
+ "vptest %%ymm6, %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vptest (%%rax), %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+/* Here we ignore the boilerplate-supplied data and try to do
+ x AND x and x AND NOT x. Not a great test but better
+ than nothing. */
+GEN_test_RandM(VPTEST_256_2,
+ "vmovups %%ymm6, %%ymm8;"
+ "vptest %%ymm6, %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vmovups (%%rax), %%ymm8;"
+ "vcmpeqpd %%xmm8,%%xmm8,%%xmm7;"
+ "subq $1024, %%rsp;"
+ "vmovups %%xmm7,512(%%rsp);"
+ "vmovups %%xmm7,528(%%rsp);"
+ "vmovups 512(%%rsp), %%ymm7;"
+ "addq $1024, %%rsp;"
+ "vxorpd %%ymm8,%%ymm7,%%ymm8;"
+ "vptest (%%rax), %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+
+/* VTESTPS/VTESTPD: test once with all-0 operands, once with
+ one all-0s and one all 1s, and once with random data. */
+
+GEN_test_RandM(VTESTPS_128_1,
+ "vtestps %%xmm6, %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestps (%%rax), %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+/* Here we ignore the boilerplate-supplied data and try to do
+ x AND x and x AND NOT x. Not a great test but better
+ than nothing. */
+GEN_test_RandM(VTESTPS_128_2,
+ "vmovups %%xmm6, %%xmm8;"
+ "vtestps %%xmm6, %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vmovups (%%rax), %%xmm8;"
+ "vcmpeqpd %%xmm8,%%xmm8,%%xmm7;"
+ "vxorpd %%xmm8,%%xmm7,%%xmm8;"
+ "vtestps (%%rax), %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+GEN_test_RandM(VTESTPS_128_3,
+ "vtestps %%xmm8, %%xmm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestps (%%rax), %%xmm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+
+
+
+GEN_test_RandM(VTESTPS_256_1,
+ "vtestps %%ymm6, %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestps (%%rax), %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+/* Here we ignore the boilerplate-supplied data and try to do
+ x AND x and x AND NOT x. Not a great test but better
+ than nothing. */
+GEN_test_RandM(VTESTPS_256_2,
+ "vmovups %%ymm6, %%ymm8;"
+ "vtestps %%ymm6, %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vmovups (%%rax), %%ymm8;"
+ "vcmpeqpd %%xmm8,%%xmm8,%%xmm7;"
+ "subq $1024, %%rsp;"
+ "vmovups %%xmm7,512(%%rsp);"
+ "vmovups %%xmm7,528(%%rsp);"
+ "vmovups 512(%%rsp), %%ymm7;"
+ "addq $1024, %%rsp;"
+ "vxorpd %%ymm8,%%ymm7,%%ymm8;"
+ "vtestps (%%rax), %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+GEN_test_RandM(VTESTPS_256_3,
+ "vtestps %%ymm8, %%ymm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestps (%%rax), %%ymm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+
+
+GEN_test_RandM(VTESTPD_128_1,
+ "vtestpd %%xmm6, %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestpd (%%rax), %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+/* Here we ignore the boilerplate-supplied data and try to do
+ x AND x and x AND NOT x. Not a great test but better
+ than nothing. */
+GEN_test_RandM(VTESTPD_128_2,
+ "vmovups %%xmm6, %%xmm8;"
+ "vtestpd %%xmm6, %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vmovups (%%rax), %%xmm8;"
+ "vcmpeqpd %%xmm8,%%xmm8,%%xmm7;"
+ "vxorpd %%xmm8,%%xmm7,%%xmm8;"
+ "vtestpd (%%rax), %%xmm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+GEN_test_RandM(VTESTPD_128_3,
+ "vtestpd %%xmm8, %%xmm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestpd (%%rax), %%xmm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+
+
+
+GEN_test_RandM(VTESTPD_256_1,
+ "vtestpd %%ymm6, %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestpd (%%rax), %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+/* Here we ignore the boilerplate-supplied data and try to do
+ x AND x and x AND NOT x. Not a great test but better
+ than nothing. */
+GEN_test_RandM(VTESTPD_256_2,
+ "vmovups %%ymm6, %%ymm8;"
+ "vtestpd %%ymm6, %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vmovups (%%rax), %%ymm8;"
+ "vcmpeqpd %%xmm8,%%xmm8,%%xmm7;"
+ "subq $1024, %%rsp;"
+ "vmovups %%xmm7,512(%%rsp);"
+ "vmovups %%xmm7,528(%%rsp);"
+ "vmovups 512(%%rsp), %%ymm7;"
+ "addq $1024, %%rsp;"
+ "vxorpd %%ymm8,%%ymm7,%%ymm8;"
+ "vtestpd (%%rax), %%ymm8; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+GEN_test_RandM(VTESTPD_256_3,
+ "vtestpd %%ymm8, %%ymm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14",
+ "vtestpd (%%rax), %%ymm9; "
+ "pushfq; popq %%r14; andq $0x8D5, %%r14")
+
+
/* Comment duplicated above, for convenient reference:
Allowed operands in test insns:
Reg form: %ymm6, %ymm7, %ymm8, %ymm9 and %r14.
@@ -1394,6 +1647,7 @@
test_VPSLLDQ_0x05_128();
test_VPSRLDQ_0x05_128();
test_VPSUBUSB_128();
+ test_VPSUBSB_128();
test_VPSLLD_0x05_128();
test_VPSRLD_0x05_128();
test_VPSRAD_0x05_128();
@@ -1535,6 +1789,7 @@
test_VANDPD_256();
test_VPMOVSXBW_128();
test_VPSUBUSW_128();
+ test_VPSUBSW_128();
test_VPCMPEQW_128();
test_VPADDB_128();
test_VMOVAPS_EtoG_256();
@@ -1639,5 +1894,35 @@
test_VPERMILPD_VAR_128();
test_VPERMILPS_VAR_256();
test_VPERMILPD_VAR_256();
+ test_VPSLLW_128();
+ test_VPSRLW_128();
+ test_VPSRAW_128();
+ test_VPSLLD_128();
+ test_VPSRLD_128();
+ test_VPSRAD_128();
+ test_VPSLLQ_128();
+ test_VPSRLQ_128();
+ test_VROUNDPS_128_0x2();
+ test_VROUNDPS_256_0x2();
+ test_VROUNDPD_128_0x2();
+ test_VROUNDPD_256_0x2();
+ test_VROUNDSS_0x5();
+ test_VROUNDSD_0x3();
+ test_VPTEST_128_1();
+ test_VPTEST_128_2();
+ test_VPTEST_256_1();
+ test_VPTEST_256_2();
+ test_VTESTPS_128_1();
+ test_VTESTPS_128_2();
+ test_VTESTPS_128_3(); // 10x
+ test_VTESTPS_256_1();
+ test_VTESTPS_256_2();
+ test_VTESTPS_256_3(); // 10x
+ test_VTESTPD_128_1();
+ test_VTESTPD_128_2();
+ test_VTESTPD_128_3(); // 10x
+ test_VTESTPD_256_1();
+ test_VTESTPD_256_2();
+ test_VTESTPD_256_3(); // 10x
return 0;
}
|
|
From: Rich C. <rc...@wi...> - 2012-06-24 04:33:51
|
valgrind revision: 12663 VEX revision: 2402 C compiler: i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) Assembler: C library: unknown uname -mrs: Darwin 10.8.0 i386 Vendor version: unknown Nightly build on macbook ( Darwin 10.8.0 i386 ) Started at 2012-06-23 23:05:01 CDT Ended at 2012-06-23 23:33:22 CDT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 493 tests, 478 stderr failures, 130 stdout failures, 3 stderrB failures, 3 stdoutB failures, 32 post failures == gdbserver_tests/mchelp (stdoutB) gdbserver_tests/mchelp (stderrB) gdbserver_tests/mcinvokeRU (stdoutB) gdbserver_tests/mcinvokeRU (stderrB) gdbserver_tests/mcinvokeWS (stdoutB) gdbserver_tests/mcinvokeWS (stderrB) gdbserver_tests/nlfork_chain (stdout) gdbserver_tests/nlfork_chain (stderr) memcheck/tests/accounting (stderr) memcheck/tests/addressable (stdout) memcheck/tests/addressable (stderr) memcheck/tests/atomic_incs (stdout) memcheck/tests/atomic_incs (stderr) memcheck/tests/badaddrvalue (stdout) memcheck/tests/badaddrvalue (stderr) memcheck/tests/badfree-2trace (stderr) memcheck/tests/badfree (stderr) memcheck/tests/badfree3 (stderr) memcheck/tests/badjump (stderr) memcheck/tests/badjump2 (stderr) memcheck/tests/badloop (stderr) memcheck/tests/badpoll (stderr) memcheck/tests/badrw (stderr) memcheck/tests/big_blocks_freed_list (stderr) memcheck/tests/brk2 (stderr) memcheck/tests/buflen_check (stderr) memcheck/tests/bug287260 (stderr) memcheck/tests/calloc-overflow (stderr) memcheck/tests/clientperm (stdout) memcheck/tests/clientperm (stderr) memcheck/tests/clireq_nofill (stdout) memcheck/tests/clireq_nofill (stderr) memcheck/tests/custom-overlap (stderr) memcheck/tests/custom_alloc (stderr) memcheck/tests/darwin/aio (stderr) memcheck/tests/darwin/env (stderr) memcheck/tests/darwin/pth-supp (stderr) memcheck/tests/darwin/scalar (stderr) memcheck/tests/darwin/scalar_fork (stderr) memcheck/tests/darwin/scalar_nocancel (stderr) memcheck/tests/darwin/scalar_vfork (stderr) memcheck/tests/deep_templates (stdout) memcheck/tests/deep_templates (stderr) memcheck/tests/describe-block (stderr) memcheck/tests/doublefree (stderr) memcheck/tests/err_disable1 (stderr) memcheck/tests/err_disable2 (stderr) memcheck/tests/err_disable3 (stderr) memcheck/tests/err_disable4 (stderr) memcheck/tests/erringfds (stdout) memcheck/tests/erringfds (stderr) memcheck/tests/error_counts (stderr) memcheck/tests/errs1 (stderr) memcheck/tests/execve1 (stderr) memcheck/tests/execve2 (stderr) memcheck/tests/exitprog (stderr) memcheck/tests/file_locking (stderr) memcheck/tests/fprw (stderr) memcheck/tests/fwrite (stderr) memcheck/tests/holey_buffer_too_small (stderr) memcheck/tests/inits (stderr) memcheck/tests/inline (stdout) memcheck/tests/inline (stderr) memcheck/tests/leak-0 (stderr) memcheck/tests/leak-cases-full (stderr) memcheck/tests/leak-cases-possible (stderr) memcheck/tests/leak-cases-summary (stderr) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-delta (stderr) memcheck/tests/leak-pool-0 (stderr) memcheck/tests/leak-pool-1 (stderr) memcheck/tests/leak-pool-2 (stderr) memcheck/tests/leak-pool-3 (stderr) memcheck/tests/leak-pool-4 (stderr) memcheck/tests/leak-pool-5 (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/long-supps (stderr) memcheck/tests/long_namespace_xml (stdout) memcheck/tests/long_namespace_xml (stderr) memcheck/tests/mallinfo (stderr) memcheck/tests/malloc1 (stderr) memcheck/tests/malloc2 (stderr) memcheck/tests/malloc3 (stdout) memcheck/tests/malloc3 (stderr) memcheck/tests/malloc_free_fill (stderr) memcheck/tests/malloc_usable (stderr) memcheck/tests/manuel1 (stdout) memcheck/tests/manuel1 (stderr) memcheck/tests/manuel2 (stdout) memcheck/tests/manuel2 (stderr) memcheck/tests/manuel3 (stderr) memcheck/tests/match-overrun (stderr) memcheck/tests/memalign2 (stderr) memcheck/tests/memalign_test (stderr) memcheck/tests/memcmptest (stdout) memcheck/tests/memcmptest (stderr) memcheck/tests/mempool (stderr) memcheck/tests/mempool2 (stderr) memcheck/tests/metadata (stdout) memcheck/tests/metadata (stderr) memcheck/tests/mismatches (stderr) memcheck/tests/mmaptest (stderr) memcheck/tests/nanoleak2 (stderr) memcheck/tests/nanoleak_supp (stderr) memcheck/tests/new_nothrow (stderr) memcheck/tests/new_override (stdout) memcheck/tests/new_override (stderr) memcheck/tests/noisy_child (stderr) memcheck/tests/null_socket (stderr) memcheck/tests/origin1-yes (stderr) memcheck/tests/origin2-not-quite (stderr) memcheck/tests/origin3-no (stderr) memcheck/tests/origin4-many (stderr) memcheck/tests/origin5-bz2 (stdout) memcheck/tests/origin5-bz2 (stderr) memcheck/tests/origin6-fp (stderr) memcheck/tests/overlap (stdout) memcheck/tests/overlap (stderr) memcheck/tests/partial_load_dflt (stderr) memcheck/tests/partial_load_ok (stderr) memcheck/tests/partiallydefinedeq (stdout) memcheck/tests/partiallydefinedeq (stderr) memcheck/tests/pdb-realloc (stderr) memcheck/tests/pdb-realloc2 (stdout) memcheck/tests/pdb-realloc2 (stderr) memcheck/tests/pipe (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/post-syscall (stderr) memcheck/tests/realloc1 (stderr) memcheck/tests/realloc2 (stderr) memcheck/tests/realloc3 (stderr) memcheck/tests/sbfragment (stdout) memcheck/tests/sbfragment (stderr) memcheck/tests/sh-mem-random (stdout) memcheck/tests/sh-mem-random (stderr) memcheck/tests/sh-mem (stderr) memcheck/tests/sigaltstack (stderr) memcheck/tests/sigkill (stderr) memcheck/tests/signal2 (stdout) memcheck/tests/signal2 (stderr) memcheck/tests/sigprocmask (stderr) memcheck/tests/static_malloc (stderr) memcheck/tests/str_tester (stderr) memcheck/tests/strchr (stderr) memcheck/tests/supp1 (stderr) memcheck/tests/supp2 (stderr) memcheck/tests/supp_unknown (stderr) memcheck/tests/suppfree (stderr) memcheck/tests/test-plo-no (stderr) memcheck/tests/test-plo-yes (stderr) memcheck/tests/trivialleak (stderr) memcheck/tests/unit_libcbase (stderr) memcheck/tests/unit_oset (stdout) memcheck/tests/unit_oset (stderr) memcheck/tests/varinfo1 (stderr) memcheck/tests/varinfo2 (stderr) memcheck/tests/varinfo3 (stderr) memcheck/tests/varinfo4 (stdout) memcheck/tests/varinfo4 (stderr) memcheck/tests/varinfo5 (stderr) memcheck/tests/varinfo6 (stdout) memcheck/tests/varinfo6 (stderr) memcheck/tests/vcpu_bz2 (stdout) memcheck/tests/vcpu_bz2 (stderr) memcheck/tests/vcpu_fbench (stdout) memcheck/tests/vcpu_fbench (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/vcpu_fnfns (stderr) memcheck/tests/wrap1 (stdout) memcheck/tests/wrap1 (stderr) memcheck/tests/wrap2 (stdout) memcheck/tests/wrap2 (stderr) memcheck/tests/wrap3 (stdout) memcheck/tests/wrap3 (stderr) memcheck/tests/wrap4 (stdout) memcheck/tests/wrap4 (stderr) memcheck/tests/wrap5 (stdout) memcheck/tests/wrap5 (stderr) memcheck/tests/wrap6 (stdout) memcheck/tests/wrap6 (stderr) memcheck/tests/wrap7 (stdout) memcheck/tests/wrap7 (stderr) memcheck/tests/wrap8 (stdout) memcheck/tests/wrap8 (stderr) memcheck/tests/writev1 (stderr) memcheck/tests/x86/bug152022 (stderr) memcheck/tests/x86/espindola2 (stderr) memcheck/tests/x86/fpeflags (stderr) memcheck/tests/x86/fprem (stdout) memcheck/tests/x86/fprem (stderr) memcheck/tests/x86/fxsave (stdout) memcheck/tests/x86/fxsave (stderr) memcheck/tests/x86/insn_basic (stdout) memcheck/tests/x86/insn_basic (stderr) memcheck/tests/x86/insn_cmov (stdout) memcheck/tests/x86/insn_cmov (stderr) memcheck/tests/x86/insn_fpu (stdout) memcheck/tests/x86/insn_fpu (stderr) memcheck/tests/x86/insn_mmx (stdout) memcheck/tests/x86/insn_mmx (stderr) memcheck/tests/x86/insn_sse (stdout) memcheck/tests/x86/insn_sse (stderr) memcheck/tests/x86/insn_sse2 (stdout) memcheck/tests/x86/insn_sse2 (stderr) memcheck/tests/x86/more_x86_fp (stdout) memcheck/tests/x86/more_x86_fp (stderr) memcheck/tests/x86/pushfpopf (stdout) memcheck/tests/x86/pushfpopf (stderr) memcheck/tests/x86/pushfw_x86 (stdout) memcheck/tests/x86/pushfw_x86 (stderr) memcheck/tests/x86/pushpopmem (stdout) memcheck/tests/x86/pushpopmem (stderr) memcheck/tests/x86/sse1_memory (stdout) memcheck/tests/x86/sse1_memory (stderr) memcheck/tests/x86/sse2_memory (stdout) memcheck/tests/x86/sse2_memory (stderr) memcheck/tests/x86/tronical (stderr) memcheck/tests/x86/xor-undef-x86 (stdout) memcheck/tests/x86/xor-undef-x86 (stderr) memcheck/tests/xml1 (stdout) memcheck/tests/xml1 (stderr) cachegrind/tests/chdir (stderr) cachegrind/tests/clreq (stderr) cachegrind/tests/dlclose (stdout) cachegrind/tests/dlclose (stderr) cachegrind/tests/notpower2 (stderr) cachegrind/tests/wrap5 (stdout) cachegrind/tests/wrap5 (stderr) cachegrind/tests/x86/fpu-28-108 (stderr) callgrind/tests/clreq (stderr) callgrind/tests/notpower2-hwpref (stderr) callgrind/tests/notpower2-use (stderr) callgrind/tests/notpower2-wb (stderr) callgrind/tests/notpower2 (stderr) callgrind/tests/simwork-both (stdout) callgrind/tests/simwork-both (stderr) callgrind/tests/simwork-branch (stdout) callgrind/tests/simwork-branch (stderr) callgrind/tests/simwork-cache (stdout) callgrind/tests/simwork-cache (stderr) callgrind/tests/simwork1 (stdout) callgrind/tests/simwork1 (stderr) callgrind/tests/simwork2 (stdout) callgrind/tests/simwork2 (stderr) callgrind/tests/simwork3 (stdout) callgrind/tests/simwork3 (stderr) callgrind/tests/threads-use (stderr) callgrind/tests/threads (stderr) massif/tests/alloc-fns-A (stderr) massif/tests/alloc-fns-A (post) massif/tests/alloc-fns-B (stderr) massif/tests/alloc-fns-B (post) massif/tests/basic (stderr) massif/tests/basic (post) massif/tests/basic2 (stderr) massif/tests/basic2 (post) massif/tests/big-alloc (stderr) massif/tests/big-alloc (post) massif/tests/culling1 (stderr) massif/tests/culling2 (stderr) massif/tests/custom_alloc (stderr) massif/tests/custom_alloc (post) massif/tests/deep-A (stderr) massif/tests/deep-A (post) massif/tests/deep-B (stderr) massif/tests/deep-B (post) massif/tests/deep-C (stderr) massif/tests/deep-C (post) massif/tests/deep-D (stderr) massif/tests/deep-D (post) massif/tests/ignored (stderr) massif/tests/ignored (post) massif/tests/ignoring (stderr) massif/tests/ignoring (post) massif/tests/insig (stderr) massif/tests/insig (post) massif/tests/long-names (stderr) massif/tests/long-names (post) massif/tests/long-time (stderr) massif/tests/long-time (post) massif/tests/malloc_usable (stderr) massif/tests/new-cpp (stderr) massif/tests/new-cpp (post) massif/tests/no-stack-no-heap (stderr) massif/tests/no-stack-no-heap (post) massif/tests/null (stderr) massif/tests/null (post) massif/tests/one (stderr) massif/tests/one (post) massif/tests/overloaded-new (stderr) massif/tests/overloaded-new (post) massif/tests/pages_as_heap (stderr) massif/tests/peak (stderr) massif/tests/peak (post) massif/tests/peak2 (stderr) massif/tests/peak2 (post) massif/tests/realloc (stderr) massif/tests/realloc (post) massif/tests/thresholds_0_0 (stderr) massif/tests/thresholds_0_0 (post) massif/tests/thresholds_0_10 (stderr) massif/tests/thresholds_0_10 (post) massif/tests/thresholds_10_0 (stderr) massif/tests/thresholds_10_0 (post) massif/tests/thresholds_10_10 (stderr) massif/tests/thresholds_10_10 (post) massif/tests/thresholds_5_0 (stderr) massif/tests/thresholds_5_0 (post) massif/tests/thresholds_5_10 (stderr) massif/tests/thresholds_5_10 (post) massif/tests/zero1 (stderr) massif/tests/zero1 (post) massif/tests/zero2 (stderr) massif/tests/zero2 (post) lackey/tests/true (stderr) none/tests/allexec32 (stdout) none/tests/allexec32 (stderr) none/tests/allexec64 (stdout) none/tests/allexec64 (stderr) none/tests/ansi (stderr) none/tests/args (stdout) none/tests/args (stderr) none/tests/async-sigs (stderr) none/tests/bitfield1 (stderr) none/tests/bug129866 (stdout) none/tests/bug129866 (stderr) none/tests/closeall (stderr) none/tests/cmd-with-special (stderr) none/tests/cmdline5 (stderr) none/tests/coolo_sigaction (stdout) none/tests/coolo_sigaction (stderr) none/tests/coolo_strlen (stderr) none/tests/darwin/access_extended (stderr) none/tests/darwin/apple-main-arg (stderr) none/tests/darwin/rlimit (stderr) none/tests/discard (stdout) none/tests/discard (stderr) none/tests/empty-exe (stderr) none/tests/exec-sigmask (stderr) none/tests/execve (stderr) none/tests/faultstatus (stderr) none/tests/fcntl_setown (stderr) none/tests/fdleak_cmsg (stderr) none/tests/fdleak_creat (stderr) none/tests/fdleak_dup (stderr) none/tests/fdleak_dup2 (stderr) none/tests/fdleak_fcntl (stderr) none/tests/fdleak_ipv4 (stdout) none/tests/fdleak_ipv4 (stderr) none/tests/fdleak_open (stderr) none/tests/fdleak_pipe (stderr) none/tests/fdleak_socketpair (stderr) none/tests/floored (stdout) none/tests/floored (stderr) none/tests/fork (stdout) none/tests/fork (stderr) none/tests/fucomip (stderr) none/tests/gxx304 (stderr) none/tests/manythreads (stdout) none/tests/manythreads (stderr) none/tests/map_unaligned (stderr) none/tests/map_unmap (stdout) none/tests/map_unmap (stderr) none/tests/mmap_fcntl_bug (stderr) none/tests/mq (stderr) none/tests/munmap_exe (stderr) none/tests/nestedfns (stdout) none/tests/nestedfns (stderr) none/tests/nodir (stderr) none/tests/pending (stdout) none/tests/pending (stderr) none/tests/process_vm_readv_writev (stderr) none/tests/procfs-non-linux (stderr) none/tests/pth_atfork1 (stdout) none/tests/pth_atfork1 (stderr) none/tests/pth_blockedsig (stdout) none/tests/pth_blockedsig (stderr) none/tests/pth_cancel1 (stdout) none/tests/pth_cancel1 (stderr) none/tests/pth_cancel2 (stderr) none/tests/pth_cvsimple (stdout) none/tests/pth_cvsimple (stderr) none/tests/pth_empty (stderr) none/tests/pth_exit (stderr) none/tests/pth_exit2 (stderr) none/tests/pth_mutexspeed (stdout) none/tests/pth_mutexspeed (stderr) none/tests/pth_once (stdout) none/tests/pth_once (stderr) none/tests/pth_rwlock (stderr) none/tests/pth_stackalign (stdout) none/tests/pth_stackalign (stderr) none/tests/rcrl (stdout) none/tests/rcrl (stderr) none/tests/readline1 (stdout) none/tests/readline1 (stderr) none/tests/require-text-symbol-1 (stderr) none/tests/require-text-symbol-2 (stderr) none/tests/res_search (stdout) none/tests/res_search (stderr) none/tests/resolv (stdout) none/tests/resolv (stderr) none/tests/rlimit64_nofile (stderr) none/tests/rlimit_nofile (stderr) none/tests/sem (stderr) none/tests/semlimit (stderr) none/tests/sha1_test (stderr) none/tests/shell (stdout) none/tests/shell (stderr) none/tests/shell_nosuchfile (stderr) none/tests/shell_valid1 (stderr) none/tests/shell_valid2 (stderr) none/tests/shell_valid3 (stderr) none/tests/shell_zerolength (stderr) none/tests/shortpush (stderr) none/tests/shorts (stderr) none/tests/sigstackgrowth (stdout) none/tests/sigstackgrowth (stderr) none/tests/stackgrowth (stdout) none/tests/stackgrowth (stderr) none/tests/syscall-restart1 (stderr) none/tests/syscall-restart2 (stderr) none/tests/syslog (stderr) none/tests/system (stderr) none/tests/thread-exits (stdout) none/tests/thread-exits (stderr) none/tests/threaded-fork (stdout) none/tests/threaded-fork (stderr) none/tests/threadederrno (stdout) none/tests/threadederrno (stderr) none/tests/timestamp (stderr) none/tests/vgprintf (stderr) none/tests/x86/aad_aam (stdout) none/tests/x86/aad_aam (stderr) none/tests/x86/badseg (stdout) none/tests/x86/badseg (stderr) none/tests/x86/bt_everything (stdout) none/tests/x86/bt_everything (stderr) none/tests/x86/bt_literal (stdout) none/tests/x86/bt_literal (stderr) none/tests/x86/bug125959-x86 (stdout) none/tests/x86/bug125959-x86 (stderr) none/tests/x86/bug126147-x86 (stdout) none/tests/x86/bug126147-x86 (stderr) none/tests/x86/bug132813-x86 (stdout) none/tests/x86/bug132813-x86 (stderr) none/tests/x86/bug135421-x86 (stdout) none/tests/x86/bug135421-x86 (stderr) none/tests/x86/bug137714-x86 (stdout) none/tests/x86/bug137714-x86 (stderr) none/tests/x86/bug152818-x86 (stdout) none/tests/x86/bug152818-x86 (stderr) none/tests/x86/cmpxchg8b (stdout) none/tests/x86/cmpxchg8b (stderr) none/tests/x86/cpuid (stdout) none/tests/x86/cpuid (stderr) none/tests/x86/cse_fail (stdout) none/tests/x86/fcmovnu (stdout) none/tests/x86/fcmovnu (stderr) none/tests/x86/fpu_lazy_eflags (stdout) none/tests/x86/fpu_lazy_eflags (stderr) none/tests/x86/fxtract (stdout) none/tests/x86/fxtract (stderr) none/tests/x86/getseg (stdout) none/tests/x86/getseg (stderr) none/tests/x86/incdec_alt (stdout) none/tests/x86/incdec_alt (stderr) none/tests/x86/insn_basic (stdout) none/tests/x86/insn_basic (stderr) none/tests/x86/insn_cmov (stdout) none/tests/x86/insn_cmov (stderr) none/tests/x86/insn_fpu (stdout) none/tests/x86/insn_fpu (stderr) none/tests/x86/insn_mmx (stdout) none/tests/x86/insn_mmx (stderr) none/tests/x86/insn_sse (stdout) none/tests/x86/insn_sse (stderr) none/tests/x86/insn_sse2 (stdout) none/tests/x86/insn_sse2 (stderr) none/tests/x86/insn_sse3 (stdout) none/tests/x86/insn_sse3 (stderr) none/tests/x86/jcxz (stdout) none/tests/x86/jcxz (stderr) none/tests/x86/lahf (stdout) none/tests/x86/lahf (stderr) none/tests/x86/looper (stdout) none/tests/x86/looper (stderr) none/tests/x86/movx (stdout) none/tests/x86/movx (stderr) none/tests/x86/pushpopseg (stdout) none/tests/x86/pushpopseg (stderr) none/tests/x86/sbbmisc (stdout) none/tests/x86/sbbmisc (stderr) none/tests/x86/shift_ndep (stdout) none/tests/x86/shift_ndep (stderr) none/tests/x86/smc1 (stdout) none/tests/x86/smc1 (stderr) none/tests/x86/x86locked (stdout) none/tests/x86/x86locked (stderr) none/tests/x86/xadd (stdout) none/tests/x86/xadd (stderr) helgrind/tests/annotate_hbefore (stderr) helgrind/tests/annotate_rwlock (stderr) helgrind/tests/annotate_smart_pointer (stderr) helgrind/tests/cond_timedwait_invalid (stderr) helgrind/tests/free_is_write (stderr) helgrind/tests/hg01_all_ok (stderr) helgrind/tests/hg02_deadlock (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/hg06_readshared (stderr) helgrind/tests/locked_vs_unlocked1_fwd (stderr) helgrind/tests/locked_vs_unlocked1_rev (stderr) helgrind/tests/locked_vs_unlocked2 (stderr) helgrind/tests/locked_vs_unlocked3 (stderr) helgrind/tests/rwlock_race (stderr) helgrind/tests/rwlock_test (stderr) helgrind/tests/t2t_laog (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc02_simple_tls (stderr) helgrind/tests/tc03_re_excl (stderr) helgrind/tests/tc04_free_lock (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc06_two_races_xml (stderr) helgrind/tests/tc07_hbl1 (stderr) helgrind/tests/tc08_hbl2 (stderr) helgrind/tests/tc09_bad_unlock (stderr) helgrind/tests/tc10_rec_lock (stderr) helgrind/tests/tc11_XCHG (stderr) helgrind/tests/tc12_rwl_trivial (stderr) helgrind/tests/tc13_laog1 (stderr) helgrind/tests/tc14_laog_dinphils (stderr) helgrind/tests/tc15_laog_lockdel (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc17_sembar (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc19_shadowmem (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc23_bogus_condwait (stderr) helgrind/tests/tc24_nonzero_sem (stderr) drd/tests/annotate_barrier (stderr) drd/tests/annotate_barrier_xml (stderr) drd/tests/annotate_hb_err (stderr) drd/tests/annotate_hb_race (stderr) drd/tests/annotate_hbefore (stderr) drd/tests/annotate_ignore_read (stderr) drd/tests/annotate_ignore_rw (stderr) drd/tests/annotate_ignore_rw2 (stderr) drd/tests/annotate_ignore_write (stderr) drd/tests/annotate_ignore_write2 (stderr) drd/tests/annotate_order_1 (stderr) drd/tests/annotate_order_2 (stderr) drd/tests/annotate_order_3 (stderr) drd/tests/annotate_publish_hg (stderr) drd/tests/annotate_rwlock (stderr) drd/tests/annotate_rwlock_hg (stderr) drd/tests/annotate_smart_pointer (stderr) drd/tests/annotate_smart_pointer2 (stderr) drd/tests/annotate_spinlock (stderr) drd/tests/annotate_static (stderr) drd/tests/annotate_trace_memory (stderr) drd/tests/annotate_trace_memory_xml (stderr) drd/tests/atomic_var (stderr) drd/tests/bug-235681 (stderr) drd/tests/circular_buffer (stderr) drd/tests/custom_alloc (stderr) drd/tests/custom_alloc_fiw (stderr) drd/tests/fp_race (stderr) drd/tests/fp_race2 (stderr) drd/tests/fp_race_xml (stderr) drd/tests/free_is_write (stderr) drd/tests/free_is_write2 (stderr) drd/tests/hg01_all_ok (stderr) drd/tests/hg02_deadlock (stderr) drd/tests/hg03_inherit (stderr) drd/tests/hg04_race (stderr) drd/tests/hg05_race2 (stderr) drd/tests/hg06_readshared (stderr) drd/tests/hold_lock_1 (stderr) drd/tests/hold_lock_2 (stderr) drd/tests/linuxthreads_det (stderr) drd/tests/memory_allocation (stderr) drd/tests/monitor_example (stderr) drd/tests/new_delete (stderr) drd/tests/pth_broadcast (stderr) drd/tests/pth_cancel_locked (stderr) drd/tests/pth_cleanup_handler (stderr) drd/tests/pth_cond_race (stderr) drd/tests/pth_cond_race2 (stderr) drd/tests/pth_cond_race3 (stderr) drd/tests/pth_create_chain (stderr) drd/tests/pth_detached (stderr) drd/tests/pth_detached2 (stderr) drd/tests/pth_detached3 (stderr) drd/tests/pth_inconsistent_cond_wait (stderr) drd/tests/pth_mutex_reinit (stderr) drd/tests/pth_once (stderr) drd/tests/pth_process_shared_mutex (stderr) drd/tests/pth_uninitialized_cond (stderr) drd/tests/read_and_free_race (stderr) drd/tests/recursive_mutex (stderr) drd/tests/rwlock_race (stderr) drd/tests/rwlock_test (stderr) drd/tests/rwlock_type_checking (stderr) drd/tests/sem_open (stderr) drd/tests/sem_open2 (stderr) drd/tests/sem_open3 (stderr) drd/tests/sem_open_traced (stderr) drd/tests/sigalrm (stderr) drd/tests/sigaltstack (stderr) drd/tests/tc01_simple_race (stderr) drd/tests/tc02_simple_tls (stderr) drd/tests/tc03_re_excl (stderr) drd/tests/tc04_free_lock (stderr) drd/tests/tc05_simple_race (stderr) drd/tests/tc06_two_races (stderr) drd/tests/tc07_hbl1 (stdout) drd/tests/tc07_hbl1 (stderr) drd/tests/tc08_hbl2 (stdout) drd/tests/tc08_hbl2 (stderr) drd/tests/tc09_bad_unlock (stderr) drd/tests/tc10_rec_lock (stderr) drd/tests/tc11_XCHG (stdout) drd/tests/tc11_XCHG (stderr) drd/tests/tc12_rwl_trivial (stderr) drd/tests/tc13_laog1 (stderr) drd/tests/tc15_laog_lockdel (stderr) drd/tests/tc16_byterace (stderr) drd/tests/tc17_sembar (stderr) drd/tests/tc19_shadowmem (stderr) drd/tests/tc21_pthonce (stdout) drd/tests/tc21_pthonce (stderr) drd/tests/tc23_bogus_condwait (stderr) drd/tests/thread_name (stderr) drd/tests/thread_name_xml (stderr) drd/tests/threaded-fork (stderr) drd/tests/trylock (stderr) drd/tests/unit_bitmap (stderr) drd/tests/unit_vc (stderr) exp-bbv/tests/x86/complex_rep (stderr) exp-bbv/tests/x86/fldcw_check (stderr) exp-bbv/tests/x86/million (stderr) exp-bbv/tests/x86/rep_prefix (stderr) ================================================= ./valgrind-new/cachegrind/tests/chdir.stderr.diff ================================================= --- chdir.stderr.exp 2012-06-23 23:18:32.000000000 -0500 +++ chdir.stderr.out 2012-06-23 23:30:34.000000000 -0500 @@ -1,17 +1,28 @@ -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3800D1C5: ??? + by 0x3800D388: ??? + by 0x38054647: ??? + by 0x380564D7: ??? + by 0x3807B938: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. + ================================================= ./valgrind-new/cachegrind/tests/clreq.stderr.diff ================================================= --- clreq.stderr.exp 2012-06-23 23:18:32.000000000 -0500 +++ clreq.stderr.out 2012-06-23 23:30:34.000000000 -0500 @@ -0,0 +1,27 @@ + +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3800D1C5: ??? + by 0x3800D388: ??? + by 0x38054647: ??? + by 0x380564D7: ??? + by 0x3807B938: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. + ================================================= ./valgrind-new/cachegrind/tests/dlclose.stderr.diff ================================================= --- dlclose.stderr.exp 2012-06-23 23:18:32.000000000 -0500 +++ dlclose.stderr.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,17 +1,28 @@ -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3800D1C5: ??? + by 0x3800D388: ??? + by 0x38054647: ??? + by 0x380564D7: ??? + by 0x3807B938: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. + ================================================= ./valgrind-new/cachegrind/tests/dlclose.stdout.diff ================================================= --- dlclose.stdout.exp 2012-06-23 23:18:32.000000000 -0500 +++ dlclose.stdout.out 2012-06-23 23:30:34.000000000 -0500 @@ -1 +0,0 @@ -This is myprint! ================================================= ./valgrind-new/cachegrind/tests/notpower2.stderr.diff ================================================= --- notpower2.stderr.exp 2012-06-23 23:18:32.000000000 -0500 +++ notpower2.stderr.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,17 +1,28 @@ -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3800D1C5: ??? + by 0x3800D388: ??? + by 0x38054647: ??? + by 0x380564D7: ??? + by 0x3807B938: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. + ================================================= ./valgrind-new/cachegrind/tests/wrap5.stderr.diff ================================================= --- wrap5.stderr.exp 2012-06-23 23:18:32.000000000 -0500 +++ wrap5.stderr.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,17 +1,28 @@ -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3800D1C5: ??? + by 0x3800D388: ??? + by 0x38054647: ??? + by 0x380564D7: ??? + by 0x3807B938: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. + ================================================= ./valgrind-new/cachegrind/tests/wrap5.stdout.diff ================================================= --- wrap5.stdout.exp 2012-06-23 23:18:32.000000000 -0500 +++ wrap5.stdout.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,37 +0,0 @@ -computing fact1(7) -in wrapper1-pre: fact(7) -in wrapper2-pre: fact(6) -in wrapper1-pre: fact(5) -in wrapper2-pre: fact(4) -in wrapper1-pre: fact(3) -in wrapper2-pre: fact(2) -in wrapper1-pre: fact(1) -in wrapper2-pre: fact(0) -in wrapper2-post: fact(0) = 1 -in wrapper1-post: fact(1) = 1 -in wrapper2-post: fact(2) = 2 -in wrapper1-post: fact(3) = 6 -in wrapper2-pre: fact(2) -in wrapper1-pre: fact(1) -in wrapper2-pre: fact(0) -in wrapper2-post: fact(0) = 1 -in wrapper1-post: fact(1) = 1 -in wrapper2-post: fact(2) = 2 -in wrapper2-post: fact(4) = 32 -in wrapper1-post: fact(5) = 160 -in wrapper2-pre: fact(2) -in wrapper1-pre: fact(1) -in wrapper2-pre: fact(0) -in wrapper2-post: fact(0) = 1 -in wrapper1-post: fact(1) = 1 -in wrapper2-post: fact(2) = 2 -in wrapper2-post: fact(6) = 972 -in wrapper1-post: fact(7) = 6804 -in wrapper2-pre: fact(2) -in wrapper1-pre: fact(1) -in wrapper2-pre: fact(0) -in wrapper2-post: fact(0) = 1 -in wrapper1-post: fact(1) = 1 -in wrapper2-post: fact(2) = 2 -fact1(7) = 6806 -allocated 51 Lards ================================================= ./valgrind-new/cachegrind/tests/x86/fpu-28-108.stderr.diff ================================================= --- fpu-28-108.stderr.exp 2012-06-23 23:18:32.000000000 -0500 +++ fpu-28-108.stderr.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,17 +1,28 @@ -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3800D1C5: ??? + by 0x3800D388: ??? + by 0x38054647: ??? + by 0x380564D7: ??? + by 0x3807B938: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. + ================================================= ./valgrind-new/callgrind/tests/clreq.stderr.diff ================================================= --- clreq.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ clreq.stderr.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,6 +1,28 @@ -Events : Ir -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: ================================================= ./valgrind-new/callgrind/tests/notpower2-hwpref.stderr.diff ================================================= --- notpower2-hwpref.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ notpower2-hwpref.stderr.out 2012-06-23 23:30:35.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/notpower2-use.stderr.diff ================================================= --- notpower2-use.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ notpower2-use.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw AcCost1 SpLoss1 AcCost2 SpLoss2 -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/notpower2-wb.stderr.diff ================================================= --- notpower2-wb.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ notpower2-wb.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw ILdmr DLdmr DLdmw -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/notpower2.stderr.diff ================================================= --- notpower2.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ notpower2.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/simwork-both.stderr.diff ================================================= --- simwork-both.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork-both.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,24 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw Bc Bcm Bi Bim -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: - -Branches: -Mispredicts: -Mispred rate: ================================================= ./valgrind-new/callgrind/tests/simwork-both.stdout.diff ================================================= --- simwork-both.stdout.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork-both.stdout.out 2012-06-23 23:30:36.000000000 -0500 @@ -1 +0,0 @@ -Sum: 1000000 ================================================= ./valgrind-new/callgrind/tests/simwork-branch.stderr.diff ================================================= --- simwork-branch.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork-branch.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,10 +1,28 @@ -Events : Ir Bc Bcm Bi Bim -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? -I refs: +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -Branches: -Mispredicts: -Mispred rate: ================================================= ./valgrind-new/callgrind/tests/simwork-branch.stdout.diff ================================================= --- simwork-branch.stdout.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork-branch.stdout.out 2012-06-23 23:30:36.000000000 -0500 @@ -1 +0,0 @@ -Sum: 1000000 ================================================= ./valgrind-new/callgrind/tests/simwork-cache.stderr.diff ================================================= --- simwork-cache.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork-cache.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/simwork-cache.stdout.diff ================================================= --- simwork-cache.stdout.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork-cache.stdout.out 2012-06-23 23:30:36.000000000 -0500 @@ -1 +0,0 @@ -Sum: 1000000 ================================================= ./valgrind-new/callgrind/tests/simwork1.stderr.diff ================================================= --- simwork1.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork1.stderr.out 2012-06-23 23:30:36.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/simwork1.stdout.diff ================================================= --- simwork1.stdout.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork1.stdout.out 2012-06-23 23:30:36.000000000 -0500 @@ -1 +0,0 @@ -Sum: 1000000 ================================================= ./valgrind-new/callgrind/tests/simwork2.stderr.diff ================================================= --- simwork2.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork2.stderr.out 2012-06-23 23:30:37.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw ILdmr DLdmr DLdmw -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/simwork2.stdout.diff ================================================= --- simwork2.stdout.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork2.stdout.out 2012-06-23 23:30:37.000000000 -0500 @@ -1 +0,0 @@ -Sum: 1000000 ================================================= ./valgrind-new/callgrind/tests/simwork3.stderr.diff ================================================= --- simwork3.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork3.stderr.out 2012-06-23 23:30:37.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw AcCost1 SpLoss1 AcCost2 SpLoss2 -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happened in m_mallocfree.c. + +If that doesn't help, please report this bug to: www.valgrind.org + +In the bug report, send all the above text, the valgrind +version, and what OS and version you are using. Thanks. -I refs: -I1 misses: -LLi misses: -I1 miss rate: -LLi miss rate: - -D refs: -D1 misses: -LLd misses: -D1 miss rate: -LLd miss rate: - -LL refs: -LL misses: -LL miss rate: ================================================= ./valgrind-new/callgrind/tests/simwork3.stdout.diff ================================================= --- simwork3.stdout.exp 2012-06-23 23:18:27.000000000 -0500 +++ simwork3.stdout.out 2012-06-23 23:30:37.000000000 -0500 @@ -1 +0,0 @@ -Sum: 1000000 ================================================= ./valgrind-new/callgrind/tests/threads-use.stderr.diff ================================================= --- threads-use.stderr.exp 2012-06-23 23:18:27.000000000 -0500 +++ threads-use.stderr.out 2012-06-23 23:30:37.000000000 -0500 @@ -1,20 +1,28 @@ -Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw AcCost1 SpLoss1 AcCost2 SpLoss2 Ge sysCount sysTime -Collected : +valgrind: m_scheduler/scheduler.c:707 (do_pre_run_checks): Assertion 'VG_IS_32_ALIGNED(a_vex)' failed. + at 0x3801ECC5: ??? + by 0x3801EE88: ??? + by 0x38064837: ??? + by 0x380666C7: ??? + by 0x3808BB28: ??? + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable + at 0x8FE01030: _dyld_start (in /usr/lib/dyld) + + +Note: see also the FAQ in the source distribution. +It contains workarounds to several common problems. +In particular, if Valgrind aborted or crashed after +identifying problems in your program, there's a good chance +that fixing those problems will prevent Valgrind aborting or +crashing, especially if it happe... [truncated message content] |
|
From: Philippe W. <phi...@sk...> - 2012-06-24 03:46:14
|
valgrind revision: 12663 VEX revision: 2402 C compiler: gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) Assembler: GNU assembler version 2.21.53.0.1-6.fc16 20110716 C library: GNU C Library development release version 2.14.90 uname -mrs: Linux 3.3.1-3.fc16.ppc64 ppc64 Vendor version: Fedora release 16 (Verne) Nightly build on gcc110 ( Fedora release 16 (Verne), ppc64 ) Started at 2012-06-23 20:00:11 PDT Ended at 2012-06-23 20:45:04 PDT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 526 tests, 15 stderr failures, 8 stdout failures, 1 stderrB failure, 1 stdoutB failure, 2 post failures == gdbserver_tests/mcmain_pic (stdout) gdbserver_tests/mcmain_pic (stderr) gdbserver_tests/mcmain_pic (stdoutB) gdbserver_tests/mcmain_pic (stderrB) memcheck/tests/ppc32/power_ISA2_05 (stdout) memcheck/tests/ppc32/power_ISA2_05 (stderr) memcheck/tests/ppc64/power_ISA2_05 (stdout) memcheck/tests/ppc64/power_ISA2_05 (stderr) memcheck/tests/supp_unknown (stderr) memcheck/tests/trivialleak (stderr) memcheck/tests/varinfo6 (stderr) memcheck/tests/wrap8 (stdout) memcheck/tests/wrap8 (stderr) massif/tests/big-alloc (post) massif/tests/deep-D (post) none/tests/empty-exe (stderr) none/tests/ppc32/jm-fp (stdout) none/tests/ppc32/jm-vmx (stdout) none/tests/ppc64/jm-fp (stdout) none/tests/ppc64/jm-vmx (stdout) none/tests/shell (stderr) none/tests/shell_valid1 (stderr) none/tests/shell_valid2 (stderr) none/tests/shell_valid3 (stderr) none/tests/shell_zerolength (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc20_verifywrap (stderr) |
|
From: Tom H. <to...@co...> - 2012-06-24 03:14:35
|
valgrind revision: 12663 VEX revision: 2402 C compiler: gcc (GCC) 4.3.0 20080428 (Red Hat 4.3.0-8) Assembler: GNU assembler version 2.18.50.0.6-2 20080403 C library: GNU C Library stable release version 2.8 uname -mrs: Linux 3.4.0-1.fc17.x86_64 x86_64 Vendor version: Fedora release 9 (Sulphur) Nightly build on bristol ( x86_64, Fedora 9 ) Started at 2012-06-24 03:42:19 BST Ended at 2012-06-24 04:14:17 BST Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 600 tests, 0 stderr failures, 1 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures == none/tests/amd64/sse4-64 (stdout) |
|
From: Tom H. <to...@co...> - 2012-06-24 03:01:52
|
valgrind revision: 12663 VEX revision: 2402 C compiler: gcc (GCC) 4.4.1 20090725 (Red Hat 4.4.1-2) Assembler: GNU assembler version 2.19.51.0.14-3.fc11 20090722 C library: GNU C Library stable release version 2.10.2 uname -mrs: Linux 3.4.0-1.fc17.x86_64 x86_64 Vendor version: Fedora release 11 (Leonidas) Nightly build on bristol ( x86_64, Fedora 11 ) Started at 2012-06-24 03:31:26 BST Ended at 2012-06-24 04:01:32 BST Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 602 tests, 1 stderr failure, 1 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/long_namespace_xml (stderr) none/tests/amd64/sse4-64 (stdout) |
|
From: Rich C. <rc...@wi...> - 2012-06-24 02:57:21
|
valgrind revision: 12663
VEX revision: 2402
C compiler: gcc (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision 167585]
Assembler: GNU assembler (GNU Binutils; openSUSE 11.4) 2.21
C library: GNU C Library stable release version 2.11.3 (20110203)
uname -mrs: Linux 2.6.37.6-0.7-desktop x86_64
Vendor version: Welcome to openSUSE 11.4 "Celadon" - Kernel %r (%t).
Nightly build on ultra ( gcc 4.5.1 Linux 2.6.37.6-0.7-desktop x86_64 )
Started at 2012-06-23 21:30:01 CDT
Ended at 2012-06-23 21:57:12 CDT
Results unchanged from 24 hours ago
Checking out valgrind source tree ... done
Configuring valgrind ... done
Building valgrind ... done
Running regression tests ... failed
Regression test results follow
== 610 tests, 1 stderr failure, 0 stdout failures, 6 stderrB failures, 0 stdoutB failures, 0 post failures ==
gdbserver_tests/mcbreak (stderrB)
gdbserver_tests/mcclean_after_fork (stderrB)
gdbserver_tests/mcleak (stderrB)
gdbserver_tests/mcmain_pic (stderrB)
gdbserver_tests/mcvabits (stderrB)
gdbserver_tests/mssnapshot (stderrB)
memcheck/tests/origin5-bz2 (stderr)
=================================================
./valgrind-new/gdbserver_tests/mcbreak.stderrB.diff
=================================================
--- mcbreak.stderrB.exp 2012-06-23 21:45:49.065736553 -0500
+++ mcbreak.stderrB.out 2012-06-23 21:48:25.071800116 -0500
@@ -1,5 +1,7 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
vgdb-error value changed from 999999 to 0
n_errs_found 1 n_errs_shown 1 (vgdb-error 0)
vgdb-error value changed from 0 to 0
=================================================
./valgrind-new/gdbserver_tests/mcclean_after_fork.stderrB.diff
=================================================
--- mcclean_after_fork.stderrB.exp 2012-06-23 21:45:49.065736553 -0500
+++ mcclean_after_fork.stderrB.out 2012-06-23 21:48:26.747994058 -0500
@@ -1,4 +1,6 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
monitor command request to kill this process
Remote connection closed
=================================================
./valgrind-new/gdbserver_tests/mcleak.stderrB.diff
=================================================
--- mcleak.stderrB.exp 2012-06-23 21:45:49.061736089 -0500
+++ mcleak.stderrB.out 2012-06-23 21:48:44.765078699 -0500
@@ -1,5 +1,7 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
10 bytes in 1 blocks are still reachable in loss record ... of ...
at 0x........: malloc (vg_replace_malloc.c:...)
by 0x........: f (leak-delta.c:14)
=================================================
./valgrind-new/gdbserver_tests/mcmain_pic.stderrB.diff
=================================================
--- mcmain_pic.stderrB.exp 2012-06-23 21:45:49.068736901 -0500
+++ mcmain_pic.stderrB.out 2012-06-23 21:48:46.356262805 -0500
@@ -1,3 +1,5 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
Remote connection closed
=================================================
./valgrind-new/gdbserver_tests/mcvabits.stderrB.diff
=================================================
--- mcvabits.stderrB.exp 2012-06-23 21:45:49.069737017 -0500
+++ mcvabits.stderrB.out 2012-06-23 21:48:51.239827852 -0500
@@ -1,5 +1,7 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
Address 0x........ len 10 addressable
Address 0x........ is 0 bytes inside data symbol "undefined"
Address 0x........ len 10 defined
=================================================
./valgrind-new/gdbserver_tests/mssnapshot.stderrB.diff
=================================================
--- mssnapshot.stderrB.exp 2012-06-23 21:45:49.068736901 -0500
+++ mssnapshot.stderrB.out 2012-06-23 21:48:54.371190162 -0500
@@ -1,5 +1,9 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
+Missing separate debuginfo for /lib64/libc.so.6
+Try: zypper install -C "debuginfo(build-id)=92ec8fe859846a62345f74696ab349721415587a"
general valgrind monitor commands:
help [debug] : monitor command help. With debug: + debugging commands
v.wait [<ms>] : sleep <ms> (default 0) then continue
=================================================
./valgrind-new/memcheck/tests/origin5-bz2.stderr.diff-glibc212-s390x
=================================================
--- origin5-bz2.stderr.exp-glibc212-s390x 2012-06-23 21:45:57.188677163 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:50:15.048524872 -0500
@@ -75,17 +75,6 @@
at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
- at 0x........: mainSort (origin5-bz2.c:2859)
- by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
- by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
- by 0x........: handle_compress (origin5-bz2.c:4753)
- by 0x........: BZ2_bzCompress (origin5-bz2.c:4822)
- by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
- by 0x........: main (origin5-bz2.c:6484)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
-
-Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2963)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -131,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
+ by 0x........: g_serviceFn (origin5-bz2.c:6429)
+ by 0x........: default_bzalloc (origin5-bz2.c:4470)
+ by 0x........: BZ2_decompress (origin5-bz2.c:1578)
+ by 0x........: BZ2_bzDecompress (origin5-bz2.c:5192)
+ by 0x........: BZ2_bzBuffToBuffDecompress (origin5-bz2.c:5678)
+ by 0x........: main (origin5-bz2.c:6498)
=================================================
./valgrind-new/memcheck/tests/origin5-bz2.stderr.diff-glibc234-s390x
=================================================
--- origin5-bz2.stderr.exp-glibc234-s390x 2012-06-23 21:45:57.171675193 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:50:15.048524872 -0500
@@ -120,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
+ by 0x........: g_serviceFn (origin5-bz2.c:6429)
+ by 0x........: default_bzalloc (origin5-bz2.c:4470)
+ by 0x........: BZ2_decompress (origin5-bz2.c:1578)
+ by 0x........: BZ2_bzDecompress (origin5-bz2.c:5192)
+ by 0x........: BZ2_bzBuffToBuffDecompress (origin5-bz2.c:5678)
+ by 0x........: main (origin5-bz2.c:6498)
=================================================
./valgrind-new/memcheck/tests/origin5-bz2.stderr.diff-glibc25-amd64
=================================================
--- origin5-bz2.stderr.exp-glibc25-amd64 2012-06-23 21:45:57.116668825 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:50:15.048524872 -0500
@@ -120,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
+ by 0x........: g_serviceFn (origin5-bz2.c:6429)
+ by 0x........: default_bzalloc (origin5-bz2.c:4470)
+ by 0x........: BZ2_decompress (origin5-bz2.c:1578)
+ by 0x........: BZ2_bzDecompress (origin5-bz2.c:5192)
+ by 0x........: BZ2_bzBuffToBuffDecompress (origin5-bz2.c:5678)
+ by 0x........: main (origin5-bz2.c:6498)
=================================================
./valgrind-new/memcheck/tests/origin5-bz2.stderr.diff-glibc25-x86
=================================================
--- origin5-bz2.stderr.exp-glibc25-x86 2012-06-23 21:45:57.145672183 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:50:15.048524872 -0500
@@ -12,7 +12,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
by 0x........: handle_compress (origin5-bz2.c:4750)
by 0x........: BZ2_bzCompress (origin5-bz2.c:4822)
@@ -21,7 +21,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
by 0x........: handle_compress (origin5-bz2.c:4750)
by 0x........: BZ2_bzCompress (origin5-bz2.c:4822)
@@ -30,7 +30,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2820)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -41,7 +41,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2823)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -52,7 +52,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2854)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -63,7 +63,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2858)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -74,7 +74,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2963)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -85,7 +85,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2964)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -96,7 +96,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: fallbackSort (origin5-bz2.c:2269)
by 0x........: BZ2_blockSort (origin5-bz2.c:3116)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -107,7 +107,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: fallbackSort (origin5-bz2.c:2275)
by 0x........: BZ2_blockSort (origin5-bz2.c:3116)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -120,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
<truncated beyond 100 lines>
=================================================
./valgrind-new/memcheck/tests/origin5-bz2.stderr.diff-glibc27-ppc64
=================================================
--- origin5-bz2.stderr.exp-glibc27-ppc64 2012-06-23 21:45:57.158673689 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:50:15.048524872 -0500
@@ -1,7 +1,7 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6481)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Conditional jump or move depends on uninitialised value(s)
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
@@ -10,7 +10,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
@@ -19,7 +19,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
@@ -28,7 +28,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2820)
@@ -39,7 +39,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2823)
@@ -50,7 +50,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2854)
@@ -61,7 +61,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2858)
@@ -72,7 +72,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2963)
@@ -83,7 +83,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2964)
@@ -94,7 +94,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: fallbackSort (origin5-bz2.c:2269)
@@ -105,7 +105,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
<truncated beyond 100 lines>
=================================================
./valgrind-old/gdbserver_tests/mcbreak.stderrB.diff
=================================================
--- mcbreak.stderrB.exp 2012-06-23 21:31:00.249809556 -0500
+++ mcbreak.stderrB.out 2012-06-23 21:36:50.934420916 -0500
@@ -1,5 +1,7 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
vgdb-error value changed from 999999 to 0
n_errs_found 1 n_errs_shown 1 (vgdb-error 0)
vgdb-error value changed from 0 to 0
=================================================
./valgrind-old/gdbserver_tests/mcclean_after_fork.stderrB.diff
=================================================
--- mcclean_after_fork.stderrB.exp 2012-06-23 21:31:00.249809556 -0500
+++ mcclean_after_fork.stderrB.out 2012-06-23 21:36:52.604614329 -0500
@@ -1,4 +1,6 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
monitor command request to kill this process
Remote connection closed
=================================================
./valgrind-old/gdbserver_tests/mcleak.stderrB.diff
=================================================
--- mcleak.stderrB.exp 2012-06-23 21:31:00.246809209 -0500
+++ mcleak.stderrB.out 2012-06-23 21:37:12.720943869 -0500
@@ -1,5 +1,7 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
10 bytes in 1 blocks are still reachable in loss record ... of ...
at 0x........: malloc (vg_replace_malloc.c:...)
by 0x........: f (leak-delta.c:14)
=================================================
./valgrind-old/gdbserver_tests/mcmain_pic.stderrB.diff
=================================================
--- mcmain_pic.stderrB.exp 2012-06-23 21:31:00.252809903 -0500
+++ mcmain_pic.stderrB.out 2012-06-23 21:37:14.277124078 -0500
@@ -1,3 +1,5 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
Remote connection closed
=================================================
./valgrind-old/gdbserver_tests/mcvabits.stderrB.diff
=================================================
--- mcvabits.stderrB.exp 2012-06-23 21:31:00.254810135 -0500
+++ mcvabits.stderrB.out 2012-06-23 21:37:19.170690769 -0500
@@ -1,5 +1,7 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
Address 0x........ len 10 addressable
Address 0x........ is 0 bytes inside data symbol "undefined"
Address 0x........ len 10 defined
=================================================
./valgrind-old/gdbserver_tests/mssnapshot.stderrB.diff
=================================================
--- mssnapshot.stderrB.exp 2012-06-23 21:31:00.253810019 -0500
+++ mssnapshot.stderrB.out 2012-06-23 21:37:22.292052233 -0500
@@ -1,5 +1,9 @@
relaying data between gdb and process ....
vgdb-error value changed from 0 to 999999
+
+
+Missing separate debuginfo for /lib64/libc.so.6
+Try: zypper install -C "debuginfo(build-id)=92ec8fe859846a62345f74696ab349721415587a"
general valgrind monitor commands:
help [debug] : monitor command help. With debug: + debugging commands
v.wait [<ms>] : sleep <ms> (default 0) then continue
=================================================
./valgrind-old/memcheck/tests/origin5-bz2.stderr.diff-glibc212-s390x
=================================================
--- origin5-bz2.stderr.exp-glibc212-s390x 2012-06-23 21:32:06.136439752 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:38:42.664359552 -0500
@@ -75,17 +75,6 @@
at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
- at 0x........: mainSort (origin5-bz2.c:2859)
- by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
- by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
- by 0x........: handle_compress (origin5-bz2.c:4753)
- by 0x........: BZ2_bzCompress (origin5-bz2.c:4822)
- by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
- by 0x........: main (origin5-bz2.c:6484)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
-
-Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2963)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -131,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
+ by 0x........: g_serviceFn (origin5-bz2.c:6429)
+ by 0x........: default_bzalloc (origin5-bz2.c:4470)
+ by 0x........: BZ2_decompress (origin5-bz2.c:1578)
+ by 0x........: BZ2_bzDecompress (origin5-bz2.c:5192)
+ by 0x........: BZ2_bzBuffToBuffDecompress (origin5-bz2.c:5678)
+ by 0x........: main (origin5-bz2.c:6498)
=================================================
./valgrind-old/memcheck/tests/origin5-bz2.stderr.diff-glibc234-s390x
=================================================
--- origin5-bz2.stderr.exp-glibc234-s390x 2012-06-23 21:32:06.120437898 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:38:42.664359552 -0500
@@ -120,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
+ by 0x........: g_serviceFn (origin5-bz2.c:6429)
+ by 0x........: default_bzalloc (origin5-bz2.c:4470)
+ by 0x........: BZ2_decompress (origin5-bz2.c:1578)
+ by 0x........: BZ2_bzDecompress (origin5-bz2.c:5192)
+ by 0x........: BZ2_bzBuffToBuffDecompress (origin5-bz2.c:5678)
+ by 0x........: main (origin5-bz2.c:6498)
=================================================
./valgrind-old/memcheck/tests/origin5-bz2.stderr.diff-glibc25-amd64
=================================================
--- origin5-bz2.stderr.exp-glibc25-amd64 2012-06-23 21:32:06.065431287 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:38:42.664359552 -0500
@@ -120,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
+ by 0x........: g_serviceFn (origin5-bz2.c:6429)
+ by 0x........: default_bzalloc (origin5-bz2.c:4470)
+ by 0x........: BZ2_decompress (origin5-bz2.c:1578)
+ by 0x........: BZ2_bzDecompress (origin5-bz2.c:5192)
+ by 0x........: BZ2_bzBuffToBuffDecompress (origin5-bz2.c:5678)
+ by 0x........: main (origin5-bz2.c:6498)
=================================================
./valgrind-old/memcheck/tests/origin5-bz2.stderr.diff-glibc25-x86
=================================================
--- origin5-bz2.stderr.exp-glibc25-x86 2012-06-23 21:32:06.094434886 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:38:42.664359552 -0500
@@ -12,7 +12,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
by 0x........: handle_compress (origin5-bz2.c:4750)
by 0x........: BZ2_bzCompress (origin5-bz2.c:4822)
@@ -21,7 +21,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
by 0x........: handle_compress (origin5-bz2.c:4750)
by 0x........: BZ2_bzCompress (origin5-bz2.c:4822)
@@ -30,7 +30,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2820)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -41,7 +41,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2823)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -52,7 +52,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2854)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -63,7 +63,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2858)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -74,7 +74,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2963)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -85,7 +85,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2964)
by 0x........: BZ2_blockSort (origin5-bz2.c:3105)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -96,7 +96,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: fallbackSort (origin5-bz2.c:2269)
by 0x........: BZ2_blockSort (origin5-bz2.c:3116)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -107,7 +107,7 @@
Uninitialised value was created by a client request
at 0x........: main (origin5-bz2.c:6479)
-Use of uninitialised value of size 4
+Use of uninitialised value of size 8
at 0x........: fallbackSort (origin5-bz2.c:2275)
by 0x........: BZ2_blockSort (origin5-bz2.c:3116)
by 0x........: BZ2_compressBlock (origin5-bz2.c:4034)
@@ -120,6 +120,12 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6512)
- Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6479)
+ Uninitialised value was created by a heap allocation
+ at 0x........: malloc (vg_replace_malloc.c:...)
<truncated beyond 100 lines>
=================================================
./valgrind-old/memcheck/tests/origin5-bz2.stderr.diff-glibc27-ppc64
=================================================
--- origin5-bz2.stderr.exp-glibc27-ppc64 2012-06-23 21:32:06.107436392 -0500
+++ origin5-bz2.stderr.out 2012-06-23 21:38:42.664359552 -0500
@@ -1,7 +1,7 @@
Conditional jump or move depends on uninitialised value(s)
at 0x........: main (origin5-bz2.c:6481)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Conditional jump or move depends on uninitialised value(s)
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
@@ -10,7 +10,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
@@ -19,7 +19,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: copy_input_until_stop (origin5-bz2.c:4686)
@@ -28,7 +28,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2820)
@@ -39,7 +39,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2823)
@@ -50,7 +50,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2854)
@@ -61,7 +61,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2858)
@@ -72,7 +72,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2963)
@@ -83,7 +83,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: mainSort (origin5-bz2.c:2964)
@@ -94,7 +94,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
at 0x........: fallbackSort (origin5-bz2.c:2269)
@@ -105,7 +105,7 @@
by 0x........: BZ2_bzBuffToBuffCompress (origin5-bz2.c:5630)
by 0x........: main (origin5-bz2.c:6484)
Uninitialised value was created by a client request
- at 0x........: main (origin5-bz2.c:6481)
+ at 0x........: main (origin5-bz2.c:6479)
Use of uninitialised value of size 8
<truncated beyond 100 lines>
|
|
From: <br...@ac...> - 2012-06-24 02:56:14
|
valgrind revision: 12663
VEX revision: 2402
C compiler: gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
Assembler: GNU assembler 2.15.92.0.2 20040927
C library: GNU C Library stable release version 2.3.4
uname -mrs: Linux 2.6.9-42.EL s390x
Vendor version: Red Hat Enterprise Linux AS release 4 (Nahant Update 4)
Nightly build on z10-ec ( s390x build on z10-EC )
Started at 2012-06-23 22:20:10 EDT
Ended at 2012-06-23 22:56:02 EDT
Results unchanged from 24 hours ago
Checking out valgrind source tree ... done
Configuring valgrind ... done
Building valgrind ... done
Running regression tests ... failed
Regression test results follow
== 508 tests, 6 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
memcheck/tests/manuel3 (stderr)
memcheck/tests/partial_load_ok (stderr)
memcheck/tests/varinfo6 (stderr)
helgrind/tests/tc09_bad_unlock (stderr)
helgrind/tests/tc18_semabuse (stderr)
helgrind/tests/tc20_verifywrap (stderr)
=================================================
./valgrind-new/helgrind/tests/tc09_bad_unlock.stderr.diff
=================================================
--- tc09_bad_unlock.stderr.exp 2012-06-23 22:38:23.000000000 -0400
+++ tc09_bad_unlock.stderr.out 2012-06-23 22:51:29.000000000 -0400
@@ -42,14 +42,6 @@
by 0x........: nearly_main (tc09_bad_unlock.c:41)
by 0x........: main (tc09_bad_unlock.c:49)
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_unlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
- by 0x........: nearly_main (tc09_bad_unlock.c:41)
- by 0x........: main (tc09_bad_unlock.c:49)
-
---------------------
----------------------------------------------------------------
@@ -110,16 +102,8 @@
----------------------------------------------------------------
-Thread #x's call to pthread_mutex_unlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
- by 0x........: nearly_main (tc09_bad_unlock.c:41)
- by 0x........: main (tc09_bad_unlock.c:50)
-
-----------------------------------------------------------------
-
Thread #x: Exiting thread still holds 1 lock
...
-ERROR SUMMARY: 11 errors from 11 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 0 from 0)
=================================================
./valgrind-new/helgrind/tests/tc18_semabuse.stderr.diff
=================================================
--- tc18_semabuse.stderr.exp 2012-06-23 22:38:23.000000000 -0400
+++ tc18_semabuse.stderr.out 2012-06-23 22:51:37.000000000 -0400
@@ -18,13 +18,5 @@
by 0x........: sem_wait (hg_intercepts.c:...)
by 0x........: main (tc18_semabuse.c:34)
-----------------------------------------------------------------
-Thread #x's call to sem_post failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: sem_post_WRK (hg_intercepts.c:...)
- by 0x........: sem_post (hg_intercepts.c:...)
- by 0x........: main (tc18_semabuse.c:37)
-
-
-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
=================================================
./valgrind-new/helgrind/tests/tc20_verifywrap.stderr.diff
=================================================
--- tc20_verifywrap.stderr.exp 2012-06-23 22:38:23.000000000 -0400
+++ tc20_verifywrap.stderr.out 2012-06-23 22:51:47.000000000 -0400
@@ -1,7 +1,7 @@
------- This is output for >= glibc 2.4 ------
+------ This is output for < glibc 2.4 ------
---------------- pthread_create/join ----------------
@@ -45,13 +45,6 @@
----------------------------------------------------------------
-Thread #x's call to pthread_mutex_init failed
- with error code 95 (EOPNOTSUPP: Operation not supported on transport endpoint)
- at 0x........: pthread_mutex_init (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:92)
-
-----------------------------------------------------------------
-
Thread #x: pthread_mutex_destroy of a locked mutex
at 0x........: pthread_mutex_destroy (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:102)
@@ -63,26 +56,8 @@
at 0x........: pthread_mutex_destroy (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:102)
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_lock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_lock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:108)
-
-----------------------------------------------------------------
-Thread #x's call to pthread_mutex_trylock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_trylock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:116)
-
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_timedlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_timedlock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:121)
+make pthread_mutex_lock fail: skipped on glibc < 2.4
----------------------------------------------------------------
@@ -90,13 +65,6 @@
at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:125)
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_unlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:125)
-
---------------- pthread_cond_wait et al ----------------
@@ -215,14 +183,6 @@
by 0x........: sem_wait (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:242)
-----------------------------------------------------------------
-
-Thread #x's call to sem_post failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: sem_post_WRK (hg_intercepts.c:...)
- by 0x........: sem_post (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:245)
-
FIXME: can't figure out how to verify wrap of sem_post
@@ -235,4 +195,4 @@
...
-ERROR SUMMARY: 23 errors from 23 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 17 errors from 17 contexts (suppressed: 0 from 0)
=================================================
./valgrind-new/memcheck/tests/manuel3.stderr.diff
=================================================
--- manuel3.stderr.exp 2012-06-23 22:39:10.000000000 -0400
+++ manuel3.stderr.out 2012-06-23 22:46:29.000000000 -0400
@@ -1,4 +1,3 @@
Conditional jump or move depends on uninitialised value(s)
- at 0x........: gcc_cant_inline_me (manuel3.c:22)
- by 0x........: main (manuel3.c:14)
+ at 0x........: main (manuel3.c:12)
=================================================
./valgrind-new/memcheck/tests/partial_load_ok.stderr.diff
=================================================
--- partial_load_ok.stderr.exp 2012-06-23 22:39:09.000000000 -0400
+++ partial_load_ok.stderr.out 2012-06-23 22:46:59.000000000 -0400
@@ -1,7 +1,13 @@
-Invalid read of size 4
+Invalid read of size 1
+ at 0x........: main (partial_load.c:16)
+ Address 0x........ is 0 bytes after a block of size 7 alloc'd
+ at 0x........: calloc (vg_replace_malloc.c:...)
+ by 0x........: main (partial_load.c:14)
+
+Invalid read of size 8
at 0x........: main (partial_load.c:23)
- Address 0x........ is 1 bytes inside a block of size 4 alloc'd
+ Address 0x........ is 1 bytes inside a block of size 8 alloc'd
at 0x........: calloc (vg_replace_malloc.c:...)
by 0x........: main (partial_load.c:20)
@@ -11,9 +17,9 @@
at 0x........: calloc (vg_replace_malloc.c:...)
by 0x........: main (partial_load.c:28)
-Invalid read of size 4
+Invalid read of size 8
at 0x........: main (partial_load.c:37)
- Address 0x........ is 0 bytes inside a block of size 4 free'd
+ Address 0x........ is 0 bytes inside a block of size 8 free'd
at 0x........: free (vg_replace_malloc.c:...)
by 0x........: main (partial_load.c:36)
@@ -25,4 +31,4 @@
For a detailed leak analysis, rerun with: --leak-check=full
For counts of detected and suppressed errors, rerun with: -v
-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
=================================================
./valgrind-new/memcheck/tests/partial_load_ok.stderr.diff64
=================================================
--- partial_load_ok.stderr.exp64 2012-06-23 22:39:09.000000000 -0400
+++ partial_load_ok.stderr.out 2012-06-23 22:46:59.000000000 -0400
@@ -1,4 +1,10 @@
+Invalid read of size 1
+ at 0x........: main (partial_load.c:16)
+ Address 0x........ is 0 bytes after a block of size 7 alloc'd
+ at 0x........: calloc (vg_replace_malloc.c:...)
+ by 0x........: main (partial_load.c:14)
+
Invalid read of size 8
at 0x........: main (partial_load.c:23)
Address 0x........ is 1 bytes inside a block of size 8 alloc'd
@@ -25,4 +31,4 @@
For a detailed leak analysis, rerun with: --leak-check=full
For counts of detected and suppressed errors, rerun with: -v
-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
=================================================
./valgrind-new/memcheck/tests/varinfo6.stderr.diff
=================================================
--- varinfo6.stderr.exp 2012-06-23 22:39:09.000000000 -0400
+++ varinfo6.stderr.out 2012-06-23 22:47:57.000000000 -0400
@@ -7,8 +7,7 @@
by 0x........: BZ2_bzCompress (varinfo6.c:4860)
by 0x........: BZ2_bzBuffToBuffCompress (varinfo6.c:5667)
by 0x........: main (varinfo6.c:6517)
- Location 0x........ is 2 bytes inside local var "budget"
- declared at varinfo6.c:3115, in frame #2 of thread 1
+ Address 0x........ is on thread 1's stack
Uninitialised byte(s) found during client check request
at 0x........: croak (varinfo6.c:34)
=================================================
./valgrind-new/memcheck/tests/varinfo6.stderr.diff-ppc64
=================================================
--- varinfo6.stderr.exp-ppc64 2012-06-23 22:39:09.000000000 -0400
+++ varinfo6.stderr.out 2012-06-23 22:47:57.000000000 -0400
@@ -1,5 +1,5 @@
Uninitialised byte(s) found during client check request
- at 0x........: croak (varinfo6.c:35)
+ at 0x........: croak (varinfo6.c:34)
by 0x........: mainSort (varinfo6.c:2999)
by 0x........: BZ2_blockSort (varinfo6.c:3143)
by 0x........: BZ2_compressBlock (varinfo6.c:4072)
@@ -10,7 +10,7 @@
Address 0x........ is on thread 1's stack
Uninitialised byte(s) found during client check request
- at 0x........: croak (varinfo6.c:35)
+ at 0x........: croak (varinfo6.c:34)
by 0x........: BZ2_decompress (varinfo6.c:1699)
by 0x........: BZ2_bzDecompress (varinfo6.c:5230)
by 0x........: BZ2_bzBuffToBuffDecompress (varinfo6.c:5715)
=================================================
./valgrind-old/helgrind/tests/tc09_bad_unlock.stderr.diff
=================================================
--- tc09_bad_unlock.stderr.exp 2012-06-23 22:21:06.000000000 -0400
+++ tc09_bad_unlock.stderr.out 2012-06-23 22:33:24.000000000 -0400
@@ -42,14 +42,6 @@
by 0x........: nearly_main (tc09_bad_unlock.c:41)
by 0x........: main (tc09_bad_unlock.c:49)
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_unlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
- by 0x........: nearly_main (tc09_bad_unlock.c:41)
- by 0x........: main (tc09_bad_unlock.c:49)
-
---------------------
----------------------------------------------------------------
@@ -110,16 +102,8 @@
----------------------------------------------------------------
-Thread #x's call to pthread_mutex_unlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
- by 0x........: nearly_main (tc09_bad_unlock.c:41)
- by 0x........: main (tc09_bad_unlock.c:50)
-
-----------------------------------------------------------------
-
Thread #x: Exiting thread still holds 1 lock
...
-ERROR SUMMARY: 11 errors from 11 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 0 from 0)
=================================================
./valgrind-old/helgrind/tests/tc18_semabuse.stderr.diff
=================================================
--- tc18_semabuse.stderr.exp 2012-06-23 22:21:06.000000000 -0400
+++ tc18_semabuse.stderr.out 2012-06-23 22:33:32.000000000 -0400
@@ -18,13 +18,5 @@
by 0x........: sem_wait (hg_intercepts.c:...)
by 0x........: main (tc18_semabuse.c:34)
-----------------------------------------------------------------
-Thread #x's call to sem_post failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: sem_post_WRK (hg_intercepts.c:...)
- by 0x........: sem_post (hg_intercepts.c:...)
- by 0x........: main (tc18_semabuse.c:37)
-
-
-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
=================================================
./valgrind-old/helgrind/tests/tc20_verifywrap.stderr.diff
=================================================
--- tc20_verifywrap.stderr.exp 2012-06-23 22:21:06.000000000 -0400
+++ tc20_verifywrap.stderr.out 2012-06-23 22:33:41.000000000 -0400
@@ -1,7 +1,7 @@
------- This is output for >= glibc 2.4 ------
+------ This is output for < glibc 2.4 ------
---------------- pthread_create/join ----------------
@@ -45,13 +45,6 @@
----------------------------------------------------------------
-Thread #x's call to pthread_mutex_init failed
- with error code 95 (EOPNOTSUPP: Operation not supported on transport endpoint)
- at 0x........: pthread_mutex_init (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:92)
-
-----------------------------------------------------------------
-
Thread #x: pthread_mutex_destroy of a locked mutex
at 0x........: pthread_mutex_destroy (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:102)
@@ -63,26 +56,8 @@
at 0x........: pthread_mutex_destroy (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:102)
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_lock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_lock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:108)
-
-----------------------------------------------------------------
-Thread #x's call to pthread_mutex_trylock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_trylock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:116)
-
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_timedlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_timedlock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:121)
+make pthread_mutex_lock fail: skipped on glibc < 2.4
----------------------------------------------------------------
@@ -90,13 +65,6 @@
at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:125)
-----------------------------------------------------------------
-
-Thread #x's call to pthread_mutex_unlock failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: pthread_mutex_unlock (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:125)
-
---------------- pthread_cond_wait et al ----------------
@@ -215,14 +183,6 @@
by 0x........: sem_wait (hg_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:242)
-----------------------------------------------------------------
-
-Thread #x's call to sem_post failed
- with error code 22 (EINVAL: Invalid argument)
- at 0x........: sem_post_WRK (hg_intercepts.c:...)
- by 0x........: sem_post (hg_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:245)
-
FIXME: can't figure out how to verify wrap of sem_post
@@ -235,4 +195,4 @@
...
-ERROR SUMMARY: 23 errors from 23 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 17 errors from 17 contexts (suppressed: 0 from 0)
=================================================
./valgrind-old/memcheck/tests/manuel3.stderr.diff
=================================================
--- manuel3.stderr.exp 2012-06-23 22:22:03.000000000 -0400
+++ manuel3.stderr.out 2012-06-23 22:28:24.000000000 -0400
@@ -1,4 +1,3 @@
Conditional jump or move depends on uninitialised value(s)
- at 0x........: gcc_cant_inline_me (manuel3.c:22)
- by 0x........: main (manuel3.c:14)
+ at 0x........: main (manuel3.c:12)
=================================================
./valgrind-old/memcheck/tests/partial_load_ok.stderr.diff
=================================================
--- partial_load_ok.stderr.exp 2012-06-23 22:22:03.000000000 -0400
+++ partial_load_ok.stderr.out 2012-06-23 22:28:54.000000000 -0400
@@ -1,7 +1,13 @@
-Invalid read of size 4
+Invalid read of size 1
+ at 0x........: main (partial_load.c:16)
+ Address 0x........ is 0 bytes after a block of size 7 alloc'd
+ at 0x........: calloc (vg_replace_malloc.c:...)
+ by 0x........: main (partial_load.c:14)
+
+Invalid read of size 8
at 0x........: main (partial_load.c:23)
- Address 0x........ is 1 bytes inside a block of size 4 alloc'd
+ Address 0x........ is 1 bytes inside a block of size 8 alloc'd
at 0x........: calloc (vg_replace_malloc.c:...)
by 0x........: main (partial_load.c:20)
@@ -11,9 +17,9 @@
at 0x........: calloc (vg_replace_malloc.c:...)
by 0x........: main (partial_load.c:28)
-Invalid read of size 4
+Invalid read of size 8
at 0x........: main (partial_load.c:37)
- Address 0x........ is 0 bytes inside a block of size 4 free'd
+ Address 0x........ is 0 bytes inside a block of size 8 free'd
at 0x........: free (vg_replace_malloc.c:...)
by 0x........: main (partial_load.c:36)
@@ -25,4 +31,4 @@
For a detailed leak analysis, rerun with: --leak-check=full
For counts of detected and suppressed errors, rerun with: -v
-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
=================================================
./valgrind-old/memcheck/tests/partial_load_ok.stderr.diff64
=================================================
--- partial_load_ok.stderr.exp64 2012-06-23 22:22:03.000000000 -0400
+++ partial_load_ok.stderr.out 2012-06-23 22:28:54.000000000 -0400
@@ -1,4 +1,10 @@
+Invalid read of size 1
+ at 0x........: main (partial_load.c:16)
+ Address 0x........ is 0 bytes after a block of size 7 alloc'd
+ at 0x........: calloc (vg_replace_malloc.c:...)
+ by 0x........: main (partial_load.c:14)
+
Invalid read of size 8
at 0x........: main (partial_load.c:23)
Address 0x........ is 1 bytes inside a block of size 8 alloc'd
@@ -25,4 +31,4 @@
For a detailed leak analysis, rerun with: --leak-check=full
For counts of detected and suppressed errors, rerun with: -v
-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
+ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
=================================================
./valgrind-old/memcheck/tests/varinfo6.stderr.diff
=================================================
--- varinfo6.stderr.exp 2012-06-23 22:22:03.000000000 -0400
+++ varinfo6.stderr.out 2012-06-23 22:29:52.000000000 -0400
@@ -7,8 +7,7 @@
by 0x........: BZ2_bzCompress (varinfo6.c:4860)
by 0x........: BZ2_bzBuffToBuffCompress (varinfo6.c:5667)
by 0x........: main (varinfo6.c:6517)
- Location 0x........ is 2 bytes inside local var "budget"
- declared at varinfo6.c:3115, in frame #2 of thread 1
+ Address 0x........ is on thread 1's stack
Uninitialised byte(s) found during client check request
at 0x........: croak (varinfo6.c:34)
=================================================
./valgrind-old/memcheck/tests/varinfo6.stderr.diff-ppc64
=================================================
--- varinfo6.stderr.exp-ppc64 2012-06-23 22:22:02.000000000 -0400
+++ varinfo6.stderr.out 2012-06-23 22:29:52.000000000 -0400
@@ -1,5 +1,5 @@
Uninitialised byte(s) found during client check request
- at 0x........: croak (varinfo6.c:35)
+ at 0x........: croak (varinfo6.c:34)
by 0x........: mainSort (varinfo6.c:2999)
by 0x........: BZ2_blockSort (varinfo6.c:3143)
by 0x........: BZ2_compressBlock (varinfo6.c:4072)
@@ -10,7 +10,7 @@
Address 0x........ is on thread 1's stack
Uninitialised byte(s) found during client check request
- at 0x........: croak (varinfo6.c:35)
+ at 0x........: croak (varinfo6.c:34)
by 0x........: BZ2_decompress (varinfo6.c:1699)
by 0x........: BZ2_bzDecompress (varinfo6.c:5230)
by 0x........: BZ2_bzBuffToBuffDecompress (varinfo6.c:5715)
|
|
From: Tom H. <to...@co...> - 2012-06-24 02:51:03
|
valgrind revision: 12663 VEX revision: 2402 C compiler: gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4) Assembler: GNU assembler version 2.20.51.0.7-8.fc14 20100318 C library: GNU C Library stable release version 2.13 uname -mrs: Linux 3.4.0-1.fc17.x86_64 x86_64 Vendor version: Fedora release 14 (Laughlin) Nightly build on bristol ( x86_64, Fedora 14 ) Started at 2012-06-24 03:12:12 BST Ended at 2012-06-24 03:50:45 BST Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 617 tests, 1 stderr failure, 0 stdout failures, 1 stderrB failure, 2 stdoutB failures, 0 post failures == gdbserver_tests/mcinfcallWSRU (stderrB) gdbserver_tests/nlcontrolc (stdoutB) gdbserver_tests/nlpasssigalrm (stdoutB) memcheck/tests/origin5-bz2 (stderr) |
|
From: Tom H. <to...@co...> - 2012-06-24 02:50:39
|
valgrind revision: 12663 VEX revision: 2402 C compiler: gcc (GCC) 4.4.5 20101112 (Red Hat 4.4.5-2) Assembler: GNU assembler version 2.20.51.0.2-20.fc13 20091009 C library: GNU C Library stable release version 2.12.2 uname -mrs: Linux 3.4.0-1.fc17.x86_64 x86_64 Vendor version: Fedora release 13 (Goddard) Nightly build on bristol ( x86_64, Fedora 13 ) Started at 2012-06-24 03:22:05 BST Ended at 2012-06-24 03:50:24 BST Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 602 tests, 1 stderr failure, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == helgrind/tests/pth_barrier3 (stderr) |