You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
1
(11) |
|
2
|
3
|
4
|
5
(4) |
6
(21) |
7
(14) |
8
(14) |
|
9
(16) |
10
(19) |
11
(18) |
12
(17) |
13
(14) |
14
(21) |
15
(15) |
|
16
(10) |
17
(7) |
18
(15) |
19
(20) |
20
(20) |
21
(14) |
22
(7) |
|
23
(2) |
24
(8) |
25
(15) |
26
(11) |
27
(6) |
28
(10) |
|
|
From: <sv...@va...> - 2014-02-05 14:00:27
|
Author: sewardj
Date: Wed Feb 5 14:00:16 2014
New Revision: 13789
Log:
Create a list of all bugs reported after the 3.9.0 release
(I think).
Added:
trunk/docs/internals/3_9_BUGSTATUS.txt
Modified:
trunk/docs/Makefile.am
Modified: trunk/docs/Makefile.am
==============================================================================
--- trunk/docs/Makefile.am (original)
+++ trunk/docs/Makefile.am Wed Feb 5 14:00:16 2014
@@ -27,6 +27,7 @@
internals/3_5_BUGSTATUS.txt \
internals/3_7_BUGSTATUS.txt \
internals/3_8_BUGSTATUS.txt \
+ internals/3_9_BUGSTATUS.txt \
internals/arm_thumb_notes_gdbserver.txt \
internals/avx-notes.txt \
internals/BIG_APP_NOTES.txt \
Added: trunk/docs/internals/3_9_BUGSTATUS.txt
==============================================================================
--- trunk/docs/internals/3_9_BUGSTATUS.txt (added)
+++ trunk/docs/internals/3_9_BUGSTATUS.txt Wed Feb 5 14:00:16 2014
@@ -0,0 +1,105 @@
+
+Bugs reported after Thu Sep 19 10:34:49 CEST 2013
+
+For bugs reported before this time, see 3_8_BUGSTATUS.txt
+
+324894 Phase 3 support for IBM Power ISA 2.07
+325110 Add test-cases for Power ISA 2.06 insns: divdo/divdo. and divduo/divduo.
+325124 [MIPSEL] Compilation error
+325222 eight bad if statements ?
+325266 unhandled instruction bytes: 0xC4 0xC2 0x79 0xF7 0xC9 0x89 0x45 0x80
+325328 __float128 loses precision under memcheck
+325333 VALGRIND_HG_DISABLE_CHECKING does not seem to work locally
+325477 Phase 4 support for IBM Power ISA 2.07
+325538 cavim octeon mips64 ï¼valgrind reported "dumping core" and "Assertion 'sizeof(*regs) == sizeof(prs->pr_reg)' failed.
+325628 Phase 5 support for IBM Power ISA 2.07
+325714 Empty vgcore but RLIMIT_CORE is big enough (too big)
+325751 Missing the two privileged Power PC Transactional Memory Instructions
+325816 Phase 6 support for IBM Power ISA 2.07
+325856 sgcheck generates internal Valgrind error on IBM Power
+325874 Crash KCachegrind while load big file
+326026 Iop names for count leading zeros/sign bits incorrectly imply a "signedness" in incoming lanes
+326091 False positive in libstdc++ std::string::_S_construct (gcc 4.7.2)
+326113 valgrind libvex hwcaps error on AMD64
+326436 False positive in libstdc++ std::list::push_back
+326444 Cavium MIPS Octeon Specific Load Indexed Instructions
+326462 Refactor vgdb module to isolate ptrace stuff into separate module
+326469 unhandled instruction bytes: 0x66 0xF 0x3A 0x63 0xC1 0xE 0x89 0xC8
+326487 child of debugged process exits without becoming zombie
+326623 A false positive conflict report in a field assignment in a constructor
+326724 Valgrind does not compile on OSX 1.9 Mavericks
+326797 Assertion 'sizeof(UWord) == sizeof(UInt)' failed.
+326816 Intercept for __strncpy_sse2_unaligned missing?
+326821 Double overflow/underflow handling broken (after exp())
+326839 Don't see a writing into a none allocated memory
+326921 coregrind fails to compile m_trampoline.S with MIPS/Linux port of Valgrind
+326955 64 bit false positive move depends on uninitialised value wcscpy
+326983 insn_basic test might crash because of setting and not clearing DF flag
+327138 valgrind.h __VALGRIND_MINOR__ says 8, in 3.9.0 tarball
+327151 valgrind appears to stop compiling when it enters the drd directory
+327155 Valgrind compilation hang on MIPS
+327212 expand_file_name prepends current directory when expansion starts with /
+327223 Support for Cavium MIPS Octeon Atomic and Count Instructions
+327238 assertion failure in Callgrind: bbcc.c:585 (vgCallgrind_setup_bbcc): Assertion 'passed <= last_bb->cjmp_count' failed
+327284 s390x VEX miscompilation of -march=z10 binary
+327285 vex amd64->IR: unhandled instruction bytes: 0x8F 0xEA 0xF8 0x10 0xCE 0x3 0x1D 0x0
+327427 ifunc wrapper crashes when symbols are discarded because of false mmap overlaps
+327548 false positive while destroying mutex
+327583 libpixman error on ARM system
+327639 vex amd64->IR pcmpestri SSE4.2 instruction is unsupported 0x34
+327665 out of memory error
+327745 valgrind 3.9.0 build fails on Mac OS X 10.6.8
+327837 dwz compressed alternate .debug_info and debug_str not read correctly.
+327859 Support for android devices
+327881 False Positive Warning on std::atomic_bool
+327916 DW_TAG_typedef may have no name
+327943 s390x missing index/strchr suppression for ld.so (bad backtrace?)
+327945 valgrind_3.9.0 failed to compile in ppc 32
+328011 3.9.0 segfaults running any program, on any valgrind tool
+328081 embedded gdbserver and non-stop mode
+328089 unhandled instruction bytes: 0xF0 0xF 0xC0 0x10
+328100 XABORT not implemented
+328147 vex mips->IR: unhandled instruction bytes: 0x0 0x0 0x0 0xE
+328205 Implement additional Xen hypercalls
+328357 vex amd64->IR: unhandled instruction bytes: 0x8F 0xEA 0xF8 0x10 0xEF 0x3 0x5 0x0
+328423 Unrecognised instructions: _fips_armv7_tick and _armv7_tick
+328441 valgrind_3.9.0 failed to compile in mips32 âError: illegal operands `cfc1 $t0,$31'â
+328454 add support Backtraces with ARM unwind tables (EXIDX)
+328455 s390x: valgrind is gettting SIGILL after emitting wrong register pair for ldxbr
+328468 unwind x86/amd64 gcc <= 4.4 compiled code does not unwind properly at "ret" instruction
+328490 drd reports false positive for concurrent __atomic_base access
+328549 Valgrind crashes on Android 4.4 / x86 on most programs
+328559 Some back trace generation (from mmap function) problem on ARM
+328563 make track-fds support xml output
+328711 valgrind.1 manpage "memcheck options" section is badly generated
+328721 MSVC 2008 compiler warns about while(0) in warning level 4
+328730 Unimplemented system call #531 in FreeBSD: SYS_posix_fadvise
+328747 Valgrind memcheck exits with SIGTRAP on PPC
+328878 vex amd64->IR pcmpestri SSE4.2 instruction is unsupported 0x14
+329104 kcachegrind crashs when on loading some of my cachegrind traces (SIGFPE).
+329245 unhandled instruction bytes: 0x48 0xF 0x5A 0x7 0x48 0xF 0x5A 0x4F
+329612 Incorrect handling of AT_BASE for image execution
+329619 leak-check gets assertion failure when nesting VALGRIND_MALLOCLIKE_BLOCK
+329694 clang warns about using uninitialized variable
+329726 Mozilla
+329737 KCachegrind stores translated messages to config file
+329956 valgrind crashes when lmw/stmw instructions are used on ppc64
+329963 Half precision floating point conversion on ARM is not supported
+330147 libmpiwrap: byte count from PMPI_Get_count should be made defined
+330152 vex amd64->IR: unhandled instruction bytes: 0xC5 0xFB 0x10 0x4 0x25 0xB0 0xCA 0x41
+330180 False positive in v4l2?
+330228 mmap must align to VKI_SHMLBA on mips32
+330254 Exit code of original app should be accessible
+330257 LLVM does not support `-mno-dynamic-no-pic` option
+330293 Please add a AppData application description
+330319 unhandled instruction bytes: 0xF 0x1 0xD5 0x31 0xC0 0xC3 0x48 0x8D
+330321 Serious error when reading debug info - DW_AT_signature 9b d0 55 13 bb 1e e9 37
+330349 Endless loop happen when using lackey with --trace-mem=yes on ARM
+330459 --track-fds=yes doesn't track eventfds
+330469 Add clock_adjtime syscall support
+330590 Missing support for multiple VEX CMP instruction Opcodes (Causes SIGILL)
+330594 Missing sysalls on PowerPC / uClibc
+330617 ppc false positive conditional jump depends on uninitialised value
+330622 Add test to regression suite for POWER instruction: dcbzl
+
+Wed Feb 5 14:58:25 CET 2014
|
|
From: <sv...@va...> - 2014-02-05 13:21:12
|
Author: sewardj
Date: Wed Feb 5 13:20:58 2014
New Revision: 13788
Log:
Show a line in the output log when the client connects but the
requested file is not found by the server. This makes it easier to
diagnose client--server communications problems.
Modified:
trunk/auxprogs/valgrind-di-server.c
Modified: trunk/auxprogs/valgrind-di-server.c
==============================================================================
--- trunk/auxprogs/valgrind-di-server.c (original)
+++ trunk/auxprogs/valgrind-di-server.c Wed Feb 5 13:20:58 2014
@@ -743,6 +743,8 @@
fd = open((char*)filename, O_RDONLY);
if (fd == -1) {
res = mk_Frame_asciiz("FAIL", "OPEN: cannot open file");
+ printf("(%d) SessionID %llu: open failed for \"%s\"\n",
+ conn_count, conn_state[conn_no].session_id, filename );
ok = False;
} else {
assert(fd > 2);
|
|
From: <sv...@va...> - 2014-02-05 11:02:41
|
Author: sewardj
Date: Wed Feb 5 11:02:34 2014
New Revision: 13787
Log:
Update.
Modified:
trunk/README.aarch64
Modified: trunk/README.aarch64
==============================================================================
--- trunk/README.aarch64 (original)
+++ trunk/README.aarch64 Wed Feb 5 11:02:34 2014
@@ -206,3 +206,20 @@
ubfm/sbfm etc: special case cases that are simple shifts, as iropt
can't always simplify the general-case IR to a shift in such cases.
+
+
+LDP,STP (immediate, simm7) (FP&VEC)
+should zero out hi parts of dst registers in the LDP case
+
+
+DUP insns: use Iop_Dup8x16, Iop_Dup16x8, Iop_Dup32x4
+rather than doing it "by hand"
+
+
+Any place where ZeroHI64ofV128 is used in conjunction with
+FP vector IROps: find a way to make sure that arithmetic on
+the upper half of the values is "harmless."
+
+
+math_MINMAXV: use real Iop_Cat{Odd,Even}Lanes ops rather than
+inline scalar code
|
Author: sewardj
Date: Wed Feb 5 11:01:19 2014
New Revision: 2812
Log:
Implement a few more vector aarch64 insns:
LD1 {vT.4s}, [xN|SP], #16
ADD Dd, Dn, Dm
SUB Dd, Dn, Dm
SMIN Vd.T, Vn.T, Vm.T
UMIN Vd.T, Vn.T, Vm.T
SMAX Vd.T, Vn.T, Vm.T
UMAX Vd.T, Vn.T, Vm.T
SMINV Vd.T, Vn.T, Vm.T
UMINV Vd.T, Vn.T, Vm.T
SMAXV Vd.T, Vn.T, Vm.T
UMAXV Vd.T, Vn.T, Vm.T
DUP Vd.T, Rn
FADD/FSUB/FMUL/FDIV32x4
Modified:
trunk/priv/guest_arm64_toIR.c
trunk/priv/host_arm64_defs.c
trunk/priv/host_arm64_defs.h
trunk/priv/host_arm64_isel.c
trunk/priv/ir_defs.c
trunk/pub/libvex_ir.h
Modified: trunk/priv/guest_arm64_toIR.c
==============================================================================
--- trunk/priv/guest_arm64_toIR.c (original)
+++ trunk/priv/guest_arm64_toIR.c Wed Feb 5 11:01:19 2014
@@ -3993,6 +3993,7 @@
0100 1100 1001 1111 0111 11 N T ST1 {vT.2d}, [xN|SP], #16
0100 1100 1101 1111 0111 11 N T LD1 {vT.2d}, [xN|SP], #16
0100 1100 1001 1111 0111 10 N T ST1 {vT.4s}, [xN|SP], #16
+ 0100 1100 1101 1111 0111 10 N T LD1 {vT.4s}, [xN|SP], #16
0100 1100 1001 1111 0111 01 N T ST1 {vT.8h}, [xN|SP], #16
Note that #16 is implied and cannot be any other value.
FIXME does this assume that the host is little endian?
@@ -4000,6 +4001,7 @@
if ( (insn & 0xFFFFFC00) == 0x4C9F7C00 // ST1 {vT.2d}, [xN|SP], #16
|| (insn & 0xFFFFFC00) == 0x4CDF7C00 // LD1 {vT.2d}, [xN|SP], #16
|| (insn & 0xFFFFFC00) == 0x4C9F7800 // ST1 {vT.4s}, [xN|SP], #16
+ || (insn & 0xFFFFFC00) == 0x4CDF7800 // LD1 {vT.4s}, [xN|SP], #16
|| (insn & 0xFFFFFC00) == 0x4C9F7400 // ST1 {vT.8h}, [xN|SP], #16
) {
Bool isLD = INSN(22,22) == 1;
@@ -4359,6 +4361,15 @@
/*--- SIMD and FP instructions ---*/
/*------------------------------------------------------------*/
+/* begin FIXME -- rm temp scaffolding */
+static IRExpr* mk_CatEvenLanes64x2 ( IRTemp, IRTemp );
+static IRExpr* mk_CatOddLanes64x2 ( IRTemp, IRTemp );
+static IRExpr* mk_CatEvenLanes32x4 ( IRTemp, IRTemp );
+static IRExpr* mk_CatOddLanes32x4 ( IRTemp, IRTemp );
+static IRExpr* mk_CatEvenLanes16x8 ( IRTemp, IRTemp );
+static IRExpr* mk_CatOddLanes16x8 ( IRTemp, IRTemp );
+/* end FIXME -- rm temp scaffolding */
+
/* Generate N copies of |bit| in the bottom of a ULong. */
static ULong Replicate ( ULong bit, Int N )
{
@@ -4458,6 +4469,100 @@
}
+/* Generate IR to fold all lanes of the V128 value in 'src' as
+ characterised by the operator 'op', and return the result in the
+ bottom bits of a V128, with all other bits set to zero. */
+static IRTemp math_MINMAXV ( IRTemp src, IROp op )
+{
+ /* The basic idea is to use repeated applications of Iop_CatEven*
+ and Iop_CatOdd* operators to 'src' so as to clone each lane into
+ a complete vector. Then fold all those vectors with 'op' and
+ zero out all but the least significant lane. */
+ switch (op) {
+ case Iop_Min8Sx16: case Iop_Min8Ux16:
+ case Iop_Max8Sx16: case Iop_Max8Ux16: {
+ return IRTemp_INVALID; // ATC
+ }
+ case Iop_Min16Sx8: case Iop_Min16Ux8:
+ case Iop_Max16Sx8: case Iop_Max16Ux8: {
+ IRTemp x76543210 = src;
+ IRTemp x76547654 = newTemp(Ity_V128);
+ IRTemp x32103210 = newTemp(Ity_V128);
+ assign(x76547654, mk_CatOddLanes64x2 (x76543210, x76543210));
+ assign(x32103210, mk_CatEvenLanes64x2(x76543210, x76543210));
+ IRTemp x76767676 = newTemp(Ity_V128);
+ IRTemp x54545454 = newTemp(Ity_V128);
+ IRTemp x32323232 = newTemp(Ity_V128);
+ IRTemp x10101010 = newTemp(Ity_V128);
+ assign(x76767676, mk_CatOddLanes32x4 (x76547654, x76547654));
+ assign(x54545454, mk_CatEvenLanes32x4(x76547654, x76547654));
+ assign(x32323232, mk_CatOddLanes32x4 (x32103210, x32103210));
+ assign(x10101010, mk_CatEvenLanes32x4(x32103210, x32103210));
+ IRTemp x77777777 = newTemp(Ity_V128);
+ IRTemp x66666666 = newTemp(Ity_V128);
+ IRTemp x55555555 = newTemp(Ity_V128);
+ IRTemp x44444444 = newTemp(Ity_V128);
+ IRTemp x33333333 = newTemp(Ity_V128);
+ IRTemp x22222222 = newTemp(Ity_V128);
+ IRTemp x11111111 = newTemp(Ity_V128);
+ IRTemp x00000000 = newTemp(Ity_V128);
+ assign(x77777777, mk_CatOddLanes16x8 (x76767676, x76767676));
+ assign(x66666666, mk_CatEvenLanes16x8(x76767676, x76767676));
+ assign(x55555555, mk_CatOddLanes16x8 (x54545454, x54545454));
+ assign(x44444444, mk_CatEvenLanes16x8(x54545454, x54545454));
+ assign(x33333333, mk_CatOddLanes16x8 (x32323232, x32323232));
+ assign(x22222222, mk_CatEvenLanes16x8(x32323232, x32323232));
+ assign(x11111111, mk_CatOddLanes16x8 (x10101010, x10101010));
+ assign(x00000000, mk_CatEvenLanes16x8(x10101010, x10101010));
+ IRTemp max76 = newTemp(Ity_V128);
+ IRTemp max54 = newTemp(Ity_V128);
+ IRTemp max32 = newTemp(Ity_V128);
+ IRTemp max10 = newTemp(Ity_V128);
+ assign(max76, binop(op, mkexpr(x77777777), mkexpr(x66666666)));
+ assign(max54, binop(op, mkexpr(x55555555), mkexpr(x44444444)));
+ assign(max32, binop(op, mkexpr(x33333333), mkexpr(x22222222)));
+ assign(max10, binop(op, mkexpr(x11111111), mkexpr(x00000000)));
+ IRTemp max7654 = newTemp(Ity_V128);
+ IRTemp max3210 = newTemp(Ity_V128);
+ assign(max7654, binop(op, mkexpr(max76), mkexpr(max54)));
+ assign(max3210, binop(op, mkexpr(max32), mkexpr(max10)));
+ IRTemp max76543210 = newTemp(Ity_V128);
+ assign(max76543210, binop(op, mkexpr(max7654), mkexpr(max3210)));
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, unop(Iop_ZeroHI112ofV128, mkexpr(max76543210)));
+ return res;
+ }
+ case Iop_Min32Sx4: case Iop_Min32Ux4:
+ case Iop_Max32Sx4: case Iop_Max32Ux4: {
+ IRTemp x3210 = src;
+ IRTemp x3232 = newTemp(Ity_V128);
+ IRTemp x1010 = newTemp(Ity_V128);
+ assign(x3232, mk_CatOddLanes64x2 (x3210, x3210));
+ assign(x1010, mk_CatEvenLanes64x2(x3210, x3210));
+ IRTemp x3333 = newTemp(Ity_V128);
+ IRTemp x2222 = newTemp(Ity_V128);
+ IRTemp x1111 = newTemp(Ity_V128);
+ IRTemp x0000 = newTemp(Ity_V128);
+ assign(x3333, mk_CatOddLanes32x4 (x3232, x3232));
+ assign(x2222, mk_CatEvenLanes32x4(x3232, x3232));
+ assign(x1111, mk_CatOddLanes32x4 (x1010, x1010));
+ assign(x0000, mk_CatEvenLanes32x4(x1010, x1010));
+ IRTemp max32 = newTemp(Ity_V128);
+ IRTemp max10 = newTemp(Ity_V128);
+ assign(max32, binop(op, mkexpr(x3333), mkexpr(x2222)));
+ assign(max10, binop(op, mkexpr(x1111), mkexpr(x0000)));
+ IRTemp max3210 = newTemp(Ity_V128);
+ assign(max3210, binop(op, mkexpr(max32), mkexpr(max10)));
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, unop(Iop_ZeroHI96ofV128, mkexpr(max3210)));
+ return res;
+ }
+ default:
+ vassert(0);
+ }
+}
+
+
static
Bool dis_ARM64_simd_and_fp(/*MB_OUT*/DisResult* dres, UInt insn)
{
@@ -5059,7 +5164,8 @@
IRTemp t1 = newTemp(Ity_V128);
IRTemp t2 = newTemp(Ity_V128);
assign(t1, triop(op, mkexpr(rm), getQReg128(nn), getQReg128(mm)));
- assign(t2, zeroHI ? unop(Iop_ZeroHI64, mkexpr(t1)) : mkexpr(t1));
+ assign(t2, zeroHI ? unop(Iop_ZeroHI64ofV128, mkexpr(t1))
+ : mkexpr(t1));
putQReg128(dd, mkexpr(t2));
DIP("%s %s.%s, %s.%s, %s.%s\n", names[ix-1],
nameQReg128(dd), arr, nameQReg128(nn), arr, nameQReg128(mm), arr);
@@ -5092,7 +5198,8 @@
IROp op = isSUB ? opSUB[szBlg2] : opADD[szBlg2];
IRTemp t = newTemp(Ity_V128);
assign(t, binop(op, getQReg128(nn), getQReg128(mm)));
- putQReg128(dd, zeroHI ? unop(Iop_ZeroHI64, mkexpr(t)) : mkexpr(t));
+ putQReg128(dd, zeroHI ? unop(Iop_ZeroHI64ofV128, mkexpr(t))
+ : mkexpr(t));
const HChar* nm = isSUB ? "sub" : "add";
DIP("%s %s.%s, %s.%s, %s.%s\n", nm,
nameQReg128(dd), arrSpec,
@@ -5102,8 +5209,136 @@
/* else fall through */
}
+ /* ---------------- ADD/SUB (scalar) ---------------- */
+ /* 31 28 23 21 20 15 9 4
+ 010 11110 11 1 m 100001 n d ADD Dd, Dn, Dm
+ 011 11110 11 1 m 100001 n d SUB Dd, Dn, Dm
+ */
+ if (INSN(31,30) == BITS2(0,1) && INSN(28,21) == BITS8(1,1,1,1,0,1,1,1)
+ && INSN(15,10) == BITS6(1,0,0,0,0,1)) {
+ Bool isSUB = INSN(29,29) == 1;
+ UInt mm = INSN(20,16);
+ UInt nn = INSN(9,5);
+ UInt dd = INSN(4,0);
+ IRTemp res = newTemp(Ity_I64);
+ assign(res, binop(isSUB ? Iop_Sub64 : Iop_Add64,
+ getQRegLane(nn, 0, Ity_I64),
+ getQRegLane(mm, 0, Ity_I64)));
+ putQRegLane(dd, 0, mkexpr(res));
+ putQRegLane(dd, 1, mkU64(0));
+ DIP("%s %s, %s, %s\n", isSUB ? "sub" : "add",
+ nameQRegLO(dd, Ity_I64),
+ nameQRegLO(nn, Ity_I64), nameQRegLO(mm, Ity_I64));
+ return True;
+ }
+
+ /* ---------------- {S,U}{MIN,MAX} (vector) ---------------- */
+ /* 31 28 23 21 20 15 9 4
+ 0q0 01110 size 1 m 011011 n d SMIN Vd.T, Vn.T, Vm.T
+ 0q1 01110 size 1 m 011011 n d UMIN Vd.T, Vn.T, Vm.T
+ 0q0 01110 size 1 m 011001 n d SMAX Vd.T, Vn.T, Vm.T
+ 0q1 01110 size 1 m 011001 n d UMAX Vd.T, Vn.T, Vm.T
+ */
+ if (INSN(31,31) == 0 && INSN(28,24) == BITS5(0,1,1,1,0)
+ && INSN(21,21) == 1
+ && ((INSN(15,10) & BITS6(1,1,1,1,0,1)) == BITS6(0,1,1,0,0,1))) {
+ Bool isQ = INSN(30,30) == 1;
+ Bool isU = INSN(29,29) == 1;
+ UInt szBlg2 = INSN(23,22);
+ Bool isMAX = INSN(12,12) == 0;
+ UInt mm = INSN(20,16);
+ UInt nn = INSN(9,5);
+ UInt dd = INSN(4,0);
+ Bool zeroHI = False;
+ const HChar* arrSpec = "";
+ Bool ok = getLaneInfo_SIMPLE(&zeroHI, &arrSpec, isQ, szBlg2 );
+ if (ok) {
+ const IROp opMINS[4]
+ = { Iop_Min8Sx16, Iop_Min16Sx8, Iop_Min32Sx4, Iop_Min64Sx2 };
+ const IROp opMINU[4]
+ = { Iop_Min8Ux16, Iop_Min16Ux8, Iop_Min32Ux4, Iop_Min64Ux2 };
+ const IROp opMAXS[4]
+ = { Iop_Max8Sx16, Iop_Max16Sx8, Iop_Max32Sx4, Iop_Max64Sx2 };
+ const IROp opMAXU[4]
+ = { Iop_Max8Ux16, Iop_Max16Ux8, Iop_Max32Ux4, Iop_Max64Ux2 };
+ vassert(szBlg2 < 4);
+ IROp op = isMAX ? (isU ? opMAXU[szBlg2] : opMAXS[szBlg2])
+ : (isU ? opMINU[szBlg2] : opMINS[szBlg2]);
+ IRTemp t = newTemp(Ity_V128);
+ assign(t, binop(op, getQReg128(nn), getQReg128(mm)));
+ putQReg128(dd, zeroHI ? unop(Iop_ZeroHI64ofV128, mkexpr(t))
+ : mkexpr(t));
+ const HChar* nm = isMAX ? (isU ? "umax" : "smax")
+ : (isU ? "umin" : "smin");
+ DIP("%s %s.%s, %s.%s, %s.%s\n", nm,
+ nameQReg128(dd), arrSpec,
+ nameQReg128(nn), arrSpec, nameQReg128(mm), arrSpec);
+ return True;
+ }
+ /* else fall through */
+ }
+
+ /* -------------------- {S,U}{MIN,MAX}V -------------------- */
+ /* 31 28 23 21 16 15 9 4
+ 0q0 01110 size 11000 1 101010 n d SMINV Vd, Vn.T
+ 0q1 01110 size 11000 1 101010 n d UMINV Vd, Vn.T
+ 0q0 01110 size 11000 0 101010 n d SMAXV Vd, Vn.T
+ 0q1 01110 size 11000 0 101010 n d UMAXV Vd, Vn.T
+ */
+ if (INSN(31,31) == 0 && INSN(28,24) == BITS5(0,1,1,1,0)
+ && INSN(21,17) == BITS5(1,1,0,0,0)
+ && INSN(15,10) == BITS6(1,0,1,0,1,0)) {
+ Bool isQ = INSN(30,30) == 1;
+ Bool isU = INSN(29,29) == 1;
+ UInt szBlg2 = INSN(23,22);
+ Bool isMAX = INSN(16,16) == 0;
+ UInt nn = INSN(9,5);
+ UInt dd = INSN(4,0);
+ Bool zeroHI = False;
+ const HChar* arrSpec = "";
+ Bool ok = getLaneInfo_SIMPLE(&zeroHI, &arrSpec, isQ, szBlg2);
+ if (ok) {
+ if (szBlg2 == 3) ok = False;
+ if (szBlg2 == 2 && !isQ) ok = False;
+ }
+ if (ok) {
+ const IROp opMINS[3]
+ = { Iop_Min8Sx16, Iop_Min16Sx8, Iop_Min32Sx4 };
+ const IROp opMINU[3]
+ = { Iop_Min8Ux16, Iop_Min16Ux8, Iop_Min32Ux4 };
+ const IROp opMAXS[3]
+ = { Iop_Max8Sx16, Iop_Max16Sx8, Iop_Max32Sx4 };
+ const IROp opMAXU[3]
+ = { Iop_Max8Ux16, Iop_Max16Ux8, Iop_Max32Ux4 };
+ vassert(szBlg2 < 3);
+ IROp op = isMAX ? (isU ? opMAXU[szBlg2] : opMAXS[szBlg2])
+ : (isU ? opMINU[szBlg2] : opMINS[szBlg2]);
+ IRTemp tN1 = newTemp(Ity_V128);
+ assign(tN1, getQReg128(nn));
+ /* If Q == 0, we're just folding lanes in the lower half of
+ the value. In which case, copy the lower half of the
+ source into the upper half, so we can then treat it the
+ same as the full width case. */
+ IRTemp tN2 = newTemp(Ity_V128);
+ assign(tN2, zeroHI ? mk_CatOddLanes64x2(tN1,tN1) : mkexpr(tN1));
+ IRTemp res = math_MINMAXV(tN2, op);
+ if (res == IRTemp_INVALID)
+ return False; /* means math_MINMAXV
+ doesn't handle this case yet */
+ putQReg128(dd, mkexpr(res));
+ const HChar* nm = isMAX ? (isU ? "umaxv" : "smaxv")
+ : (isU ? "uminv" : "sminv");
+ const IRType tys[3] = { Ity_I8, Ity_I16, Ity_I32 };
+ IRType laneTy = tys[szBlg2];
+ DIP("%s %s, %s.%s\n", nm,
+ nameQRegLO(dd, laneTy), nameQReg128(nn), arrSpec);
+ return True;
+ }
+ /* else fall through */
+ }
+
/* -------------------- XTN{,2} -------------------- */
- /* 31 28 23 21 15 9 4
+ /* 31 28 23 21 15 9 4 XTN{,2} Vd.Tb, Vn.Ta
0q0 01110 size 100001 001010 n d
*/
if (INSN(31,31) == 0 && INSN(29,24) == BITS6(0,0,1,1,1,0)
@@ -5198,6 +5433,60 @@
/* else fall through */
}
+ /* ---------------- DUP (general, vector) ---------------- */
+ /* 31 28 23 20 15 9 4
+ 0q0 01110 000 imm5 000011 n d DUP Vd.T, Rn
+ Q=0 writes 64, Q=1 writes 128
+ imm5: xxxx1 8B(q=0) or 16b(q=1), R=W
+ xxx10 4H(q=0) or 8H(q=1), R=W
+ xx100 2S(q=0) or 4S(q=1), R=W
+ x1000 Invalid(q=0) or 2D(q=1), R=X
+ x0000 Invalid(q=0) or Invalid(q=1)
+ */
+ if (INSN(31,31) == 0 && INSN(29,21) == BITS9(0,0,1,1,1,0,0,0,0)
+ && INSN(15,10) == BITS6(0,0,0,0,1,1)) {
+ Bool isQ = INSN(30,30) == 1;
+ UInt imm5 = INSN(20,16);
+ UInt nn = INSN(9,5);
+ UInt dd = INSN(4,0);
+ IRTemp w0 = newTemp(Ity_I64);
+ const HChar* arT = "??";
+ IRType laneTy = Ity_INVALID;
+ if (imm5 & 1) {
+ arT = isQ ? "16b" : "8b";
+ laneTy = Ity_I8;
+ assign(w0, unop(Iop_8Uto64, unop(Iop_64to8, getIReg64orZR(nn))));
+ }
+ else if (imm5 & 2) {
+ arT = isQ ? "8h" : "4h";
+ laneTy = Ity_I16;
+ assign(w0, unop(Iop_16Uto64, unop(Iop_64to16, getIReg64orZR(nn))));
+ }
+ else if (imm5 & 4) {
+ arT = isQ ? "4s" : "2s";
+ laneTy = Ity_I32;
+ assign(w0, unop(Iop_32Uto64, unop(Iop_64to32, getIReg64orZR(nn))));
+ }
+ else if ((imm5 & 8) && isQ) {
+ arT = "2d";
+ laneTy = Ity_I64;
+ assign(w0, getIReg64orZR(nn));
+ }
+ else {
+ /* invalid; leave laneTy unchanged. */
+ }
+ /* */
+ if (laneTy != Ity_INVALID) {
+ IRTemp w1 = math_DUP_TO_64(w0, laneTy);
+ putQReg128(dd, binop(Iop_64HLtoV128,
+ isQ ? mkexpr(w1) : mkU64(0), mkexpr(w1)));
+ DIP("dup %s.%s, %s\n",
+ nameQReg128(dd), arT, nameIRegOrZR(laneTy == Ity_I64, nn));
+ return True;
+ }
+ /* else fall through */
+ }
+
/* FIXME Temporary hacks to get through ld.so FIXME */
/* ------------------ movi vD.4s, #0x0 ------------------ */
@@ -5209,20 +5498,6 @@
return True;
}
- /* ------------------ dup vD.2d, xN ------------------ */
- /* 0x4E 0x08 0000 11 xN(5) vD(5) */
- if ((insn & 0xFFFFFC00) == 0x4E080C00) {
- UInt nn = INSN(9,5);
- UInt dd = INSN(4,0);
- IRTemp src64 = newTemp(Ity_I64);
- assign(src64, getIReg64orZR(nn));
- IRTemp res = newTemp(Ity_V128);
- assign(res, binop(Iop_64HLtoV128, mkexpr(src64), mkexpr(src64)));
- putQReg128(dd, mkexpr(res));
- DIP("dup v%u.2d, x%u\n", dd, nn);
- return True;
- }
-
/* ---------------- MOV vD.16b, vN.16b ---------------- */
/* 31 23 20 15 9 4
010 01110 101 m 000111 n d ORR vD.16b, vN.16b, vM.16b
@@ -5517,6 +5792,201 @@
return dres;
}
+////////////////////////////////////////////////////////////////////////
+////////////////////////////////////////////////////////////////////////
+
+/* Spare code for doing reference implementations of various 128-bit
+ SIMD interleaves/deinterleaves/concatenation ops. For 64-bit
+ equivalents see the end of guest_arm_toIR.c. */
+
+////////////////////////////////////////////////////////////////
+// 64x2 operations
+//
+static IRExpr* mk_CatEvenLanes64x2 ( IRTemp a10, IRTemp b10 )
+{
+ // returns a0 b0
+ return binop(Iop_64HLtoV128, unop(Iop_V128to64, mkexpr(a10)),
+ unop(Iop_V128to64, mkexpr(b10)));
+}
+
+static IRExpr* mk_CatOddLanes64x2 ( IRTemp a10, IRTemp b10 )
+{
+ // returns a1 b1
+ return binop(Iop_64HLtoV128, unop(Iop_V128HIto64, mkexpr(a10)),
+ unop(Iop_V128HIto64, mkexpr(b10)));
+}
+
+
+////////////////////////////////////////////////////////////////
+// 32x4 operations
+//
+
+// Split a 128 bit value into 4 32 bit ones, in 64-bit IRTemps with
+// the top halves guaranteed to be zero.
+static void breakV128to32s ( IRTemp* out3, IRTemp* out2, IRTemp* out1,
+ IRTemp* out0, IRTemp v128 )
+{
+ if (out3) *out3 = newTemp(Ity_I64);
+ if (out2) *out2 = newTemp(Ity_I64);
+ if (out1) *out1 = newTemp(Ity_I64);
+ if (out0) *out0 = newTemp(Ity_I64);
+ IRTemp hi64 = newTemp(Ity_I64);
+ IRTemp lo64 = newTemp(Ity_I64);
+ assign(hi64, unop(Iop_V128HIto64, mkexpr(v128)) );
+ assign(lo64, unop(Iop_V128to64, mkexpr(v128)) );
+ if (out3) assign(*out3, binop(Iop_Shr64, mkexpr(hi64), mkU8(32)));
+ if (out2) assign(*out2, binop(Iop_And64, mkexpr(hi64), mkU64(0xFFFFFFFF)));
+ if (out1) assign(*out1, binop(Iop_Shr64, mkexpr(lo64), mkU8(32)));
+ if (out0) assign(*out0, binop(Iop_And64, mkexpr(lo64), mkU64(0xFFFFFFFF)));
+}
+
+// Make a V128 bit value from 4 32 bit ones, each of which is in a 64 bit
+// IRTemp.
+static IRTemp mkV128from32s ( IRTemp in3, IRTemp in2, IRTemp in1, IRTemp in0 )
+{
+ IRTemp hi64 = newTemp(Ity_I64);
+ IRTemp lo64 = newTemp(Ity_I64);
+ assign(hi64,
+ binop(Iop_Or64,
+ binop(Iop_Shl64, mkexpr(in3), mkU8(32)),
+ binop(Iop_And64, mkexpr(in2), mkU64(0xFFFFFFFF))));
+ assign(lo64,
+ binop(Iop_Or64,
+ binop(Iop_Shl64, mkexpr(in1), mkU8(32)),
+ binop(Iop_And64, mkexpr(in0), mkU64(0xFFFFFFFF))));
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, binop(Iop_64HLtoV128, mkexpr(hi64), mkexpr(lo64)));
+ return res;
+}
+
+static IRExpr* mk_CatEvenLanes32x4 ( IRTemp a3210, IRTemp b3210 )
+{
+ // returns a2 a0 b2 b0
+ IRTemp a2, a0, b2, b0;
+ breakV128to32s(NULL, &a2, NULL, &a0, a3210);
+ breakV128to32s(NULL, &b2, NULL, &b0, b3210);
+ return mkexpr(mkV128from32s(a2, a0, b2, b0));
+}
+
+static IRExpr* mk_CatOddLanes32x4 ( IRTemp a3210, IRTemp b3210 )
+{
+ // returns a3 a1 b3 b1
+ IRTemp a3, a1, b3, b1;
+ breakV128to32s(&a3, NULL, &a1, NULL, a3210);
+ breakV128to32s(&b3, NULL, &b1, NULL, b3210);
+ return mkexpr(mkV128from32s(a3, a1, b3, b1));
+}
+
+
+////////////////////////////////////////////////////////////////
+// 16x8 operations
+//
+
+static void breakV128to16s ( IRTemp* out7, IRTemp* out6, IRTemp* out5,
+ IRTemp* out4, IRTemp* out3, IRTemp* out2,
+ IRTemp* out1,IRTemp* out0, IRTemp v128 )
+{
+ if (out7) *out7 = newTemp(Ity_I64);
+ if (out6) *out6 = newTemp(Ity_I64);
+ if (out5) *out5 = newTemp(Ity_I64);
+ if (out4) *out4 = newTemp(Ity_I64);
+ if (out3) *out3 = newTemp(Ity_I64);
+ if (out2) *out2 = newTemp(Ity_I64);
+ if (out1) *out1 = newTemp(Ity_I64);
+ if (out0) *out0 = newTemp(Ity_I64);
+ IRTemp hi64 = newTemp(Ity_I64);
+ IRTemp lo64 = newTemp(Ity_I64);
+ assign(hi64, unop(Iop_V128HIto64, mkexpr(v128)) );
+ assign(lo64, unop(Iop_V128to64, mkexpr(v128)) );
+ if (out7)
+ assign(*out7, binop(Iop_And64,
+ binop(Iop_Shr64, mkexpr(hi64), mkU8(48)),
+ mkU64(0xFFFF)));
+ if (out6)
+ assign(*out6, binop(Iop_And64,
+ binop(Iop_Shr64, mkexpr(hi64), mkU8(32)),
+ mkU64(0xFFFF)));
+ if (out5)
+ assign(*out5, binop(Iop_And64,
+ binop(Iop_Shr64, mkexpr(hi64), mkU8(16)),
+ mkU64(0xFFFF)));
+ if (out4)
+ assign(*out4, binop(Iop_And64, mkexpr(hi64), mkU64(0xFFFF)));
+ if (out3)
+ assign(*out3, binop(Iop_And64,
+ binop(Iop_Shr64, mkexpr(lo64), mkU8(48)),
+ mkU64(0xFFFF)));
+ if (out2)
+ assign(*out2, binop(Iop_And64,
+ binop(Iop_Shr64, mkexpr(lo64), mkU8(32)),
+ mkU64(0xFFFF)));
+ if (out1)
+ assign(*out1, binop(Iop_And64,
+ binop(Iop_Shr64, mkexpr(lo64), mkU8(16)),
+ mkU64(0xFFFF)));
+ if (out0)
+ assign(*out0, binop(Iop_And64, mkexpr(lo64), mkU64(0xFFFF)));
+}
+
+static IRTemp mkV128from16s ( IRTemp in7, IRTemp in6, IRTemp in5, IRTemp in4,
+ IRTemp in3, IRTemp in2, IRTemp in1, IRTemp in0 )
+{
+ IRTemp hi64 = newTemp(Ity_I64);
+ IRTemp lo64 = newTemp(Ity_I64);
+ assign(hi64,
+ binop(Iop_Or64,
+ binop(Iop_Or64,
+ binop(Iop_Shl64,
+ binop(Iop_And64, mkexpr(in7), mkU64(0xFFFF)),
+ mkU8(48)),
+ binop(Iop_Shl64,
+ binop(Iop_And64, mkexpr(in6), mkU64(0xFFFF)),
+ mkU8(32))),
+ binop(Iop_Or64,
+ binop(Iop_Shl64,
+ binop(Iop_And64, mkexpr(in5), mkU64(0xFFFF)),
+ mkU8(16)),
+ binop(Iop_And64,
+ mkexpr(in4), mkU64(0xFFFF)))));
+ assign(lo64,
+ binop(Iop_Or64,
+ binop(Iop_Or64,
+ binop(Iop_Shl64,
+ binop(Iop_And64, mkexpr(in3), mkU64(0xFFFF)),
+ mkU8(48)),
+ binop(Iop_Shl64,
+ binop(Iop_And64, mkexpr(in2), mkU64(0xFFFF)),
+ mkU8(32))),
+ binop(Iop_Or64,
+ binop(Iop_Shl64,
+ binop(Iop_And64, mkexpr(in1), mkU64(0xFFFF)),
+ mkU8(16)),
+ binop(Iop_And64,
+ mkexpr(in0), mkU64(0xFFFF)))));
+ IRTemp res = newTemp(Ity_V128);
+ assign(res, binop(Iop_64HLtoV128, mkexpr(hi64), mkexpr(lo64)));
+ return res;
+}
+
+static IRExpr* mk_CatEvenLanes16x8 ( IRTemp a76543210, IRTemp b76543210 )
+{
+ // returns a6 a4 a2 a0 b6 b4 b2 b0
+ IRTemp a6, a4, a2, a0, b6, b4, b2, b0;
+ breakV128to16s(NULL, &a6, NULL, &a4, NULL, &a2, NULL, &a0, a76543210);
+ breakV128to16s(NULL, &b6, NULL, &b4, NULL, &b2, NULL, &b0, b76543210);
+ return mkexpr(mkV128from16s(a6, a4, a2, a0, b6, b4, b2, b0));
+}
+
+static IRExpr* mk_CatOddLanes16x8 ( IRTemp a76543210, IRTemp b76543210 )
+{
+ // returns a7 a5 a3 a1 b7 b5 b3 b1
+ IRTemp a7, a5, a3, a1, b7, b5, b3, b1;
+ breakV128to16s(&a7, NULL, &a5, NULL, &a3, NULL, &a1, NULL, a76543210);
+ breakV128to16s(&b7, NULL, &b5, NULL, &b3, NULL, &b1, NULL, b76543210);
+ return mkexpr(mkV128from16s(a7, a5, a3, a1, b7, b5, b3, b1));
+}
+
+
/*--------------------------------------------------------------------*/
/*--- end guest_arm64_toIR.c ---*/
/*--------------------------------------------------------------------*/
Modified: trunk/priv/host_arm64_defs.c
==============================================================================
--- trunk/priv/host_arm64_defs.c (original)
+++ trunk/priv/host_arm64_defs.c Wed Feb 5 11:01:19 2014
@@ -861,6 +861,16 @@
case ARM64vecb_FSUB64x2: *nm = "fsub"; *ar = "2d"; return;
case ARM64vecb_FMUL64x2: *nm = "fmul"; *ar = "2d"; return;
case ARM64vecb_FDIV64x2: *nm = "fdiv"; *ar = "2d"; return;
+ case ARM64vecb_FADD32x4: *nm = "fadd"; *ar = "4s"; return;
+ case ARM64vecb_FSUB32x4: *nm = "fsub"; *ar = "4s"; return;
+ case ARM64vecb_FMUL32x4: *nm = "fmul"; *ar = "4s"; return;
+ case ARM64vecb_FDIV32x4: *nm = "fdiv"; *ar = "4s"; return;
+ case ARM64vecb_UMAX32x4: *nm = "umax"; *ar = "4s"; return;
+ case ARM64vecb_UMAX16x8: *nm = "umax"; *ar = "8h"; return;
+ case ARM64vecb_UMIN32x4: *nm = "umin"; *ar = "4s"; return;
+ case ARM64vecb_UMIN16x8: *nm = "umin"; *ar = "8h"; return;
+ case ARM64vecb_AND: *nm = "and "; *ar = "all"; return;
+ case ARM64vecb_ORR: *nm = "orr "; *ar = "all"; return;
default: vpanic("showARM64VecBinOp");
}
}
@@ -3214,13 +3224,16 @@
#define X000000 BITS8(0,0, 0,0,0,0,0,0)
#define X000100 BITS8(0,0, 0,0,0,1,0,0)
+#define X000111 BITS8(0,0, 0,0,0,1,1,1)
#define X001000 BITS8(0,0, 0,0,1,0,0,0)
#define X001001 BITS8(0,0, 0,0,1,0,0,1)
#define X001010 BITS8(0,0, 0,0,1,0,1,0)
#define X001111 BITS8(0,0, 0,0,1,1,1,1)
#define X010000 BITS8(0,0, 0,1,0,0,0,0)
#define X010001 BITS8(0,0, 0,1,0,0,0,1)
+#define X011001 BITS8(0,0, 0,1,1,0,0,1)
#define X011010 BITS8(0,0, 0,1,1,0,1,0)
+#define X011011 BITS8(0,0, 0,1,1,0,1,1)
#define X011111 BITS8(0,0, 0,1,1,1,1,1)
#define X100001 BITS8(0,0, 1,0,0,0,0,1)
#define X100100 BITS8(0,0, 1,0,0,1,0,0)
@@ -4748,9 +4761,19 @@
011 01110 10 1 m 100001 n d SUB Vd.4s, Vn.4s, Vm.4s
011 01110 01 1 m 100001 n d SUB Vd.8h, Vn.8h, Vm.8h
010 01110 01 1 m 110101 n d FADD Vd.2d, Vn.2d, Vm.2d
+ 010 01110 00 1 m 110101 n d FADD Vd.4s, Vn.4s, Vm.4s
010 01110 11 1 m 110101 n d FSUB Vd.2d, Vn.2d, Vm.2d
+ 010 01110 10 1 m 110101 n d FSUB Vd.4s, Vn.4s, Vm.4s
011 01110 01 1 m 110111 n d FMUL Vd.2d, Vn.2d, Vm.2d
+ 011 01110 00 1 m 110111 n d FMUL Vd.4s, Vn.4s, Vm.4s
011 01110 01 1 m 111111 n d FDIV Vd.2d, Vn.2d, Vm.2d
+ 011 01110 00 1 m 111111 n d FDIV Vd.4s, Vn.4s, Vm.4s
+ 011 01110 10 1 m 011001 n d UMAX Vd.4s, Vn.4s, Vm.4s
+ 011 01110 01 1 m 011001 n d UMAX Vd.8h, Vn.8h, Vm.8h
+ 011 01110 10 1 m 011011 n d UMIN Vd.4s, Vn.4s, Vm.4s
+ 011 01110 01 1 m 011011 n d UMIN Vd.8h, Vn.8h, Vm.8h
+ 010 01110 00 1 m 000111 n d AND Vd, Vn, Vm
+ 010 01110 10 1 m 000111 n d ORR Vd, Vn, Vm
*/
UInt vD = qregNo(i->ARM64in.VBinV.dst);
UInt vN = qregNo(i->ARM64in.VBinV.argL);
@@ -4759,6 +4782,7 @@
case ARM64vecb_ADD64x2:
*p++ = X_3_8_5_6_5_5(X010, X01110111, vM, X100001, vN, vD);
break;
+ // ADD32x4
case ARM64vecb_SUB64x2:
*p++ = X_3_8_5_6_5_5(X011, X01110111, vM, X100001, vN, vD);
break;
@@ -4771,15 +4795,46 @@
case ARM64vecb_FADD64x2:
*p++ = X_3_8_5_6_5_5(X010, X01110011, vM, X110101, vN, vD);
break;
+ case ARM64vecb_FADD32x4:
+ *p++ = X_3_8_5_6_5_5(X010, X01110001, vM, X110101, vN, vD);
+ break;
case ARM64vecb_FSUB64x2:
*p++ = X_3_8_5_6_5_5(X010, X01110111, vM, X110101, vN, vD);
break;
+ case ARM64vecb_FSUB32x4:
+ *p++ = X_3_8_5_6_5_5(X010, X01110101, vM, X110101, vN, vD);
+ break;
case ARM64vecb_FMUL64x2:
*p++ = X_3_8_5_6_5_5(X011, X01110011, vM, X110111, vN, vD);
break;
+ case ARM64vecb_FMUL32x4:
+ *p++ = X_3_8_5_6_5_5(X011, X01110001, vM, X110111, vN, vD);
+ break;
case ARM64vecb_FDIV64x2:
*p++ = X_3_8_5_6_5_5(X011, X01110011, vM, X111111, vN, vD);
break;
+ case ARM64vecb_FDIV32x4:
+ *p++ = X_3_8_5_6_5_5(X011, X01110001, vM, X111111, vN, vD);
+ break;
+ case ARM64vecb_UMAX32x4:
+ *p++ = X_3_8_5_6_5_5(X011, X01110101, vM, X011001, vN, vD);
+ break;
+ case ARM64vecb_UMAX16x8:
+ *p++ = X_3_8_5_6_5_5(X011, X01110011, vM, X011001, vN, vD);
+ break;
+ case ARM64vecb_UMIN32x4:
+ *p++ = X_3_8_5_6_5_5(X011, X01110101, vM, X011011, vN, vD);
+ break;
+ case ARM64vecb_UMIN16x8:
+ *p++ = X_3_8_5_6_5_5(X011, X01110011, vM, X011011, vN, vD);
+ break;
+ case ARM64vecb_ORR:
+ goto bad; //ATC
+ *p++ = X_3_8_5_6_5_5(X010, X01110101, vM, X000111, vN, vD);
+ break;
+ case ARM64vecb_AND:
+ *p++ = X_3_8_5_6_5_5(X010, X01110001, vM, X000111, vN, vD);
+ break;
default:
goto bad;
}
@@ -5690,13 +5745,25 @@
case ARM64in_VImmQ: {
UInt rQ = qregNo(i->ARM64in.VImmQ.rQ);
UShort imm = i->ARM64in.VImmQ.imm;
- if (imm == 0) {
+ if (imm == 0x0000) {
/* movi rQ.4s, #0x0 == 0x4F 0x00 0x04 000 rQ */
vassert(rQ < 32);
*p++ = 0x4F000400 | rQ;
goto done;
}
- goto bad; /* zero is the only handled case right now */
+ if (imm == 0x0003) {
+ /* movi rD, #0xFFFF == 0x2F 0x00 0xE4 011 rD */
+ vassert(rQ < 32);
+ *p++ = 0x2F00E460 | rQ;
+ goto done;
+ }
+ if (imm == 0x000F) {
+ /* movi rD, #0xFFFFFFFF == 0x2F 0x00 0xE5 111 rD */
+ vassert(rQ < 32);
+ *p++ = 0x2F00E5E0 | rQ;
+ goto done;
+ }
+ goto bad; /* no other handled cases right now */
}
case ARM64in_VDfromX: {
Modified: trunk/priv/host_arm64_defs.h
==============================================================================
--- trunk/priv/host_arm64_defs.h (original)
+++ trunk/priv/host_arm64_defs.h Wed Feb 5 11:01:19 2014
@@ -317,6 +317,16 @@
ARM64vecb_FSUB64x2,
ARM64vecb_FMUL64x2,
ARM64vecb_FDIV64x2,
+ ARM64vecb_FADD32x4,
+ ARM64vecb_FSUB32x4,
+ ARM64vecb_FMUL32x4,
+ ARM64vecb_FDIV32x4,
+ ARM64vecb_UMAX32x4,
+ ARM64vecb_UMAX16x8,
+ ARM64vecb_UMIN32x4,
+ ARM64vecb_UMIN16x8,
+ ARM64vecb_AND,
+ ARM64vecb_ORR,
ARM64vecb_INVALID
}
ARM64VecBinOp;
Modified: trunk/priv/host_arm64_isel.c
==============================================================================
--- trunk/priv/host_arm64_isel.c (original)
+++ trunk/priv/host_arm64_isel.c Wed Feb 5 11:01:19 2014
@@ -4340,8 +4340,26 @@
goto v128_expr_bad;
}
-//ZZ if (e->tag == Iex_Unop) {
-//ZZ switch (e->Iex.Unop.op) {
+ if (e->tag == Iex_Unop) {
+
+ /* Iop_ZeroHIXXofV128 cases */
+ UShort imm16 = 0;
+ switch (e->Iex.Unop.op) {
+ case Iop_ZeroHI96ofV128: imm16 = 0x000F; break;
+ case Iop_ZeroHI112ofV128: imm16 = 0x0003; break;
+ default: break;
+ }
+ if (imm16 != 0) {
+ HReg src = iselV128Expr(env, e->Iex.Unop.arg);
+ HReg imm = newVRegV(env);
+ HReg res = newVRegV(env);
+ addInstr(env, ARM64Instr_VImmQ(imm, imm16));
+ addInstr(env, ARM64Instr_VBinV(ARM64vecb_AND, res, src, imm));
+ return res;
+ }
+
+ /* Other cases */
+ switch (e->Iex.Unop.op) {
//ZZ case Iop_NotV128: {
//ZZ DECLARE_PATTERN(p_veqz_8x16);
//ZZ DECLARE_PATTERN(p_veqz_16x8);
@@ -4807,11 +4825,11 @@
//ZZ res, arg, 0, True));
//ZZ return res;
//ZZ }
-//ZZ /* ... */
-//ZZ default:
-//ZZ break;
-//ZZ }
-//ZZ }
+ /* ... */
+ default:
+ break;
+ } /* switch on the unop */
+ } /* if (e->tag == Iex_Unop) */
if (e->tag == Iex_Binop) {
switch (e->Iex.Binop.op) {
@@ -4849,6 +4867,10 @@
//ZZ case Iop_Add8x16:
//ZZ case Iop_Add16x8:
//ZZ case Iop_Add32x4:
+ case Iop_Max32Ux4:
+ case Iop_Max16Ux8:
+ case Iop_Min32Ux4:
+ case Iop_Min16Ux8:
case Iop_Add64x2:
case Iop_Sub64x2:
case Iop_Sub32x4:
@@ -4858,10 +4880,14 @@
HReg argR = iselV128Expr(env, e->Iex.Binop.arg2);
ARM64VecBinOp op = ARM64vecb_INVALID;
switch (e->Iex.Binop.op) {
- case Iop_Add64x2: op = ARM64vecb_ADD64x2; break;
- case Iop_Sub64x2: op = ARM64vecb_SUB64x2; break;
- case Iop_Sub32x4: op = ARM64vecb_SUB32x4; break;
- case Iop_Sub16x8: op = ARM64vecb_SUB16x8; break;
+ case Iop_Max32Ux4: op = ARM64vecb_UMAX32x4; break;
+ case Iop_Max16Ux8: op = ARM64vecb_UMAX16x8; break;
+ case Iop_Min32Ux4: op = ARM64vecb_UMIN32x4; break;
+ case Iop_Min16Ux8: op = ARM64vecb_UMIN16x8; break;
+ case Iop_Add64x2: op = ARM64vecb_ADD64x2; break;
+ case Iop_Sub64x2: op = ARM64vecb_SUB64x2; break;
+ case Iop_Sub32x4: op = ARM64vecb_SUB32x4; break;
+ case Iop_Sub16x8: op = ARM64vecb_SUB16x8; break;
default: vassert(0);
}
addInstr(env, ARM64Instr_VBinV(op, res, argL, argR));
@@ -5747,6 +5773,10 @@
case Iop_Sub64Fx2: vecbop = ARM64vecb_FSUB64x2; break;
case Iop_Mul64Fx2: vecbop = ARM64vecb_FMUL64x2; break;
case Iop_Div64Fx2: vecbop = ARM64vecb_FDIV64x2; break;
+ case Iop_Add32Fx4: vecbop = ARM64vecb_FADD32x4; break;
+ case Iop_Sub32Fx4: vecbop = ARM64vecb_FSUB32x4; break;
+ case Iop_Mul32Fx4: vecbop = ARM64vecb_FMUL32x4; break;
+ case Iop_Div32Fx4: vecbop = ARM64vecb_FDIV32x4; break;
default: break;
}
if (vecbop != ARM64vecb_INVALID) {
Modified: trunk/priv/ir_defs.c
==============================================================================
--- trunk/priv/ir_defs.c (original)
+++ trunk/priv/ir_defs.c Wed Feb 5 11:01:19 2014
@@ -694,7 +694,11 @@
case Iop_64UtoV128: vex_printf("64UtoV128"); return;
case Iop_SetV128lo64: vex_printf("SetV128lo64"); return;
- case Iop_ZeroHI64: vex_printf("ZeroHI64"); return;
+
+ case Iop_ZeroHI64ofV128: vex_printf("ZeroHI64ofV128"); return;
+ case Iop_ZeroHI96ofV128: vex_printf("ZeroHI96ofV128"); return;
+ case Iop_ZeroHI112ofV128: vex_printf("ZeroHI112ofV128"); return;
+ case Iop_ZeroHI120ofV128: vex_printf("ZeroHI120ofV128"); return;
case Iop_32UtoV128: vex_printf("32UtoV128"); return;
case Iop_V128to32: vex_printf("V128to32"); return;
@@ -2905,6 +2909,8 @@
case Iop_Abs8x16: case Iop_Abs16x8: case Iop_Abs32x4:
case Iop_CipherSV128:
case Iop_PwBitMtxXpose64x2:
+ case Iop_ZeroHI64ofV128: case Iop_ZeroHI96ofV128:
+ case Iop_ZeroHI112ofV128: case Iop_ZeroHI120ofV128:
UNARY(Ity_V128, Ity_V128);
case Iop_ShlV128: case Iop_ShrV128:
Modified: trunk/pub/libvex_ir.h
==============================================================================
--- trunk/pub/libvex_ir.h (original)
+++ trunk/pub/libvex_ir.h Wed Feb 5 11:01:19 2014
@@ -1363,8 +1363,11 @@
Iop_64UtoV128,
Iop_SetV128lo64,
- /* Copies lower 64 bits, zeroes out upper 64 bits. */
- Iop_ZeroHI64, // :: V128 -> V128
+ /* Copies lower 64/32/16/8 bits, zeroes out the rest. */
+ Iop_ZeroHI64ofV128, // :: V128 -> V128
+ Iop_ZeroHI96ofV128, // :: V128 -> V128
+ Iop_ZeroHI112ofV128, // :: V128 -> V128
+ Iop_ZeroHI120ofV128, // :: V128 -> V128
/* 32 <-> 128 bit vector */
Iop_32UtoV128,
|