You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(25) |
2
(33) |
3
(25) |
4
(27) |
5
(17) |
|
6
(3) |
7
(18) |
8
(16) |
9
(21) |
10
(14) |
11
(17) |
12
(11) |
|
13
(3) |
14
(24) |
15
(19) |
16
(13) |
17
(16) |
18
(29) |
19
(19) |
|
20
(17) |
21
(27) |
22
(21) |
23
(32) |
24
(19) |
25
(24) |
26
(16) |
|
27
(2) |
28
(21) |
29
(20) |
30
(20) |
31
(2) |
|
|
|
From: Tom H. <to...@co...> - 2013-10-18 02:08:38
|
valgrind revision: 13653 VEX revision: 2790 C compiler: gcc (GCC) 4.8.1 20130603 (Red Hat 4.8.1-1) GDB: GNU gdb (GDB) Fedora 7.6.1-41.fc19 Assembler: GNU assembler version 2.23.52.0.1-9.fc19 20130226 C library: GNU C Library (GNU libc) stable release version 2.17 uname -mrs: Linux 3.9.5-301.fc19.x86_64 x86_64 Vendor version: Fedora release 19 (Schrödingerâs Cat) Nightly build on bristol ( x86_64, Fedora 19 (Schrödingerâs Cat) ) Started at 2013-10-18 02:32:18 BST Ended at 2013-10-18 03:08:23 BST Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 668 tests, 3 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/dw4 (stderr) memcheck/tests/origin5-bz2 (stderr) exp-sgcheck/tests/hackedbz2 (stderr) |
Author: carll
Date: Fri Oct 18 01:20:11 2013
New Revision: 13653
Log:
This commit adds testing support for the following instructions:
vaddcuq, vadduqm, vaddecuq, vaddeuqm,
vsubcuq, vsubuqm, vsubecuq, vsubeuqm,
vbpermq and vgbbd.
The completes adding the Power ISA 2.07 support.
Bugzilla 325816
VEX commit id 2790
Modified:
trunk/memcheck/mc_translate.c
trunk/memcheck/tests/vbit-test/irops.c
trunk/none/tests/ppc32/jm_vec_isa_2_07.stdout.exp
trunk/none/tests/ppc64/jm_vec_isa_2_07.stdout.exp
trunk/none/tests/ppc64/test_isa_2_07_part1.c
Modified: trunk/memcheck/mc_translate.c
==============================================================================
--- trunk/memcheck/mc_translate.c (original)
+++ trunk/memcheck/mc_translate.c Fri Oct 18 01:20:11 2013
@@ -4152,6 +4152,9 @@
case Iop_Clz64x2:
return mkPCast64x2(mce, vatom);
+ case Iop_PwBitMtxXpose64x2:
+ return assignNew('V', mce, Ity_V128, unop(op, vatom));
+
case Iop_NarrowUn16to8x8:
case Iop_NarrowUn32to16x4:
case Iop_NarrowUn64to32x2:
Modified: trunk/memcheck/tests/vbit-test/irops.c
==============================================================================
--- trunk/memcheck/tests/vbit-test/irops.c (original)
+++ trunk/memcheck/tests/vbit-test/irops.c Fri Oct 18 01:20:11 2013
@@ -974,6 +974,7 @@
{ DEFOP(Iop_NCipherLV128, UNDEF_UNKNOWN), },
{ DEFOP(Iop_SHA512, UNDEF_UNKNOWN), },
{ DEFOP(Iop_SHA256, UNDEF_UNKNOWN), },
+ { DEFOP(Iop_PwBitMtxXpose64x2, UNDEF_UNKNOWN), },
};
Modified: trunk/none/tests/ppc32/jm_vec_isa_2_07.stdout.exp
==============================================================================
--- trunk/none/tests/ppc32/jm_vec_isa_2_07.stdout.exp (original)
+++ trunk/none/tests/ppc32/jm_vec_isa_2_07.stdout.exp Fri Oct 18 01:20:11 2013
@@ -336,6 +336,9 @@
vsbox: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 7c777bf26b6fc53001672bfeabd7ab76
vsbox: f1f2f3f4f5f6f7f8 @@ f9fafbfcfefdfeff ==> a1890dbfe6426841992d0fb0bb54bb16
+vgbbd: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 00000000011e66aa00000000ff1f6ba5
+vgbbd: f1f2f3f4f5f6f7f8 @@ f9fafbfcfefdfeff ==> ffffffff011e66aaffffffffff1f6ba5
+
vshasigmad: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 088207870e8c098d || 8b9e1b9b13149015
vshasigmad: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> c8f5100c7844a0fc || e9b5916d0131c581
vshasigmad: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 592bfd4c0062b487 || fb4fb96f4cf02615
@@ -420,4 +423,65 @@
bcdsub.: 0000000000000000 || 0000000000000000 @@ 0000000000000000 || 0000000000000000 ==> 0000000000000000 || 000000000000000c
bcdsub.: 0000000000000000 || 0000000000000000 @@ 0000000000000000 || 0000000000000000 ==> 0000000000000000 || 000000000000000f
-All done. Tested 56 different instructions
+vaddcuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000000
+vaddcuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+vaddcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000000
+vaddcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000001
+
+vadduqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 020406080a0c0e10121416181c1a1c1e
+vadduqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> f2f4f6f8fafcff01030507090d0b0d0e
+vadduqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> f2f4f6f8fafcff01030507090d0b0d0e
+vadduqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> e3e5e7e9ebedeff1f3f5f7f9fdfbfdfe
+
+vsubcuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000001
+vsubcuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+vsubcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000001
+vsubcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000001
+
+vsubuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000000
+vsubuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f10
+vsubuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0
+vsubuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+
+vbpermq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 0000000000000000000000000000020a
+vbpermq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+vbpermq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> 0000000000000000000000000000e3ea
+vbpermq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000000
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000000
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000000
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000000
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000000
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000000
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000001
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000001
+
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 020406080a0c0e10121416181c1a1c1e
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 020406080a0c0e10121416181c1a1c1f
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> f2f4f6f8fafcff01030507090d0b0d0e
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> f2f4f6f8fafcff01030507090d0b0d0f
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> f2f4f6f8fafcff01030507090d0b0d0e
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> f2f4f6f8fafcff01030507090d0b0d0f
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> e3e5e7e9ebedeff1f3f5f7f9fdfbfdfe
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> e3e5e7e9ebedeff1f3f5f7f9fdfbfdff
+
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000000
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000001
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000000
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000000
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000001
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000001
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000000
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000001
+
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> ffffffffffffffffffffffffffffffff
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000000
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f10
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0ef
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> ffffffffffffffffffffffffffffffff
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000000
+
+All done. Tested 66 different instructions
Modified: trunk/none/tests/ppc64/jm_vec_isa_2_07.stdout.exp
==============================================================================
--- trunk/none/tests/ppc64/jm_vec_isa_2_07.stdout.exp (original)
+++ trunk/none/tests/ppc64/jm_vec_isa_2_07.stdout.exp Fri Oct 18 01:20:11 2013
@@ -336,6 +336,9 @@
vsbox: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 7c777bf26b6fc53001672bfeabd7ab76
vsbox: f1f2f3f4f5f6f7f8 @@ f9fafbfcfefdfeff ==> a1890dbfe6426841992d0fb0bb54bb16
+vgbbd: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 00000000011e66aa00000000ff1f6ba5
+vgbbd: f1f2f3f4f5f6f7f8 @@ f9fafbfcfefdfeff ==> ffffffff011e66aaffffffffff1f6ba5
+
vshasigmad: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 088207870e8c098d || 8b9e1b9b13149015
vshasigmad: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> c8f5100c7844a0fc || e9b5916d0131c581
vshasigmad: 0102030405060708 @@ 090a0b0c0e0d0e0f ==> 592bfd4c0062b487 || fb4fb96f4cf02615
@@ -420,4 +423,65 @@
bcdsub.: 0000000000000000 || 0000000000000000 @@ 0000000000000000 || 0000000000000000 ==> 0000000000000000 || 000000000000000c
bcdsub.: 0000000000000000 || 0000000000000000 @@ 0000000000000000 || 0000000000000000 ==> 0000000000000000 || 000000000000000f
-All done. Tested 56 different instructions
+vaddcuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000000
+vaddcuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+vaddcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000000
+vaddcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000001
+
+vadduqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 020406080a0c0e10121416181c1a1c1e
+vadduqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> f2f4f6f8fafcff01030507090d0b0d0e
+vadduqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> f2f4f6f8fafcff01030507090d0b0d0e
+vadduqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> e3e5e7e9ebedeff1f3f5f7f9fdfbfdfe
+
+vsubcuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000001
+vsubcuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+vsubcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000001
+vsubcuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000001
+
+vsubuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 00000000000000000000000000000000
+vsubuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f10
+vsubuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0
+vsubuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+
+vbpermq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f ==> 0000000000000000000000000000020a
+vbpermq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+vbpermq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f ==> 0000000000000000000000000000e3ea
+vbpermq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff ==> 00000000000000000000000000000000
+
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000000
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000000
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000000
+vaddecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000000
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000000
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000000
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000001
+vaddecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000001
+
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 020406080a0c0e10121416181c1a1c1e
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 020406080a0c0e10121416181c1a1c1f
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> f2f4f6f8fafcff01030507090d0b0d0e
+vaddeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> f2f4f6f8fafcff01030507090d0b0d0f
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> f2f4f6f8fafcff01030507090d0b0d0e
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> f2f4f6f8fafcff01030507090d0b0d0f
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> e3e5e7e9ebedeff1f3f5f7f9fdfbfdfe
+vaddeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> e3e5e7e9ebedeff1f3f5f7f9fdfbfdff
+
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000000
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000001
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000000
+vsubecuq: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000000
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> 00000000000000000000000000000001
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000001
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 00000000000000000000000000000000
+vsubecuq: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000001
+
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> ffffffffffffffffffffffffffffffff
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> 00000000000000000000000000000000
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> 0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f
+vsubeuqm: 0102030405060708090a0b0c0e0d0e0f @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f10
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000000 ==> f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0ef
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ 0102030405060708090a0b0c0e0d0e0f @@ f000000000000001 ==> f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000000 ==> ffffffffffffffffffffffffffffffff
+vsubeuqm: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f1f2f3f4f5f6f7f8f9fafbfcfefdfeff @@ f000000000000001 ==> 00000000000000000000000000000000
+
+All done. Tested 66 different instructions
Modified: trunk/none/tests/ppc64/test_isa_2_07_part1.c
==============================================================================
--- trunk/none/tests/ppc64/test_isa_2_07_part1.c (original)
+++ trunk/none/tests/ppc64/test_isa_2_07_part1.c Fri Oct 18 01:20:11 2013
@@ -242,6 +242,7 @@
PPC_ALTIVEC = 0x00040000,
PPC_FALTIVEC = 0x00050000,
PPC_ALTIVECD = 0x00060000, /* double word Altivec tests */
+ PPC_ALTIVECQ = 0x00070000,
PPC_FAMILY = 0x000F0000,
/* Flags: these may be combined, so use separate bitfields. */
PPC_CR = 0x01000000,
@@ -670,6 +671,74 @@
__asm__ __volatile__ ("bcdsub. %0, %1, %2, 0" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB));
}
+static void test_vaddcuq (void)
+{
+ __asm__ __volatile__ ("vaddcuq %0, %1, %2" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB));
+}
+
+static void test_vadduqm (void)
+{
+ __asm__ __volatile__ ("vadduqm %0, %1, %2" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB));
+}
+
+static void test_vaddecuq (void)
+{
+ __asm__ __volatile__ ("vaddecuq %0, %1, %2, %3" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB),"v" (vec_inC));
+}
+
+static void test_vaddeuqm (void)
+{
+ __asm__ __volatile__ ("vaddeuqm %0, %1, %2, %3" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB),"v" (vec_inC));
+}
+
+static void test_vsubcuq (void)
+{
+ __asm__ __volatile__ ("vsubcuq %0, %1, %2" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB));
+}
+
+static void test_vsubuqm (void)
+{
+ __asm__ __volatile__ ("vsubuqm %0, %1, %2" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB));
+}
+
+static void test_vsubecuq (void)
+{
+ __asm__ __volatile__ ("vsubecuq %0, %1, %2, %3" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB),"v" (vec_inC));
+}
+
+static void test_vsubeuqm (void)
+{
+ __asm__ __volatile__ ("vsubeuqm %0, %1, %2, %3" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB),"v" (vec_inC));
+}
+
+static void test_vbpermq (void)
+{
+ __asm__ __volatile__ ("vbpermq %0, %1, %2" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB));
+}
+
+static void test_vgbbd (void)
+{
+ __asm__ __volatile__ ("vgbbd %0, %1" : "=v" (vec_out): "v" (vec_inB));
+}
+
+
+static test_t tests_aa_quadword_two_args[] = {
+ { &test_vaddcuq , "vaddcuq" },
+ { &test_vadduqm , "vadduqm" },
+ { &test_vsubcuq , "vsubcuq" },
+ { &test_vsubuqm , "vsubuqm" },
+ { &test_vbpermq , "vbpermq" },
+ { NULL , NULL },
+};
+
+static test_t tests_aa_quadword_three_args[] = {
+ { &test_vaddecuq , "vaddecuq" },
+ { &test_vaddeuqm , "vaddeuqm" },
+ { &test_vsubecuq , "vsubecuq" },
+ { &test_vsubeuqm , "vsubeuqm" },
+ { NULL , NULL },
+};
+
static test_t tests_aa_bcd_ops[] = {
{ &test_bcdadd , "bcdadd." },
{ &test_bcdsub , "bcdsub." },
@@ -743,6 +812,7 @@
{ &test_vpopcntw , "vpopcntw" },
{ &test_vpopcntd , "vpopcntd" },
{ &test_vsbox , "vsbox" },
+ { &test_vgbbd , "vgbbd" },
{ NULL , NULL, }
};
@@ -1151,6 +1221,7 @@
unsigned long long * dst;
unsigned int * dst_int;
int i,j;
+ int family = test_flags & PPC_FAMILY;
int is_vpkudum;
if (strcmp(name, "vpkudum") == 0)
is_vpkudum = 1;
@@ -1175,6 +1246,10 @@
vdargs[j+1] & 0x00000000ffffffffULL);
printf(" Output: %08x %08x %08x %08x\n", dst_int[0], dst_int[1],
dst_int[2], dst_int[3]);
+ } else if (family == PPC_ALTIVECQ) {
+ printf("%016llx%016llx @@ %016llx%016llx ==> %016llx%016llx\n",
+ vdargs[i], vdargs[i+1], vdargs[j], vdargs[j+1],
+ dst[0], dst[1]);
} else {
printf("%016llx @@ %016llx ", vdargs[i], vdargs[j]);
printf(" ==> %016llx\n", dst[0]);
@@ -1467,28 +1542,43 @@
}
-static void test_av_int_three_args (const char* name, test_func_t func,
- unused uint32_t test_flags)
+static void test_av_dint_three_args (const char* name, test_func_t func,
+ unused uint32_t test_flags)
{
unsigned long long * dst;
int i,j, k;
+ int family = test_flags & PPC_FAMILY;
+ unsigned long long cin_vals[] = {
+ // First pair of ULLs have LSB=0, so cin is '0'.
+ // Second pair of ULLs have LSB=1, so cin is '1'.
+ 0xf000000000000000ULL, 0xf000000000000000ULL,
+ 0xf000000000000000ULL, 0xf000000000000001ULL
+ };
for (i = 0; i < NB_VDARGS; i+=2) {
vec_inA = (vector unsigned long long){ vdargs[i], vdargs[i+1] };
for (j = 0; j < NB_VDARGS; j+=2) {
vec_inB = (vector unsigned long long){ vdargs[j], vdargs[j+1] };
- for (k = 0; k < NB_VDARGS; k+=2) {
- vec_inC = (vector unsigned long long){ vdargs[k], vdargs[k+1] };
+ for (k = 0; k < 4; k+=2) {
+ if (family == PPC_ALTIVECQ)
+ vec_inC = (vector unsigned long long){ cin_vals[k], cin_vals[k+1] };
+ else
+ vec_inC = (vector unsigned long long){ vdargs[k], vdargs[k+1] };
vec_out = (vector unsigned long long){ 0,0 };
(*func)();
dst = (unsigned long long*)&vec_out;
-
printf("%s: ", name);
- printf("%016llx @@ %016llx @@ %016llx ", vdargs[i], vdargs[j], vdargs[k]);
- printf(" ==> %016llx\n", dst[0]);
- printf("\t%016llx @@ %016llx @@ %016llx ", vdargs[i+1], vdargs[j+1], vdargs[k+1]);
- printf(" ==> %016llx\n", dst[1]);
+ if (family == PPC_ALTIVECQ) {
+ printf("%016llx%016llx @@ %016llx%016llx @@ %llx ==> %016llx%016llx\n",
+ vdargs[i], vdargs[i+1], vdargs[j], vdargs[j+1], cin_vals[k+1],
+ dst[0], dst[1]);
+ } else {
+ printf("%016llx @@ %016llx @@ %016llx ", vdargs[i], vdargs[j], vdargs[k]);
+ printf(" ==> %016llx\n", dst[0]);
+ printf("\t%016llx @@ %016llx @@ %016llx ", vdargs[i+1], vdargs[j+1], vdargs[k+1]);
+ printf(" ==> %016llx\n", dst[1]);
+ }
}
}
}
@@ -1517,7 +1607,7 @@
&test_av_wint_two_args_dres,
&test_av_dint_to_int_two_args,
&test_av_wint_one_arg_dres,
- &test_av_int_three_args,
+ &test_av_dint_three_args,
&test_av_dint_one_arg,
&test_av_dint_one_arg_SHA,
&test_av_bcd,
@@ -1636,6 +1726,16 @@
"PPC altivec BCD insns",
0x00040B02,
},
+ {
+ tests_aa_quadword_two_args,
+ "PPC altivec quadword insns, two input args",
+ 0x00070102,
+ },
+ {
+ tests_aa_quadword_three_args,
+ "PPC altivec quadword insns, three input args",
+ 0x00070103
+ },
{ NULL, NULL, 0x00000000, },
};
@@ -1676,6 +1776,7 @@
(family == PPC_FLOAT && !seln_flags.floats) ||
(family == PPC_ALTIVEC && !seln_flags.altivec) ||
(family == PPC_ALTIVECD && !seln_flags.altivec) ||
+ (family == PPC_ALTIVECQ && !seln_flags.altivec) ||
(family == PPC_FALTIVEC && !seln_flags.faltivec)) {
continue;
}
@@ -1700,6 +1801,12 @@
loop = &float_loops[nb_args - 1];
break;
+ case PPC_ALTIVECQ:
+ if (nb_args == 2)
+ loop = &altivec_loops[ALTV_DINT];
+ else if (nb_args == 3)
+ loop = &altivec_loops[ALTV_DINT_THREE_ARGS];
+ break;
case PPC_ALTIVECD:
switch (type) {
case PPC_MOV:
|
|
From: <sv...@va...> - 2013-10-18 01:19:20
|
Author: carll
Date: Fri Oct 18 01:19:06 2013
New Revision: 2790
Log:
This commit adds support for the following instructions:
vaddcuq, vadduqm, vaddecuq, vaddeuqm,
vsubcuq, vsubuqm, vsubecuq, vsubeuqm,
vbpermq and vgbbd.
The vgbbd instruction required a new Iop -- Iop_PwBitMtxXpose64x2.
All other instructions were emulated using existing Iops.
The completes adding the Power ISA 2.07 support.
Bugzilla 325816
Modified:
trunk/priv/guest_ppc_toIR.c
trunk/priv/host_ppc_defs.c
trunk/priv/host_ppc_defs.h
trunk/priv/host_ppc_isel.c
trunk/priv/ir_defs.c
trunk/pub/libvex_ir.h
Modified: trunk/priv/guest_ppc_toIR.c
==============================================================================
--- trunk/priv/guest_ppc_toIR.c (original)
+++ trunk/priv/guest_ppc_toIR.c Fri Oct 18 01:19:06 2013
@@ -13271,10 +13271,10 @@
}
/*
- * VSX vector Population Count
+ * Vector Population Count/bit matrix transpose
*/
static Bool
-dis_vxv_population_count ( UInt theInstr, UInt opc2 )
+dis_av_count_bitTranspose ( UInt theInstr, UInt opc2 )
{
UChar vRB_addr = ifieldRegB(theInstr);
UChar vRT_addr = ifieldRegDS(theInstr);
@@ -13283,7 +13283,7 @@
assign( vB, getVReg(vRB_addr));
if (opc1 != 0x4) {
- vex_printf( "dis_vxv_population_count(ppc)(instr)\n" );
+ vex_printf( "dis_av_count_bitTranspose(ppc)(instr)\n" );
return False;
}
@@ -13423,8 +13423,13 @@
break;
}
+ case 0x50C: // vgbbd Vector Gather Bits by Bytes by Doubleword
+ DIP("vgbbd v%d,v%d\n", vRT_addr, vRB_addr);
+ putVReg( vRT_addr, unop( Iop_PwBitMtxXpose64x2, mkexpr( vB ) ) );
+ break;
+
default:
- vex_printf("dis_vxv_population_count(ppc)(opc2)\n");
+ vex_printf("dis_av_count_bitTranspose(ppc)(opc2)\n");
return False;
break;
}
@@ -17416,6 +17421,285 @@
}
/*
+ * This function is used by the Vector add/subtract [extended] modulo/carry
+ * instructions.
+ * - For the non-extended add instructions, the cin arg is set to zero.
+ * - For the extended add instructions, cin is the integer value of
+ * src3.bit[127].
+ * - For the non-extended subtract instructions, src1 is added to the one's
+ * complement of src2 + 1. We re-use the cin argument to hold the '1'
+ * value for this operation.
+ * - For the extended subtract instructions, cin is the integer value of src3.bit[127].
+ */
+static IRTemp _get_quad_modulo_or_carry(IRExpr * vecA, IRExpr * vecB,
+ IRExpr * cin, Bool modulo)
+{
+ IRTemp _vecA_32 = IRTemp_INVALID;
+ IRTemp _vecB_32 = IRTemp_INVALID;
+ IRTemp res_32 = IRTemp_INVALID;
+ IRTemp result = IRTemp_INVALID;
+ IRTemp tmp_result = IRTemp_INVALID;
+ IRTemp carry = IRTemp_INVALID;
+ Int i;
+ IRExpr * _vecA_low64 = unop( Iop_V128to64, vecA );
+ IRExpr * _vecB_low64 = unop( Iop_V128to64, vecB );
+ IRExpr * _vecA_high64 = unop( Iop_V128HIto64, vecA );
+ IRExpr * _vecB_high64 = unop( Iop_V128HIto64, vecB );
+
+ for (i = 0; i < 4; i++) {
+ _vecA_32 = newTemp(Ity_I32);
+ _vecB_32 = newTemp(Ity_I32);
+ res_32 = newTemp(Ity_I32);
+ switch (i) {
+ case 0:
+ assign(_vecA_32, unop( Iop_64to32, _vecA_low64 ) );
+ assign(_vecB_32, unop( Iop_64to32, _vecB_low64 ) );
+ break;
+ case 1:
+ assign(_vecA_32, unop( Iop_64HIto32, _vecA_low64 ) );
+ assign(_vecB_32, unop( Iop_64HIto32, _vecB_low64 ) );
+ break;
+ case 2:
+ assign(_vecA_32, unop( Iop_64to32, _vecA_high64 ) );
+ assign(_vecB_32, unop( Iop_64to32, _vecB_high64 ) );
+ break;
+ case 3:
+ assign(_vecA_32, unop( Iop_64HIto32, _vecA_high64 ) );
+ assign(_vecB_32, unop( Iop_64HIto32, _vecB_high64 ) );
+ break;
+ }
+
+ assign(res_32, binop( Iop_Add32,
+ binop( Iop_Add32,
+ binop ( Iop_Add32,
+ mkexpr(_vecA_32),
+ mkexpr(_vecB_32) ),
+ (i == 0) ? mkU32(0) : mkexpr(carry) ),
+ (i == 0) ? cin : mkU32(0) ) );
+ if (modulo) {
+ result = newTemp(Ity_V128);
+ assign(result, binop( Iop_OrV128,
+ (i == 0) ? binop( Iop_64HLtoV128,
+ mkU64(0),
+ mkU64(0) ) : mkexpr(tmp_result),
+ binop( Iop_ShlV128,
+ binop( Iop_64HLtoV128,
+ mkU64(0),
+ binop( Iop_32HLto64,
+ mkU32(0),
+ mkexpr(res_32) ) ),
+ mkU8(i * 32) ) ) );
+ tmp_result = newTemp(Ity_V128);
+ assign(tmp_result, mkexpr(result));
+ }
+ carry = newTemp(Ity_I32);
+ assign(carry, unop(Iop_1Uto32, binop( Iop_CmpLT32U,
+ mkexpr(res_32),
+ mkexpr(_vecA_32 ) ) ) );
+ }
+ if (modulo)
+ return result;
+ else
+ return carry;
+}
+
+
+static Bool dis_av_quad ( UInt theInstr )
+{
+ /* VX-Form */
+ UChar opc1 = ifieldOPC(theInstr);
+ UChar vRT_addr = ifieldRegDS(theInstr);
+ UChar vRA_addr = ifieldRegA(theInstr);
+ UChar vRB_addr = ifieldRegB(theInstr);
+ UChar vRC_addr;
+ UInt opc2 = IFIELD( theInstr, 0, 11 );
+
+ IRTemp vA = newTemp(Ity_V128);
+ IRTemp vB = newTemp(Ity_V128);
+ IRTemp vC = IRTemp_INVALID;
+ IRTemp cin = IRTemp_INVALID;
+ assign( vA, getVReg(vRA_addr));
+ assign( vB, getVReg(vRB_addr));
+
+ if (opc1 != 0x4) {
+ vex_printf("dis_av_quad(ppc)(instr)\n");
+ return False;
+ }
+
+ switch (opc2) {
+ case 0x140: // vaddcuq
+ DIP("vaddcuq v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr);
+ putVReg( vRT_addr, unop( Iop_32UtoV128,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA),
+ mkexpr(vB),
+ mkU32(0), False) ) ) );
+ return True;
+ case 0x100: // vadduqm
+ DIP("vadduqm v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr);
+ putVReg( vRT_addr, mkexpr(_get_quad_modulo_or_carry(mkexpr(vA),
+ mkexpr(vB), mkU32(0), True) ) );
+ return True;
+ case 0x540: // vsubcuq
+ DIP("vsubcuq v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr);
+ putVReg( vRT_addr,
+ unop( Iop_32UtoV128,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA),
+ unop( Iop_NotV128,
+ mkexpr(vB) ),
+ mkU32(1), False) ) ) );
+ return True;
+ case 0x500: // vsubuqm
+ DIP("vsubuqm v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr);
+ putVReg( vRT_addr,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA),
+ unop( Iop_NotV128, mkexpr(vB) ),
+ mkU32(1), True) ) );
+ return True;
+ case 0x054C: // vbpermq
+ {
+#define BPERMD_IDX_MASK 0x00000000000000FFULL
+#define BPERMD_BIT_MASK 0x8000000000000000ULL
+ int i;
+ IRExpr * vB_expr = mkexpr(vB);
+ IRExpr * res = binop(Iop_AndV128, mkV128(0), mkV128(0));
+ DIP("vbpermq v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr);
+ for (i = 0; i < 16; i++) {
+ IRTemp idx_tmp = newTemp( Ity_V128 );
+ IRTemp perm_bit = newTemp( Ity_V128 );
+ IRTemp idx = newTemp( Ity_I8 );
+ IRTemp idx_LT127 = newTemp( Ity_I1 );
+ IRTemp idx_LT127_ity128 = newTemp( Ity_V128 );
+
+ assign( idx_tmp,
+ binop( Iop_AndV128,
+ binop( Iop_64HLtoV128,
+ mkU64(0),
+ mkU64(BPERMD_IDX_MASK) ),
+ vB_expr ) );
+ assign( idx_LT127,
+ binop( Iop_CmpEQ32,
+ unop ( Iop_64to32,
+ unop( Iop_V128to64, binop( Iop_ShrV128,
+ mkexpr(idx_tmp),
+ mkU8(7) ) ) ),
+ mkU32(0) ) );
+
+ /* Below, we set idx to determine which bit of vA to use for the
+ * perm bit. If idx_LT127 is 0, the perm bit is forced to '0'.
+ */
+ assign( idx,
+ binop( Iop_And8,
+ unop( Iop_1Sto8,
+ mkexpr(idx_LT127) ),
+ unop( Iop_32to8,
+ unop( Iop_V128to32, mkexpr( idx_tmp ) ) ) ) );
+
+ assign( idx_LT127_ity128,
+ binop( Iop_64HLtoV128,
+ mkU64(0),
+ unop( Iop_32Uto64,
+ unop( Iop_1Uto32, mkexpr(idx_LT127 ) ) ) ) );
+ assign( perm_bit,
+ binop( Iop_AndV128,
+ mkexpr( idx_LT127_ity128 ),
+ binop( Iop_ShrV128,
+ binop( Iop_AndV128,
+ binop (Iop_64HLtoV128,
+ mkU64( BPERMD_BIT_MASK ),
+ mkU64(0)),
+ binop( Iop_ShlV128,
+ mkexpr( vA ),
+ mkexpr( idx ) ) ),
+ mkU8( 127 ) ) ) );
+ res = binop( Iop_OrV128,
+ res,
+ binop( Iop_ShlV128,
+ mkexpr( perm_bit ),
+ mkU8( i ) ) );
+ vB_expr = binop( Iop_ShrV128, vB_expr, mkU8( 8 ) );
+ }
+ putVReg( vRT_addr, res);
+ return True;
+#undef BPERMD_IDX_MASK
+#undef BPERMD_BIT_MASK
+ }
+
+ default:
+ break; // fall through
+ }
+
+ opc2 = IFIELD( theInstr, 0, 6 );
+ vRC_addr = ifieldRegC(theInstr);
+ vC = newTemp(Ity_V128);
+ cin = newTemp(Ity_I32);
+ switch (opc2) {
+ case 0x3D: // vaddecuq
+ assign( vC, getVReg(vRC_addr));
+ DIP("vaddecuq v%d,v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr,
+ vRC_addr);
+ assign(cin, binop( Iop_And32,
+ unop( Iop_64to32,
+ unop( Iop_V128to64, mkexpr(vC) ) ),
+ mkU32(1) ) );
+ putVReg( vRT_addr,
+ unop( Iop_32UtoV128,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA), mkexpr(vB),
+ mkexpr(cin),
+ False) ) ) );
+ return True;
+ case 0x3C: // vaddeuqm
+ assign( vC, getVReg(vRC_addr));
+ DIP("vaddeuqm v%d,v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr,
+ vRC_addr);
+ assign(cin, binop( Iop_And32,
+ unop( Iop_64to32,
+ unop( Iop_V128to64, mkexpr(vC) ) ),
+ mkU32(1) ) );
+ putVReg( vRT_addr,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA), mkexpr(vB),
+ mkexpr(cin),
+ True) ) );
+ return True;
+ case 0x3F: // vsubecuq
+ assign( vC, getVReg(vRC_addr));
+ DIP("vsubecuq v%d,v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr,
+ vRC_addr);
+ assign(cin, binop( Iop_And32,
+ unop( Iop_64to32,
+ unop( Iop_V128to64, mkexpr(vC) ) ),
+ mkU32(1) ) );
+ putVReg( vRT_addr,
+ unop( Iop_32UtoV128,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA),
+ unop( Iop_NotV128,
+ mkexpr(vB) ),
+ mkexpr(cin),
+ False) ) ) );
+ return True;
+ case 0x3E: // vsubeuqm
+ assign( vC, getVReg(vRC_addr));
+ DIP("vsubeuqm v%d,v%d,v%d,v%d\n", vRT_addr, vRA_addr, vRB_addr,
+ vRC_addr);
+ assign(cin, binop( Iop_And32,
+ unop( Iop_64to32,
+ unop( Iop_V128to64, mkexpr(vC) ) ),
+ mkU32(1) ) );
+ putVReg( vRT_addr,
+ mkexpr(_get_quad_modulo_or_carry(mkexpr(vA),
+ unop( Iop_NotV128, mkexpr(vB) ),
+ mkexpr(cin),
+ True) ) );
+ return True;
+ default:
+ vex_printf("dis_av_quad(ppc)(opc2.2)\n");
+ return False;
+ }
+
+ return True;
+}
+
+
+/*
AltiVec BCD Arithmetic instructions.
These instructions modify CR6 for various conditions in the result,
including when an overflow occurs. We could easily detect all conditions
@@ -19308,6 +19592,12 @@
if (dis_av_fp_arith( theInstr )) goto decode_success;
goto decode_failure;
+ case 0x3D: case 0x3C: // vaddecuq, vaddeuqm
+ case 0x3F: case 0x3E: // vsubecuq, vsubeuqm
+ if (!allow_V) goto decode_noV;
+ if (dis_av_quad( theInstr)) goto decode_success;
+ goto decode_failure;
+
default:
break; // Fall through...
}
@@ -19468,13 +19758,25 @@
case 0x702: case 0x742: // vclzb, vclzh
case 0x782: case 0x7c2: // vclzw, vclzd
if (!allow_isa_2_07) goto decode_noP8;
- if (dis_vxv_population_count( theInstr, opc2 )) goto decode_success;
+ if (dis_av_count_bitTranspose( theInstr, opc2 )) goto decode_success;
goto decode_failure;
case 0x703: case 0x743: // vpopcntb, vpopcnth
case 0x783: case 0x7c3: // vpopcntw, vpopcntd
if (!allow_isa_2_07) goto decode_noP8;
- if (dis_vxv_population_count( theInstr, opc2 )) goto decode_success;
+ if (dis_av_count_bitTranspose( theInstr, opc2 )) goto decode_success;
+ goto decode_failure;
+
+ case 0x50c: // vgbbd
+ if (!allow_isa_2_07) goto decode_noP8;
+ if (dis_av_count_bitTranspose( theInstr, opc2 )) goto decode_success;
+ goto decode_failure;
+
+ case 0x140: case 0x100: // vaddcuq, vadduqm
+ case 0x540: case 0x500: // vsubcuq, vsubuqm
+ case 0x54C: // vbpermq
+ if (!allow_V) goto decode_noV;
+ if (dis_av_quad( theInstr)) goto decode_success;
goto decode_failure;
default:
Modified: trunk/priv/host_ppc_defs.c
==============================================================================
--- trunk/priv/host_ppc_defs.c (original)
+++ trunk/priv/host_ppc_defs.c Fri Oct 18 01:19:06 2013
@@ -744,6 +744,10 @@
case Pav_ZEROCNTHALF: case Pav_ZEROCNTDBL:
return "vclz_"; // b, h, w, d
+ /* vector gather (byte-by-byte bit matrix transpose) */
+ case Pav_BITMTXXPOSE:
+ return "vgbbd";
+
default: vpanic("showPPCAvOp");
}
}
@@ -4773,6 +4777,7 @@
case Pav_ZEROCNTHALF: opc2 = 1858; break; // vclzh
case Pav_ZEROCNTWORD: opc2 = 1922; break; // vclzw
case Pav_ZEROCNTDBL: opc2 = 1986; break; // vclzd
+ case Pav_BITMTXXPOSE: opc2 = 1292; break; // vgbbd
default:
goto bad;
}
Modified: trunk/priv/host_ppc_defs.h
==============================================================================
--- trunk/priv/host_ppc_defs.h (original)
+++ trunk/priv/host_ppc_defs.h Fri Oct 18 01:19:06 2013
@@ -442,6 +442,9 @@
/* zero count */
Pav_ZEROCNTBYTE, Pav_ZEROCNTWORD, Pav_ZEROCNTHALF, Pav_ZEROCNTDBL,
+
+ /* Vector bit matrix transpose by byte */
+ Pav_BITMTXXPOSE,
}
PPCAvOp;
Modified: trunk/priv/host_ppc_isel.c
==============================================================================
--- trunk/priv/host_ppc_isel.c (original)
+++ trunk/priv/host_ppc_isel.c Fri Oct 18 01:19:06 2013
@@ -4857,6 +4857,7 @@
case Iop_Clz16Sx8: fpop = Pav_ZEROCNTHALF; goto do_zerocnt;
case Iop_Clz32Sx4: fpop = Pav_ZEROCNTWORD; goto do_zerocnt;
case Iop_Clz64x2: fpop = Pav_ZEROCNTDBL; goto do_zerocnt;
+ case Iop_PwBitMtxXpose64x2: fpop = Pav_BITMTXXPOSE; goto do_zerocnt;
do_zerocnt:
{
HReg arg = iselVecExpr(env, e->Iex.Unop.arg);
Modified: trunk/priv/ir_defs.c
==============================================================================
--- trunk/priv/ir_defs.c (original)
+++ trunk/priv/ir_defs.c Fri Oct 18 01:19:06 2013
@@ -1148,6 +1148,8 @@
case Iop_BCDAdd: vex_printf("BCDAdd"); return;
case Iop_BCDSub: vex_printf("BCDSub"); return;
+ case Iop_PwBitMtxXpose64x2: vex_printf("BitMatrixTranspose64x2"); return;
+
default: vpanic("ppIROp(1)");
}
@@ -2901,6 +2903,7 @@
case Iop_Neg32Fx4:
case Iop_Abs8x16: case Iop_Abs16x8: case Iop_Abs32x4:
case Iop_CipherSV128:
+ case Iop_PwBitMtxXpose64x2:
UNARY(Ity_V128, Ity_V128);
case Iop_ShlV128: case Iop_ShrV128:
Modified: trunk/pub/libvex_ir.h
==============================================================================
--- trunk/pub/libvex_ir.h (original)
+++ trunk/pub/libvex_ir.h Fri Oct 18 01:19:06 2013
@@ -1456,6 +1456,13 @@
Iop_PwAddL8Ux16, Iop_PwAddL16Ux8, Iop_PwAddL32Ux4,
Iop_PwAddL8Sx16, Iop_PwAddL16Sx8, Iop_PwAddL32Sx4,
+ /* Other unary pairwise ops */
+
+ /* Vector bit matrix transpose. (V128) -> V128 */
+ /* For each doubleword element of the source vector, an 8-bit x 8-bit
+ * matrix transpose is performed. */
+ Iop_PwBitMtxXpose64x2,
+
/* ABSOLUTE VALUE */
Iop_Abs8x16, Iop_Abs16x8, Iop_Abs32x4,
|
|
From: <sv...@va...> - 2013-10-18 00:08:40
|
Author: philippe
Date: Fri Oct 18 00:08:20 2013
New Revision: 13652
Log:
Allow the user to dimension the translation cache
A previous commit had decreased to 6 (on android) and increased to 16
(other platforms) the nr of sectors in the translation cache.
This patch adds a command line option to let the user specify
the nr of sectors as e.g. 16 sectors might be a lot and cause
an out of memory for some workloads or might be too small for
huge executable or executables using a lot of shared libs.
Modified:
trunk/NEWS
trunk/coregrind/m_main.c
trunk/coregrind/m_transtab.c
trunk/coregrind/pub_core_options.h
trunk/coregrind/pub_core_transtab.h
trunk/docs/xml/manual-core.xml
trunk/none/tests/cmdline1.stdout.exp
trunk/none/tests/cmdline2.stdout.exp
trunk/perf/bigcode.c
Modified: trunk/NEWS
==============================================================================
--- trunk/NEWS (original)
+++ trunk/NEWS Fri Oct 18 00:08:20 2013
@@ -40,6 +40,13 @@
* ==================== OTHER CHANGES ====================
+ - The default nr of sectors in the translation cache has been
+ decreased to 6 on android platforms, and increased to 16
+ on all other platforms. A sector (lazily allocated) uses several
+ MB depending on the tool (about 40MB for memcheck).
+ The option --num-transtab-sectors allows to specify how
+ many sectors Valgrind can allocate.
+
- Option --merge-recursive-frames=<number> tells Valgrind to
detect and merge (collapse) recursive calls when recording stack traces.
When your program has recursive algorithms, this limits
Modified: trunk/coregrind/m_main.c
==============================================================================
--- trunk/coregrind/m_main.c (original)
+++ trunk/coregrind/m_main.c Fri Oct 18 00:08:20 2013
@@ -200,6 +200,8 @@
" handle non-standard kernel variants\n"
" --merge-recursive-frames=<number> merge frames between identical\n"
" program counters in max <number> frames) [0]\n"
+" --num-transtab-sectors=<number> size of translated code cache [%d]\n"
+" more sectors may increase the performance, but use more memory.\n"
" --show-emwarns=no|yes show warnings about emulation limits? [no]\n"
" --require-text-symbol=:sonamepattern:symbolpattern abort run if the\n"
" stated shared object doesn't have the stated\n"
@@ -306,7 +308,8 @@
default_alignment /* char* */,
default_redzone_size /* char* */,
VG_(clo_vgdb_poll) /* int */,
- VG_(vgdb_prefix_default)() /* char* */
+ VG_(vgdb_prefix_default)() /* char* */,
+ N_SECTORS_DEFAULT /* int */
);
if (VG_(details).name) {
VG_(printf)(" user options for %s:\n", VG_(details).name);
@@ -606,6 +609,9 @@
else if VG_INT_CLO (arg, "--sanity-level", VG_(clo_sanity_level)) {}
else if VG_BINT_CLO(arg, "--num-callers", VG_(clo_backtrace_size), 1,
VG_DEEPEST_BACKTRACE) {}
+ else if VG_BINT_CLO(arg, "--num-transtab-sectors",
+ VG_(clo_num_transtab_sectors),
+ MIN_N_SECTORS, MAX_N_SECTORS) {}
else if VG_BINT_CLO(arg, "--merge-recursive-frames",
VG_(clo_merge_recursive_frames), 0,
VG_DEEPEST_BACKTRACE) {}
Modified: trunk/coregrind/m_transtab.c
==============================================================================
--- trunk/coregrind/m_transtab.c (original)
+++ trunk/coregrind/m_transtab.c Fri Oct 18 00:08:20 2013
@@ -53,20 +53,13 @@
/*--- Management of the FIFO-based translation table+cache. ---*/
/*-------------------------------------------------------------*/
-/*------------------ CONSTANTS ------------------*/
-
-/* Number of sectors the TC is divided into. If you need a larger
- overall translation cache, increase this value. On Android, space
- is limited, so try to get by with fewer sectors. On other
- platforms we can go to town. 16 sectors gives theoretical capacity
- of about 440MB of JITted code in 1.05 million translations
- (realistically, about 2/3 of that) for Memcheck. */
-#if defined(VGPV_arm_linux_android) || defined(VGPV_x86_linux_android)
-# define N_SECTORS 6
-#else
-# define N_SECTORS 16
-#endif
+/* Nr of sectors provided via command line parameter. */
+UInt VG_(clo_num_transtab_sectors) = N_SECTORS_DEFAULT;
+/* Nr of sectors.
+ Will be set by VG_(init_tt_tc) to VG_(clo_num_transtab_sectors). */
+static int n_sectors;
+/*------------------ CONSTANTS ------------------*/
/* Number of TC entries in each sector. This needs to be a prime
number to work properly, it must be <= 65535 (so that a TT index
fits in a UShort, leaving room for 0xFFFF(EC2TTE_DELETED) to denote
@@ -356,7 +349,7 @@
N_TC_SECTORS. The initial -1 value indicates the TT/TC system is
not yet initialised.
*/
-static Sector sectors[N_SECTORS];
+static Sector sectors[MAX_N_SECTORS];
static Int youngest_sector = -1;
/* The number of ULongs in each TCEntry area. This is computed once
@@ -368,7 +361,7 @@
searched to find translations. This is an optimisation to be used
when searching for translations and should not affect
correctness. -1 denotes "no entry". */
-static Int sector_search_order[N_SECTORS];
+static Int sector_search_order[MAX_N_SECTORS];
/* Fast helper for the TC. A direct-mapped cache which holds a set of
@@ -447,7 +440,7 @@
static inline TTEntry* index_tte ( UInt sNo, UInt tteNo )
{
- vg_assert(sNo < N_SECTORS);
+ vg_assert(sNo < n_sectors);
vg_assert(tteNo < N_TTES_PER_SECTOR);
Sector* s = §ors[sNo];
vg_assert(s->tt);
@@ -682,7 +675,7 @@
Int i;
/* Search order logic copied from VG_(search_transtab). */
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
Int sno = sector_search_order[i];
if (UNLIKELY(sno == -1))
return False; /* run out of sectors to search */
@@ -732,7 +725,7 @@
static Bool is_in_the_main_TC ( void* hcode )
{
Int i, sno;
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
sno = sector_search_order[i];
if (sno == -1)
break; /* run out of sectors to search */
@@ -1222,32 +1215,32 @@
{
Int i, j, nListed;
/* assert the array is the right size */
- vg_assert(N_SECTORS == (sizeof(sector_search_order)
- / sizeof(sector_search_order[0])));
+ vg_assert(MAX_N_SECTORS == (sizeof(sector_search_order)
+ / sizeof(sector_search_order[0])));
/* Check it's of the form valid_sector_numbers ++ [-1, -1, ..] */
- for (i = 0; i < N_SECTORS; i++) {
- if (sector_search_order[i] < 0 || sector_search_order[i] >= N_SECTORS)
+ for (i = 0; i < n_sectors; i++) {
+ if (sector_search_order[i] < 0 || sector_search_order[i] >= n_sectors)
break;
}
nListed = i;
- for (/* */; i < N_SECTORS; i++) {
+ for (/* */; i < n_sectors; i++) {
if (sector_search_order[i] != -1)
break;
}
- if (i != N_SECTORS)
+ if (i != n_sectors)
return False;
/* Check each sector number only appears once */
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
if (sector_search_order[i] == -1)
continue;
- for (j = i+1; j < N_SECTORS; j++) {
+ for (j = i+1; j < n_sectors; j++) {
if (sector_search_order[j] == sector_search_order[i])
return False;
}
}
/* Check that the number of listed sectors equals the number
in use, by counting nListed back down. */
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
if (sectors[i].tc != NULL)
nListed--;
}
@@ -1261,7 +1254,7 @@
Int sno;
Bool sane;
Sector* sec;
- for (sno = 0; sno < N_SECTORS; sno++) {
+ for (sno = 0; sno < n_sectors; sno++) {
Int i;
Int nr_not_dead_hx = 0;
Int szhxa;
@@ -1308,7 +1301,7 @@
static Bool isValidSector ( Int sector )
{
- if (sector < 0 || sector >= N_SECTORS)
+ if (sector < 0 || sector >= n_sectors)
return False;
return True;
}
@@ -1413,11 +1406,11 @@
sizeof(HostExtent));
/* Add an entry in the sector_search_order */
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
if (sector_search_order[i] == -1)
break;
}
- vg_assert(i >= 0 && i < N_SECTORS);
+ vg_assert(i >= 0 && i < n_sectors);
sector_search_order[i] = sno;
if (VG_(clo_verbosity) > 2)
@@ -1482,11 +1475,11 @@
/* Sanity check: ensure it is already in
sector_search_order[]. */
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
if (sector_search_order[i] == sno)
break;
}
- vg_assert(i >= 0 && i < N_SECTORS);
+ vg_assert(i >= 0 && i < n_sectors);
if (VG_(clo_verbosity) > 2)
VG_(message)(Vg_DebugMsg, "TT/TC: recycle sector %d\n", sno);
@@ -1579,7 +1572,7 @@
y, tt_loading_pct, tc_loading_pct);
}
youngest_sector++;
- if (youngest_sector >= N_SECTORS)
+ if (youngest_sector >= n_sectors)
youngest_sector = 0;
y = youngest_sector;
initialiseSector(y);
@@ -1693,7 +1686,7 @@
/* Search in all the sectors,using sector_search_order[] as a
heuristic guide as to what order to visit the sectors. */
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
sno = sector_search_order[i];
if (UNLIKELY(sno == -1))
@@ -1951,7 +1944,7 @@
/* Fast scheme */
vg_assert(ec >= 0 && ec < ECLASS_MISC);
- for (sno = 0; sno < N_SECTORS; sno++) {
+ for (sno = 0; sno < n_sectors; sno++) {
sec = §ors[sno];
if (sec->tc == NULL)
continue;
@@ -1972,7 +1965,7 @@
VG_(debugLog)(2, "transtab",
" SLOW, ec = %d\n", ec);
- for (sno = 0; sno < N_SECTORS; sno++) {
+ for (sno = 0; sno < n_sectors; sno++) {
sec = §ors[sno];
if (sec->tc == NULL)
continue;
@@ -1996,7 +1989,7 @@
vg_assert(sane);
/* But now, also check the requested address range isn't
present anywhere. */
- for (sno = 0; sno < N_SECTORS; sno++) {
+ for (sno = 0; sno < n_sectors; sno++) {
sec = §ors[sno];
if (sec->tc == NULL)
continue;
@@ -2211,9 +2204,13 @@
vg_assert(tc_sector_szQ >= 2 * N_TTES_PER_SECTOR_USABLE);
vg_assert(tc_sector_szQ <= 100 * N_TTES_PER_SECTOR_USABLE);
+ n_sectors = VG_(clo_num_transtab_sectors);
+ vg_assert(n_sectors >= MIN_N_SECTORS);
+ vg_assert(n_sectors <= MAX_N_SECTORS);
+
/* Initialise the sectors */
youngest_sector = 0;
- for (i = 0; i < N_SECTORS; i++) {
+ for (i = 0; i < n_sectors; i++) {
sectors[i].tc = NULL;
sectors[i].tt = NULL;
sectors[i].tc_next = NULL;
@@ -2227,7 +2224,7 @@
}
/* Initialise the sector_search_order hint table. */
- for (i = 0; i < N_SECTORS; i++)
+ for (i = 0; i < n_sectors; i++)
sector_search_order[i] = -1;
/* Initialise the fast cache. */
@@ -2236,27 +2233,24 @@
/* and the unredir tt/tc */
init_unredir_tt_tc();
- if (VG_(clo_verbosity) > 2 || VG_(clo_stats)) {
+ if (VG_(clo_verbosity) > 2 || VG_(clo_stats)
+ || VG_(debugLog_getLevel) () >= 2) {
VG_(message)(Vg_DebugMsg,
"TT/TC: cache: %d sectors of %d bytes each = %d total\n",
- N_SECTORS, 8 * tc_sector_szQ,
- N_SECTORS * 8 * tc_sector_szQ );
+ n_sectors, 8 * tc_sector_szQ,
+ n_sectors * 8 * tc_sector_szQ );
VG_(message)(Vg_DebugMsg,
- "TT/TC: table: %d total entries, max occupancy %d (%d%%)\n",
- N_SECTORS * N_TTES_PER_SECTOR,
- N_SECTORS * N_TTES_PER_SECTOR_USABLE,
+ "TT/TC: table: %d tables of %d bytes each = %d total\n",
+ n_sectors, (int)(N_TTES_PER_SECTOR * sizeof(TTEntry)),
+ (int)(n_sectors * N_TTES_PER_SECTOR * sizeof(TTEntry)));
+ VG_(message)(Vg_DebugMsg,
+ "TT/TC: table: %d entries each = %d total entries"
+ " max occupancy %d (%d%%)\n",
+ N_TTES_PER_SECTOR,
+ n_sectors * N_TTES_PER_SECTOR,
+ n_sectors * N_TTES_PER_SECTOR_USABLE,
SECTOR_TT_LIMIT_PERCENT );
}
-
- VG_(debugLog)(2, "transtab",
- "cache: %d sectors of %d bytes each = %d total\n",
- N_SECTORS, 8 * tc_sector_szQ,
- N_SECTORS * 8 * tc_sector_szQ );
- VG_(debugLog)(2, "transtab",
- "table: %d total entries, max occupancy %d (%d%%)\n",
- N_SECTORS * N_TTES_PER_SECTOR,
- N_SECTORS * N_TTES_PER_SECTOR_USABLE,
- SECTOR_TT_LIMIT_PERCENT );
}
@@ -2332,7 +2326,7 @@
score_total = 0;
- for (sno = 0; sno < N_SECTORS; sno++) {
+ for (sno = 0; sno < n_sectors; sno++) {
if (sectors[sno].tc == NULL)
continue;
for (i = 0; i < N_TTES_PER_SECTOR; i++) {
@@ -2370,7 +2364,7 @@
/* Now zero out all the counter fields, so that we can make
multiple calls here and just get the values since the last call,
each time, rather than values accumulated for the whole run. */
- for (sno = 0; sno < N_SECTORS; sno++) {
+ for (sno = 0; sno < n_sectors; sno++) {
if (sectors[sno].tc == NULL)
continue;
for (i = 0; i < N_TTES_PER_SECTOR; i++) {
Modified: trunk/coregrind/pub_core_options.h
==============================================================================
--- trunk/coregrind/pub_core_options.h (original)
+++ trunk/coregrind/pub_core_options.h Fri Oct 18 00:08:20 2013
@@ -272,6 +272,9 @@
Note that the value is changeable by a gdbsrv command. */
extern Int VG_(clo_merge_recursive_frames);
+/* Max number of sectors that will be used by the translation code cache. */
+extern UInt VG_(clo_num_transtab_sectors);
+
/* Delay startup to allow GDB to be attached? Default: NO */
extern Bool VG_(clo_wait_for_gdb);
Modified: trunk/coregrind/pub_core_transtab.h
==============================================================================
--- trunk/coregrind/pub_core_transtab.h (original)
+++ trunk/coregrind/pub_core_transtab.h Fri Oct 18 00:08:20 2013
@@ -53,8 +53,29 @@
#define TRANSTAB_BOGUS_GUEST_ADDR ((Addr)1)
+
+/* Initialises the TC, using VG_(clo_num_transtab_sectors).
+ VG_(clo_num_transtab_sectors) must be >= MIN_N_SECTORS
+ and <= MAX_N_SECTORS. */
extern void VG_(init_tt_tc) ( void );
+
+/* Limits for number of sectors the TC is divided into. If you need a larger
+ overall translation cache, increase MAX_N_SECTORS. */
+#define MIN_N_SECTORS 2
+#define MAX_N_SECTORS 32
+
+/* Default for the nr of sectors, if not overriden by command line.
+ On Android, space is limited, so try to get by with fewer sectors.
+ On other platforms we can go to town. 16 sectors gives theoretical
+ capacity of about 440MB of JITted code in 1.05 million translations
+ (realistically, about 2/3 of that) for Memcheck. */
+#if defined(VGPV_arm_linux_android) || defined(VGPV_x86_linux_android)
+# define N_SECTORS_DEFAULT 6
+#else
+# define N_SECTORS_DEFAULT 16
+#endif
+
extern
void VG_(add_to_transtab)( VexGuestExtents* vge,
Addr64 entry,
Modified: trunk/docs/xml/manual-core.xml
==============================================================================
--- trunk/docs/xml/manual-core.xml (original)
+++ trunk/docs/xml/manual-core.xml Fri Oct 18 00:08:20 2013
@@ -1852,6 +1852,28 @@
</listitem>
</varlistentry>
+ <varlistentry id="opt.num-transtab-sectors" xreflabel="--num-transtab-sectors">
+ <term>
+ <option><![CDATA[--num-transtab-sectors=<number> [default: 6 or 16] ]]></option>
+ </term>
+ <listitem>
+ <para>Valgrind translates and instruments your program code. The
+ translations are stored in a translation cache organized in
+ sectors. If the cache is full, the sector containing the older
+ translations is emptied and recycled. If these old translations
+ are needed again, Valgrind must re-translate and re-instrument
+ the corresponding program code. If the "executed instructions"
+ working set of a program is big, increasing the number of
+ sectors may improve the performance by reducing the number of
+ re-translations needed. A sector is lazily allocated but once
+ allocated, it permanently uses several MB depending
+ on the tool (about 40 MB per sector for memcheck).
+ Use the option <option>--stats=yes</option> to obtain precise
+ information about the memory used by a sector and the allocation
+ and recycling of sectors.</para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="opt.show-emwarns" xreflabel="--show-emwarns">
<term>
<option><![CDATA[--show-emwarns=<yes|no> [default: no] ]]></option>
Modified: trunk/none/tests/cmdline1.stdout.exp
==============================================================================
--- trunk/none/tests/cmdline1.stdout.exp (original)
+++ trunk/none/tests/cmdline1.stdout.exp Fri Oct 18 00:08:20 2013
@@ -88,6 +88,8 @@
handle non-standard kernel variants
--merge-recursive-frames=<number> merge frames between identical
program counters in max <number> frames) [0]
+ --num-transtab-sectors=<number> size of translated code cache [16]
+ more sectors may increase the performance, but use more memory.
--show-emwarns=no|yes show warnings about emulation limits? [no]
--require-text-symbol=:sonamepattern:symbolpattern abort run if the
stated shared object doesn't have the stated
Modified: trunk/none/tests/cmdline2.stdout.exp
==============================================================================
--- trunk/none/tests/cmdline2.stdout.exp (original)
+++ trunk/none/tests/cmdline2.stdout.exp Fri Oct 18 00:08:20 2013
@@ -88,6 +88,8 @@
handle non-standard kernel variants
--merge-recursive-frames=<number> merge frames between identical
program counters in max <number> frames) [0]
+ --num-transtab-sectors=<number> size of translated code cache [16]
+ more sectors may increase the performance, but use more memory.
--show-emwarns=no|yes show warnings about emulation limits? [no]
--require-text-symbol=:sonamepattern:symbolpattern abort run if the
stated shared object doesn't have the stated
Modified: trunk/perf/bigcode.c
==============================================================================
--- trunk/perf/bigcode.c (original)
+++ trunk/perf/bigcode.c Fri Oct 18 00:08:20 2013
@@ -1,6 +1,7 @@
// This artificial program runs a lot of code. The exact amount depends on
-// the command line -- if any command line args are given, it does exactly
+// the command line -- if an arg "0" is given, it does exactly
// the same amount of work, but using four times as much code.
+// If an arg >= 1 is given, the amount of code is multiplied by this arg.
//
// It's a stress test for Valgrind's translation speed; natively the two
// modes run in about the same time (the I-cache effects aren't big enough
@@ -9,6 +10,7 @@
#include <stdio.h>
#include <string.h>
+#include <stdlib.h>
#include <assert.h>
#if defined(__mips__)
#include <asm/cachectl.h>
@@ -39,11 +41,6 @@
int h, i, sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0;
int n_fns, n_reps;
- char* a = mmap(0, FN_SIZE * N_LOOPS,
- PROT_EXEC|PROT_WRITE,
- MAP_PRIVATE|MAP_ANONYMOUS, -1,0);
- assert(a != (char*)MAP_FAILED);
-
if (argc <= 1) {
// Mode 1: not so much code
n_fns = N_LOOPS / RATIO;
@@ -51,12 +48,21 @@
printf("mode 1: ");
} else {
// Mode 2: lots of code
- n_fns = N_LOOPS;
+ const int mul = atoi(argv[1]);
+ if (mul == 0)
+ n_fns = N_LOOPS;
+ else
+ n_fns = N_LOOPS * mul;
n_reps = 1;
printf("mode 1: ");
}
printf("%d copies of f(), %d reps\n", n_fns, n_reps);
+ char* a = mmap(0, FN_SIZE * n_fns,
+ PROT_EXEC|PROT_WRITE,
+ MAP_PRIVATE|MAP_ANONYMOUS, -1,0);
+ assert(a != (char*)MAP_FAILED);
+
// Make a whole lot of copies of f(). FN_SIZE is much bigger than f()
// will ever be (we hope).
for (i = 0; i < n_fns; i++) {
|