You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(32) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(30) |
2
(8) |
3
(5) |
4
(5) |
|
5
(3) |
6
(9) |
7
(5) |
8
(14) |
9
(17) |
10
(27) |
11
(10) |
|
12
(6) |
13
(10) |
14
(7) |
15
(16) |
16
(9) |
17
(14) |
18
(8) |
|
19
(5) |
20
(13) |
21
(21) |
22
(13) |
23
(4) |
24
(1) |
25
(4) |
|
26
(2) |
27
(7) |
28
(4) |
29
(5) |
30
(12) |
|
|
|
From: <sv...@va...> - 2015-04-13 11:39:57
|
Author: sewardj
Date: Mon Apr 13 12:39:50 2015
New Revision: 15087
Log:
Un-break the Darwin build after r15078. Patch from Mark Pauley
(pa...@un...).
Modified:
trunk/coregrind/m_initimg/initimg-darwin.c
Modified: trunk/coregrind/m_initimg/initimg-darwin.c
==============================================================================
--- trunk/coregrind/m_initimg/initimg-darwin.c (original)
+++ trunk/coregrind/m_initimg/initimg-darwin.c Mon Apr 13 12:39:50 2015
@@ -312,7 +312,8 @@
HChar** orig_envp,
const ExeInfo* info,
Addr clstack_end,
- SizeT clstack_max_size )
+ SizeT clstack_max_size,
+ const VexArchInfo* vex_archinfo )
{
HChar **cpp;
HChar *strtab; /* string table */
@@ -508,7 +509,8 @@
/*====================================================================*/
/* Create the client's initial memory image. */
-IIFinaliseImageInfo VG_(ii_create_image)( IICreateImageInfo iicii )
+IIFinaliseImageInfo VG_(ii_create_image)( IICreateImageInfo iicii,
+ const VexArchInfo* vex_archinfo )
{
ExeInfo info;
VG_(memset)( &info, 0, sizeof(info) );
@@ -548,7 +550,8 @@
iifii.initial_client_SP =
setup_client_stack( iicii.argv - 1, env, &info,
- iicii.clstack_end, iifii.clstack_max_size );
+ iicii.clstack_end, iifii.clstack_max_size,
+ vex_archinfo );
VG_(free)(env);
|
|
From: <sv...@va...> - 2015-04-13 11:33:36
|
Author: sewardj
Date: Mon Apr 13 12:33:29 2015
New Revision: 3128
Log:
Remove unused function "lshift".
Modified:
trunk/priv/guest_tilegx_helpers.c
Modified: trunk/priv/guest_tilegx_helpers.c
==============================================================================
--- trunk/priv/guest_tilegx_helpers.c (original)
+++ trunk/priv/guest_tilegx_helpers.c Mon Apr 13 12:33:29 2015
@@ -46,15 +46,6 @@
{ offsetof(VexGuestTILEGXState, field), \
(sizeof ((VexGuestTILEGXState*)0)->field) }
-/* generalised left-shifter */
-static inline UInt lshift ( UInt x, Int n )
-{
- if (n >= 0)
- return x << n;
- else
- return x >> (-n);
-}
-
IRExpr *guest_tilegx_spechelper ( const HChar * function_name, IRExpr ** args,
IRStmt ** precedingStmts, Int n_precedingStmts)
{
|
|
From: <sv...@va...> - 2015-04-13 10:49:23
|
Author: sewardj
Date: Mon Apr 13 11:49:15 2015
New Revision: 15086
Log:
Add an NCode template for 8 bit loads on 64 bit targets.
Requires vex r3127.
Modified:
branches/NCODE/memcheck/mc_include.h
branches/NCODE/memcheck/mc_main.c
branches/NCODE/memcheck/mc_translate.c
Modified: branches/NCODE/memcheck/mc_include.h
==============================================================================
--- branches/NCODE/memcheck/mc_include.h (original)
+++ branches/NCODE/memcheck/mc_include.h Mon Apr 13 11:49:15 2015
@@ -602,6 +602,7 @@
extern NCodeTemplate* MC_(tmpl__LOADV64le_on_64);
extern NCodeTemplate* MC_(tmpl__LOADV32le_on_64);
+extern NCodeTemplate* MC_(tmpl__LOADV8_on_64);
/* Helper functions defined in mc_main.c */
Modified: branches/NCODE/memcheck/mc_main.c
==============================================================================
--- branches/NCODE/memcheck/mc_main.c (original)
+++ branches/NCODE/memcheck/mc_main.c Mon Apr 13 11:49:15 2015
@@ -4276,6 +4276,7 @@
VG_REGPARM(1) static ULong mc_LOADV64le_on_64_slow ( Addr a );
VG_REGPARM(1) static ULong mc_LOADV32le_on_64_slow ( Addr a );
+VG_REGPARM(1) static ULong mc_LOADV8_on_64_slow ( Addr a );
static void* ncode_alloc ( UInt n ) {
return VG_(malloc)("mc.ncode_alloc (NCode, permanent)", n);
@@ -4294,6 +4295,7 @@
NCodeTemplate* MC_(tmpl__LOADV64le_on_64) = NULL;
NCodeTemplate* MC_(tmpl__LOADV32le_on_64) = NULL;
+NCodeTemplate* MC_(tmpl__LOADV8_on_64) = NULL;
static NCodeTemplate* mk_tmpl__LOADV64le_on_64 ( NAlloc na )
{
@@ -4306,27 +4308,49 @@
NReg a0 = mkNReg(Nrr_Argument, 0);
NReg s0 = mkNReg(Nrr_Scratch, 0);
- hot[0] = NInstr_SetFlagsWri (na, Nsf_TEST, a0, MASK(8));
- hot[1] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 4));
- hot[2] = NInstr_ShiftWri (na, Nsh_SHR, s0, a0, 16);
- hot[3] = NInstr_LoadU (na, 8, s0, NEA_IRS(na, (HWord)&primary_map[0],
- s0, 3));
- hot[4] = NInstr_AluWri (na, Nalu_AND, r0, a0, 0xFFFF);
- hot[5] = NInstr_ShiftWri (na, Nsh_SHR, r0, r0, 3);
- hot[6] = NInstr_LoadU (na, 2, r0, NEA_RRS(na, s0, r0, 1));
- hot[7] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, VA_BITS16_DEFINED);
- hot[8] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 0));
- hot[9] = NInstr_ImmW (na, r0, V_BITS64_DEFINED);
+ /* NCode [r0] = "LOADV64le_on_64" [a0] s0 {
+ hot:
+ 0 tst.w a0, #0xFFFFFFF000000007 misaligned-or-high?
+ 1 bnz cold.4 yes, goto slow path
+ 2 shr.w s0, a0, #16 s0 = pri-map-ix
+ 3 ld.64 s0, [&pri_map[0] + s0 << #3] s0 = sec-map
+ 4 and.w r0, a0, #0xFFFF r0 = sec-map-offB
+ 5 shr.w r0, r0, #3 r0 = sec-map-ix
+ 6 ld.16 r0, [s0 + r0 << #1] r0 = sec-map-VABITS16
+ 7 cmp.w r0, #0xAAAA r0 == VABITS16_DEFINED?
+ 8 bnz cold.0 no, goto cold.0
+ 9 imm.w r0, #0x0 VBITS64_DEFINED
+ 10 nop continue
+ cold:
+ 0 mov.w s0, r0 s0 = sec-map-VABITS16
+ 1 imm.w r0, #0xFFFFFFFFFFFFFFFF VBITS64_UNDEFINED
+ 2 cmp.w s0, #0x5555 s0 == VABITS16_UNDEFINED?
+ 3 bz hot.10 yes, continue
+ 4 call r0 = mc_LOADV64le_on_64_slow[..](a0) call helper
+ 5 b hot.10 continue
+ }
+ */
+ hot[0] = NInstr_SetFlagsWri (na, Nsf_TEST, a0, MASK(8));
+ hot[1] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 4));
+ hot[2] = NInstr_ShiftWri (na, Nsh_SHR, s0, a0, 16);
+ hot[3] = NInstr_LoadU (na, 8, s0, NEA_IRS(na, (HWord)&primary_map[0],
+ s0, 3));
+ hot[4] = NInstr_AluWri (na, Nalu_AND, r0, a0, 0xFFFF);
+ hot[5] = NInstr_ShiftWri (na, Nsh_SHR, r0, r0, 3);
+ hot[6] = NInstr_LoadU (na, 2, r0, NEA_RRS(na, s0, r0, 1));
+ hot[7] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, VA_BITS16_DEFINED);
+ hot[8] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 0));
+ hot[9] = NInstr_ImmW (na, r0, V_BITS64_DEFINED);
hot[10] = NInstr_Nop (na);
- cold[0] = NInstr_MovW (na, s0, r0);
- cold[1] = NInstr_ImmW (na, r0, V_BITS64_UNDEFINED);
- cold[2] = NInstr_SetFlagsWri (na, Nsf_CMP, s0, VA_BITS16_UNDEFINED);
- cold[3] = NInstr_Branch (na, Ncc_Z, mkNLabel(Nlz_Hot, 10));
- cold[4] = NInstr_Call (na, rINVALID, r0, mkNRegVec1(na, a0),
- (void*)& mc_LOADV64le_on_64_slow,
- "mc_LOADV64le_on_64_slow");
- cold[5] = NInstr_Branch(na, Ncc_ALWAYS, mkNLabel(Nlz_Hot, 10));
+ cold[0] = NInstr_MovW (na, s0, r0);
+ cold[1] = NInstr_ImmW (na, r0, V_BITS64_UNDEFINED);
+ cold[2] = NInstr_SetFlagsWri (na, Nsf_CMP, s0, VA_BITS16_UNDEFINED);
+ cold[3] = NInstr_Branch (na, Ncc_Z, mkNLabel(Nlz_Hot, 10));
+ cold[4] = NInstr_Call (na, rINVALID, r0, mkNRegVec1(na, a0),
+ (void*)& mc_LOADV64le_on_64_slow,
+ "mc_LOADV64le_on_64_slow");
+ cold[5] = NInstr_Branch (na, Ncc_ALWAYS, mkNLabel(Nlz_Hot, 10));
hot[11] = cold[6] = NULL;
NCodeTemplate* tmpl
@@ -4346,29 +4370,51 @@
NReg a0 = mkNReg(Nrr_Argument, 0);
NReg s0 = mkNReg(Nrr_Scratch, 0);
- hot[0] = NInstr_SetFlagsWri (na, Nsf_TEST, a0, MASK(4));
- hot[1] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 4));
- hot[2] = NInstr_ShiftWri (na, Nsh_SHR, s0, a0, 16);
- hot[3] = NInstr_LoadU (na, 8, s0, NEA_IRS(na, (HWord)&primary_map[0],
- s0, 3));
- hot[4] = NInstr_AluWri (na, Nalu_AND, r0, a0, 0xFFFF);
- hot[5] = NInstr_ShiftWri (na, Nsh_SHR, r0, r0, 2);
- hot[6] = NInstr_LoadU (na, 1, r0, NEA_RRS(na, s0, r0, 0));
- hot[7] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, VA_BITS8_DEFINED);
- hot[8] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 0));
- hot[9] = NInstr_ImmW (na, r0, 0xFFFFFFFF00000000ULL
- | (ULong)V_BITS32_DEFINED);
+ /* NCode [r0] = "LOADV32le_on_64" [a0] s0 {
+ hot:
+ 0 tst.w a0, #0xFFFFFFF000000003 misaligned-or-high?
+ 1 bnz cold.4 yes, goto slow path
+ 2 shr.w s0, a0, #16 pri-map-ix
+ 3 ld.64 s0, [&pri_map[0] + s0 << #3] sec-map
+ 4 and.w r0, a0, #0xFFFF sec-map-offB
+ 5 shr.w r0, r0, #2 sec-map-ix
+ 6 ld.8 r0, [s0 + r0 << #0] sec-map-VABITS8
+ 7 cmp.w r0, #0xAA == VABITS8_DEFINED ?
+ 8 bnz cold.0 no, goto cold.0
+ 9 imm.w r0, #0xFFFFFFFF00000000 VBITS32_DEFINED (sort of)
+ 10 nop continue
+ cold:
+ 0 mov.w s0, r0 sec-map-VABITS8
+ 1 imm.w r0, #0xFFFFFFFFFFFFFFFF VBITS32_UNDEFINED
+ 2 cmp.w s0, #0x55 s-m-VABITS8 == VABITS8_UNDEF?
+ 3 bz hot.10 yes, continue
+ 4 call r0 = mc_LOADV32le_on_64_slow[..](a0) call helper
+ 5 b hot.10 continue
+ }
+ */
+ hot[0] = NInstr_SetFlagsWri (na, Nsf_TEST, a0, MASK(4));
+ hot[1] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 4));
+ hot[2] = NInstr_ShiftWri (na, Nsh_SHR, s0, a0, 16);
+ hot[3] = NInstr_LoadU (na, 8, s0, NEA_IRS(na, (HWord)&primary_map[0],
+ s0, 3));
+ hot[4] = NInstr_AluWri (na, Nalu_AND, r0, a0, 0xFFFF);
+ hot[5] = NInstr_ShiftWri (na, Nsh_SHR, r0, r0, 2);
+ hot[6] = NInstr_LoadU (na, 1, r0, NEA_RRS(na, s0, r0, 0));
+ hot[7] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, VA_BITS8_DEFINED);
+ hot[8] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 0));
+ hot[9] = NInstr_ImmW (na, r0, 0xFFFFFFFF00000000ULL
+ | (ULong)V_BITS32_DEFINED);
hot[10] = NInstr_Nop (na);
- cold[0] = NInstr_MovW (na, s0, r0);
- cold[1] = NInstr_ImmW (na, r0, 0xFFFFFFFF00000000ULL
- | (ULong)V_BITS32_UNDEFINED);
- cold[2] = NInstr_SetFlagsWri (na, Nsf_CMP, s0, VA_BITS8_UNDEFINED);
- cold[3] = NInstr_Branch (na, Ncc_Z, mkNLabel(Nlz_Hot, 10));
- cold[4] = NInstr_Call (na, rINVALID, r0, mkNRegVec1(na, a0),
- (void*)& mc_LOADV32le_on_64_slow,
- "mc_LOADV32le_on_64_slow");
- cold[5] = NInstr_Branch(na, Ncc_ALWAYS, mkNLabel(Nlz_Hot, 10));
+ cold[0] = NInstr_MovW (na, s0, r0);
+ cold[1] = NInstr_ImmW (na, r0, 0xFFFFFFFF00000000ULL
+ | (ULong)V_BITS32_UNDEFINED);
+ cold[2] = NInstr_SetFlagsWri (na, Nsf_CMP, s0, VA_BITS8_UNDEFINED);
+ cold[3] = NInstr_Branch (na, Ncc_Z, mkNLabel(Nlz_Hot, 10));
+ cold[4] = NInstr_Call (na, rINVALID, r0, mkNRegVec1(na, a0),
+ (void*)& mc_LOADV32le_on_64_slow,
+ "mc_LOADV32le_on_64_slow");
+ cold[5] = NInstr_Branch (na, Ncc_ALWAYS, mkNLabel(Nlz_Hot, 10));
hot[11] = cold[6] = NULL;
NCodeTemplate* tmpl
@@ -4377,12 +4423,107 @@
return tmpl;
}
+static NCodeTemplate* mk_tmpl__LOADV8_on_64 ( NAlloc na )
+{
+ NInstr** hot = na((11+1) * sizeof(NInstr*));
+ NInstr** cold = na((14+1) * sizeof(NInstr*));
+
+ NReg rINVALID = mkNRegINVALID();
+
+ NReg r0 = mkNReg(Nrr_Result, 0);
+ NReg a0 = mkNReg(Nrr_Argument, 0);
+ NReg s0 = mkNReg(Nrr_Scratch, 0);
+
+ /*
+ h0 tst.w a0, #0xFFFFFFF000000000 high?
+ h1 bnz cold.12 yes, goto slow path
+ h2 shr.w s0, a0, #16 s0 = pri-map-ix
+ h3 ld.64 s0, [&pri_map[0] + s0 << #3] s0 = sec-map
+ h4 and.w r0, a0, #0xFFFF r0 = sec-map-offB
+ h5 shr.w r0, r0, #2 r0 = sec-map-ix
+ h6 ld.8 r0, [s0 + r0 << #0] r0 = sec-map-VABITS8
+ h7 cmp.w r0, #0xAA r0 == VABITS8_DEFINED?
+ h8 bnz cold.0 no, goto cold.0
+ h9 imm.w r0, #0xFFFFFFFFFFFFFF00 VBITS8_DEFINED | top56safe
+ h10 nop continue
+
+ c0 cmp.w r0, #0x55 VABITS8_UNDEFINED
+ c1 bnz cold.4
+
+ c2 imm.w r0, #0xFFFFFFFFFFFFFFFF VBITS8_UNDEFINED | top56safe
+ c3 b hot.10
+
+ // r0 holds sec-map-VABITS8
+ // a0 holds the address. Extract the relevant 2 bits and inspect.
+ c4 and.w s0, a0, #3 // addr & 3
+ c5 add.w s0, s0, s0 // 2 * (addr & 3)
+ c6 shr.w r0, r0, s0 // sec-map-VABITS8 >> (2 * (addr & 3))
+ c7 and.w r0, r0, #3 // (sec-map-VABITS8 >> (2 * (addr & 3))) & 3
+
+ c8 cmp.w r0, #2 // VABITS2_DEFINED
+ c9 jz hot.9
+
+ c10 cmp.w r0, #1 // VABITS2_UNDEFINED
+ c11 jz cold.2
+
+ c12 call r0 = mc_LOADV8_on_64_slow(a0)
+ c13 b hot.10
+ */
+ hot[0] = NInstr_SetFlagsWri (na, Nsf_TEST, a0, MASK(1));
+ hot[1] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 12));
+ hot[2] = NInstr_ShiftWri (na, Nsh_SHR, s0, a0, 16);
+ hot[3] = NInstr_LoadU (na, 8, s0, NEA_IRS(na, (HWord)&primary_map[0],
+ s0, 3));
+ hot[4] = NInstr_AluWri (na, Nalu_AND, r0, a0, 0xFFFF);
+ hot[5] = NInstr_ShiftWri (na, Nsh_SHR, r0, r0, 2);
+ hot[6] = NInstr_LoadU (na, 1, r0, NEA_RRS(na, s0, r0, 0));
+ hot[7] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, VA_BITS8_DEFINED);
+ hot[8] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 0));
+ hot[9] = NInstr_ImmW (na, r0, 0xFFFFFFFFFFFFFF00ULL
+ | (ULong)V_BITS8_DEFINED);
+ hot[10] = NInstr_Nop (na);
+
+ cold[0] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, VA_BITS8_UNDEFINED);
+ cold[1] = NInstr_Branch (na, Ncc_NZ, mkNLabel(Nlz_Cold, 4));
+
+ cold[2] = NInstr_ImmW (na, r0, 0xFFFFFFFFFFFFFF00ULL
+ | (ULong)V_BITS8_UNDEFINED);
+ cold[3] = NInstr_Branch (na, Ncc_ALWAYS, mkNLabel(Nlz_Hot, 10));
+
+ // r0 holds sec-map-VABITS8
+ // a0 holds the address. Extract the relevant 2 bits and inspect.
+ cold[4] = NInstr_AluWri (na, Nalu_AND, s0, a0, 3);
+ cold[5] = NInstr_AluWrr (na, Nalu_ADD, s0, s0, s0);
+ cold[6] = NInstr_ShiftWrr (na, Nsh_SHR, r0, r0, s0);
+ cold[7] = NInstr_AluWri (na, Nalu_AND, r0, r0, 3);
+
+ cold[8] = NInstr_SetFlagsWri (na, Nsf_CMP, r0, 2);
+ cold[9] = NInstr_Branch (na, Ncc_Z, mkNLabel(Nlz_Hot, 9));
+
+ cold[10]= NInstr_SetFlagsWri (na, Nsf_CMP, r0, 1);
+ cold[11]= NInstr_Branch (na, Ncc_Z, mkNLabel(Nlz_Cold, 2));
+
+ cold[12]= NInstr_Call (na, rINVALID, r0, mkNRegVec1(na, a0),
+ (void*)& mc_LOADV8_on_64_slow,
+ "mc_LOADV8_on_64_slow");
+ cold[13]= NInstr_Branch (na, Ncc_ALWAYS, mkNLabel(Nlz_Hot, 10));
+
+ hot[11] = cold[14] = NULL;
+ NCodeTemplate* tmpl
+ = mkNCodeTemplate(na,"LOADV8_on_64",
+ /*res, parms, scratch*/1, 1, 1, hot, cold);
+ return tmpl;
+}
+
+
void MC_(create_ncode_templates) ( void )
{
tl_assert(MC_(tmpl__LOADV64le_on_64) == NULL);
tl_assert(MC_(tmpl__LOADV32le_on_64) == NULL);
+ tl_assert(MC_(tmpl__LOADV8_on_64) == NULL);
MC_(tmpl__LOADV64le_on_64) = mk_tmpl__LOADV64le_on_64(ncode_alloc);
MC_(tmpl__LOADV32le_on_64) = mk_tmpl__LOADV32le_on_64(ncode_alloc);
+ MC_(tmpl__LOADV8_on_64) = mk_tmpl__LOADV8_on_64(ncode_alloc);
}
@@ -4886,6 +5027,11 @@
#endif
}
+VG_REGPARM(1) static ULong mc_LOADV8_on_64_slow ( Addr a )
+{
+ return mc_LOADVn_slow( a, 8, False/*irrelevant*/ );
+}
+
VG_REGPARM(2)
void MC_(helperc_STOREV8) ( Addr a, UWord vbits8 )
Modified: branches/NCODE/memcheck/mc_translate.c
==============================================================================
--- branches/NCODE/memcheck/mc_translate.c (original)
+++ branches/NCODE/memcheck/mc_translate.c Mon Apr 13 11:49:15 2015
@@ -4648,6 +4648,19 @@
= assignNew('V', mce, Ity_I32, unop(Iop_64to32, mkexpr(datavbits64)));
return datavbits32;
}
+ if (ty == Ity_I8) {
+ /* Unconditional LOAD8 on 64 bit host. Generate inline code. */
+ IRTemp datavbits64 = newTemp(mce, Ity_I64, VSh);
+ NCodeTemplate* tmpl = MC_(tmpl__LOADV8_on_64);
+ IRAtom** args = mkIRExprVec_1( addrAct );
+ IRTemp* ress = mkIRTempVec_1( datavbits64 );
+ /* The NCode block produces a 64 bit value, but we need to
+ truncate it to 8 bits. */
+ stmt( 'V', mce, IRStmt_NCode(tmpl, args, ress) );
+ IRAtom* datavbits8
+ = assignNew('V', mce, Ity_I8, unop(Iop_64to8, mkexpr(datavbits64)));
+ return datavbits8;
+ }
/* else fall through */
}
/* ------ END inline NCode ? ------ */
|
|
From: <sv...@va...> - 2015-04-13 10:47:22
|
Author: sewardj
Date: Mon Apr 13 11:47:13 2015
New Revision: 3127
Log:
Add new NInstrs: NIn_ShiftWrr, NIn_AluWrr for shifts and add/and where
both operands are registers.
amd64 back end: emit code for the above 2 new NInstrs
Modified:
branches/NCODE/priv/host_amd64_defs.c
branches/NCODE/priv/host_amd64_defs.h
branches/NCODE/priv/ir_defs.c
branches/NCODE/pub/libvex_ir.h
Modified: branches/NCODE/priv/host_amd64_defs.c
==============================================================================
--- branches/NCODE/priv/host_amd64_defs.c (original)
+++ branches/NCODE/priv/host_amd64_defs.c Mon Apr 13 11:47:13 2015
@@ -4218,23 +4218,69 @@
break;
}
+ case Nin_ShiftWrr: {
+ NShift how = ni->Nin.ShiftWrr.how;
+ HReg amt = mapNReg(nregMap, ni->Nin.ShiftWrr.amt);
+ HReg src = mapNReg(nregMap, ni->Nin.ShiftWrr.srcL);
+ HReg dst = mapNReg(nregMap, ni->Nin.ShiftWrr.dst);
+
+ AMD64ShiftOp shOp = Ash_INVALID;
+ switch (how) {
+ case Nsh_SHR: shOp = Ash_SHR; break;
+ default: break;
+ }
+ vassert(shOp != Ash_INVALID);
+
+ if (!sameHReg(src, dst)) {
+ HI( mk_iMOVsd_RR(src, dst) );
+ }
+ /* Now, we have the shift amount in register |amt|. Problem
+ is that it needs to be in %rcx, but we don't know whether
+ or not that is live. Rather than do this nicely, we can
+ take advantage of the fact that r11 is a guaranteed
+ available scratch temp, and temporarily store rcx in it.
+ Note that rcx could be live even through it isn't
+ allocatable, since the insn selector uses it to put
+ variable shift amounts in. So we can't safely trash it
+ here. */
+ HI( mk_iMOVsd_RR(hregAMD64_RCX(), hregAMD64_R11()) ); // save rcx
+ HI( mk_iMOVsd_RR(amt, hregAMD64_RCX()) ); // amt->rcx
+ HI( AMD64Instr_Sh64(shOp, 0/*meaning %cl*/, dst) );
+ HI( mk_iMOVsd_RR(hregAMD64_R11(), hregAMD64_RCX()) ); // restore rcx
+ break;
+ }
+
case Nin_AluWri: {
NAlu how = ni->Nin.AluWri.how;
HReg dstR = mapNReg(nregMap, ni->Nin.AluWri.dst);
HReg srcLR = mapNReg(nregMap, ni->Nin.AluWri.srcL);
HWord imm = ni->Nin.AluWri.srcR;
- // Verified correct, but currently unused
- //if (how == Nalu_AND && fitsIn32Bits((ULong)imm)) {
- // if (!sameHReg(srcLR, dstR)) {
- // HI( mk_iMOVsd_RR(srcLR, dstR) );
- // }
- // HI( AMD64Instr_Alu64R(Aalu_AND, AMD64RMI_Imm(imm), dstR) );
- // break;
- //}
if (how == Nalu_AND && imm == 0xFFFFULL) {
HI( AMD64Instr_MovxWQ(False/*!syned*/, srcLR, dstR) );
break;
}
+ if (how == Nalu_AND && fitsIn32Bits((ULong)imm)) {
+ if (!sameHReg(srcLR, dstR)) {
+ HI( mk_iMOVsd_RR(srcLR, dstR) );
+ }
+ HI( AMD64Instr_Alu64R(Aalu_AND, AMD64RMI_Imm(imm), dstR) );
+ break;
+ }
+ goto unhandled;
+ }
+
+ case Nin_AluWrr: {
+ NAlu how = ni->Nin.AluWrr.how;
+ HReg dstR = mapNReg(nregMap, ni->Nin.AluWrr.dst);
+ HReg srcLR = mapNReg(nregMap, ni->Nin.AluWrr.srcL);
+ HReg srcRR = mapNReg(nregMap, ni->Nin.AluWrr.srcR);
+ if (how == Nalu_ADD) {
+ if (!sameHReg(srcLR, dstR)) {
+ HI( mk_iMOVsd_RR(srcLR, dstR) );
+ }
+ HI( AMD64Instr_Alu64R(Aalu_ADD, AMD64RMI_Reg(srcRR), dstR) );
+ break;
+ }
goto unhandled;
}
@@ -4345,7 +4391,6 @@
const AMD64InstrNCode* hi_details = hi->Ain.NCode.details;
const NCodeTemplate* tmpl = hi_details->tmpl;
const RRegSet* rregsLiveAfter = hi_details->rrLiveAfter;
- const RRegUniverse* univ = RRegSet__getUniverse(rregsLiveAfter);
NRegMap nregMap;
nregMap.regsR = hi_details->regsR;
Modified: branches/NCODE/priv/host_amd64_defs.h
==============================================================================
--- branches/NCODE/priv/host_amd64_defs.h (original)
+++ branches/NCODE/priv/host_amd64_defs.h Mon Apr 13 11:47:13 2015
@@ -358,7 +358,7 @@
Ain_Imm64, /* Generate 64-bit literal to register */
Ain_Alu64R, /* 64-bit mov/arith/logical, dst=REG */
Ain_Alu64M, /* 64-bit mov/arith/logical, dst=MEM */
- Ain_Sh64, /* 64-bit shift/rotate, dst=REG or MEM */
+ Ain_Sh64, /* 64-bit shift, dst=REG, by imm or %cl */
Ain_Test64, /* 64-bit test (AND, set flags, discard result) */
Ain_Unary64, /* 64-bit not and neg */
Ain_Lea64, /* 64-bit compute EA into a reg */
Modified: branches/NCODE/priv/ir_defs.c
==============================================================================
--- branches/NCODE/priv/ir_defs.c (original)
+++ branches/NCODE/priv/ir_defs.c Mon Apr 13 11:47:13 2015
@@ -77,6 +77,7 @@
static const HChar* nameNAlu ( NAlu nal ) {
switch (nal) {
case Nalu_AND: return "and";
+ case Nalu_ADD: return "add";
default: return "nameNAlu???";
}
}
@@ -182,6 +183,14 @@
ppNReg(ni->Nin.ShiftWri.srcL);
vex_printf(", #%u", (UInt)ni->Nin.ShiftWri.amt);
break;
+ case Nin_ShiftWrr:
+ vex_printf("%s.w ", nameNShift(ni->Nin.ShiftWrr.how));
+ ppNReg(ni->Nin.ShiftWrr.dst);
+ vex_printf(", ");
+ ppNReg(ni->Nin.ShiftWrr.srcL);
+ vex_printf(", ");
+ ppNReg(ni->Nin.ShiftWrr.amt);
+ break;
case Nin_AluWri:
vex_printf("%s.w ", nameNAlu(ni->Nin.AluWri.how));
ppNReg(ni->Nin.AluWri.dst);
@@ -189,6 +198,14 @@
ppNReg(ni->Nin.AluWri.srcL);
vex_printf(", #0x%llx", (ULong)ni->Nin.AluWri.srcR);
break;
+ case Nin_AluWrr:
+ vex_printf("%s.w ", nameNAlu(ni->Nin.AluWrr.how));
+ ppNReg(ni->Nin.AluWrr.dst);
+ vex_printf(", ");
+ ppNReg(ni->Nin.AluWrr.srcL);
+ vex_printf(", ");
+ ppNReg(ni->Nin.AluWrr.srcR);
+ break;
case Nin_SetFlagsWri:
vex_printf("%s.w ", nameNSetFlags(ni->Nin.SetFlagsWri.how));
ppNReg(ni->Nin.SetFlagsWri.srcL);
@@ -2064,6 +2081,17 @@
in->Nin.ShiftWri.amt = amt;
return in;
}
+NInstr* NInstr_ShiftWrr ( NAlloc na,
+ NShift how, NReg dst, NReg srcL, NReg amt )
+{
+ NInstr* in = na(sizeof(NInstr));
+ in->tag = Nin_ShiftWrr;
+ in->Nin.ShiftWrr.how = how;
+ in->Nin.ShiftWrr.dst = dst;
+ in->Nin.ShiftWrr.srcL = srcL;
+ in->Nin.ShiftWrr.amt = amt;
+ return in;
+}
NInstr* NInstr_AluWri ( NAlloc na, NAlu how, NReg dst, NReg srcL, HWord srcR )
{
NInstr* in = na(sizeof(NInstr));
@@ -2074,6 +2102,16 @@
in->Nin.AluWri.srcR = srcR;
return in;
}
+NInstr* NInstr_AluWrr ( NAlloc na, NAlu how, NReg dst, NReg srcL, NReg srcR )
+{
+ NInstr* in = na(sizeof(NInstr));
+ in->tag = Nin_AluWrr;
+ in->Nin.AluWrr.how = how;
+ in->Nin.AluWrr.dst = dst;
+ in->Nin.AluWrr.srcL = srcL;
+ in->Nin.AluWrr.srcR = srcR;
+ return in;
+}
NInstr* NInstr_SetFlagsWri ( NAlloc na, NSetFlags how, NReg srcL, HWord srcR )
{
NInstr* in = na(sizeof(NInstr));
Modified: branches/NCODE/pub/libvex_ir.h
==============================================================================
--- branches/NCODE/pub/libvex_ir.h (original)
+++ branches/NCODE/pub/libvex_ir.h Mon Apr 13 11:47:13 2015
@@ -2673,7 +2673,8 @@
typedef
enum {
- Nalu_AND=0x1D30
+ Nalu_AND=0x1D30,
+ Nalu_ADD
}
NAlu;
@@ -2766,7 +2767,9 @@
Nin_Call,
Nin_ImmW,
Nin_ShiftWri,
+ Nin_ShiftWrr,
Nin_AluWri,
+ Nin_AluWrr,
Nin_SetFlagsWri,
Nin_MovW,
Nin_LoadU,
@@ -2802,12 +2805,24 @@
UInt amt; /* 1 .. host-word-size-1 only */
} ShiftWri;
struct {
+ NShift how;
+ NReg dst;
+ NReg srcL;
+ NReg amt; /* 0 .. host-word-size-1 only */
+ } ShiftWrr;
+ struct {
NAlu how;
NReg dst;
NReg srcL;
HWord srcR;
} AluWri;
struct {
+ NAlu how;
+ NReg dst;
+ NReg srcL;
+ NReg srcR;
+ } AluWrr;
+ struct {
NSetFlags how;
NReg srcL;
HWord srcR;
@@ -2839,8 +2854,12 @@
extern NInstr* NInstr_ImmW ( NAlloc na, NReg dst, HWord imm );
extern NInstr* NInstr_ShiftWri ( NAlloc na,
NShift how, NReg dst, NReg srcL, UInt amt );
+extern NInstr* NInstr_ShiftWrr ( NAlloc na,
+ NShift how, NReg dst, NReg srcL, NReg amt );
extern NInstr* NInstr_AluWri ( NAlloc na,
NAlu how, NReg dst, NReg srcL, HWord srcR );
+extern NInstr* NInstr_AluWrr ( NAlloc na,
+ NAlu how, NReg dst, NReg srcL, NReg srcR );
extern NInstr* NInstr_SetFlagsWri ( NAlloc na,
NSetFlags how, NReg srcL, HWord srcR );
extern NInstr* NInstr_MovW ( NAlloc na, NReg dst, NReg src );
|
|
From: Mark P. <pa...@un...> - 2015-04-12 23:19:15
|
Hi there, It looks like valgrind HEAD is busted for those of us living on bleeding 10.10.
The two main issues are that the configure.am file is a bit too strict for Apple LLVM checking, and someone made a breaking change to the initimg interfaces without making sure to un-bust initimg-darwin.c
The following patch worked like a charm for myself:
Index: configure.ac
===================================================================
--- configure.ac (revision 15085)
+++ configure.ac (working copy)
@@ -154,7 +154,7 @@
# Note: m4 arguments are quoted with [ and ] so square brackets in shell
# statements have to be quoted.
case "${is_clang}-${gcc_version}" in
- applellvm-5.1|applellvm-6.0*)
+ applellvm-5.1|applellvm-6.*)
AC_MSG_RESULT([ok (Apple LLVM version ${gcc_version})])
;;
icc-1[[3-9]].*)
Index: coregrind/m_initimg/initimg-darwin.c
===================================================================
--- coregrind/m_initimg/initimg-darwin.c (revision 15085)
+++ coregrind/m_initimg/initimg-darwin.c (working copy)
@@ -312,7 +312,8 @@
HChar** orig_envp,
const ExeInfo* info,
Addr clstack_end,
- SizeT clstack_max_size )
+ SizeT clstack_max_size,
+ const VexArchInfo* vex_archinfo )
{
HChar **cpp;
HChar *strtab; /* string table */
@@ -508,7 +509,8 @@
/*====================================================================*/
/* Create the client's initial memory image. */
-IIFinaliseImageInfo VG_(ii_create_image)( IICreateImageInfo iicii )
+IIFinaliseImageInfo VG_(ii_create_image)( IICreateImageInfo iicii,
+ const VexArchInfo* vex_archinfo )
{
ExeInfo info;
VG_(memset)( &info, 0, sizeof(info) );
@@ -548,7 +550,8 @@
iifii.initial_client_SP =
setup_client_stack( iicii.argv - 1, env, &info,
- iicii.clstack_end, iifii.clstack_max_size );
+ iicii.clstack_end, iifii.clstack_max_size,
+ vex_archinfo );
VG_(free)(env);
thanks,
-Pauley
|
|
From: Ivo R. <iv...@iv...> - 2015-04-12 19:45:03
|
2015-04-09 22:03 GMT+02:00 Philippe Waroquiers < phi...@sk...>: > On Thu, 2015-04-09 at 21:58 +0200, Florian Krohm wrote: > > If you want to beat me at it feel free to do so :) > This crashes in a very special case, no urgency :) > I have a patch including test cases ready for review in https://bugs.kde.org/show_bug.cgi?id=345887 After all, it was not so difficult. I. |
|
From: Matthias S. <zz...@ge...> - 2015-04-12 19:01:15
|
On 10.04.2015 20:56, Philippe Waroquiers wrote: > On Fri, 2015-04-10 at 06:47 +0200, Matthias Schwarzott wrote: > >> The check "if (fp_min + 256 >= fp_max)" in coregrind/m_stacktrace.c:501 >> is triggered here. >> >> By changing it to "if (fp_min + 128 >= fp_max)" it can be fixed. >> >> I think amd64 is having problems here because some functions do not need >> additional local variables but can use the redzone, so the stackframes >> are small. > Thanks for this analysis. > Some days ago, I looked at the history of this check. > I saw that the value was already decreased in the past for amd64. > This check is also disabled for Darwin. > > For x86/ppc32/ppc64/arm/arm64, the value is still 512. > > s390x/mips32/mips64/tilegx have no such condition. > > > It is not clear to me what is the purpose of this check. > I did not find an explanation in the svn history. > > I am wondering if that had not as objective to avoid SEGV for bogus > stack pointers. > There was some recent work done to avoid SEGV (e.g. by obtaining > better stack limits for unwinding). > > So, I am not sure that these checks are still useful. > Rather, they might only harm in case we have small but valid > stack frames. > Maybe the number should be dropped in total. Then only the check "fp_min >= fp_max" would remain. Even if I tried, I did not get the whole idea of how stack registering and (main) thread extension works. Regards Matthias |
|
From: Matthias S. <zz...@ge...> - 2015-04-12 18:54:01
|
Hi there!
When executing valgrind automatically on a server, I sometimes wonder if
a process did finish successfully or did call abort (or was killed in
some other way).
when running valgrind with option "-v" it prints this:
==10481== Process terminating with default action of signal 6 (SIGABRT)
==10481== at 0x5085137: kill (syscall-template.S:81)
==10481== by 0x40081B: main (gone.c:26)
But "-v" is too verbose for normal runs :)
So I suggest to always write this and additionally to also do it for xml
output.
I did some experiments. What do you think about something like this in
the xml file:
<fatal_signal>
<tid>1</tid>
<signo>6</signo>
<signame>SIGABRT</signame>
<stack>
<frame>
<ip>0x5084137</ip>
<obj>/lib64/libc-2.20.so</obj>
<fn>kill</fn>
<dir>/var/tmp/portage/sys-libs/glibc-2.20-r2/work/glibc-2.20/signal/../sysdeps/unix</dir>
<file>syscall-template.S</file>
<line>81</line>
</frame>
<frame>
<ip>0x40081B</ip>
<obj>/home/matze/development/valgrind.git/gdbserver_tests/gone</obj>
<fn>main</fn>
<dir>/home/matze/development/valgrind.git/gdbserver_tests</dir>
<file>gone.c</file>
<line>26</line>
</frame>
</stack>
</fatal_signal>
Regards
Matthias
|
Author: sewardj
Date: Sun Apr 12 10:23:58 2015
New Revision: 3126
Log:
Tidyups, no functional change:
* Create RRegSets for caller-saved and callee-saved registers on
amd64, so as to create a single point of reference for that info.
Plumb to use sites.
* Pull out and abstractify logic to compute the set of registers
to spill around NCode calls (calcRegistersToPreserveAroundNCodeCall)
so it becomes arch neutral and move it to host_generic_regs.c.
* fix stupid error in RRegSet__fromVec
Modified:
branches/NCODE/priv/host_amd64_defs.c
branches/NCODE/priv/host_amd64_defs.h
branches/NCODE/priv/host_amd64_isel.c
branches/NCODE/priv/host_generic_regs.c
branches/NCODE/priv/host_generic_regs.h
branches/NCODE/priv/main_main.c
branches/NCODE/priv/main_util.h
Modified: branches/NCODE/priv/host_amd64_defs.c
==============================================================================
--- branches/NCODE/priv/host_amd64_defs.c (original)
+++ branches/NCODE/priv/host_amd64_defs.c Sun Apr 12 10:23:58 2015
@@ -101,6 +101,75 @@
}
+/* Returns the registers in the AMD64 universe that are caller saved.
+ This is really ABI dependent, but we ignore that detail here. */
+static const RRegSet* getRRegsCallerSaved_AMD64 ( void )
+{
+ /* In theory gcc should be able to fold this into a single 64 bit
+ constant (bitset). But that's a bit risky, so instead do
+ thread-unsafe lazy initialisation (sigh). */
+ static RRegSet callerSavedRegs;
+ static Bool callerSavedRegs_initted = False;
+
+ if (LIKELY(callerSavedRegs_initted))
+ return &callerSavedRegs;
+
+ RRegSet__init(&callerSavedRegs, getRRegUniverse_AMD64());
+
+ RRegSet__add(&callerSavedRegs, hregAMD64_RAX());
+ RRegSet__add(&callerSavedRegs, hregAMD64_RCX());
+ RRegSet__add(&callerSavedRegs, hregAMD64_RDX());
+ RRegSet__add(&callerSavedRegs, hregAMD64_RSI());
+ RRegSet__add(&callerSavedRegs, hregAMD64_RDI());
+ RRegSet__add(&callerSavedRegs, hregAMD64_R8());
+ RRegSet__add(&callerSavedRegs, hregAMD64_R9());
+ RRegSet__add(&callerSavedRegs, hregAMD64_R10());
+ RRegSet__add(&callerSavedRegs, hregAMD64_R11());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM0());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM1());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM3());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM4());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM5());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM6());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM7());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM8());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM9());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM10());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM11());
+ RRegSet__add(&callerSavedRegs, hregAMD64_XMM12());
+
+ callerSavedRegs_initted = True;
+ return &callerSavedRegs;
+}
+
+
+/* Returns the registers in the AMD64 universe that are callee saved.
+ This is really ABI dependent, but we ignore that detail here. */
+static const RRegSet* getRRegsCalleeSaved_AMD64 ( void )
+{
+ /* In theory gcc should be able to fold this into a single 64 bit
+ constant (bitset). But that's a bit risky, so instead do
+ thread-unsafe lazy initialisation (sigh). */
+ static RRegSet calleeSavedRegs;
+ static Bool calleeSavedRegs_initted = False;
+
+ if (LIKELY(calleeSavedRegs_initted))
+ return &calleeSavedRegs;
+
+ RRegSet__init(&calleeSavedRegs, getRRegUniverse_AMD64());
+
+ RRegSet__add(&calleeSavedRegs, hregAMD64_RBX());
+ RRegSet__add(&calleeSavedRegs, hregAMD64_RBP());
+ RRegSet__add(&calleeSavedRegs, hregAMD64_R12());
+ RRegSet__add(&calleeSavedRegs, hregAMD64_R13());
+ RRegSet__add(&calleeSavedRegs, hregAMD64_R14());
+ RRegSet__add(&calleeSavedRegs, hregAMD64_R15());
+
+ calleeSavedRegs_initted = True;
+ return &calleeSavedRegs;
+}
+
+
void ppHRegAMD64 ( HReg reg )
{
Int r;
@@ -1548,31 +1617,9 @@
/* This is a bit subtle. */
/* First off, claim it trashes all the caller-saved regs
which fall within the register allocator's jurisdiction.
- These I believe to be: rax rcx rdx rsi rdi r8 r9 r10 r11
- and all the xmm registers.
+ These I believe to be: rsi rdi r8 r9 r10 xmm3..xmm12.
*/
- addHRegUse(u, HRmWrite, hregAMD64_RAX());
- addHRegUse(u, HRmWrite, hregAMD64_RCX());
- addHRegUse(u, HRmWrite, hregAMD64_RDX());
- addHRegUse(u, HRmWrite, hregAMD64_RSI());
- addHRegUse(u, HRmWrite, hregAMD64_RDI());
- addHRegUse(u, HRmWrite, hregAMD64_R8());
- addHRegUse(u, HRmWrite, hregAMD64_R9());
- addHRegUse(u, HRmWrite, hregAMD64_R10());
- addHRegUse(u, HRmWrite, hregAMD64_R11());
- addHRegUse(u, HRmWrite, hregAMD64_XMM0());
- addHRegUse(u, HRmWrite, hregAMD64_XMM1());
- addHRegUse(u, HRmWrite, hregAMD64_XMM3());
- addHRegUse(u, HRmWrite, hregAMD64_XMM4());
- addHRegUse(u, HRmWrite, hregAMD64_XMM5());
- addHRegUse(u, HRmWrite, hregAMD64_XMM6());
- addHRegUse(u, HRmWrite, hregAMD64_XMM7());
- addHRegUse(u, HRmWrite, hregAMD64_XMM8());
- addHRegUse(u, HRmWrite, hregAMD64_XMM9());
- addHRegUse(u, HRmWrite, hregAMD64_XMM10());
- addHRegUse(u, HRmWrite, hregAMD64_XMM11());
- addHRegUse(u, HRmWrite, hregAMD64_XMM12());
-
+ addHRegUse_from_RRegSet(u, HRmWrite, getRRegsCallerSaved_AMD64());
/* Now we have to state any parameter-carrying registers
which might be read. This depends on the regparmness. */
switch (i->Ain.Call.regparms) {
@@ -3981,251 +4028,9 @@
so it's already out of commission as far as regalloc is concerned.
So we can safely use it here, when needed. */
-/* A handy structure to hold the register environment. */
-typedef
- struct {
- UInt nRegsR;
- const HReg* regsR;
- UInt nRegsA;
- const HReg* regsA;
- UInt nRegsS;
- const HReg* regsS;
- }
- NRegMap;
-
-/* fwds */
-static void emit_AMD64NInstr ( /*MOD*/AssemblyBuffer* ab,
- /*MOD*/RelocationBuffer* rb,
- const NInstr* ni,
- const NRegMap* nregMap,
- const RRegSet* rrLiveAfter,
- /* for debug printing only */
- Bool verbose, NLabel niLabel );
-
-static UInt hregVecLen ( const HReg* vec )
-{
- UInt i;
- for (i = 0; !hregIsInvalid(vec[i]); i++)
- ;
- return i;
-}
-
-/* Generate the AMD64 NCode instruction |hi| into |ab_hot| and
- |ab_cold|. This can only handle NCode blocks. All other AMD64
- instructions are to be handled by emit_AMD64Instr. This is
- required to generate <= 1024 bytes of code. Returns True if OK,
- False if not enough buffer space. */
-
-Bool emit_AMD64NCode ( /*MOD*/AssemblyBuffer* ab_hot,
- /*MOD*/AssemblyBuffer* ab_cold,
- /*MOD*/RelocationBuffer* rb,
- const AMD64Instr* hi,
- Bool mode64, VexEndness endness_host,
- Bool verbose )
-{
- vassert(mode64 == True);
- vassert(endness_host == VexEndnessLE);
- vassert(hi->tag == Ain_NCode);
-
- const AMD64InstrNCode* hi_details = hi->Ain.NCode.details;
- const NCodeTemplate* tmpl = hi_details->tmpl;
- const RRegSet* rregsLiveAfter = hi_details->rrLiveAfter;
- const RRegUniverse* univ = RRegSet__getUniverse(rregsLiveAfter);
-
- NRegMap nregMap;
- nregMap.regsR = hi_details->regsR;
- nregMap.regsA = hi_details->regsA;
- nregMap.regsS = hi_details->regsS;
- nregMap.nRegsR = tmpl->nres;
- nregMap.nRegsA = tmpl->narg;
- nregMap.nRegsS = tmpl->nscr;
-
- vassert(hregVecLen(nregMap.regsR) == nregMap.nRegsR);
- vassert(hregVecLen(nregMap.regsA) == nregMap.nRegsA);
- vassert(hregVecLen(nregMap.regsS) == nregMap.nRegsS);
-
- if (AssemblyBuffer__getRemainingSize(ab_hot) < 1024)
- return False;
- if (AssemblyBuffer__getRemainingSize(ab_cold) < 1024)
- return False;
- if (RelocationBuffer__getRemainingSize(rb) < 128)
- return False;
-
- /* Count how many hot and cold instructions (NInstrs) the template
- has, since we'll need to allocate temporary arrays to keep track
- of the label offsets. */
- UInt nHot, nCold;
- for (nHot = 0; tmpl->hot[nHot]; nHot++)
- ;
- for (nCold = 0; tmpl->cold[nCold]; nCold++)
- ;
-
- /* Here are our two arrays for tracking the AssemblyBuffer offsets
- of the NCode instructions. */
- UInt i;
- UInt offsetsHot[nHot];
- UInt offsetsCold[nCold];
- for (i = 0; i < nHot; i++) offsetsHot[i] = 0;
- for (i = 0; i < nCold; i++) offsetsCold[i] = 0;
-
- /* We'll be adding entries to the relocation buffer, |rb|, and will
- need to adjust their |dst| fields after generation of the hot
- and cold code. Record therefore where we are in the buffer now,
- so that we can iterate over the new entries later. */
- UInt rb_first = RelocationBuffer__getNext(rb);
-
- /* Generate the hot code */
- for (i = 0; i < nHot; i++) {
- offsetsHot[i] = AssemblyBuffer__getNext(ab_hot);
- NLabel lbl = mkNLabel(Nlz_Hot, i);
- emit_AMD64NInstr(ab_hot, rb, tmpl->hot[i], &nregMap,
- rregsLiveAfter, verbose, lbl);
- }
-
- /* And the cold code */
- for (i = 0; i < nCold; i++) {
- offsetsCold[i] = AssemblyBuffer__getNext(ab_cold);
- NLabel lbl = mkNLabel(Nlz_Cold, i);
- emit_AMD64NInstr(ab_cold, rb, tmpl->cold[i], &nregMap,
- rregsLiveAfter, verbose, lbl);
- }
-
- /* Now visit the new relocation entries. */
- UInt rb_last1 = RelocationBuffer__getNext(rb);
-
- for (i = rb_first; i < rb_last1; i++) {
- Relocation* reloc = &rb->buf[i];
-
- /* Show the reloc before the label-to-offset transformation. */
- if (verbose) {
- vex_printf(" reloc: ");
- ppRelocation(reloc);
- vex_printf("\n");
- }
-
- /* Transform the destination component of |reloc| so that it no
- longer refers to a label but rather to an offset in the hot
- or cold assembly buffer. */
- vassert(!reloc->dst.isOffset);
- reloc->dst.isOffset = True;
-
- if (reloc->dst.zone == Nlz_Hot) {
- vassert(reloc->dst.num < nHot);
- reloc->dst.num = offsetsHot[reloc->dst.num];
- } else {
- vassert(reloc->dst.zone == Nlz_Cold);
- vassert(reloc->dst.num < nCold);
- reloc->dst.num = offsetsCold[reloc->dst.num];
- }
-
- /* Show the reloc after the label-to-offset transformation. */
- if (verbose) {
- vex_printf(" reloc: ");
- ppRelocation(reloc);
- vex_printf("\n");
- }
- }
-
- if (0) {
- HReg r10 = hregAMD64_R10();
- HReg rax = hregAMD64_RAX();
- HReg rbx = hregAMD64_RBX();
- HReg rcx = hregAMD64_RCX();
- HReg rdx = hregAMD64_RDX();
-
- RRegSet* rs = RRegSet__new(univ);
- vex_printf("\n__new\n");
- vex_printf("1: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- vex_printf("\n__add\n");
- RRegSet__add(rs, rbx);
- vex_printf("2: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__add(rs, rdx);
- vex_printf("3: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__add(rs, rcx);
- vex_printf("4: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__add(rs, rcx);
- vex_printf("5: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__add(rs, r10);
- vex_printf("6: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__add(rs, rax);
- vex_printf("7: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- vex_printf("\n__fromVec\n");
- const HReg vec[4] = { rdx, rcx, rbx, rax };
- RRegSet__fromVec(rs, vec, 0);
- vex_printf("8: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__fromVec(rs, vec, 4);
- vex_printf("9: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- vex_printf("\n__del\n");
- RRegSet__del(rs, rcx);
- vex_printf("10: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__del(rs, rcx);
- vex_printf("11: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__del(rs, rbx);
- vex_printf("12: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__del(rs, rax);
- vex_printf("13: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__del(rs, rdx);
- vex_printf("14: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- RRegSet__del(rs, rdx);
- vex_printf("15: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
-
- vex_printf("\n__plus\n");
- RRegSet* rs2 = RRegSet__new(univ);
- RRegSet__add(rs, r10); RRegSet__add(rs, rax);
- RRegSet__add(rs2, rbx); RRegSet__add(rs2, rcx); RRegSet__add(rs2, rax);
-
- RRegSet__plus(rs2, rs);
- vex_printf("16a: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
- vex_printf("16b: "); RRegSet__pp(rs2, ppHRegAMD64); vex_printf("\n");
-
- vex_printf("\n__minus\n");
- RRegSet__minus(rs, rs2);
- vex_printf("17: "); RRegSet__pp(rs, ppHRegAMD64); vex_printf("\n");
-
- }
-
- return True;
-}
-
-/* Find the real (hard) register for |r| by looking up in |map|. */
-static HReg mapNReg ( const NRegMap* map, NReg r )
-{
- UInt limit = 0;
- const HReg* arr = NULL;
- switch (r.role) {
- case Nrr_Result: limit = map->nRegsR; arr = map->regsR; break;
- case Nrr_Argument: limit = map->nRegsA; arr = map->regsA; break;
- case Nrr_Scratch: limit = map->nRegsS; arr = map->regsS; break;
- default: vpanic("mapNReg: invalid reg role");
- }
- vassert(r.num < limit);
- return arr[r.num];
-}
-
-/* ***FIXME*** this is an exact copy of the same in host_amd64_isel.c. */
-static AMD64Instr* mk_iMOVsd_RR ( HReg src, HReg dst )
-{
- vassert(hregClass(src) == HRcInt64);
- vassert(hregClass(dst) == HRcInt64);
- return AMD64Instr_Alu64R(Aalu_MOV, AMD64RMI_Reg(src), dst);
-}
-
-
+/* Emits AMD64 code for a single NInstr |ni| into |ab|, possibly
+ adding relocation information into |rb| too.
+*/
static
void emit_AMD64NInstr ( /*MOD*/AssemblyBuffer* ab,
/*MOD*/RelocationBuffer* rb,
@@ -4298,78 +4103,15 @@
}
case Nin_Call: {
- /* The main difficulty here is to figure out the minimal set
- of registers to save across the call. As far as I can see, the
- set is:
-
- (1) registers live after this NCode block
- (2) + the Arg, Res and Scratch registers for this block
- (3) - Abi_Callee_Saved registers
- (4) - the Arg/Res/Scratch register(s) into which this call
- will place its results
-
- (1) because that's the set of regs that reg-alloc expects to
- not be trashed by the NCode block
- (2) because Arg/Res/Scratch regs can be used freely within the
- NCode block, so we have to keep them alive
- (3) because preserving Callee saved regs is obviously pointless
- (4) because preserving the call's result reg(s) will result in
- the restore sequence overwriting the result of the call
-
- Figuring out (1) is tricky and is something that reg-alloc
- needs to tell us. I think it's safe to start with an
- overestimate of (1) -- for example, all regs available to
- reg-alloc -- and refine it later.
- */
- const RRegUniverse* univ = RRegSet__getUniverse(hregsLiveAfter);
- const RRegSet* set_1 = hregsLiveAfter;
-
- RRegSet* set_2 = RRegSet__new(univ);
- { UInt i;
- for (i = 0; i < nregMap->nRegsR; i++)
- RRegSet__add(set_2, nregMap->regsR[i]);
- for (i = 0; i < nregMap->nRegsA; i++)
- RRegSet__add(set_2, nregMap->regsA[i]);
- for (i = 0; i < nregMap->nRegsS; i++)
- RRegSet__add(set_2, nregMap->regsS[i]);
- }
-
- RRegSet* set_3 = RRegSet__new(univ);
- // callee-saves: rbx rbp r12 r13 r14 r15
- { HReg vec[6];
- vec[0] = hregAMD64_RBX(); vec[1] = hregAMD64_RBP();
- vec[2] = hregAMD64_R12(); vec[3] = hregAMD64_R13();
- vec[4] = hregAMD64_R14(); vec[5] = hregAMD64_R15();
- RRegSet__fromVec(set_3, vec, sizeof(vec)/sizeof(vec[0]));
- }
-
- RRegSet* set_4 = RRegSet__new(univ);
- if (!isNRegINVALID(ni->Nin.Call.resHi))
- RRegSet__add(set_4, mapNReg(nregMap, ni->Nin.Call.resHi));
- if (!isNRegINVALID(ni->Nin.Call.resLo))
- RRegSet__add(set_4, mapNReg(nregMap, ni->Nin.Call.resLo));
-
- RRegSet* to_preserve = RRegSet__new(univ);
- RRegSet__copy(to_preserve, set_1);
- RRegSet__plus(to_preserve, set_2);
- RRegSet__minus(to_preserve, set_3);
- RRegSet__minus(to_preserve, set_4);
-
- if (verbose) {
- vex_printf(" # set1: ");
- RRegSet__pp(set_1, ppHRegAMD64); vex_printf("\n");
- vex_printf(" # set2: ");
- RRegSet__pp(set_2, ppHRegAMD64); vex_printf("\n");
- vex_printf(" # set3: ");
- RRegSet__pp(set_3, ppHRegAMD64); vex_printf("\n");
- vex_printf(" # set4: ");
- RRegSet__pp(set_4, ppHRegAMD64); vex_printf("\n");
- vex_printf(" # pres: ");
- RRegSet__pp(to_preserve, ppHRegAMD64); vex_printf("\n");
- }
+ RRegSet to_preserve;
+ calcRegistersToPreserveAroundNCodeCall(
+ &to_preserve,
+ hregsLiveAfter, getRRegsCalleeSaved_AMD64(), nregMap,
+ ni->Nin.Call.resHi, ni->Nin.Call.resLo
+ );
/* Save live regs */
- UInt n_to_preserve = RRegSet__card(to_preserve);
+ UInt n_to_preserve = RRegSet__card(&to_preserve);
vassert(n_to_preserve < 25); /* stay sane */
/* Figure out how much to move the stack, ensuring any alignment up
@@ -4382,7 +4124,7 @@
}
RRegSetIterator* iter = RRegSetIterator__new();
- RRegSetIterator__init(iter, to_preserve);
+ RRegSetIterator__init(iter, &to_preserve);
UInt slotNo = 0;
while (True) {
HReg r = RRegSetIterator__next(iter);
@@ -4426,7 +4168,7 @@
}
/* Restore live regs */
- RRegSetIterator__init(iter, to_preserve);
+ RRegSetIterator__init(iter, &to_preserve);
slotNo = 0;
while (True) {
HReg r = RRegSetIterator__next(iter);
@@ -4582,6 +4324,127 @@
}
+/* Emits AMD64 code for the complete NCode block |hi| into |ab_hot|
+ and |ab_cold|, possibly adding relocation information to |rb| too.
+ This function can only handle NCode blocks. All other AMD64
+ instructions are to be handled by emit_AMD64Instr. This function
+ is required to generate <= 1024 bytes of code. Returns True if OK,
+ False if not enough buffer space.
+*/
+Bool emit_AMD64NCodeBlock ( /*MOD*/AssemblyBuffer* ab_hot,
+ /*MOD*/AssemblyBuffer* ab_cold,
+ /*MOD*/RelocationBuffer* rb,
+ const AMD64Instr* hi,
+ Bool mode64, VexEndness endness_host,
+ Bool verbose )
+{
+ vassert(mode64 == True);
+ vassert(endness_host == VexEndnessLE);
+ vassert(hi->tag == Ain_NCode);
+
+ const AMD64InstrNCode* hi_details = hi->Ain.NCode.details;
+ const NCodeTemplate* tmpl = hi_details->tmpl;
+ const RRegSet* rregsLiveAfter = hi_details->rrLiveAfter;
+ const RRegUniverse* univ = RRegSet__getUniverse(rregsLiveAfter);
+
+ NRegMap nregMap;
+ nregMap.regsR = hi_details->regsR;
+ nregMap.regsA = hi_details->regsA;
+ nregMap.regsS = hi_details->regsS;
+ nregMap.nRegsR = tmpl->nres;
+ nregMap.nRegsA = tmpl->narg;
+ nregMap.nRegsS = tmpl->nscr;
+
+ vassert(hregVecLen(nregMap.regsR) == nregMap.nRegsR);
+ vassert(hregVecLen(nregMap.regsA) == nregMap.nRegsA);
+ vassert(hregVecLen(nregMap.regsS) == nregMap.nRegsS);
+
+ if (AssemblyBuffer__getRemainingSize(ab_hot) < 1024)
+ return False;
+ if (AssemblyBuffer__getRemainingSize(ab_cold) < 1024)
+ return False;
+ if (RelocationBuffer__getRemainingSize(rb) < 128)
+ return False;
+
+ /* Count how many hot and cold instructions (NInstrs) the template
+ has, since we'll need to allocate temporary arrays to keep track
+ of the label offsets. */
+ UInt nHot, nCold;
+ for (nHot = 0; tmpl->hot[nHot]; nHot++)
+ ;
+ for (nCold = 0; tmpl->cold[nCold]; nCold++)
+ ;
+
+ /* Here are our two arrays for tracking the AssemblyBuffer offsets
+ of the NCode instructions. */
+ UInt i;
+ UInt offsetsHot[nHot];
+ UInt offsetsCold[nCold];
+ for (i = 0; i < nHot; i++) offsetsHot[i] = 0;
+ for (i = 0; i < nCold; i++) offsetsCold[i] = 0;
+
+ /* We'll be adding entries to the relocation buffer, |rb|, and will
+ need to adjust their |dst| fields after generation of the hot
+ and cold code. Record therefore where we are in the buffer now,
+ so that we can iterate over the new entries later. */
+ UInt rb_first = RelocationBuffer__getNext(rb);
+
+ /* Generate the hot code */
+ for (i = 0; i < nHot; i++) {
+ offsetsHot[i] = AssemblyBuffer__getNext(ab_hot);
+ NLabel lbl = mkNLabel(Nlz_Hot, i);
+ emit_AMD64NInstr(ab_hot, rb, tmpl->hot[i], &nregMap,
+ rregsLiveAfter, verbose, lbl);
+ }
+
+ /* And the cold code */
+ for (i = 0; i < nCold; i++) {
+ offsetsCold[i] = AssemblyBuffer__getNext(ab_cold);
+ NLabel lbl = mkNLabel(Nlz_Cold, i);
+ emit_AMD64NInstr(ab_cold, rb, tmpl->cold[i], &nregMap,
+ rregsLiveAfter, verbose, lbl);
+ }
+
+ /* Now visit the new relocation entries. */
+ UInt rb_last1 = RelocationBuffer__getNext(rb);
+
+ for (i = rb_first; i < rb_last1; i++) {
+ Relocation* reloc = &rb->buf[i];
+
+ /* Show the reloc before the label-to-offset transformation. */
+ if (verbose) {
+ vex_printf(" reloc: ");
+ ppRelocation(reloc);
+ vex_printf("\n");
+ }
+
+ /* Transform the destination component of |reloc| so that it no
+ longer refers to a label but rather to an offset in the hot
+ or cold assembly buffer. */
+ vassert(!reloc->dst.isOffset);
+ reloc->dst.isOffset = True;
+
+ if (reloc->dst.zone == Nlz_Hot) {
+ vassert(reloc->dst.num < nHot);
+ reloc->dst.num = offsetsHot[reloc->dst.num];
+ } else {
+ vassert(reloc->dst.zone == Nlz_Cold);
+ vassert(reloc->dst.num < nCold);
+ reloc->dst.num = offsetsCold[reloc->dst.num];
+ }
+
+ /* Show the reloc after the label-to-offset transformation. */
+ if (verbose) {
+ vex_printf(" reloc: ");
+ ppRelocation(reloc);
+ vex_printf("\n");
+ }
+ }
+
+ return True;
+}
+
+
/* --------- Helpers for translation chaining. --------- */
/* How big is an event check? See case for Ain_EvCheck in
Modified: branches/NCODE/priv/host_amd64_defs.h
==============================================================================
--- branches/NCODE/priv/host_amd64_defs.h (original)
+++ branches/NCODE/priv/host_amd64_defs.h Sun Apr 12 10:23:58 2015
@@ -830,6 +830,9 @@
extern void ppAMD64Instr ( const AMD64Instr*, Bool );
+/* Handy helper, for generating integer reg-reg moves. */
+extern AMD64Instr* mk_iMOVsd_RR ( HReg src, HReg dst );
+
/* Some functions that insulate the register allocator from details
of the underlying instruction set. */
extern void getRegUsage_AMD64Instr ( HRegUsage*, const AMD64Instr*, Bool );
@@ -839,12 +842,12 @@
const AMD64Instr*, Bool, VexEndness,
const VexDispatcherAddresses* );
-extern Bool emit_AMD64NCode ( /*MOD*/AssemblyBuffer* ab_hot,
- /*MOD*/AssemblyBuffer* ab_cold,
- /*MOD*/RelocationBuffer* rb,
- const AMD64Instr* hi,
- Bool mode64, VexEndness endness_host,
- Bool verbose );
+extern Bool emit_AMD64NCodeBlock ( /*MOD*/AssemblyBuffer* ab_hot,
+ /*MOD*/AssemblyBuffer* ab_cold,
+ /*MOD*/RelocationBuffer* rb,
+ const AMD64Instr* hi,
+ Bool mode64, VexEndness endness_host,
+ Bool verbose );
extern void genSpill_AMD64 ( /*OUT*/HInstr** i1, /*OUT*/HInstr** i2,
HReg rreg, Bool spRel, Int offset, Bool );
Modified: branches/NCODE/priv/host_amd64_isel.c
==============================================================================
--- branches/NCODE/priv/host_amd64_isel.c (original)
+++ branches/NCODE/priv/host_amd64_isel.c Sun Apr 12 10:23:58 2015
@@ -309,9 +309,9 @@
&& e->Iex.Const.con->Ico.U32 == 0;
}
-/* Make a int reg-reg move. */
+/* Make an int reg-reg move. */
-static AMD64Instr* mk_iMOVsd_RR ( HReg src, HReg dst )
+/*notstatic*/ AMD64Instr* mk_iMOVsd_RR ( HReg src, HReg dst )
{
vassert(hregClass(src) == HRcInt64);
vassert(hregClass(dst) == HRcInt64);
Modified: branches/NCODE/priv/host_generic_regs.c
==============================================================================
--- branches/NCODE/priv/host_generic_regs.c (original)
+++ branches/NCODE/priv/host_generic_regs.c Sun Apr 12 10:23:58 2015
@@ -120,16 +120,6 @@
/*--- Real register sets ---*/
/*---------------------------------------------------------*/
-/* Represents sets of real registers. |bits| is interpreted in the
- context of |univ|. That is, each bit index |i| in |bits|
- corresponds to the register |univ->regs[i]|. This relies
- entirely on the fact that N_RREGUNIVERSE_REGS <= 64.
-*/
-struct _RRegSet {
- ULong bits;
- const RRegUniverse* univ;
-};
-
STATIC_ASSERT(N_RREGUNIVERSE_REGS <= 8 * sizeof(ULong));
/* Print a register set, using the arch-specific register printing
@@ -153,13 +143,19 @@
vex_printf("}");
}
-/* Create a new, empty, set. */
+/* Initialise an RRegSet, making it empty. */
+inline void RRegSet__init ( /*OUT*/RRegSet* set, const RRegUniverse* univ )
+{
+ set->bits = 0;
+ set->univ = univ;
+}
+
+/* Create a new, empty, set, in the normal (transient) heap. */
RRegSet* RRegSet__new ( const RRegUniverse* univ )
{
vassert(univ);
RRegSet* set = LibVEX_Alloc_inline(sizeof(RRegSet));
- set->bits = 0;
- set->univ = univ;
+ RRegSet__init(set, univ);
return set;
}
@@ -174,6 +170,7 @@
duplicates. */
void RRegSet__fromVec ( /*MOD*/RRegSet* dst, const HReg* vec, UInt nVec )
{
+ dst->bits = 0;
for (UInt i = 0; i < nVec; i++) {
HReg r = vec[i];
vassert(!hregIsInvalid(r) && !hregIsVirtual(r));
@@ -229,6 +226,22 @@
return __builtin_popcountll(set->bits);
}
+/* Remove non-allocatable registers from this set. Because the set
+ carries its register universe, we can consult that to find the
+ non-allocatable registers, so no other parameters are needed. */
+void RRegSet__deleteNonAllocatable ( /*MOD*/RRegSet* set )
+{
+ const RRegUniverse* univ = set->univ;
+ UInt allocable = univ->allocable;
+ if (UNLIKELY(allocable == N_RREGUNIVERSE_REGS)) {
+ return;
+ /* otherwise we'd get an out-of-range shift below */
+ }
+ vassert(allocable > 0 && allocable < N_RREGUNIVERSE_REGS);
+ ULong mask = (1ULL << allocable) - 1;
+ set->bits &= mask;
+}
+
struct _RRegSetIterator {
const RRegSet* set;
@@ -398,6 +411,20 @@
/*NOTREACHED*/
}
+void addHRegUse_from_RRegSet ( HRegUsage* tab,
+ HRegMode mode, const RRegSet* set )
+{
+ STATIC_ASSERT(sizeof(tab->rRead) == sizeof(tab->rWritten));
+ STATIC_ASSERT(sizeof(tab->rRead) == sizeof(set->bits));
+ switch (mode) {
+ case HRmRead: tab->rRead |= set->bits; break;
+ case HRmWrite: tab->rWritten |= set->bits; break;
+ case HRmModify: tab->rRead |= set->bits;
+ tab->rWritten |= set->bits; break;
+ default: vassert(0);
+ }
+}
+
/*---------------------------------------------------------*/
/*--- Indicating register remappings (for reg-alloc) ---*/
@@ -531,6 +558,128 @@
}
+/*---------------------------------------------------------*/
+/*--- NCode generation helpers ---*/
+/*---------------------------------------------------------*/
+
+/* Find the length of a vector of HRegs that is terminated by
+ an HReg_INVALID. */
+UInt hregVecLen ( const HReg* vec )
+{
+ UInt i;
+ for (i = 0; !hregIsInvalid(vec[i]); i++)
+ ;
+ return i;
+}
+
+
+/* Find the real (hard) register for |r| by looking up in |map|. */
+HReg mapNReg ( const NRegMap* map, NReg r )
+{
+ UInt limit = 0;
+ const HReg* arr = NULL;
+ switch (r.role) {
+ case Nrr_Result: limit = map->nRegsR; arr = map->regsR; break;
+ case Nrr_Argument: limit = map->nRegsA; arr = map->regsA; break;
+ case Nrr_Scratch: limit = map->nRegsS; arr = map->regsS; break;
+ default: vpanic("mapNReg: invalid reg role");
+ }
+ vassert(r.num < limit);
+ return arr[r.num];
+}
+
+
+/* Compute the minimal set of registers to preserve around calls
+ embedded within NCode blocks. */
+void calcRegistersToPreserveAroundNCodeCall (
+ /*OUT*/RRegSet* result,
+ const RRegSet* hregsLiveAfterTheNCodeBlock,
+ const RRegSet* abiCallerSavedRegs,
+ const NRegMap* nregMap,
+ NReg nregResHi,
+ NReg nregResLo
+ )
+{
+ /* This function deals with one of the main difficulties of NCode
+ templates, which is that of figuring out the minimal set of
+ registers to save across calls embedded inside NCode blocks. As
+ far as I can see, the set is:
+
+ (1) registers live after the NCode block
+ (2) + the Arg, Res and Scratch registers for the block
+ (3) - Abi_Callee_Saved registers
+ (4) - the Arg/Res/Scratch register(s) into which the call
+ will place its results
+
+ (1) because that's the set of regs that reg-alloc expects to
+ not be trashed by the NCode block
+ (2) because Arg/Res/Scratch regs can be used freely within the
+ NCode block, so we have to keep them alive
+ (3) because preserving Callee saved regs is obviously pointless
+ (4) because preserving the call's result reg(s) will result in
+ the restore sequence overwriting the result of the call
+
+ (2) (3) (4) are either constants or something we can find from
+ inspection of the relevant NInstr (call) alone. (1) is
+ something that depends on instructions after the NCode block
+ and so is something that the register allocator has to tell us.
+
+ Another detail is that we remove from the set, all registers not
+ available to the register allocator. That is, we save across
+ the call, only registers available to the allocator. That
+ assumes that all fixed-use or otherwise-not-allocatable
+ registers, that we care about, are callee-saved. AFAIK the only
+ important register is the baseblock register, and that is indeed
+ callee-saved on all targets.
+ */
+ const RRegUniverse* univ
+ = RRegSet__getUniverse(hregsLiveAfterTheNCodeBlock);
+
+ const RRegSet* set_1 = hregsLiveAfterTheNCodeBlock;
+
+ RRegSet set_2;
+ RRegSet__init(&set_2, univ);
+ for (UInt i = 0; i < nregMap->nRegsR; i++)
+ RRegSet__add(&set_2, nregMap->regsR[i]);
+ for (UInt i = 0; i < nregMap->nRegsA; i++)
+ RRegSet__add(&set_2, nregMap->regsA[i]);
+ for (UInt i = 0; i < nregMap->nRegsS; i++)
+ RRegSet__add(&set_2, nregMap->regsS[i]);
+
+ const RRegSet* set_3 = abiCallerSavedRegs;
+ vassert(univ == RRegSet__getUniverse(set_3));
+
+ RRegSet set_4;
+ RRegSet__init(&set_4, univ);
+ if (!isNRegINVALID(nregResHi))
+ RRegSet__add(&set_4, mapNReg(nregMap, nregResHi));
+ if (!isNRegINVALID(nregResLo))
+ RRegSet__add(&set_4, mapNReg(nregMap, nregResLo));
+
+ RRegSet__init(result, univ);
+ RRegSet__copy(result, set_1);
+ RRegSet__plus(result, &set_2);
+ RRegSet__minus(result, set_3);
+ RRegSet__minus(result, &set_4);
+
+ if (0) {
+ vex_printf(" # set1: ");
+ RRegSet__pp(set_1, ppHReg); vex_printf("\n");
+ vex_printf(" # set2: ");
+ RRegSet__pp(&set_2, ppHReg); vex_printf("\n");
+ vex_printf(" # set3: ");
+ RRegSet__pp(set_3, ppHReg); vex_printf("\n");
+ vex_printf(" # set4: ");
+ RRegSet__pp(&set_4, ppHReg); vex_printf("\n");
+ vex_printf(" # pres: ");
+ RRegSet__pp(result, ppHReg); vex_printf("\n");
+ }
+
+ /* Remove any non allocatable registers (see big comment above) */
+ RRegSet__deleteNonAllocatable(result);
+}
+
+
/*---------------------------------------------------------------*/
/*--- end host_generic_regs.c ---*/
/*---------------------------------------------------------------*/
Modified: branches/NCODE/priv/host_generic_regs.h
==============================================================================
--- branches/NCODE/priv/host_generic_regs.h (original)
+++ branches/NCODE/priv/host_generic_regs.h Sun Apr 12 10:23:58 2015
@@ -238,14 +238,34 @@
/*--- Real Register Sets ---*/
/*---------------------------------------------------------*/
-/* ABSTYPE */
-typedef struct _RRegSet RRegSet;
+/* Represents sets of real registers. |bits| is interpreted in the
+ context of |univ|. That is, each bit index |i| in |bits|
+ corresponds to the register |univ->regs[i]|. This relies
+ entirely on the fact that N_RREGUNIVERSE_REGS <= 64.
+
+ It would have been nice to have been able to make this abstract,
+ but it is necessary to declare globals of this type. Hence the
+ size has to be known to the users of the type and so it can't be
+ abstract.
+*/
+typedef
+ struct {
+ ULong bits;
+ const RRegUniverse* univ;
+ }
+ RRegSet;
+
+STATIC_ASSERT(N_RREGUNIVERSE_REGS <= 8 * sizeof(ULong));
+
/* Print a register set, using the arch-specific register printing
function |regPrinter| supplied. */
extern void RRegSet__pp ( const RRegSet* set, void (*regPrinter)(HReg) );
-/* Create a new, empty, set. */
+/* Initialise an RRegSet, making it empty. */
+extern void RRegSet__init ( /*OUT*/RRegSet* set, const RRegUniverse* univ );
+
+/* Create a new, empty, set, in the normal (transient) heap. */
extern RRegSet* RRegSet__new ( const RRegUniverse* univ );
/* Return the RRegUniverse for a given RRegSet. */
@@ -275,6 +295,11 @@
/* Returns the number of elements in |set|. */
extern UInt RRegSet__card ( const RRegSet* set );
+/* Remove non-allocatable registers from this set. Because the set
+ carries its register universe, we can consult that to find the
+ non-allocatable registers, so no other parameters are needed. */
+extern void RRegSet__deleteNonAllocatable ( /*MOD*/RRegSet* set );
+
/* Iterating over RRegSets. */
/* ABSTYPE */
@@ -344,6 +369,9 @@
extern Bool HRegUsage__contains ( const HRegUsage*, HReg );
+extern void addHRegUse_from_RRegSet ( HRegUsage*, HRegMode, const RRegSet* );
+
+
/*---------------------------------------------------------*/
/*--- Indicating register remappings (for reg-alloc) ---*/
/*---------------------------------------------------------*/
@@ -702,6 +730,46 @@
);
+/*---------------------------------------------------------*/
+/*--- NCode generation helpers ---*/
+/*---------------------------------------------------------*/
+
+/* Find the length of a vector of HRegs that is terminated by
+ an HReg_INVALID. */
+extern UInt hregVecLen ( const HReg* vec );
+
+
+/* A handy structure to hold the register environment for an NCode
+ block -- that is, the NReg to HReg mapping. */
+typedef
+ struct {
+ UInt nRegsR;
+ const HReg* regsR;
+ UInt nRegsA;
+ const HReg* regsA;
+ UInt nRegsS;
+ const HReg* regsS;
+ }
+ NRegMap;
+
+/* Find the real (hard) register for |r| by looking up in |map|. */
+extern HReg mapNReg ( const NRegMap* map, NReg r );
+
+
+/* Compute the minimal set of registers to preserve around calls
+ embedded within NCode blocks. See implementation for a detailed
+ comment. */
+extern
+void calcRegistersToPreserveAroundNCodeCall (
+ /*OUT*/RRegSet* result,
+ const RRegSet* hregsLiveAfterTheNCodeBlock,
+ const RRegSet* abiCallerSavedRegs,
+ const NRegMap* nregMap,
+ NReg nregResHi,
+ NReg nregResLo
+ );
+
+
#endif /* ndef __VEX_HOST_GENERIC_REGS_H */
/*---------------------------------------------------------------*/
Modified: branches/NCODE/priv/main_main.c
==============================================================================
--- branches/NCODE/priv/main_main.c (original)
+++ branches/NCODE/priv/main_main.c Sun Apr 12 10:23:58 2015
@@ -1128,9 +1128,9 @@
if (UNLIKELY( AssemblyBuffer__getRemainingSize(&ab_hot) < 1024 )
|| UNLIKELY( AssemblyBuffer__getRemainingSize(&ab_cold) < 1024 ))
goto outputBufferFull;
- Bool ok = emit_AMD64NCode ( &ab_hot, &ab_cold, &rb, hi,
- mode64, vta->archinfo_host.endness,
- !!(vex_traceflags & VEX_TRACE_ASM));
+ Bool ok = emit_AMD64NCodeBlock ( &ab_hot, &ab_cold, &rb, hi,
+ mode64, vta->archinfo_host.endness,
+ !!(vex_traceflags & VEX_TRACE_ASM));
if (!ok)
goto outputBufferFull;
}
Modified: branches/NCODE/priv/main_util.h
==============================================================================
--- branches/NCODE/priv/main_util.h (original)
+++ branches/NCODE/priv/main_util.h Sun Apr 12 10:23:58 2015
@@ -51,7 +51,8 @@
#endif
// Poor man's static assert
-#define STATIC_ASSERT(x) extern int vex__unused_array[(x) ? 1 : -1]
+#define STATIC_ASSERT(x) extern int vex__unused_array[(x) ? 1 : -1] \
+ __attribute__((unused))
/* Stuff for panicking and assertion. */
|
|
From: Zhigang L. <zl...@ez...> - 2015-04-12 05:24:51
|
________________________________________
From: Philippe Waroquiers <phi...@sk...>
Sent: Saturday, April 11, 2015 7:30 AM
To: Zhigang Liu
Cc: Valgrind Developers
Subject: Re: [Valgrind-developers] I need access to a TILEGX :) : libvexmultiarch_test failing with TILEGX host
On Sat, 2015-04-11 at 01:12 +0200, Philippe Waroquiers wrote:
> On Sat, 2015-04-11 at 01:02 +0200, Philippe Waroquiers wrote:
> > Julian,
> > do you agree that the offB_HOST_* offsets are depending on the host
> > architecture, and not on the guest architecture ?
> Moving the offB_HOST_* to the arch_host switch makes
> guest amd64/host tilegx
> work ok.
>
> It looks to me that this is the good thing to do
After an irc discussion with Julian, it became clear that this
is not the good thing to do, and that I misunderstood
the somewhat misleading names offB_HOST_EvC_COUNTER and
offB_HOST_EvC_FAILADDR.
Here is what I understand now:
These offB_HOST_* are really offset in the guest state,
which give locations in the guest state that are used by the
(generated) host code.
Basically, a translation entry (generated host code) is doing
if (-- guest_state->COUNTER) == 0) goto guest_state->FAILADDR
So, COUNTER and FAILADDR are in the guest state.
FAILADDR must be an host address
(this is in fact wrongly defined in all 32 bits guest states.
E.g. libvex_guest_x86.h and libvex_guest_ppc32.h defines
UInt host_EvC_FAILADDR;
while it should be the size of an host address (or at least
big enough to hold a 64 bit host address, if the host would
be 64 bits in a multiarch setup).
So, now I think the problem guest amd64/host tilegx
is better solved in the host tilegx code, that should ensure to always
generate the same nr of bytes for the evCheck instructions
(this was suggested by Zhigang)
(or maybe dynamically compute
the needed nr of instructions for an eventcheck, depending
on the offsets of the host_EvC_*, that changes the size of the
instructions).
Zhigang, does the above look reasonable to do in tilegx ?
Yes, thank you for finding this issue. I have a simple patch for this, would you mind to have a try.
Thanks
--- ZhiGang
******* Begin of the patch ******
Index: priv/host_tilegx_defs.c
===================================================================
--- priv/host_tilegx_defs.c (revision 3125)
+++ priv/host_tilegx_defs.c (working copy)
@@ -1348,11 +1348,10 @@
static UChar *doAMode_IR ( UChar * p, UInt opc1, UInt rSD, TILEGXAMode * am )
{
- UInt rA; //, idx;
+ UInt rA;
vassert(am->tag == GXam_IR);
rA = iregNo(am->GXam.IR.base);
- //idx = am->GXam.IR.index;
if (opc1 == TILEGX_OPC_ST1 || opc1 == TILEGX_OPC_ST2 ||
opc1 == TILEGX_OPC_ST4 || opc1 == TILEGX_OPC_ST) {
@@ -1381,19 +1380,29 @@
return p;
}
-/* Generate a machine-word sized load or store. Simplified version of
- the GXin_Load and GXin_Store cases below. */
+/* Generate a machine-word sized load or store using exact 2 bundles.
+ Simplified version of the GXin_Load and GXin_Store cases below. */
static UChar* do_load_or_store_machine_word ( UChar* p, Bool isLoad, UInt reg,
TILEGXAMode* am )
{
+ UInt rA = iregNo(am->GXam.IR.base);
+
if (am->tag != GXam_IR)
vpanic(__func__);
- if (isLoad) /* load */
- p = doAMode_IR(p, TILEGX_OPC_LD, reg, am);
- else /* store */
- p = doAMode_IR(p, TILEGX_OPC_ST, reg, am);
-
+ if (isLoad) /* load */ {
+ /* r51 is reserved scratch registers. */
+ p = mkInsnBin(p, mkTileGxInsn(TILEGX_OPC_ADDLI, 3,
+ 51, rA, am->GXam.IR.index));
+ /* load from address in r51 to rSD. */
+ p = mkInsnBin(p, mkTileGxInsn(TILEGX_OPC_LD, 2, reg, 51));
+ } else /* store */ {
+ /* r51 is reserved scratch registers. */
+ p = mkInsnBin(p, mkTileGxInsn(TILEGX_OPC_ADDLI, 3,
+ 51, rA, am->GXam.IR.index));
+ /* store rSD to address in r51 */
+ p = mkInsnBin(p, mkTileGxInsn(TILEGX_OPC_ST, 2, 51, reg));
+ }
return p;
}
******* END of the patch ******
(waiting for this to be done, I could always disable in the test
using tilegx as a host)
Thanks
Philippe
|
|
From: John R. <jr...@bi...> - 2015-04-11 23:36:12
|
> On MIPS (ASUS RT-N16): > $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP > ### empty output from grep: no AT_HWCAP at all ... because the C library is uClibc, not glibc. Some digging shows that the AUX vector is: 0x00000010 AT_HWCAP 0x00000000 0x00000006 AT_PAGESZ 0x00001000 0x00000011 AT_CLKTCK 0x00000064 0x00000003 AT_PHDR 0x00400034 0x00000004 AT_PHENT 0x00000020 0x00000005 AT_PHNUM 0x00000008 0x00000007 AT_BASE 0x2aaa8000 0x00000008 AT_FLAGS 0x00000000 0x00000009 AT_ENTRY 0x00401740 0x0000000b AT_UID 0x00000000 0x0000000c AT_EUID 0x00000000 0x0000000d AT_GID 0x00000000 0x0000000e AT_EGID 0x00000000 0x00000017 AT_SECURE 0x00000000 0x00000000 AT_NULL 0x00000000 Still, AT_HWCAP is 0, which omits information such as support for mips16 and dsp that is shown in /proc/cpuinfo below. The Linux kernel is 2.6.24 (dd-wrt + optware.) > > $ cat /proc/cpuinfo ### abbreviated :) > system type : Broadcom BCM4716 chip rev 1 > cpu model : MIPS 74K V4.0 > ASEs implemented : mips16 dsp From the viewpoint of the end user, a commandline override such as --cpu=... has an advantage because it allows working around bugs in AT_HWCAP and/or /proc/cpuinfo. |
|
From: <sv...@va...> - 2015-04-11 18:50:54
|
Author: florian
Date: Sat Apr 11 19:50:47 2015
New Revision: 15085
Log:
Update list of ignored files.
Modified:
trunk/none/tests/ (props changed)
trunk/none/tests/tilegx/ (props changed)
|
|
From: <sv...@va...> - 2015-04-11 14:33:01
|
Author: philippe
Date: Sat Apr 11 15:32:53 2015
New Revision: 3125
Log:
VEX side for revision 15084 (multi arch testing)
Modified:
trunk/priv/host_arm64_isel.c
trunk/priv/host_mips_defs.c
trunk/pub/libvex_guest_amd64.h
Modified: trunk/priv/host_arm64_isel.c
==============================================================================
--- trunk/priv/host_arm64_isel.c (original)
+++ trunk/priv/host_arm64_isel.c Sat Apr 11 15:32:53 2015
@@ -3849,6 +3849,12 @@
case Ist_IMark:
return;
+ /* --------- ABI HINT --------- */
+ /* These have no meaning (denotation in the IR) and so we ignore
+ them ... if any actually made it this far. */
+ case Ist_AbiHint:
+ return;
+
/* --------- NO-OP --------- */
case Ist_NoOp:
return;
Modified: trunk/priv/host_mips_defs.c
==============================================================================
--- trunk/priv/host_mips_defs.c (original)
+++ trunk/priv/host_mips_defs.c Sat Apr 11 15:32:53 2015
@@ -2070,7 +2070,18 @@
*p++ = toUChar((w32 >> 8) & 0x000000FF);
*p++ = toUChar((w32 >> 16) & 0x000000FF);
*p++ = toUChar((w32 >> 24) & 0x000000FF);
-#elif defined (_MIPSEB)
+/* HACK !!!!
+ MIPS endianess is decided at compile time using gcc defined
+ symbols _MIPSEL or _MIPSEB. When compiling libvex in a cross-arch
+ setup, then none of these is defined. We just choose here by default
+ mips Big Endian to allow libvexmultiarch_test to work when using
+ a mips host architecture.
+ A cleaner way would be to either have mips using 'dynamic endness'
+ (like ppc64be or le, decided at runtime) or at least defining
+ by default _MIPSEB when compiling on a non mips system.
+#elif defined (_MIPSEB).
+*/
+#else
*p++ = toUChar((w32 >> 24) & 0x000000FF);
*p++ = toUChar((w32 >> 16) & 0x000000FF);
*p++ = toUChar((w32 >> 8) & 0x000000FF);
Modified: trunk/pub/libvex_guest_amd64.h
==============================================================================
--- trunk/pub/libvex_guest_amd64.h (original)
+++ trunk/pub/libvex_guest_amd64.h Sat Apr 11 15:32:53 2015
@@ -124,6 +124,7 @@
delicately-balanced PutI/GetI optimisation machinery.
Therefore best to leave it as a UInt. */
UInt guest_FTOP;
+ UInt pad1;
ULong guest_FPREG[8];
UChar guest_FPTAG[8];
ULong guest_FPROUND;
@@ -131,6 +132,7 @@
/* Emulation notes */
UInt guest_EMNOTE;
+ UInt pad2;
/* Translation-invalidation area description. Not used on amd64
(there is no invalidate-icache insn), but needed so as to
@@ -167,7 +169,7 @@
ULong guest_IP_AT_SYSCALL;
/* Padding to make it have an 16-aligned size */
- ULong pad1;
+ ULong pad3;
}
VexGuestAMD64State;
|
|
From: <sv...@va...> - 2015-04-11 11:42:31
|
Author: philippe
Date: Sat Apr 11 12:42:22 2015
New Revision: 15083
Log:
Remove useless arguments in sparsewa, that were inheritated from WordFM
These arguments are not needed for sparsewa, as they can only
return the key given in input.
Modified:
trunk/coregrind/m_sparsewa.c
trunk/helgrind/libhb_core.c
trunk/include/pub_tool_sparsewa.h
Modified: trunk/coregrind/m_sparsewa.c
==============================================================================
--- trunk/coregrind/m_sparsewa.c (original)
+++ trunk/coregrind/m_sparsewa.c Sat Apr 11 12:42:22 2015
@@ -274,7 +274,7 @@
Bool VG_(lookupSWA) ( const SparseWA* swa,
- /*OUT*/UWord* keyP, /*OUT*/UWord* valP,
+ /*OUT*/UWord* valP,
UWord key )
{
Int i;
@@ -302,7 +302,6 @@
vg_assert(level0->nInUse > 0);
ix = key & 0xFF;
if (swa_bitarray_read(level0->inUse, ix) == 0) return False;
- *keyP = key; /* this is stupid. only here to make it look like WordFM */
*valP = level0->words[ix];
return True;
}
@@ -366,7 +365,7 @@
Bool VG_(delFromSWA) ( SparseWA* swa,
- /*OUT*/UWord* oldK, /*OUT*/UWord* oldV, UWord key )
+ /*OUT*/UWord* oldV, UWord key )
{
Int i;
UWord ix;
@@ -403,7 +402,6 @@
if (swa_bitarray_read_then_clear(level0->inUse, ix) == 0)
return False;
- *oldK = key; /* this is silly */
*oldV = level0->words[ix];
level0->nInUse--;
Modified: trunk/helgrind/libhb_core.c
==============================================================================
--- trunk/helgrind/libhb_core.c (original)
+++ trunk/helgrind/libhb_core.c Sat Apr 11 12:42:22 2015
@@ -4149,7 +4149,7 @@
OldRef* ref;
RCEC* rcec;
Word i, j;
- UWord keyW, valW;
+ UWord valW;
Bool b;
tl_assert(thr);
@@ -4173,14 +4173,13 @@
/* Look in the map to see if we already have a record for this
address. */
- b = VG_(lookupSWA)( oldrefTree, &keyW, &valW, a );
+ b = VG_(lookupSWA)( oldrefTree, &valW, a );
if (b) {
/* We already have a record for this address. We now need to
see if we have a stack trace pertaining to this (thrid, R/W,
size) triple. */
- tl_assert(keyW == a);
ref = (OldRef*)valW;
tl_assert(ref->magic == OldRef_MAGIC);
@@ -4287,7 +4286,7 @@
{
Word i, j;
OldRef* ref;
- UWord keyW, valW;
+ UWord valW;
Bool b;
ThrID cand_thrid;
@@ -4319,12 +4318,11 @@
cand_a = toCheck[j];
// VG_(printf)("test %ld %p\n", j, cand_a);
- b = VG_(lookupSWA)( oldrefTree, &keyW, &valW, cand_a );
+ b = VG_(lookupSWA)( oldrefTree, &valW, cand_a );
if (!b)
continue;
ref = (OldRef*)valW;
- tl_assert(keyW == cand_a);
tl_assert(ref->magic == OldRef_MAGIC);
tl_assert(ref->accs[0].thrid != 0); /* first slot must always be used */
@@ -4705,9 +4703,8 @@
for (i = 0; i < n2del; i++) {
Bool b;
Addr ga2del = *(Addr*)VG_(indexXA)( refs2del, i );
- b = VG_(delFromSWA)( oldrefTree, &keyW, &valW, ga2del );
+ b = VG_(delFromSWA)( oldrefTree, &valW, ga2del );
tl_assert(b);
- tl_assert(keyW == ga2del);
oldref = (OldRef*)valW;
for (j = 0; j < N_OLDREF_ACCS; j++) {
ThrID aThrID = oldref->accs[j].thrid;
Modified: trunk/include/pub_tool_sparsewa.h
==============================================================================
--- trunk/include/pub_tool_sparsewa.h (original)
+++ trunk/include/pub_tool_sparsewa.h Sat Apr 11 12:42:22 2015
@@ -63,21 +63,17 @@
// overwritten. Returned Bool is True iff a previous binding existed.
Bool VG_(addToSWA) ( SparseWA* swa, UWord key, UWord val );
-// Delete key from swa, returning associated key and val if found.
-// Note: returning associated key is stupid (it can only be the
-// key you just specified). This behaviour is retained to make it
-// easier to migrate from WordFM. Returned Bool is True iff
-// the key was actually bound in the mapping.
+// Delete key from swa, returning val if found.
+// Returned Bool is True iff the key was actually bound in the mapping.
Bool VG_(delFromSWA) ( SparseWA* swa,
- /*OUT*/UWord* oldK, /*OUT*/UWord* oldV,
+ /*OUT*/UWord* oldV,
UWord key );
// Indexes swa at 'key' (or, if you like, looks up 'key' in the
-// mapping), and returns the associated value, if any, in *valP. For
-// compatibility with WordFM, 'key' is also returned in *keyP. Returned
-// Bool is True iff a binding for 'key' actually existed.
+// mapping), and returns the associated value, if any, in *valP.
+// Returned Bool is True iff a binding for 'key' actually existed.
Bool VG_(lookupSWA) ( const SparseWA* swa,
- /*OUT*/UWord* keyP, /*OUT*/UWord* valP,
+ /*OUT*/UWord* valP,
UWord key );
// Set up 'swa' for iteration.
|
|
From: Philippe W. <phi...@sk...> - 2015-04-11 11:29:14
|
On Sat, 2015-04-11 at 01:12 +0200, Philippe Waroquiers wrote:
> On Sat, 2015-04-11 at 01:02 +0200, Philippe Waroquiers wrote:
> > Julian,
> > do you agree that the offB_HOST_* offsets are depending on the host
> > architecture, and not on the guest architecture ?
> Moving the offB_HOST_* to the arch_host switch makes
> guest amd64/host tilegx
> work ok.
>
> It looks to me that this is the good thing to do
After an irc discussion with Julian, it became clear that this
is not the good thing to do, and that I misunderstood
the somewhat misleading names offB_HOST_EvC_COUNTER and
offB_HOST_EvC_FAILADDR.
Here is what I understand now:
These offB_HOST_* are really offset in the guest state,
which give locations in the guest state that are used by the
(generated) host code.
Basically, a translation entry (generated host code) is doing
if (-- guest_state->COUNTER) == 0) goto guest_state->FAILADDR
So, COUNTER and FAILADDR are in the guest state.
FAILADDR must be an host address
(this is in fact wrongly defined in all 32 bits guest states.
E.g. libvex_guest_x86.h and libvex_guest_ppc32.h defines
UInt host_EvC_FAILADDR;
while it should be the size of an host address (or at least
big enough to hold a 64 bit host address, if the host would
be 64 bits in a multiarch setup).
So, now I think the problem guest amd64/host tilegx
is better solved in the host tilegx code, that should ensure to always
generate the same nr of bytes for the evCheck instructions
(this was suggested by Zhigang)
(or maybe dynamically compute
the needed nr of instructions for an eventcheck, depending
on the offsets of the host_EvC_*, that changes the size of the
instructions).
Zhigang, does the above look reasonable to do in tilegx ?
(waiting for this to be done, I could always disable in the test
using tilegx as a host)
Thanks
Philippe
|
|
From: James D. <J.H...@ba...> - 2015-04-11 08:10:33
|
On Fri, 10 Apr 2015, Josef Weidendorfer <Jos...@gm...> wrote On Fri, 10 Apr 2015, Josef Weidendorfer <Jos...@gm...> wrote Subject: Re: [Valgrind-developers] Characteristics of VG simulated CPU My student Stavros Kaparelos (ska...@gm...) has a modification for Cachegrind that tracks L1/L2/L3 separately, and that also tracks TLB as well. James Davenport Am 10.04.2015 um 10:44 schrieb Alex: > Can someone provide a quick explanation what are the characteristics > of VG simulated CPU (cache, cores, core speed, threads)? Cachegrind/Callgrind simulate one 2-level cache hierarchy with separate L1 data and L1 instuction caches, and unified L2. L1 and L2 are inclusive (not strict inclusive) with write-allocate and LRU replacement. Cache parameters (associativity/sizes) are taken per default from the CPU you run VG on. For newer Intel CPUs with L3, the real L3 parameters are used for the L2 in the cache model. As events, you get number of instructions executed (= fetched from L1), data read and written from/to L1, L1D/L1I and L2 misses. James Davenport National Teaching Fellow 2014 Hebron & Medlock Professor of Information Technology, University of Bath OpenMath Content Dictionary Editor Director of Studies EPSRC Doctoral Taught Course Centre for HPC Chair, IMU Committee on Electronic Information and Communication Vice-President and Academy Trustee, British Computer Society SW Coordinator, Computing at School Network of Excellence |
|
From: Zhigang L. <zl...@ez...> - 2015-04-11 05:10:51
|
You are right, I will add them. ZhiGang ________________________________________ From: Philippe Waroquiers <phi...@sk...> Sent: Friday, April 10, 2015 7:39 PM To: Valgrind Developers Subject: [Valgrind-developers] TileGX : gdbserver xml files missing ? I see that valgrind-low-tilegx.c references the files tilegx-linux-valgrind.xml and tilegx-linux.xml (see target_xml function) but these files are not present in coregrind/m_gdbserver directory and are not in the GDBSERVER_XML_FILES in coregrind/Makefile.am I guess these files were missing in the patch ? Philippe ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Valgrind-developers mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Petar J. <mip...@gm...> - 2015-04-11 02:37:27
|
> * /tmp/ccpgn2sW.s:3257: Error: opcode not supported on this processor: mips2 (mips2) `clo $t0,$t1'> /tmp/ccpgn2sW.s:3345: Error: opcode not supported on this processor: mips2 (mips2) `clz $t0,$t1'> /tmp/ccpgn2sW.s:8311: Error: opcode not supported on this processor: mips2 (mips2) `madd $t0,$t1'> I did a bit research on 'clo' (clear ones). It looks like it should besupported. I think I'm missing the correct arch specification to gcc.* @Rich The errors you see come from the fact that Debian GCC (and Debian MIPS in general) is still set to the ancient mips2 variant. If you want to configure Valgrind and the tests for your MIPS32 capable system, pass "CFLAGS=-mips32" (or -mips32r2 for more optimal Valgrind if you run on MIPS32r2 capable CPU/emulator) to your configure line. Regards, Petar On Sat, Apr 11, 2015 at 1:16 AM, Rich Coe <rc...@wi...> wrote: > On Fri, 10 Apr 2015 10:41:10 -0500 > Rich Coe <rc...@wi...> wrote: > > On Wed, 8 Apr 2015 08:35:09 -0500 > > Rich Coe <rc...@wi...> wrote: > > > For mips or ppc, without JeOS, there should be a way to mount a > distribution > > > dvd of linux and install another platform. I'd have to look into it. > > > > I worked on creating a mips and a ppc qemu installation. I started with > ppc > > because I know that platform better. > > I created a qemu installation for mips from > wget > http://ftp.de.debian.org/debian/dists/wheezy/main/installer-mipsel/current/images/malta/netboot/vmlinux-3.2.0-4-4kc-malta > wget > http://ftp.de.debian.org/debian/dists/wheezy/main/installer-mipsel/current/images/malta/netboot/initrd.gz > > by running: > qemu-system-mipsel -m 256 -hda deb-mips.qcow2 -kernel > vmlinux-3.2.0-4-4kc-malta -initrd initrd.gz -append "root=/dev/ram > console=ttyS0" -nographic > > I installed: gcc g++ make automake autoconf subversion > > I built valgrind with the following config parameters: > Maximum build arch: mips32 > Primary build arch: mips32 > Secondary build arch: > Build OS: linux > Primary build target: MIPS32_LINUX > Secondary build target: > Platform variant: vanilla > Primary -DVGPV string: -DVGPV_mips32_linux_vanilla=1 > Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp > glibc-2.X-drd.supp glibc-2.34567-NPTL-helgrind.supp glibc-2.X.supp > > V builds ok. When I try to build the tests, it fails building MIPS32int.c > with many errors (with duplicates removed) like > /tmp/ccpgn2sW.s:3257: Error: opcode not supported on this processor: mips2 > (mips2) `clo $t0,$t1' > /tmp/ccpgn2sW.s:3345: Error: opcode not supported on this processor: mips2 > (mips2) `clz $t0,$t1' > /tmp/ccpgn2sW.s:8311: Error: opcode not supported on this processor: mips2 > (mips2) `madd $t0,$t1' > > I did a bit research on 'clo' (clear ones). It looks like it should be > supported. I think I'm missing the correct arch specification to gcc. > > Rich > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live > exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Petar J. <mip...@gm...> - 2015-04-11 02:22:48
|
@Julian *> due to lack of nightly test machines for some architectures, especially mips32/64 and ...* While it may not be important for this thread, there is actually a set of Valgrind nightly build slaves [1] for selected MIPS variants (MIPS32r1-LE, MIPS32r2-LE, MIPS64-LE, MIPS64-BE). The buildbot has been in place since 2012, but the results have not been sent to the Valgrind mailing list. If anyone wants to take a look now, note that the large number of reported failures today is related to r15060. Prior to that, we had a dozen failures in average. E.g. [2]. As of MIPS and QEMU, I suggest you to take QEMU Debian images from Imagination Debian repository [3]. QEMU itself can be taken from the trunk [4]. Last, I believe the situation with MIPS machines on GCC Farm will be sorted out soon. Regards, Petar [1] http://www.rt-rk.com/mips-buildbot/builders [2] http://www.rt-rk.com/mips-buildbot/builders/XLP316/builds/500/steps/shell_1/logs/stdio [3] http://mipsdebian.imgtec.com/ [4] http://wiki.qemu.org/Download On Tue, Apr 7, 2015 at 4:44 PM, Julian Seward <js...@ac...> wrote: > > Hi Torbjörn, > > Once again in Valgrind land we are having problems due to lack of > nightly test machines for some architectures, especially mips32/64 > and arm32/64. > > In the context of the conversation below, I seem to remember you said > something to the effect that you use QEMU to solve this problem for > GMP. Do I remember correctly? > > If so, do you have any information that you can share, regarding > configurations of QEMU and Linux distros for these targets? I am wondering > if we can set up QEMU VMs for at least some of them, so I am writing to ask > if you know which QEMU+distro combinations work well enough to actually be > useful. > > Thanks, > > J > > > > -------- Forwarded Message -------- > Subject: Re: [Gcc-cfarm-users] Is there a mips(64)el box? > Date: Fri, 10 Oct 2014 11:28:39 +0200 > From: Julian Seward <js...@ac...> > Reply-To: js...@ac... > To: Torbjörn Granlund <tg...@gm...> > CC: Sergio Durigan Junior <ser...@re...>, gcc...@gn..., > Philippe Waroquiers <phi...@sk...> > > On 10/10/2014 09:38 AM, Torbjörn Granlund wrote: > > The only machines which are actually alive which are useful TO ME are > > the two power7 machines. They allowed me to improve the performance of > > my code for the chips, and let me do regular testing for ppc64 and AIX. > > The Power7 machine is also useful for Valgrind support. Without it it > would be more difficult to maintain the ppc64 Valgrind port. > > There are (or were, at one time) a lot of machines in the farm, but most > of them were x86 variants, which IMO are the least valuable because that > hardware is most widely available. What the farm is really useful > for is the more obscure stuff, viz, MIPS, PPC, ARM, which are harder to > get hold of. For sure if there were fast, solid MIPS32/64 and AArch32/64 > machines, they would be useful for Valgrind testing and development. > > My impression is that it would be preferable to have fewer machines in > the farm, but concentrate on providing at least one reliable, fast > implementation of each of MIPS, PPC and ARM (32 and 64 bit in all > cases). That is, to try and emphasise quality (breadth and reliability > of supported targets) over quantity (numbers of machines). > > J > > > > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live > exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Philippe W. <phi...@sk...> - 2015-04-10 23:38:29
|
I see that valgrind-low-tilegx.c references the files tilegx-linux-valgrind.xml and tilegx-linux.xml (see target_xml function) but these files are not present in coregrind/m_gdbserver directory and are not in the GDBSERVER_XML_FILES in coregrind/Makefile.am I guess these files were missing in the patch ? Philippe |
|
From: Rich C. <rc...@wi...> - 2015-04-10 23:16:19
|
On Fri, 10 Apr 2015 10:41:10 -0500 Rich Coe <rc...@wi...> wrote: > On Wed, 8 Apr 2015 08:35:09 -0500 > Rich Coe <rc...@wi...> wrote: > > For mips or ppc, without JeOS, there should be a way to mount a distribution > > dvd of linux and install another platform. I'd have to look into it. > > I worked on creating a mips and a ppc qemu installation. I started with ppc > because I know that platform better. I created a qemu installation for mips from wget http://ftp.de.debian.org/debian/dists/wheezy/main/installer-mipsel/current/images/malta/netboot/vmlinux-3.2.0-4-4kc-malta wget http://ftp.de.debian.org/debian/dists/wheezy/main/installer-mipsel/current/images/malta/netboot/initrd.gz by running: qemu-system-mipsel -m 256 -hda deb-mips.qcow2 -kernel vmlinux-3.2.0-4-4kc-malta -initrd initrd.gz -append "root=/dev/ram console=ttyS0" -nographic I installed: gcc g++ make automake autoconf subversion I built valgrind with the following config parameters: Maximum build arch: mips32 Primary build arch: mips32 Secondary build arch: Build OS: linux Primary build target: MIPS32_LINUX Secondary build target: Platform variant: vanilla Primary -DVGPV string: -DVGPV_mips32_linux_vanilla=1 Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp glibc-2.X-drd.supp glibc-2.34567-NPTL-helgrind.supp glibc-2.X.supp V builds ok. When I try to build the tests, it fails building MIPS32int.c with many errors (with duplicates removed) like /tmp/ccpgn2sW.s:3257: Error: opcode not supported on this processor: mips2 (mips2) `clo $t0,$t1' /tmp/ccpgn2sW.s:3345: Error: opcode not supported on this processor: mips2 (mips2) `clz $t0,$t1' /tmp/ccpgn2sW.s:8311: Error: opcode not supported on this processor: mips2 (mips2) `madd $t0,$t1' I did a bit research on 'clo' (clear ones). It looks like it should be supported. I think I'm missing the correct arch specification to gcc. Rich |
|
From: Philippe W. <phi...@sk...> - 2015-04-10 23:11:39
|
On Sat, 2015-04-11 at 01:02 +0200, Philippe Waroquiers wrote: > Julian, > do you agree that the offB_HOST_* offsets are depending on the host > architecture, and not on the guest architecture ? Moving the offB_HOST_* to the arch_host switch makes guest amd64/host tilegx work ok. It looks to me that this is the good thing to do Note: in any case, moving these initialisations from 'guest switch' to 'host switch' can only change the values when guest != host. Philippe |
|
From: Philippe W. <phi...@sk...> - 2015-04-10 23:01:29
|
On Fri, 2015-04-10 at 22:19 +0000, Zhigang Liu wrote:
> Philippe
>
> I can look this carefully for you tonight.
>
> A quick look: the doAMode_IR( ... ) has an optimized case when the am->GXam.IR.index = 0.
>
> In this condition, the load or store will be mapped as 1 bundle, instead of 2 normally. That could be the reason.
That is a good hint.
When looking in main_main.c:594, we see:
switch (vta->arch_guest) {
and then depending on the guest architecture,
offB_HOST_EvC_COUNTER and offB_HOST_EvC_FAILADDR are initialised.
It sounds somewhat suspicious to initialise offB_HOST_* variables
using the guest architecture.
When we have a guest amd64, the host_EvC_FAILADDR is at offset 0
(while for TILEGX, the offset of FAILADDR is 592.
And so, the TILEGX instruction changes, due to the offset 0 being
handled specially.
We in fact use the guest offsets while we should in fact use
host offset for that.
And so, the initialisations of all offB_HOST_EvC_COUNTER and
offB_HOST_EvC_FAILADDR should in fact rather be done in
the switch at main_main.c:407
switch (vta->arch_host) {
Julian,
do you agree that the offB_HOST_* offsets are depending on the host
architecture, and not on the guest architecture ?
Philippe
|
|
From: John R. <jr...@bi...> - 2015-04-10 22:37:29
|
> Parsing /proc/cpuinfo is a bad idea. That might be true, but please say why you believe it. > Julian again mentioned the correct > method and that is to query the AT_HWCAP and AT_HWCAP2 flags in the AUXV. > > [bergner@makalu ~]$ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP > AT_HWCAP: vsx arch_2_06 dfp ic_snoop smt mmu fpu altivec ppc64 ppc32 > AT_HWCAP2: tar isel ebb dscr htm arch_2_07 > > That shows all the instruction categories that are supported in the > cpu you're running on. In my experience, relying on AT_HWCAP* (and especially its decoded representation) is a bad idea. /proc/cpuinfo is *much* better. Here are three current examples: On x86_64: $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP AT_HWCAP: bfebfbf ### meaning ?? $ cat /proc/cpuinfo ### abbreviated :) processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid On ARM (raspberry pi2): $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP AT_HWCAP: half thumb fastmult vfp edsp neon vfpv3 ### armv6 or armv7 ?? etc. $ cat /proc/cpuinfo ### abbreviated :) model name : ARMv7 Processor rev 5 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc07 CPU revision : 5 On MIPS (ASUS RT-N16): $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP ### empty output from grep: no AT_HWCAP at all $ cat /proc/cpuinfo ### abbreviated :) system type : Broadcom BCM4716 chip rev 1 cpu model : MIPS 74K V4.0 ASEs implemented : mips16 dsp In those 3 cases /proc/cpuinfo is vastly superior to AT_HWCAP*. Given that the vendor of e500v2 is stingy with expected software, I expect that /proc/cpuinfo contains more and better information. |
|
From: Zhigang L. <zl...@ez...> - 2015-04-10 22:19:15
|
Philippe I can look this carefully for you tonight. A quick look: the doAMode_IR( ... ) has an optimized case when the am->GXam.IR.index = 0. In this condition, the load or store will be mapped as 1 bundle, instead of 2 normally. That could be the reason. ---Zhigang -----Original Message----- From: Philippe Waroquiers [mailto:phi...@sk...] Sent: Friday, April 10, 2015 5:25 PM To: Valgrind Developers Subject: [Valgrind-developers] I need access to a TILEGX :) : libvexmultiarch_test failing with TILEGX host Hello, I am busy preparing a test that ensures that libVEX (somewhat) works when guest != host. I was going to soon commit the test, but now that TILEGX has been committed, the test fails miserably :(. Find attached the patch that adds the new tests and a few changes/fixes in VEX needed for this guest != host setup. When guest = amd64 and host = TILEGX, the libvexmultiarch_test asserts in TILEGX code: vex: priv/host_tilegx_defs.c:2361 (emit_TILEGXInstr): Assertion `evCheckSzB_TILEGX() == (UChar*)p - (UChar*)p0' failed. I have added some traces to see how the 'p' pointer advances. evCheckSzB_TILEGX is 80 bytes, but the 'p' pointer is only advanced by 72 bytes: ------------------------ Assembly ------------------------ EvCheck (evCheck) ld r11, 8(r50); addli r11, r11, -1; st r11, 8(r50); bgez r11, nofail; jalr *(r50); nofail: A 16 0x7fff53188880 0x7fff53188890 B 24 0x7fff53188880 0x7fff53188898 C 40 0x7fff53188880 0x7fff531888a8 D 48 0x7fff53188880 0x7fff531888b0 p1 0x7fff531888a8 E 56 0x7fff53188880 0x7fff531888b8 p1 0x7fff531888a8 F 56 0x7fff53188880 0x7fff531888b8 p1 0x7fff531888a8 p2 0x7fff531888b0 G 64 0x7fff53188880 0x7fff531888c0 H 72 0x7fff53188880 0x7fff531888c8 vex: priv/host_tilegx_defs.c:2361 (emit_TILEGXInstr): Assertion `evCheckSzB_TILEGX() == (UChar*)p - (UChar*)p0' failed. //// failure exit called by libVEX It looks like the evcheck can have a variable nr of instruction, and there is some code aiming at coping with that. But at least on guest amd64/host tilegx, that gives a problem. To reproduce, once V has been build with the patch: cd none/tests ./libvexmultiarch_test 1 0 0 Any idea what is going wrong ? Thanks Philippe NB: Having access to a TILEGX system would have allowed me to investigate the normal expected behaviour of TILEGX evcheck. But I guess at short term, we will have to debug via mail :). |