You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
1
(16) |
2
(23) |
3
(15) |
|
4
(19) |
5
(21) |
6
(27) |
7
(18) |
8
(17) |
9
(15) |
10
(11) |
|
11
(9) |
12
(18) |
13
(26) |
14
(28) |
15
(26) |
16
(20) |
17
(27) |
|
18
(16) |
19
(40) |
20
(2) |
21
(11) |
22
(27) |
23
(24) |
24
(16) |
|
25
(10) |
26
(12) |
27
(16) |
28
(7) |
29
(6) |
30
(15) |
31
(5) |
|
From: Duncan S. <bal...@fr...> - 2005-12-16 17:27:15
|
Hi Yu, > Right now, we are developing some multithreaded application with > pthread. We tried to use Helgrind to detect potential data races, but > it gives out many false warnings. According to the documentation, > Helgrind is based on lockset algorithm, the same as Eraser etc. We > want to compare these race detectors and decide which one we shall > use. Is there any comparison data between Helgrind and Eraser, or > other race-detectors? What can we do to help improve Helgrind? I am also interested in helgrind (and also saw many false positives - but also many true positives). Unfortunately helgrind does not work in the latest versions of valgrind, and it looks like no-one has the time to fix it. So one way you could improve helgrind would be to get it working again! Best wishes, Duncan. |
|
From: Nicholas N. <nj...@cs...> - 2005-12-16 17:15:52
|
Hi, In this message: ==22516== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 13 from 3) ==22516== malloc/free: in use at exit: 0 bytes in 0 blocks. ==22516== malloc/free: 5,000,000 allocs, 5,000,000 frees, 80,000,000 bytes allocated. ==22516== For counts of detected errors, rerun with: -v ==22516== No malloc'd blocks -- no leaks are possible. the last line always confuses me -- it makes it sound like the program did no heap allocation. Would anyone object to changing it to: "All malloc'd blocks were freed -- no leaks are possible" ? And I wonder if we should still output this stuff: ==22614== LEAK SUMMARY: ==22614== definitely lost: 0 bytes in 0 blocks. ==22614== possibly lost: 0 bytes in 0 blocks. ==22614== still reachable: 0 bytes in 0 blocks. ==22614== suppressed: 0 bytes in 0 blocks. in this case, to be more consistent? (Talking about malloc/free is also a little confusing here since the user might have used new/delete...) Nick |
|
From: <sv...@va...> - 2005-12-16 17:06:44
|
Author: njn
Date: 2005-12-16 17:06:37 +0000 (Fri, 16 Dec 2005)
New Revision: 5361
Log:
Add info about overhead in heap blocks and OSet nodes.
Modified:
trunk/coregrind/m_mallocfree.c
trunk/coregrind/m_oset.c
Modified: trunk/coregrind/m_mallocfree.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/coregrind/m_mallocfree.c 2005-12-16 16:10:48 UTC (rev 5360)
+++ trunk/coregrind/m_mallocfree.c 2005-12-16 17:06:37 UTC (rev 5361)
@@ -80,6 +80,15 @@
=20
bszB =3D=3D pszB + 2*sizeof(SizeT) + 2*a->rz_szB
=20
+ The minimum overhead per heap block for arenas used by
+ the core is: =20
+
+ 32-bit platforms: 2*4 + 2*4 =3D=3D 16 bytes
+ 64-bit platforms: 2*8 + 2*8 =3D=3D 32 bytes
+
+ In both cases extra overhead may be incurred when rounding the payloa=
d
+ size up to VG_MIN_MALLOC_SZB.
+
Furthermore, both size fields in the block have their least-significa=
nt
bit set if the block is not in use, and unset if it is in use.
(The bottom 3 or so bits are always free for this because of alignmen=
t.)
Modified: trunk/coregrind/m_oset.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/coregrind/m_oset.c 2005-12-16 16:10:48 UTC (rev 5360)
+++ trunk/coregrind/m_oset.c 2005-12-16 17:06:37 UTC (rev 5361)
@@ -42,7 +42,8 @@
// - First is the AVL metadata, which is three words: a left pointer, a
// right pointer, and a word containing balancing information and a
// "magic" value which provides some checking that the user has not
-// corrupted the metadata.
+// corrupted the metadata. So the overhead is 12 bytes on 32-bit
+// platforms and 24 bytes on 64-bit platforms.
// - Second is the user's data. This can be anything. Note that becaus=
e it
// comes after the metadata, it will only be word-aligned, even if the
// user data is a struct that would normally be doubleword-aligned.
|
|
From: Nicholas N. <nj...@cs...> - 2005-12-16 16:17:18
|
On Fri, 16 Dec 2005, James Begley wrote: > I've run the performance tests on my 4 year old laptop (running an > up-to-date Fedora Core 4, gcc 4.0.2), which has a 1.1GHz Pentium III > processor - cat /proc/cpuinfo gives > > model name : Intel(R) Pentium(R) III Mobile CPU 1133MHz > > After SVN updating to r5348 (vex r1497), the results of make perf are > -- Running tests in perf ----------------------------------------- > bigcode1 valgrind : 0.5s nl:16.6s (33.1x) mc:23.0s (45.9x) > bigcode2 valgrind : 0.5s nl:26.6s (52.2x) mc:42.7s (83.8x) > bz2 valgrind : 2.5s nl:14.9s ( 6.0x) mc:55.9s (22.5x) > fbench valgrind : 1.8s nl: 5.7s ( 3.2x) mc:26.7s (15.1x) > ffbench valgrind : 4.3s nl: 8.6s ( 2.0x) mc:24.6s ( 5.7x) > sarp valgrind : 0.1s nl: 1.2s ( 9.1x) mc:30.2s (232.5x) > -- Finished tests in perf ----------------------------------------- > (snipped slightly to avoid line wrapping) Thanks for the info. It would be useful to see a comparison between r5348 and 3.1.0 if you have that. You can do it with this command: perl perf/vg_perf --vg=<dir1> --vg=<dir2> perf/ Where <dir1> is the directory holding 3.1.0 and <dir2> is the directory holding r5348. It would also be useful if you could compare the COMPVBITS branch, you can check it out with: svn co svn://www.valgrind.org/valgrind/branches/COMPVBITS and just add a third --vg= option to the command line above. Nick |
|
From: Nicholas N. <nj...@cs...> - 2005-12-16 16:11:51
|
On Fri, 16 Dec 2005, Julian Seward wrote: >>> +#define BYTES_PER_SEC_VBIT_NODE 4 >>> + >> >> this is a very 32 bit assumption. i'd define it to sizeof(Addr). just >> being consequent about the "for free" principle ... unless i missed >> something about v's internals. :) > > In this particular case the value is completely unrelated to the > word size, so it has no bad effect on 64-bit systems. It doesn't affect correctness, but Oswald is right that changing it to sizeof(UWord) will get us an extra 4 bytes of coverage per node for free on 64-bit systems. I've made the change. Nick |
|
From: <sv...@va...> - 2005-12-16 16:10:55
|
Author: njn
Date: 2005-12-16 16:10:48 +0000 (Fri, 16 Dec 2005)
New Revision: 5360
Log:
Add missing declaration (merged from trunk).
Modified:
branches/COMPVBITS/coregrind/pub_core_translate.h
Modified: branches/COMPVBITS/coregrind/pub_core_translate.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- branches/COMPVBITS/coregrind/pub_core_translate.h 2005-12-16 16:10:34=
UTC (rev 5359)
+++ branches/COMPVBITS/coregrind/pub_core_translate.h 2005-12-16 16:10:48=
UTC (rev 5360)
@@ -43,6 +43,8 @@
Int debugging_verbosity,
ULong bbs_done );
=20
+extern void VG_(print_translation_stats) ( void );
+
#endif // __PUB_CORE_TRANSLATE_H
=20
/*--------------------------------------------------------------------*/
|
|
From: <sv...@va...> - 2005-12-16 16:10:42
|
Author: njn
Date: 2005-12-16 16:10:34 +0000 (Fri, 16 Dec 2005)
New Revision: 5359
Log:
Make BYTES_PER_SEC_VBIT_NODE less 32-bit specific.
Modified:
branches/COMPVBITS/memcheck/mc_main.c
Modified: branches/COMPVBITS/memcheck/mc_main.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- branches/COMPVBITS/memcheck/mc_main.c 2005-12-16 01:08:22 UTC (rev 53=
58)
+++ branches/COMPVBITS/memcheck/mc_main.c 2005-12-16 16:10:34 UTC (rev 53=
59)
@@ -469,13 +469,13 @@
static ULong sec_vbits_bytes_curr =3D 0;
static ULong sec_vbits_bytes_peak =3D 0;
=20
-// 4 is the best value here. We can go from 1 to 4 for free -- it doesn=
't
-// change the size of the SecVBitNode because of padding. If we make it
-// larger, we have bigger nodes, but can possibly fit more partially def=
ined
-// bytes in each node. In practice it seems that partially defined byte=
s
-// are not clustered close to each other, so going bigger than 4 does no=
t
-// save space.
-#define BYTES_PER_SEC_VBIT_NODE 4
+// sizeof(Addr) is the best value here. We can go from 1 to sizeof(Addr=
)
+// for free -- it doesn't change the size of the SecVBitNode because of
+// padding. If we make it larger, we have bigger nodes, but can possibl=
y
+// fit more partially defined bytes in each node. In practice it seems =
that
+// partially defined bytes are rarely clustered close to each other, so
+// going bigger than sizeof(Addr) does not save space.
+#define BYTES_PER_SEC_VBIT_NODE sizeof(Addr)
=20
typedef=20
struct {
|
|
From: Dirk M. <dm...@gm...> - 2005-12-16 14:02:33
|
On Thursday 15 December 2005 20:55, Nicholas Nethercote wrote: > Julian's commits r5345 and r5346 (avoiding the profiling in the > dispatcher, and using jumps instead of call/return) have the following > effect on my 3.0 GHz P4 Prescott. What are the feelings about backporting this to the 3.1 branch? Dirk |
|
From: <sv...@va...> - 2005-12-16 13:49:24
|
Author: cerion
Date: 2005-12-16 13:49:00 +0000 (Fri, 16 Dec 2005)
New Revision: 1499
Log:
Fix switchback.c to reflect changes to call of LibVEX_Translate()
Fix test_ppc_jm1.c to reflect direct linking
- main -> __main etc
- vex_printf -> vexxx_printf etc
Modified:
trunk/switchback/
trunk/switchback/switchback.c
trunk/switchback/test_ppc_jm1.c
Property changes on: trunk/switchback
___________________________________________________________________
Name: svn:ignore
+ switchback
Modified: trunk/switchback/switchback.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/switchback/switchback.c 2005-12-16 13:40:18 UTC (rev 1498)
+++ trunk/switchback/switchback.c 2005-12-16 13:49:00 UTC (rev 1499)
@@ -4,8 +4,7 @@
13 Dec '05 - Linker no longer used (apart from mymalloc)
Simply compile and link switchback.c with test_xxx.c,
e.g. for ppc64:
-$ (cd .. && make EXTRA_CFLAGS=3D"-m64" libvex_ppc64_linux.a)
-$ gcc -m64 -Wall -O -g -o switchback switchback.c linker.c ../libvex_ppc=
64_linux.a test_xxx.c
+$ (cd .. && make EXTRA_CFLAGS=3D"-m64" libvex_ppc64_linux.a) && gcc -m64=
-mregnames -Wall -Wshadow -Wno-long-long -Winline -O -g -o switchback sw=
itchback.c linker.c ../libvex_ppc64_linux.a test_xxx.c
=20
Test file test_xxx.c must have an entry point called "entry",
which expects to take a single argument which is a function pointer
@@ -112,7 +111,7 @@
Int trans_cache_used =3D 0;
Int trans_table_used =3D 0;
=20
-static Bool chase_into_not_ok ( Addr64 dst ) { return False; }
+static Bool chase_into_ok ( Addr64 dst ) { return False; }
=20
#if 0
// local_sys_write_stderr(&c,1);
@@ -796,7 +795,7 @@
HWord find_translation ( Addr64 guest_addr )
{
Int i;
- HWord res;
+ HWord __res;
if (0)
printf("find translation %p ... ", ULong_to_Ptr(guest_addr));
for (i =3D 0; i < trans_table_used; i++)
@@ -819,17 +818,18 @@
i--;
}
=20
- res =3D (HWord)trans_tableP[i];
- if (0) printf("%p\n", (void*)res);
- return res;
+ __res =3D (HWord)trans_tableP[i];
+ if (0) printf("%p\n", (void*)__res);
+ return __res;
}
=20
#define N_TRANSBUF 5000
static UChar transbuf[N_TRANSBUF];
void make_translation ( Addr64 guest_addr, Bool verbose )
{
+ VexTranslateArgs vta;
VexTranslateResult tres;
- VexArchInfo vai;
+ VexArchInfo vex_archinfo;
Int trans_used, i, ws_needed;
=20
if (trans_table_used >=3D N_TRANS_TABLE
@@ -844,25 +844,32 @@
if (0)
printf("make translation %p\n", ULong_to_Ptr(guest_addr));
=20
- LibVEX_default_VexArchInfo(&vai);
- vai.subarch =3D VexSubArch;
- vai.ppc32_cache_line_szB =3D CacheLineSize;
+ LibVEX_default_VexArchInfo(&vex_archinfo);
+ vex_archinfo.subarch =3D VexSubArch;
+ vex_archinfo.ppc32_cache_line_szB =3D CacheLineSize;
=20
- tres
- =3D LibVEX_Translate (=20
- VexArch, &vai,
- VexArch, &vai,
- ULong_to_Ptr(guest_addr), guest_addr, guest_addr,
- chase_into_not_ok,
- &trans_table[trans_table_used],
- transbuf, N_TRANSBUF, &trans_used,
- NULL, /* instrument1 */
- NULL, /* instrument2 */
- False, /* cleanup after instrument */
- False, /* self-checking translation? */
- NULL, /* access checker */
- verbose ? TEST_FLAGS : DEBUG_TRACE_FLAGS
- );
+ /* */
+ vta.arch_guest =3D VexArch;
+ vta.archinfo_guest =3D vex_archinfo;
+ vta.arch_host =3D VexArch;
+ vta.archinfo_host =3D vex_archinfo;
+ vta.guest_bytes =3D (UChar*)ULong_to_Ptr(guest_addr);
+ vta.guest_bytes_addr =3D (Addr64)guest_addr;
+ vta.guest_bytes_addr_noredir =3D (Addr64)guest_addr;
+ vta.chase_into_ok =3D chase_into_ok;
+// vta.guest_extents =3D &vge;
+ vta.guest_extents =3D &trans_table[trans_table_used];
+ vta.host_bytes =3D transbuf;
+ vta.host_bytes_size =3D N_TRANSBUF;
+ vta.host_bytes_used =3D &trans_used;
+ vta.instrument1 =3D NULL;
+ vta.instrument2 =3D NULL;
+ vta.do_self_check =3D False;
+ vta.traceflags =3D verbose ? TEST_FLAGS : DEBUG_TRACE_FLAGS;
+ vta.dispatch =3D NULL;
+
+ tres =3D LibVEX_Translate ( &vta );
+
assert(tres =3D=3D VexTransOK);
ws_needed =3D (trans_used+7) / 8;
assert(ws_needed > 0);
@@ -1129,19 +1136,19 @@
get_R2();
=20
#if !defined(__powerpc64__) // ppc32
- gst.guest_CIA =3D (UInt)entryP;
- gst.guest_GPR1 =3D (UInt)&gstack[25000]; /* stack pointer */
- gst.guest_GPR3 =3D (UInt)serviceFn; /* param to entry */
- gst.guest_GPR2 =3D saved_R2;
- gst.guest_LR =3D 0x12345678; /* bogus return address */
+ gst.guest_CIA =3D (UInt)entryP;
+ gst.guest_GPR1 =3D (UInt)&gstack[25000]; /* stack pointer */
+ gst.guest_GPR3 =3D (UInt)serviceFn; /* param to entry */
+ gst.guest_GPR2 =3D saved_R2;
+ gst.guest_LR =3D 0x12345678; /* bogus return address */
#else // ppc64
get_R13();
- gst.guest_CIA =3D * (ULong*)entryP;
- gst.guest_GPR1 =3D (ULong)&gstack[25000]; /* stack pointer */
- gst.guest_GPR3 =3D (ULong)serviceFn; /* param to entry */
- gst.guest_GPR2 =3D saved_R2;
+ gst.guest_CIA =3D * (ULong*)entryP;
+ gst.guest_GPR1 =3D (ULong)&gstack[25000]; /* stack pointer */
+ gst.guest_GPR3 =3D (ULong)serviceFn; /* param to entry */
+ gst.guest_GPR2 =3D saved_R2;
gst.guest_GPR13 =3D saved_R13;
- gst.guest_LR =3D 0x1234567812345678ULL; /* bogus return address */
+ gst.guest_LR =3D 0x1234567812345678ULL; /* bogus return address */
// printf("setting CIA to %p\n", (void*)gst.guest_CIA);
#endif
=20
Modified: trunk/switchback/test_ppc_jm1.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/switchback/test_ppc_jm1.c 2005-12-16 13:40:18 UTC (rev 1498)
+++ trunk/switchback/test_ppc_jm1.c 2005-12-16 13:49:00 UTC (rev 1499)
@@ -119,14 +119,14 @@
=20
//#define DEBUG_ARGS_BUILD
#if defined (DEBUG_ARGS_BUILD)
-#define AB_DPRINTF(fmt, args...) do { vex_printf(fmt , ##args); } while =
(0)
+#define AB_DPRINTF(fmt, args...) do { vexxx_printf(fmt , ##args); } whil=
e (0)
#else
#define AB_DPRINTF(fmt, args...) do { } while (0)
#endif
=20
//#define DEBUG_FILTER
#if defined (DEBUG_FILTER)
-#define FDPRINTF(fmt, args...) do { vex_printf(fmt , ##args); } while (0=
)
+#define FDPRINTF(fmt, args...) do { vexxx_printf(fmt , ##args); } while =
(0)
#else
#define FDPRINTF(fmt, args...) do { } while (0)
#endif
@@ -304,7 +304,7 @@
=20
/////////////////////////////////////////////////////////////////////
=20
-static void vex_log_bytes ( char* p, int n )
+static void vexxx_log_bytes ( char* p, int n )
{
int i;
for (i =3D 0; i < n; i++)
@@ -312,14 +312,14 @@
}
=20
/*---------------------------------------------------------*/
-/*--- vex_printf ---*/
+/*--- vexxx_printf ---*/
/*---------------------------------------------------------*/
=20
/* This should be the only <...> include in the entire VEX library.
New code for vex_util.c should go above this point. */
#include <stdarg.h>
=20
-static HChar vex_toupper ( HChar c )
+static HChar vexxx_toupper ( HChar c )
{
if (c >=3D 'a' && c <=3D 'z')
return c + ('A' - 'a');
@@ -327,14 +327,14 @@
return c;
}
=20
-static Int vex_strlen ( const HChar* str )
+static Int vexxx_strlen ( const HChar* str )
{
Int i =3D 0;
while (str[i] !=3D 0) i++;
return i;
}
=20
-Bool vex_streq ( const HChar* s1, const HChar* s2 )
+Bool vexxx_streq ( const HChar* s1, const HChar* s2 )
{
while (True) {
if (*s1 =3D=3D 0 && *s2 =3D=3D 0)
@@ -358,10 +358,10 @@
myvprintf_str ( void(*send)(HChar), Int flags, Int width, HChar* str,=20
Bool capitalise )
{
-# define MAYBE_TOUPPER(ch) (capitalise ? vex_toupper(ch) : (ch))
+# define MAYBE_TOUPPER(ch) (capitalise ? vexxx_toupper(ch) : (ch))
UInt ret =3D 0;
Int i, extra;
- Int len =3D vex_strlen(str);
+ Int len =3D vexxx_strlen(str);
=20
if (width =3D=3D 0) {
ret +=3D len;
@@ -606,7 +606,7 @@
static void add_to_myprintf_buf ( HChar c )
{
if (c =3D=3D '\n' || n_myprintf_buf >=3D 1000-10 /*paranoia*/ ) {
- (*vex_log_bytes)( myprintf_buf, vex_strlen(myprintf_buf) );
+ (*vexxx_log_bytes)( myprintf_buf, vexxx_strlen(myprintf_buf) );
n_myprintf_buf =3D 0;
myprintf_buf[n_myprintf_buf] =3D 0; =20
}
@@ -614,7 +614,7 @@
myprintf_buf[n_myprintf_buf] =3D 0;
}
=20
-static UInt vex_printf ( const char *format, ... )
+static UInt vexxx_printf ( const char *format, ... )
{
UInt ret;
va_list vargs;
@@ -625,7 +625,7 @@
ret =3D vprintf_wrk ( add_to_myprintf_buf, format, vargs );
=20
if (n_myprintf_buf > 0) {
- (*vex_log_bytes)( myprintf_buf, n_myprintf_buf );
+ (*vexxx_log_bytes)( myprintf_buf, n_myprintf_buf );
}
=20
va_end(vargs);
@@ -3576,14 +3576,14 @@
static int nb_ii16;
=20
static inline void register_farg (void *farg,
- int s, uint16_t exp, uint64_t mant)
+ int s, uint16_t _exp, uint64_t mant)
{
uint64_t tmp;
=20
- tmp =3D ((uint64_t)s << 63) | ((uint64_t)exp << 52) | mant;
+ tmp =3D ((uint64_t)s << 63) | ((uint64_t)_exp << 52) | mant;
*(uint64_t *)farg =3D tmp;
AB_DPRINTF("%d %03x %013llx =3D> %016llx %0e\n",
- s, exp, mant, *(uint64_t *)farg, *(double *)farg);
+ s, _exp, mant, *(uint64_t *)farg, *(double *)farg);
}
=20
static void build_fargs_table (void)
@@ -3603,7 +3603,7 @@
* (8 values)
*/
uint64_t mant;
- uint16_t exp, e0, e1;
+ uint16_t _exp, e0, e1;
int s;
int i;
=20
@@ -3614,11 +3614,11 @@
for (e1 =3D 0x000; ; e1 =3D ((e1 + 1) << 2) + 6) {
if (e1 >=3D 0x400)
e1 =3D 0x3fe;
- exp =3D (e0 << 10) | e1;
+ _exp =3D (e0 << 10) | e1;
for (mant =3D 0x0000000000001ULL; mant < (1ULL << 52);
/* Add 'random' bits */
mant =3D ((mant + 0x4A6) << 13) + 0x359) {
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
}
if (e1 =3D=3D 0x3fe)
break;
@@ -3628,44 +3628,44 @@
/* Special values */
/* +0.0 : 0 0x000 0x0000000000000 */
s =3D 0;
- exp =3D 0x000;
+ _exp =3D 0x000;
mant =3D 0x0000000000000ULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* -0.0 : 1 0x000 0x0000000000000 */
s =3D 1;
- exp =3D 0x000;
+ _exp =3D 0x000;
mant =3D 0x0000000000000ULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* +infinity : 0 0x7FF 0x0000000000000 */
s =3D 0;
- exp =3D 0x7FF;
+ _exp =3D 0x7FF;
mant =3D 0x0000000000000ULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* -infinity : 1 0x7FF 0x0000000000000 */
s =3D 1;
- exp =3D 0x7FF;
+ _exp =3D 0x7FF;
mant =3D 0x0000000000000ULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* +SNaN : 0 0x7FF 0x7FFFFFFFFFFFF */
s =3D 0;
- exp =3D 0x7FF;
+ _exp =3D 0x7FF;
mant =3D 0x7FFFFFFFFFFFFULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* -SNaN : 1 0x7FF 0x7FFFFFFFFFFFF */
s =3D 1;
- exp =3D 0x7FF;
+ _exp =3D 0x7FF;
mant =3D 0x7FFFFFFFFFFFFULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* +QNaN : 0 0x7FF 0x8000000000000 */
s =3D 0;
- exp =3D 0x7FF;
+ _exp =3D 0x7FF;
mant =3D 0x8000000000000ULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
/* -QNaN : 1 0x7FF 0x8000000000000 */
s =3D 1;
- exp =3D 0x7FF;
+ _exp =3D 0x7FF;
mant =3D 0x8000000000000ULL;
- register_farg(&fargs[i++], s, exp, mant);
+ register_farg(&fargs[i++], s, _exp, mant);
AB_DPRINTF("Registered %d floats values\n", i);
nb_fargs =3D i;
}
@@ -3714,7 +3714,7 @@
int i, j, k;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
for (j =3D 0; j < nb_iargs; j++) {
for (k =3D 0;k < nb_iargs; k++) {
@@ -3730,14 +3730,14 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %08x, %08x =3D> %08x (%08x %08x)\n"=
,
+ vexxx_printf("%s %08x, %08x, %08x =3D> %08x (%08x %08x)\=
n",
name, iargs[i], iargs[j], iargs[k], res, flags, x=
er);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static void test_int_two_args (const unsigned char *name, test_func_t fu=
nc)
@@ -3746,7 +3746,7 @@
int i, j;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
for (j =3D 0; j < nb_iargs; j++) {
r14 =3D iargs[i];
@@ -3760,12 +3760,12 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %08x =3D> %08x (%08x %08x)\n",
+ vexxx_printf("%s %08x, %08x =3D> %08x (%08x %08x)\n",
name, iargs[i], iargs[j], res, flags, xer);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static void test_int_one_arg (const unsigned char *name, test_func_t fun=
c)
@@ -3774,7 +3774,7 @@
int i;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
r14 =3D iargs[i];
r18 =3D 0;
@@ -3787,10 +3787,10 @@
flags =3D r18;
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
- vex_printf("%s %08x =3D> %08x (%08x %08x)\n",
+ vexxx_printf("%s %08x =3D> %08x (%08x %08x)\n",
name, iargs[i], res, flags, xer);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static inline void _patch_op_imm (void *out, void *in,
@@ -3826,19 +3826,19 @@
int i, j;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
for (j =3D 0; j < nb_ii16; j++) {
p =3D (void *)func;
#if 0
- vex_printf("copy func %s from %p to %p (%08x %08x)\n",
+ vexxx_printf("copy func %s from %p to %p (%08x %08x)\n",
name, func, func_buf, p[0], p[1]);
#endif
func_buf[1] =3D p[1];
patch_op_imm16(func_buf, p, ii16[j]);
func =3D (void *)func_buf;
#if 0
- vex_printf(" =3D> func %s from %p to %p (%08x %08x)\n",
+ vexxx_printf(" =3D> func %s from %p to %p (%08x %08x)\n",
name, func, func_buf, func_buf[0], func_buf[1]);
#endif
r14 =3D iargs[i];
@@ -3851,12 +3851,12 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %08x =3D> %08x (%08x %08x)\n",
+ vexxx_printf("%s %08x, %08x =3D> %08x (%08x %08x)\n",
name, iargs[i], ii16[j], res, flags, xer);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
/* Special test cases for:
@@ -3878,7 +3878,7 @@
int i, j, k, l;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0;;) {
if (i >=3D nb_iargs)
i =3D nb_iargs - 1;
@@ -3901,14 +3901,14 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %d, %d, %d =3D> %08x (%08x %08x=
)\n",
+ vexxx_printf("%s %08x, %d, %d, %d =3D> %08x (%08x %0=
8x)\n",
name, iargs[i], j, k, l, res, flags, xer);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
if (i =3D=3D 0)
i =3D 1;
else if (i =3D=3D nb_iargs - 1)
@@ -3916,7 +3916,7 @@
else
i +=3D 3;
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static void rlwnm_cb (const unsigned char *name, test_func_t func)
@@ -3926,7 +3926,7 @@
int i, j, k, l;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
for (j =3D 0; j < 64; j++) {
for (k =3D 0; k < 32; k++) {
@@ -3947,16 +3947,16 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %08x, %d, %d =3D> %08x (%08x %0=
8x)\n",
+ vexxx_printf("%s %08x, %08x, %d, %d =3D> %08x (%08x =
%08x)\n",
name, iargs[i], j, k, l, res, flags, xer);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static void srawi_cb (const unsigned char *name, test_func_t func)
@@ -3966,7 +3966,7 @@
int i, j;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
for (j =3D 0; j < 32; j++) {
p =3D (void *)func;
@@ -3983,12 +3983,12 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %d =3D> %08x (%08x %08x)\n",
+ vexxx_printf("%s %08x, %d =3D> %08x (%08x %08x)\n",
name, iargs[i], j, res, flags, xer);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
typedef struct special_t special_t;
@@ -4007,7 +4007,7 @@
continue;
for (i =3D 0; table[i].name !=3D NULL; i++) {
#if 0
- vex_printf( "look for handler for '%s' (%s)\n", name,
+ vexxx_printf( "look for handler for '%s' (%s)\n", name,
table[i].name);
#endif
if (my_strcmp(table[i].name, tmp) =3D=3D 0) {
@@ -4015,7 +4015,7 @@
return;
}
}
- vex_printf( "ERROR: no test found for op '%s'\n", name);
+ vexxx_printf( "ERROR: no test found for op '%s'\n", name);
}
=20
static special_t special_int_ops[] =3D {
@@ -4093,7 +4093,7 @@
int i, j, k;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_fargs; i++) {
for (j =3D 0; j < nb_fargs; j++) {
for (k =3D 0;k < nb_fargs; k++) {
@@ -4113,14 +4113,14 @@
flags =3D r18;
res =3D f17;
ur =3D *(uint64_t *)(&res);
- vex_printf("%s %016llx, %016llx, %016llx =3D> %016llx (%=
08x)\n",
+ vexxx_printf("%s %016llx, %016llx, %016llx =3D> %016llx =
(%08x)\n",
name, u0, u1, u2, ur, flags);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static void test_float_two_args (const unsigned char *name, test_func_t =
func)
@@ -4131,7 +4131,7 @@
int i, j;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_fargs; i++) {
for (j =3D 0; j < nb_fargs; j++) {
u0 =3D *(uint64_t *)(&fargs[i]);
@@ -4148,12 +4148,12 @@
flags =3D r18;
res =3D f17;
ur =3D *(uint64_t *)(&res);
- vex_printf("%s %016llx, %016llx =3D> %016llx (%08x)\n",
+ vexxx_printf("%s %016llx, %016llx =3D> %016llx (%08x)\n",
name, u0, u1, ur, flags);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static void test_float_one_arg (const unsigned char *name, test_func_t f=
unc)
@@ -4164,7 +4164,7 @@
int i;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_fargs; i++) {
u0 =3D *(uint64_t *)(&fargs[i]);
f14 =3D fargs[i];
@@ -4178,9 +4178,9 @@
flags =3D r18;
res =3D f17;
ur =3D *(uint64_t *)(&res);
- vex_printf("%s %016llx =3D> %016llx (%08x)\n", name, u0, ur, fla=
gs);
+ vexxx_printf("%s %016llx =3D> %016llx (%08x)\n", name, u0, ur, f=
lags);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
=20
static special_t special_float_ops[] =3D {
@@ -4259,7 +4259,7 @@
int i, j, k;
=20
if (verbose > 1)
- vex_printf( "Test instruction %s\n", name);
+ vexxx_printf( "Test instruction %s\n", name);
for (i =3D 0; i < nb_iargs; i++) {
for (j =3D 0; j < nb_iargs; j++) {
for (k =3D 0;k < nb_iargs; k++) {
@@ -4278,14 +4278,14 @@
__asm__ __volatile__ ("mfxer 18");
xer =3D r18;
res =3D r17;
- vex_printf("%s %08x, %08x, %08x =3D> %08x (%08x %08x)\n"=
,
+ vexxx_printf("%s %08x, %08x, %08x =3D> %08x (%08x %08x)\=
n",
name, iargs[i], iargs[j], iargs[k], res, flags, x=
er);
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
#endif /* defined (IS_PPC405) */
=20
@@ -4316,8 +4316,8 @@
continue;
FDPRINTF("Check '%s' againt '%s' (%s match)\n",
name, filter, exact ? "exact" : "starting");
- nlen =3D vex_strlen(name);
- flen =3D vex_strlen(filter);
+ nlen =3D vexxx_strlen(name);
+ flen =3D vexxx_strlen(filter);
if (exact) {
if (nlen =3D=3D flen && my_memcmp(name, filter, flen) =3D=3D=
0)
ret =3D 1;
@@ -4386,7 +4386,7 @@
loop =3D &float_loops[nb_args - 1];
break;
#else
- vex_printf( "Sorry. "
+ vexxx_printf( "Sorry. "
"PPC floating point instructions tests "
"are disabled on your host\n");
#endif /* !defined (NO_FLOAT) */
@@ -4397,7 +4397,7 @@
loop =3D &tmpl;
break;
#else
- vex_printf( "Sorry. "
+ vexxx_printf( "Sorry. "
"PPC405 instructions tests are disabled on your host=
\n");
continue;
#endif /* defined (IS_PPC405) */
@@ -4407,12 +4407,12 @@
loop =3D &altivec_int_loops[nb_args - 1];
break;
#else
- vex_printf( "Sorry. "
+ vexxx_printf( "Sorry. "
"Altivec instructions tests are not yet implemented\=
n");
continue;
#endif
#else
- vex_printf( "Sorry. "
+ vexxx_printf( "Sorry. "
"Altivec instructions tests are disabled on your hos=
t\n");
continue;
#endif
@@ -4422,36 +4422,36 @@
loop =3D &altivec_float_loops[nb_args - 1];
break;
#else
- vex_printf( "Sorry. "
+ vexxx_printf( "Sorry. "
"Altivec instructions tests are not yet implemented\=
n");
continue;
#endif
#else
- vex_printf( "Sorry. "
+ vexxx_printf( "Sorry. "
"Altivec float instructions tests "
"are disabled on your host\n");
#endif
continue;
default:
- vex_printf("ERROR: unknown insn family %08x\n", family);
+ vexxx_printf("ERROR: unknown insn family %08x\n", family);
continue;
}
if (verbose > 0)
- vex_printf( "%s:\n", all_tests[i].name);
+ vexxx_printf( "%s:\n", all_tests[i].name);
for (j =3D 0; tests[j].name !=3D NULL; j++) {
if (check_name(tests[j].name, filter, exact))
(*loop)(tests[j].name, tests[j].func);
n++;
}
- vex_printf("\n");
+ vexxx_printf("\n");
}
- vex_printf( "All done. Tested %d different instructions\n", n);
+ vexxx_printf( "All done. Tested %d different instructions\n", n);
}
=20
#if 0 // unused
static void usage (void)
{
- vex_printf(
+ vexxx_printf(
"test-ppc [-1] [-2] [-3] [-*] [-t <type>] [-f <family>] [-u]=
"
"[-n <filter>] [-x] [-h]\n"
"\t-1: test opcodes with one argument\n"
@@ -4479,7 +4479,7 @@
}
#endif
=20
-int main (int argc, char **argv)
+int _main (int argc, char **argv)
{
unsigned char /* *tmp, */ *filter =3D NULL;
int one_arg =3D 0, two_args =3D 0, three_args =3D 0;
@@ -4540,17 +4540,17 @@
// break;
// default:
// usage();
- // vex_printf( "Unknown argument: '%c'\n", c);
+ // vexxx_printf( "Unknown argument: '%c'\n", c);
// return 1;
// bad_arg:
// usage();
- // vex_printf( "Bad argument for '%c': '%s'\n", c, tmp);
+ // vexxx_printf( "Bad argument for '%c': '%s'\n", c, tmp)=
;
// return 1;
// }
// }
// if (argc !=3D optind) {
// usage();
- // vex_printf( "Bad number of arguments\n");
+ // vexxx_printf( "Bad number of arguments\n");
// return 1;
// }
=20
@@ -4608,6 +4608,6 @@
{
char* argv[2] =3D { NULL, NULL };
serviceFn =3D service;
- main(0, argv);
+ _main(0, argv);
(*service)(0,0);
}
|
|
From: <sv...@va...> - 2005-12-16 13:40:23
|
Author: cerion
Date: 2005-12-16 13:40:18 +0000 (Fri, 16 Dec 2005)
New Revision: 1498
Log:
Fixed up front and backend for 32bit mul,div,cmp,shift in mode64
Backend:
- separated shifts from other alu ops
- gave {shift, mul, div, cmp} ops a bool to indicate 32|64bit insn
- fixed and implemented more mode64 cases
Also improved some IR by moving imm's to right arg of binop - backend ass=
umes this.
All integer ppc32 insns now pass switchback tests in 64bit mode.
(ppc64-only insns not yet fully tested)
Modified:
trunk/priv/guest-ppc32/toIR.c
trunk/priv/host-ppc32/hdefs.c
trunk/priv/host-ppc32/hdefs.h
trunk/priv/host-ppc32/isel.c
Modified: trunk/priv/guest-ppc32/toIR.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/priv/guest-ppc32/toIR.c 2005-12-16 01:06:42 UTC (rev 1497)
+++ trunk/priv/guest-ppc32/toIR.c 2005-12-16 13:40:18 UTC (rev 1498)
@@ -749,13 +749,18 @@
binop(Iop_ShrV128, expr_vA, mkU8(16)), \
binop(Iop_ShrV128, expr_vB, mkU8(16)))
=20
-/* */
-static IRExpr* /* :: Ity_I64 */ mkExtendLoS32 ( IRExpr* src )
+static IRExpr* /* :: Ity_I32/64 */ mk64lo32Sto64 ( IRExpr* src )
{
vassert(typeOfIRExpr(irbb->tyenv, src) =3D=3D Ity_I64);
return unop(Iop_32Sto64, unop(Iop_64to32, src));
}
=20
+static IRExpr* /* :: Ity_I32/64 */ mk64lo32Uto64 ( IRExpr* src )
+{
+ vassert(typeOfIRExpr(irbb->tyenv, src) =3D=3D Ity_I64);
+ return unop(Iop_32Uto64, unop(Iop_64to32, src));
+}
+
static IROp mkSzOp ( IRType ty, IROp op8 )
{
Int adj;
@@ -1551,28 +1556,28 @@
static void putXER_SO ( IRExpr* e )
{
vassert(typeOfIRExpr(irbb->tyenv, e) =3D=3D Ity_I8);
- IRExpr* so =3D binop(Iop_And8, mkU8(1), e);
+ IRExpr* so =3D binop(Iop_And8, e, mkU8(1));
stmt( IRStmt_Put( (mode64 ? OFFB64_XER_SO : OFFB32_XER_SO), so) );
}
=20
static void putXER_OV ( IRExpr* e )
{
vassert(typeOfIRExpr(irbb->tyenv, e) =3D=3D Ity_I8);
- IRExpr* ov =3D binop(Iop_And8, mkU8(1), e);
+ IRExpr* ov =3D binop(Iop_And8, e, mkU8(1));
stmt( IRStmt_Put( (mode64 ? OFFB64_XER_OV : OFFB32_XER_OV), ov) );
}
=20
static void putXER_CA ( IRExpr* e )
{
vassert(typeOfIRExpr(irbb->tyenv, e) =3D=3D Ity_I8);
- IRExpr* ca =3D binop(Iop_And8, mkU8(1), e);
+ IRExpr* ca =3D binop(Iop_And8, e, mkU8(1));
stmt( IRStmt_Put( (mode64 ? OFFB64_XER_CA : OFFB32_XER_CA), ca) );
}
=20
static void putXER_BC ( IRExpr* e )
{
vassert(typeOfIRExpr(irbb->tyenv, e) =3D=3D Ity_I8);
- IRExpr* bc =3D binop(Iop_And8, mkU8(0x7F), e);
+ IRExpr* bc =3D binop(Iop_And8, e, mkU8(0x7F));
stmt( IRStmt_Put( (mode64 ? OFFB64_XER_BC : OFFB32_XER_BC), bc) );
}
=20
@@ -1792,14 +1797,13 @@
case /* 4 */ PPC32G_FLAG_OP_MULLW: {
/* OV true if result can't be represented in 64 bits
i.e sHi !=3D sign extension of sLo */
- IRTemp t128 =3D newTemp(Ity_I128);
- assign( t128, binop(Iop_MullS64, argL, argR) );
xer_ov=20
- =3D binop( Iop_CmpNE64,
- unop(Iop_128HIto64, mkexpr(t128)),
- binop( Iop_Sar64,=20
- unop(Iop_128to64, mkexpr(t128)),=20
- mkU8(63)) );
+ =3D binop( Iop_CmpNE32,
+ unop(Iop_64HIto32, res),
+ binop( Iop_Sar32,=20
+ unop(Iop_64to32, res),=20
+ mkU8(31))
+ );
break;
}
=20
@@ -2647,13 +2651,11 @@
overflow of the low-order 32bit result
CR0[LT|GT|EQ] are undefined if flag_rC && mode64
*/
- IRExpr* dividend =3D unop(Iop_32Sto64,
- unop(Iop_64to32, mkexpr(rA)));
- IRExpr* divisor =3D unop(Iop_32Sto64,
- unop(Iop_64to32, mkexpr(rB)));
- assign( rD, unop(Iop_32Uto64,
- unop(Iop_64to32,
- binop(Iop_DivS64, dividend, divisor)))=
);
+ /* rD[hi32] are undefined: setting them to sign of lo32
+ - makes set_CR0 happy */
+ IRExpr* dividend =3D mk64lo32Sto64( mkexpr(rA) );
+ IRExpr* divisor =3D mk64lo32Sto64( mkexpr(rB) );
+ assign( rD, mk64lo32Uto64( binop(Iop_DivS64, dividend, divis=
or) ) );
if (flag_OE) {
set_XER_OV( ty, PPC32G_FLAG_OP_DIVW,=20
mkexpr(rD), dividend, divisor );
@@ -2681,13 +2683,9 @@
overflow of the low-order 32bit result
CR0[LT|GT|EQ] are undefined if flag_rC && mode64
*/
- IRExpr* dividend =3D unop(Iop_32Uto64,
- unop(Iop_64to32, mkexpr(rA)));
- IRExpr* divisor =3D unop(Iop_32Uto64,
- unop(Iop_64to32, mkexpr(rB)));
- assign( rD, unop(Iop_32Uto64,
- unop(Iop_64to32,
- binop(Iop_DivU64, dividend, divisor)))=
);
+ IRExpr* dividend =3D mk64lo32Uto64( mkexpr(rA) );
+ IRExpr* divisor =3D mk64lo32Uto64( mkexpr(rB) );
+ assign( rD, mk64lo32Uto64( binop(Iop_DivU64, dividend, divis=
or) ) );
if (flag_OE) {
set_XER_OV( ty, PPC32G_FLAG_OP_DIVWU,=20
mkexpr(rD), dividend, divisor );
@@ -2710,9 +2708,13 @@
DIP("mulhw%s r%u,r%u,r%u\n", flag_rC ? "." : "",
rD_addr, rA_addr, rB_addr);
if (mode64) {
- assign( rD, unop(Iop_128HIto64,
- binop(Iop_MullS64,
- mkexpr(rA), mkexpr(rB))) );
+ /* rD[hi32] are undefined: setting them to sign of lo32
+ - makes set_CR0 happy */
+ assign( rD, binop(Iop_Sar64,
+ binop(Iop_Mul64,
+ mk64lo32Sto64( mkexpr(rA) ),
+ mk64lo32Sto64( mkexpr(rB) )),
+ mkU8(32)) );
} else {
assign( rD, unop(Iop_64HIto32,
binop(Iop_MullS32,
@@ -2728,9 +2730,13 @@
DIP("mulhwu%s r%u,r%u,r%u\n", flag_rC ? "." : "",
rD_addr, rA_addr, rB_addr);
if (mode64) {
- assign( rD, unop(Iop_128HIto64,
- binop(Iop_MullU64,
- mkexpr(rA), mkexpr(rB))) );
+ /* rD[hi32] are undefined: setting them to sign of lo32
+ - makes set_CR0 happy */
+ assign( rD, binop(Iop_Sar64,
+ binop(Iop_Mul64,
+ mk64lo32Uto64( mkexpr(rA) ),
+ mk64lo32Uto64( mkexpr(rB) ) ),
+ mkU8(32)) );
} else {
assign( rD, unop(Iop_64HIto32,=20
binop(Iop_MullU32,
@@ -2743,18 +2749,25 @@
flag_OE ? "o" : "", flag_rC ? "." : "",
rD_addr, rA_addr, rB_addr);
if (mode64) {
- assign( rD, unop(Iop_128to64,
- binop(Iop_MullU64,
- mkexpr(rA), mkexpr(rB))) );
+ /* rD[hi32] are undefined: setting them to sign of lo32
+ - set_XER_OV() and set_CR0() depend on this */
+ IRExpr *a =3D unop(Iop_64to32, mkexpr(rA) );
+ IRExpr *b =3D unop(Iop_64to32, mkexpr(rB) );
+ assign( rD, binop(Iop_MullS32, a, b) );
+ if (flag_OE) {
+ set_XER_OV( ty, PPC32G_FLAG_OP_MULLW,=20
+ mkexpr(rD),
+ unop(Iop_32Uto64, a), unop(Iop_32Uto64, b) );
+ }
} else {
assign( rD, unop(Iop_64to32,
binop(Iop_MullU32,
mkexpr(rA), mkexpr(rB))) );
+ if (flag_OE) {
+ set_XER_OV( ty, PPC32G_FLAG_OP_MULLW,=20
+ mkexpr(rD), mkexpr(rA), mkexpr(rB) );
+ }
}
- if (flag_OE) {
- set_XER_OV( ty, PPC32G_FLAG_OP_MULLW,=20
- mkexpr(rD), mkexpr(rA), mkexpr(rB) );
- }
break;
=20
case 0x068: // neg (Negate, PPC32 p493)
@@ -2999,14 +3012,10 @@
UInt opc2 =3D ifieldOPClo10(theInstr);
UChar b0 =3D ifieldBIT0(theInstr);
=20
- IRType ty =3D mode64 ? Ity_I64 : Ity_I32;
- IRTemp rA =3D newTemp(ty);
- IRTemp rB =3D newTemp(ty);
- IRExpr *a, *b;
+ IRType ty =3D mode64 ? Ity_I64 : Ity_I32;
+ IRExpr *a =3D getIReg(rA_addr);
+ IRExpr *b;
=20
- assign(rA, getIReg(rA_addr));
- a =3D mkexpr(rA);
- =20
if (!mode64 && flag_L=3D=3D1) { // L=3D=3D1 invalid for 32 bit.
vex_printf("dis_int_cmp(PPC32)(flag_L)\n");
return False;
@@ -3022,10 +3031,11 @@
DIP("cmpi cr%u,%u,r%u,%d\n", crfD, flag_L, rA_addr,
(Int)extend_s_16to32(uimm16));
b =3D mkSzExtendS16( ty, uimm16 );
- if (mode64) {
- if (flag_L =3D=3D 0) a =3D mkExtendLoS32( mkexpr(rA) );
+ if (flag_L =3D=3D 1) {
putCR321(crfD, unop(Iop_64to8, binop(Iop_CmpORD64S, a, b)));
} else {
+ a =3D mkSzNarrow32( ty, a );
+ b =3D mkSzNarrow32( ty, b );
putCR321(crfD, unop(Iop_32to8, binop(Iop_CmpORD32S, a, b)));
}
putCR0( crfD, getXER_SO() );
@@ -3034,10 +3044,11 @@
case 0x0A: // cmpli (Compare Logical Immediate, PPC32 p370)
DIP("cmpli cr%u,%u,r%u,0x%x\n", crfD, flag_L, rA_addr, uimm16);
b =3D mkSzImm( ty, uimm16 );
- if (mode64) {
- if (flag_L =3D=3D 0) a =3D mkExtendLoS32( mkexpr(rA) );
+ if (flag_L =3D=3D 1) {
putCR321(crfD, unop(Iop_64to8, binop(Iop_CmpORD64U, a, b)));
} else {
+ a =3D mkSzNarrow32( ty, a );
+ b =3D mkSzNarrow32( ty, b );
putCR321(crfD, unop(Iop_32to8, binop(Iop_CmpORD32U, a, b)));
}
putCR0( crfD, getXER_SO() );
@@ -3049,34 +3060,29 @@
vex_printf("dis_int_cmp(PPC32)(0x1F,b0)\n");
return False;
}
- assign(rB, getIReg(rB_addr));
- b =3D mkexpr(rB);
- if (mode64 && flag_L =3D=3D 0) {
- a =3D mkExtendLoS32( mkexpr(rA) );
- b =3D mkExtendLoS32( mkexpr(rB) );
- }
+ b =3D getIReg(rB_addr);
=20
switch (opc2) {
case 0x000: // cmp (Compare, PPC32 p367)
DIP("cmp cr%u,%u,r%u,r%u\n", crfD, flag_L, rA_addr, rB_addr);
- if (mode64) {
- putCR321( crfD, unop(Iop_64to8,
- binop(Iop_CmpORD64S, a, b)) );
+ if (flag_L =3D=3D 1) {
+ putCR321(crfD, unop(Iop_64to8, binop(Iop_CmpORD64S, a, b)));
} else {
- putCR321( crfD, unop(Iop_32to8,
- binop(Iop_CmpORD32S, a, b)) );
+ a =3D mkSzNarrow32( ty, a );
+ b =3D mkSzNarrow32( ty, b );
+ putCR321(crfD, unop(Iop_32to8,binop(Iop_CmpORD32S, a, b)));
}
putCR0( crfD, getXER_SO() );
break;
=20
case 0x020: // cmpl (Compare Logical, PPC32 p369)
DIP("cmpl cr%u,%u,r%u,r%u\n", crfD, flag_L, rA_addr, rB_addr);
- if (mode64) {
- putCR321( crfD, unop(Iop_64to8,
- binop(Iop_CmpORD64U, a, b)) );
+ if (flag_L =3D=3D 1) {
+ putCR321(crfD, unop(Iop_64to8, binop(Iop_CmpORD64U, a, b)));
} else {
- putCR321( crfD, unop(Iop_32to8,
- binop(Iop_CmpORD32U, a, b)) );
+ a =3D mkSzNarrow32( ty, a );
+ b =3D mkSzNarrow32( ty, b );
+ putCR321(crfD, unop(Iop_32to8, binop(Iop_CmpORD32U, a, b)));
}
putCR0( crfD, getXER_SO() );
break;
@@ -3196,8 +3202,8 @@
// Iop_Clz32 undefined for arg=3D=3D0, so deal with that case:
irx =3D binop(Iop_CmpNE32, lo32, mkU32(0));
assign(rA, IRExpr_Mux0X( unop(Iop_1Uto8, irx),
- mkU32(32),
- unop(Iop_Clz32, lo32) ));
+ mkSzImm(ty, 32),
+ mkSzWiden32(ty, unop(Iop_Clz32, lo32),=
False) ));
// TODO: alternatively: assign(rA, verbose_Clz32(rS));
break;
}
@@ -3348,6 +3354,7 @@
IRTemp rS =3D newTemp(ty);
IRTemp rA =3D newTemp(ty);
IRTemp rB =3D newTemp(ty);
+ IRTemp rot =3D newTemp(ty);
IRExpr *r;
UInt mask32;
ULong mask64;
@@ -3366,10 +3373,10 @@
mask64 =3D MASK64(31-MaskEnd, 31-MaskBeg);
r =3D ROTL( unop(Iop_64to32, mkexpr(rS) ), mkU8(sh_imm) );
r =3D unop(Iop_32Uto64, r);
- r =3D binop(Iop_Or64, r, binop(Iop_Shl64, r, mkU8(32)));
+ assign( rot, binop(Iop_Or64, r, binop(Iop_Shl64, r, mkU8(32))) =
);
assign( rA,
binop(Iop_Or64,
- binop(Iop_And64, r, mkU64(mask64)),
+ binop(Iop_And64, mkexpr(rot), mkU64(mask64)),
binop(Iop_And64, getIReg(rA_addr), mkU64(~mask64))) );
}
else {
@@ -3398,8 +3405,8 @@
// rA =3D ((tmp32 || tmp32) & mask64)
r =3D ROTL( unop(Iop_64to32, mkexpr(rS) ), mkU8(sh_imm) );
r =3D unop(Iop_32Uto64, r);
- r =3D binop(Iop_Or64, r, binop(Iop_Shl64, r, mkU8(32)));
- assign( rA, binop(Iop_And64, r, mkU64(mask64)) );
+ assign( rot, binop(Iop_Or64, r, binop(Iop_Shl64, r, mkU8(32))) =
);
+ assign( rA, binop(Iop_And64, mkexpr(rot), mkU64(mask64)) );
}
else {
if (MaskBeg =3D=3D 0 && sh_imm+MaskEnd =3D=3D 31) {
@@ -3435,14 +3442,16 @@
rA_addr, rS_addr, rB_addr, MaskBeg, MaskEnd);
if (mode64) {
mask64 =3D MASK64(31-MaskEnd, 31-MaskBeg);
- // tmp32 =3D (ROTL(rS_Lo32, rB[0-4])
- // rA =3D ((tmp32 || tmp32) & mask64)
+ /* weird insn alert!
+ tmp32 =3D (ROTL(rS_Lo32, rB[0-4])
+ rA =3D ((tmp32 || tmp32) & mask64)
+ */
// note, ROTL does the masking, so we don't do it here
r =3D ROTL( unop(Iop_64to32, mkexpr(rS)),
- unop(Iop_32to8, mkexpr(rB)) );
+ unop(Iop_64to8, mkexpr(rB)) );
r =3D unop(Iop_32Uto64, r);
- r =3D binop(Iop_Or64, r, binop(Iop_Shl64, r, mkU8(32)));
- assign( rA, binop(Iop_And64, r, mkU64(mask64)) );
+ assign(rot, binop(Iop_Or64, r, binop(Iop_Shl64, r, mkU8(32))));
+ assign( rA, binop(Iop_And64, mkexpr(rot), mkU64(mask64)) );
} else {
mask32 =3D MASK32(31-MaskEnd, 31-MaskBeg);
// rA =3D ROTL(rS, rB[0-4]) & mask
Modified: trunk/priv/host-ppc32/hdefs.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/priv/host-ppc32/hdefs.c 2005-12-16 01:06:42 UTC (rev 1497)
+++ trunk/priv/host-ppc32/hdefs.c 2005-12-16 13:40:18 UTC (rev 1498)
@@ -573,23 +573,29 @@
}
}
=20
-HChar* showPPC32AluOp ( PPC32AluOp op, Bool immR, Bool is32Bit ) {
+HChar* showPPC32AluOp ( PPC32AluOp op, Bool immR ) {
switch (op) {
case Palu_ADD: return immR ? "addi" : "add";
case Palu_SUB: return immR ? "subi" : "sub";
case Palu_AND: return immR ? "andi." : "and";
case Palu_OR: return immR ? "ori" : "or";
case Palu_XOR: return immR ? "xori" : "xor";
- case Palu_SHL: return is32Bit ? (immR ? "slwi" : "slw") :=20
- (immR ? "sldi" : "sld");
- case Palu_SHR: return is32Bit ? (immR ? "srwi" : "srw") :
- (immR ? "srdi" : "srd");
- case Palu_SAR: return is32Bit ? (immR ? "srawi" : "sraw") :
- (immR ? "sradi" : "srad");
default: vpanic("showPPC32AluOp");
}
}
=20
+HChar* showPPC32ShftOp ( PPC32ShftOp op, Bool immR, Bool sz32 ) {
+ switch (op) {
+ case Pshft_SHL: return sz32 ? (immR ? "slwi" : "slw") :=20
+ (immR ? "sldi" : "sld");
+ case Pshft_SHR: return sz32 ? (immR ? "srwi" : "srw") :
+ (immR ? "srdi" : "srd");
+ case Pshft_SAR: return sz32 ? (immR ? "srawi" : "sraw") :
+ (immR ? "sradi" : "srad");
+ default: vpanic("showPPC32ShftOp");
+ }
+}
+
HChar* showPPC32FpOp ( PPC32FpOp op ) {
switch (op) {
case Pfp_ADD: return "fadd";
@@ -719,6 +725,17 @@
i->Pin.Alu.srcR =3D srcR;
return i;
}
+PPC32Instr* PPC32Instr_Shft ( PPC32ShftOp op, Bool sz32,=20
+ HReg dst, HReg srcL, PPC32RH* srcR ) {
+ PPC32Instr* i =3D LibVEX_Alloc(sizeof(PPC32Instr));
+ i->tag =3D Pin_Shft;
+ i->Pin.Shft.op =3D op;
+ i->Pin.Shft.sz32 =3D sz32;
+ i->Pin.Shft.dst =3D dst;
+ i->Pin.Shft.srcL =3D srcL;
+ i->Pin.Shft.srcR =3D srcR;
+ return i;
+}
PPC32Instr* PPC32Instr_AddSubC32 ( Bool isAdd, Bool setC,
HReg dst, HReg srcL, HReg srcR ) {
PPC32Instr* i =3D LibVEX_Alloc(sizeof(PPC32Instr));
@@ -730,11 +747,12 @@
i->Pin.AddSubC32.srcR =3D srcR;
return i;
}
-PPC32Instr* PPC32Instr_Cmp ( Bool syned, UInt crfD,=20
- HReg srcL, PPC32RH* srcR ) {
+PPC32Instr* PPC32Instr_Cmp ( Bool syned, Bool sz32,=20
+ UInt crfD, HReg srcL, PPC32RH* srcR ) {
PPC32Instr* i =3D LibVEX_Alloc(sizeof(PPC32Instr));
i->tag =3D Pin_Cmp;
i->Pin.Cmp.syned =3D syned;
+ i->Pin.Cmp.sz32 =3D sz32;
i->Pin.Cmp.crfD =3D crfD;
i->Pin.Cmp.srcL =3D srcL;
i->Pin.Cmp.srcR =3D srcR;
@@ -748,12 +766,13 @@
i->Pin.Unary32.src =3D src;
return i;
}
-PPC32Instr* PPC32Instr_MulL ( Bool syned, Bool hi,=20
+PPC32Instr* PPC32Instr_MulL ( Bool syned, Bool hi, Bool sz32,=20
HReg dst, HReg srcL, HReg srcR ) {
PPC32Instr* i =3D LibVEX_Alloc(sizeof(PPC32Instr));
i->tag =3D Pin_MulL;
i->Pin.MulL.syned =3D syned;
i->Pin.MulL.hi =3D hi;
+ i->Pin.MulL.sz32 =3D sz32;
i->Pin.MulL.dst =3D dst;
i->Pin.MulL.srcL =3D srcL;
i->Pin.MulL.srcR =3D srcR;
@@ -762,10 +781,12 @@
if (!hi) vassert(!syned);
return i;
}
-PPC32Instr* PPC32Instr_Div ( Bool syned, HReg dst, HReg srcL, HReg srcR =
) {
+PPC32Instr* PPC32Instr_Div ( Bool syned, Bool sz32,
+ HReg dst, HReg srcL, HReg srcR ) {
PPC32Instr* i =3D LibVEX_Alloc(sizeof(PPC32Instr));
i->tag =3D Pin_Div;
i->Pin.Div.syned =3D syned;
+ i->Pin.Div.sz32 =3D sz32;
i->Pin.Div.dst =3D dst;
i->Pin.Div.srcL =3D srcL;
i->Pin.Div.srcR =3D srcR;
@@ -1137,19 +1158,41 @@
ppHRegPPC32(i->Pin.Alu.dst);
vex_printf(",");
ppHRegPPC32(r_srcL);
- } else {
- /* generic */
- vex_printf("%s ", showPPC32AluOp(i->Pin.Alu.op,
- toBool(rh_srcR->tag =3D=3D Prh_Imm),
- toBool(hregClass(r_srcL) =3D=3D HRcInt32=
)));
+ return;
+ }
+ /* special-case "li" */
+ if (i->Pin.Alu.op =3D=3D Palu_ADD && // addi Rd,0,imm =3D=3D li =
Rd,imm
+ rh_srcR->tag =3D=3D Prh_Imm &&
+ hregNumber(r_srcL) =3D=3D 0) {
+ vex_printf("li ");
ppHRegPPC32(i->Pin.Alu.dst);
vex_printf(",");
- ppHRegPPC32(r_srcL);
- vex_printf(",");
ppPPC32RH(rh_srcR);
+ return;
}
+ /* generic */
+ vex_printf("%s ", showPPC32AluOp(i->Pin.Alu.op,
+ toBool(rh_srcR->tag =3D=3D Prh_Im=
m)));
+ ppHRegPPC32(i->Pin.Alu.dst);
+ vex_printf(",");
+ ppHRegPPC32(r_srcL);
+ vex_printf(",");
+ ppPPC32RH(rh_srcR);
return;
}
+ case Pin_Shft: {
+ HReg r_srcL =3D i->Pin.Shft.srcL;
+ PPC32RH* rh_srcR =3D i->Pin.Shft.srcR;
+ vex_printf("%s ", showPPC32ShftOp(i->Pin.Shft.op,
+ toBool(rh_srcR->tag =3D=3D Prh_I=
mm),
+ i->Pin.Shft.sz32));
+ ppHRegPPC32(i->Pin.Shft.dst);
+ vex_printf(",");
+ ppHRegPPC32(r_srcL);
+ vex_printf(",");
+ ppPPC32RH(rh_srcR);
+ return;
+ }
case Pin_AddSubC32:
vex_printf("%s%s ",
i->Pin.AddSubC32.isAdd ? "add" : "sub",
@@ -1161,8 +1204,9 @@
ppHRegPPC32(i->Pin.AddSubC32.srcR);
return;
case Pin_Cmp:
- vex_printf("%s%s %%cr%u,",
+ vex_printf("%s%c%s %%cr%u,",
i->Pin.Cmp.syned ? "cmp" : "cmpl",
+ i->Pin.Cmp.sz32 ? 'w' : 'd',
i->Pin.Cmp.srcR->tag =3D=3D Prh_Imm ? "i" : "",
i->Pin.Cmp.crfD);
ppHRegPPC32(i->Pin.Cmp.srcL);
@@ -1176,8 +1220,9 @@
ppHRegPPC32(i->Pin.Unary32.src);
return;
case Pin_MulL:
- vex_printf("mul%s%s ",
- i->Pin.MulL.hi ? "hw" : "lw",
+ vex_printf("mul%c%c%s ",
+ i->Pin.MulL.hi ? 'h' : 'l',
+ i->Pin.MulL.sz32 ? 'w' : 'd',
i->Pin.MulL.hi ? (i->Pin.MulL.syned ? "s" : "u") : "");
ppHRegPPC32(i->Pin.MulL.dst);
vex_printf(",");
@@ -1186,7 +1231,8 @@
ppHRegPPC32(i->Pin.MulL.srcR);
return;
case Pin_Div:
- vex_printf("divw%s ",
+ vex_printf("div%c%s ",
+ i->Pin.Div.sz32 ? 'w' : 'd',
i->Pin.Div.syned ? "" : "u");
ppHRegPPC32(i->Pin.Div.dst);
vex_printf(",");
@@ -1555,6 +1601,11 @@
addRegUsage_PPC32RH(u, i->Pin.Alu.srcR);
addHRegUse(u, HRmWrite, i->Pin.Alu.dst);
return;
+ case Pin_Shft:
+ addHRegUse(u, HRmRead, i->Pin.Shft.srcL);
+ addRegUsage_PPC32RH(u, i->Pin.Shft.srcR);
+ addHRegUse(u, HRmWrite, i->Pin.Shft.dst);
+ return;
case Pin_AddSubC32:
addHRegUse(u, HRmWrite, i->Pin.AddSubC32.dst);
addHRegUse(u, HRmRead, i->Pin.AddSubC32.srcL);
@@ -1800,6 +1851,11 @@
mapReg(m, &i->Pin.Alu.srcL);
mapRegs_PPC32RH(m, i->Pin.Alu.srcR);
return;
+ case Pin_Shft:
+ mapReg(m, &i->Pin.Shft.dst);
+ mapReg(m, &i->Pin.Shft.srcL);
+ mapRegs_PPC32RH(m, i->Pin.Shft.srcR);
+ return;
case Pin_AddSubC32:
mapReg(m, &i->Pin.AddSubC32.dst);
mapReg(m, &i->Pin.AddSubC32.srcL);
@@ -2429,10 +2485,8 @@
UInt r_srcL =3D iregNo(i->Pin.Alu.srcL, mode64);
UInt r_srcR =3D immR ? (-1)/*bogus*/ :
iregNo(srcR->Prh.Reg.reg, mode64);
- Bool is32BitOp =3D toBool(hregClass(i->Pin.Alu.srcL) =3D=3D HRcIn=
t32);
=20
switch (i->Pin.Alu.op) {
-
case Palu_ADD:
if (immR) {
/* addi (PPC32 p350) */
@@ -2490,9 +2544,26 @@
}
break;
=20
- case Palu_SHL:
- if (is32BitOp) {
- vassert(!mode64);
+ default:
+ goto bad;
+ }
+ goto done;
+ }
+
+ case Pin_Shft: {
+ PPC32RH* srcR =3D i->Pin.Shft.srcR;
+ Bool sz32 =3D i->Pin.Shft.sz32;
+ Bool immR =3D toBool(srcR->tag =3D=3D Prh_Imm);
+ UInt r_dst =3D iregNo(i->Pin.Shft.dst, mode64);
+ UInt r_srcL =3D iregNo(i->Pin.Shft.srcL, mode64);
+ UInt r_srcR =3D immR ? (-1)/*bogus*/ :
+ iregNo(srcR->Prh.Reg.reg, mode64);
+ if (!mode64)
+ vassert(sz32);
+
+ switch (i->Pin.Shft.op) {
+ case Pshft_SHL:
+ if (sz32) {
if (immR) {
/* rd =3D rs << n, 1 <=3D n <=3D 31
is
@@ -2507,7 +2578,6 @@
p =3D mkFormX(p, 31, r_srcL, r_dst, r_srcR, 24, 0);
}
} else {
- vassert(mode64);
if (immR) {
/* rd =3D rs << n, 1 <=3D n <=3D 63
is
@@ -2524,10 +2594,9 @@
}
break;
=20
- case Palu_SHR:
- if (is32BitOp) {
- vassert(!mode64);
- if (immR) {
+ case Pshft_SHR:
+ if (sz32) {
+ if (immR) {
/* rd =3D rs >>u n, 1 <=3D n <=3D 31
is
rlwinm rd,rs,32-n,n,31 (PPC32 p501)
@@ -2541,7 +2610,6 @@
p =3D mkFormX(p, 31, r_srcL, r_dst, r_srcR, 536, 0);
}
} else {
- vassert(mode64);
if (immR) {
/* rd =3D rs >>u n, 1 <=3D n <=3D 63
is
@@ -2558,9 +2626,8 @@
}
break;
=20
- case Palu_SAR:
- if (is32BitOp) {
- vassert(!mode64);
+ case Pshft_SAR:
+ if (sz32) {
if (immR) {
/* srawi (PPC32 p507) */
UInt n =3D srcR->Prh.Imm.imm16;
@@ -2572,7 +2639,6 @@
p =3D mkFormX(p, 31, r_srcL, r_dst, r_srcR, 792, 0);
}
} else {
- vassert(mode64);
if (immR) {
/* sradi (PPC64 p571) */
UInt n =3D srcR->Prh.Imm.imm16;
@@ -2616,29 +2682,34 @@
=20
case Pin_Cmp: {
Bool syned =3D i->Pin.Cmp.syned;
+ Bool sz32 =3D i->Pin.Cmp.sz32;
UInt fld1 =3D i->Pin.Cmp.crfD << 2;
UInt r_srcL =3D iregNo(i->Pin.Cmp.srcL, mode64);
UInt r_srcR, imm_srcR;
PPC32RH* srcR =3D i->Pin.Cmp.srcR;
=20
+ if (!mode64) // cmp double word invalid for mode32
+ vassert(sz32); =20
+ else if (!sz32) // mode64 && cmp64: set L=3D1
+ fld1 |=3D 1;
+=20
switch (srcR->tag) {
case Prh_Imm:
- /* cmpi (signed) (PPC32 p368) or=20
- cmpli (unsigned) (PPC32 p370) */
+ vassert(syned =3D=3D srcR->Prh.Imm.syned);
imm_srcR =3D srcR->Prh.Imm.imm16;
- if (syned) {
- vassert(srcR->Prh.Imm.syned);
+ if (syned) { // cmpw/di (signed) (PPC32 p368)
vassert(imm_srcR !=3D 0x8000);
- } else {
- vassert(!srcR->Prh.Imm.syned);
+ p =3D mkFormD(p, 11, fld1, r_srcL, imm_srcR);
+ } else { // cmplw/di (unsigned) (PPC32 p370)
+ p =3D mkFormD(p, 10, fld1, r_srcL, imm_srcR);
}
- p =3D mkFormD(p, syned ? 11 : 10, fld1, r_srcL, imm_srcR);
break;
case Prh_Reg:
- /* cmpi (signed) (PPC32 p367) or=20
- cmpli (unsigned) (PPC32 p379) */
r_srcR =3D iregNo(srcR->Prh.Reg.reg, mode64);
- p =3D mkFormX(p, 31, fld1, r_srcL, r_srcR, syned ? 0 : 32, 0);
+ if (syned) // cmpwi (signed) (PPC32 p367)
+ p =3D mkFormX(p, 31, fld1, r_srcL, r_srcR, 0, 0);
+ else // cmplwi (unsigned) (PPC32 p379)
+ p =3D mkFormX(p, 31, fld1, r_srcL, r_srcR, 32, 0);
break;
default:=20
goto bad;
@@ -2667,30 +2738,33 @@
=20
case Pin_MulL: {
Bool syned =3D i->Pin.MulL.syned;
+ Bool sz32 =3D i->Pin.MulL.sz32;
UInt r_dst =3D iregNo(i->Pin.MulL.dst, mode64);
UInt r_srcL =3D iregNo(i->Pin.MulL.srcL, mode64);
UInt r_srcR =3D iregNo(i->Pin.MulL.srcR, mode64);
- Bool is32BitOp =3D toBool(hregClass(i->Pin.MulL.dst) =3D=3D HRcInt=
32);
=20
+ if (!mode64)
+ vassert(sz32);
+
if (i->Pin.MulL.hi) {
// mul hi words, must consider sign
- if (syned) {
- if (is32BitOp) // mulhw r_dst,r_srcL,r_srcR
+ if (sz32) {
+ if (syned) // mulhw r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 75, 0);
- else // mulhd r_dst,r_srcL,r_srcR
+ else // mulhwu r_dst,r_srcL,r_srcR
+ p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 11, 0);
+ } else {
+ if (syned) // mulhd r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 73, 0);
- } else {
- if (is32BitOp) // mulhwu r_dst,r_srcL,r_srcR
- p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 11, 0);
- else // mulhdu r_dst,r_srcL,r_srcR
+ else // mulhdu r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 9, 0);
}
} else {
// mul low word, sign is irrelevant
vassert(!i->Pin.MulL.syned);
- if (is32BitOp) // mullw r_dst,r_srcL,r_srcR
+ if (sz32) // mullw r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 235, 0);
- else // mulld r_dst,r_srcL,r_srcR
+ else // mulld r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 233, 0);
}
goto done;
@@ -2698,20 +2772,23 @@
=20
case Pin_Div: {
Bool syned =3D i->Pin.Div.syned;
+ Bool sz32 =3D i->Pin.Div.sz32;
UInt r_dst =3D iregNo(i->Pin.Div.dst, mode64);
UInt r_srcL =3D iregNo(i->Pin.Div.srcL, mode64);
UInt r_srcR =3D iregNo(i->Pin.Div.srcR, mode64);
- Bool is32BitOp =3D toBool(hregClass(i->Pin.Div.dst) =3D=3D HRcInt3=
2);
=20
- if (syned =3D=3D True) {
- if (is32BitOp) // divw r_dst,r_srcL,r_srcR
+ if (!mode64)
+ vassert(sz32);
+
+ if (sz32) {
+ if (syned) // divw r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 491, 0);
- else
+ else // divwu r_dst,r_srcL,r_srcR
+ p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 459, 0);
+ } else {
+ if (syned) // divd r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 489, 0);
- } else {
- if (is32BitOp) // divwu r_dst,r_srcL,r_srcR
- p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 459, 0);
- else
+ else // divdu r_dst,r_srcL,r_srcR
p =3D mkFormXO(p, 31, r_dst, r_srcL, r_srcR, 0, 457, 0);
}
goto done;
Modified: trunk/priv/host-ppc32/hdefs.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/priv/host-ppc32/hdefs.h 2005-12-16 01:06:42 UTC (rev 1497)
+++ trunk/priv/host-ppc32/hdefs.h 2005-12-16 13:40:18 UTC (rev 1498)
@@ -339,17 +339,29 @@
Palu_INVALID,
Palu_ADD, Palu_SUB,
Palu_AND, Palu_OR, Palu_XOR,
- Palu_SHL, Palu_SHR, Palu_SAR,=20
}
PPC32AluOp;
=20
extern=20
HChar* showPPC32AluOp ( PPC32AluOp,=20
- Bool /* is the 2nd operand an immediate? */,
- Bool /* is this a 32bit or 64bit op? */ );
+ Bool /* is the 2nd operand an immediate? */);
=20
=20
/* --------- */
+typedef=20
+ enum {
+ Pshft_INVALID,
+ Pshft_SHL, Pshft_SHR, Pshft_SAR,=20
+ }
+ PPC32ShftOp;
+
+extern=20
+HChar* showPPC32ShftOp ( PPC32ShftOp,=20
+ Bool /* is the 2nd operand an immediate? */,
+ Bool /* is this a 32bit or 64bit op? */ );
+
+
+/* --------- */
typedef
enum {
Pfp_INVALID,
@@ -427,7 +439,8 @@
typedef
enum {
Pin_LI, /* load word (32/64-bit) immediate (fake insn) */
- Pin_Alu, /* word add/sub/and/or/xor/shl/shr/sar */
+ Pin_Alu, /* word add/sub/and/or/xor */
+ Pin_Shft, /* word shl/shr/sar */
Pin_AddSubC32, /* 32-bit add/sub with read/write carry */
Pin_Cmp, /* word compare */
Pin_Unary, /* not, neg, clz */
@@ -485,7 +498,7 @@
HReg dst;
ULong imm64;
} LI;
- /* Integer add/sub/and/or/xor/shl/shr/sar. Limitations:
+ /* Integer add/sub/and/or/xor. Limitations:
- For add, the immediate, if it exists, is a signed 16.
- For sub, the immediate, if it exists, is a signed 16
which may not be -32768, since no such instruction=20
@@ -493,8 +506,6 @@
that is not possible.
- For and/or/xor, the immediate, if it exists,=20
is an unsigned 16.
- - For shr/shr/sar, the immediate, if it exists,
- is a signed 5-bit value between 1 and 31 inclusive.
*/
struct {
PPC32AluOp op;
@@ -502,6 +513,17 @@
HReg srcL;
PPC32RH* srcR;
} Alu;
+ /* Integer shl/shr/sar.
+ Limitations: the immediate, if it exists,
+ is a signed 5-bit value between 1 and 31 inclusive.
+ */
+ struct {
+ PPC32ShftOp op;
+ Bool sz32; /* mode64 has both 32 and 64bit shft */
+ HReg dst;
+ HReg srcL;
+ PPC32RH* srcR;
+ } Shft;
/* */
struct {
Bool isAdd; /* else sub */
@@ -514,6 +536,7 @@
else it is an unsigned 16. */
struct {
Bool syned;
+ Bool sz32; /* mode64 has both 32 and 64bit cmp */
UInt crfD;
HReg srcL;
PPC32RH* srcR;
@@ -527,6 +550,7 @@
struct {
Bool syned; /* meaningless if hi32=3D=3DFalse */
Bool hi; /* False=3D>low, True=3D>high */
+ Bool sz32; /* mode64 has both 32 & 64bit mull */
HReg dst;
HReg srcL;
HReg srcR;
@@ -534,6 +558,7 @@
/* ppc32 div/divu instruction. */
struct {
Bool syned;
+ Bool sz32; /* mode64 has both 32 & 64bit div */
HReg dst;
HReg srcL;
HReg srcR;
@@ -564,14 +589,14 @@
} CMov;
/* Sign/Zero extending loads. Dst size is always 32 bits. */
struct {
- UChar sz; /* 1|2|4 */
+ UChar sz; /* 1|2|4|8 */
Bool syned;
HReg dst;
PPC32AMode* src;
} Load;
/* 32/16/8 bit stores */
struct {
- UChar sz; /* 1|2|4 */
+ UChar sz; /* 1|2|4|8 */
PPC32AMode* dst;
HReg src;
} Store;
@@ -734,11 +759,12 @@
=20
extern PPC32Instr* PPC32Instr_LI ( HReg, ULong, Bool );
extern PPC32Instr* PPC32Instr_Alu ( PPC32AluOp, HReg, HReg, PPC32=
RH* );
+extern PPC32Instr* PPC32Instr_Shft ( PPC32AluOp, Bool sz32, HReg, =
HReg, PPC32RH* );
extern PPC32Instr* PPC32Instr_AddSubC32 ( Bool, Bool, HReg, HReg, HReg =
);
-extern PPC32Instr* PPC32Instr_Cmp ( Bool, UInt, HReg, PPC32=
RH* );
+extern PPC32Instr* PPC32Instr_Cmp ( Bool, Bool, UInt, HReg, PPC32=
RH* );
extern PPC32Instr* PPC32Instr_Unary ( PPC32UnaryOp op, HReg dst, HR=
eg src );
-extern PPC32Instr* PPC32Instr_MulL ( Bool syned, Bool hi32, HReg, =
HReg, HReg );
-extern PPC32Instr* PPC32Instr_Div ( Bool syned, HReg dst, HReg sr=
cL, HReg srcR );
+extern PPC32Instr* PPC32Instr_MulL ( Bool syned, Bool hi32, Bool s=
z32, HReg, HReg, HReg );
+extern PPC32Instr* PPC32Instr_Div ( Bool syned, Bool sz32, HReg d=
st, HReg srcL, HReg srcR );
extern PPC32Instr* PPC32Instr_Call ( PPC32CondCode, Addr64, UInt )=
;
extern PPC32Instr* PPC32Instr_Goto ( IRJumpKind, PPC32CondCode con=
d, PPC32RI* dst );
extern PPC32Instr* PPC32Instr_CMov ( PPC32CondCode, HReg dst, PPC3=
2RI* src );
Modified: trunk/priv/host-ppc32/isel.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/priv/host-ppc32/isel.c 2005-12-16 01:06:42 UTC (rev 1497)
+++ trunk/priv/host-ppc32/isel.c 2005-12-16 13:40:18 UTC (rev 1498)
@@ -767,9 +767,9 @@
PPC32Instr_Alu(Palu_AND, r_rmIR, r_rmIR, PPC32RH_Imm(False,3)));
=20
// r_rmPPC32 =3D XOR( r_rmIR, (r_rmIR << 1) & 2)
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, True/*32bit shift*/,
+ r_tmp, r_rmIR, PPC32RH_Imm(False,1)));
addInstr(env,=20
- PPC32Instr_Alu(Palu_SHL, r_tmp, r_rmIR, PPC32RH_Imm(False,1)));
- addInstr(env,=20
PPC32Instr_Alu(Palu_AND, r_tmp, r_tmp, PPC32RH_Imm(False,2)));
addInstr(env,=20
PPC32Instr_Alu(Palu_XOR, r_rmPPC32, r_rmIR, PPC32RH_Reg(r_tmp)));
@@ -1021,6 +1021,7 @@
/* --------- BINARY OP --------- */
case Iex_Binop: {
PPC32AluOp aluOp;
+ PPC32ShftOp shftOp;
=20
//.. /* Pattern: Sub32(0,x) */
//.. if (e->Iex.Binop.op =3D=3D Iop_Sub32 && isZero32(e->Iex.Binop=
.arg1)) {
@@ -1043,12 +1044,6 @@
aluOp =3D Palu_OR; break;
case Iop_Xor8: case Iop_Xor16: case Iop_Xor32: case Iop_Xor64:
aluOp =3D Palu_XOR; break;
- case Iop_Shl8: case Iop_Shl16: case Iop_Shl32: case Iop_Shl64:
- aluOp =3D Palu_SHL; break;
- case Iop_Shr8: case Iop_Shr16: case Iop_Shr32: case Iop_Shr64:
- aluOp =3D Palu_SHR; break;
- case Iop_Sar8: case Iop_Sar16: case Iop_Sar32: case Iop_Sar64:
- aluOp =3D Palu_SAR; break;
default:
aluOp =3D Palu_INVALID; break;
}
@@ -1068,49 +1063,86 @@
ri_srcR =3D iselIntExpr_RH(env, False/*signed*/,
e->Iex.Binop.arg2);
break;
- case Palu_SHL: case Palu_SHR: case Palu_SAR:
+ default:
+ vpanic("iselIntExpr_R_wrk-aluOp-arg2");
+ }
+ addInstr(env, PPC32Instr_Alu(aluOp, r_dst, r_srcL, ri_srcR));
+ return r_dst;
+ }
+
+ /* a shift? */
+ switch (e->Iex.Binop.op) {
+ case Iop_Shl8: case Iop_Shl16: case Iop_Shl32: case Iop_Shl64:
+ shftOp =3D Pshft_SHL; break;
+ case Iop_Shr8: case Iop_Shr16: case Iop_Shr32: case Iop_Shr64:
+ shftOp =3D Pshft_SHR; break;
+ case Iop_Sar8: case Iop_Sar16: case Iop_Sar32: case Iop_Sar64:
+ shftOp =3D Pshft_SAR; break;
+ default:
+ shftOp =3D Pshft_INVALID; break;
+ }
+ /* we assume any literal values are on the second operand. */
+ if (shftOp !=3D Pshft_INVALID) {
+ HReg r_dst =3D newVRegI(env);
+ HReg r_srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
+ PPC32RH* ri_srcR =3D NULL;
+ /* get right arg into an RH, in the appropriate way */
+ switch (shftOp) {
+ case Pshft_SHL: case Pshft_SHR: case Pshft_SAR:
if (!mode64)
ri_srcR =3D iselIntExpr_RH5u(env, e->Iex.Binop.arg2);
else
ri_srcR =3D iselIntExpr_RH6u(env, e->Iex.Binop.arg2);
break;
default:
- vpanic("iselIntExpr_R_wrk-aluOp-arg2");
+ vpanic("iselIntExpr_R_wrk-shftOp-arg2");
}
/* widen the left arg if needed */
- if ((aluOp =3D=3D Palu_SHR || aluOp =3D=3D Palu_SAR)) {
- if (!mode64 && (ty =3D=3D Ity_I8 || ty =3D=3D Ity_I16)) {
+ if (shftOp =3D=3D Pshft_SHR || shftOp =3D=3D Pshft_SAR) {
+ if (ty =3D=3D Ity_I8 || ty =3D=3D Ity_I16) {
PPC32RH* amt =3D PPC32RH_Imm(False, toUShort(ty =3D=3D It=
y_I8 ? 24 : 16));
HReg tmp =3D newVRegI(env);
- addInstr(env, PPC32Instr_Alu(Palu_SHL, tmp, r_srcL, amt))=
;
- addInstr(env, PPC32Instr_Alu(aluOp, tmp, tmp, amt));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, True/*32bit shif=
t*/,
+ tmp, r_srcL, amt));
+ addInstr(env, PPC32Instr_Shft(shftOp, True/*32bit shif=
t*/,
+ tmp, tmp, amt));
r_srcL =3D tmp;
vassert(0); /* AWAITING TEST CASE */
}
- if (mode64 && (ty =3D=3D Ity_I8 || ty =3D=3D Ity_I16 || ty =3D=
=3D Ity_I32)) {
- PPC32RH* amt =3D PPC32RH_Imm(False, toUShort(ty =3D=3D It=
y_I8 ? 56 :
- ty =3D=3D Ity_=
I16 ? 48 : 32));
- HReg tmp =3D newVRegI(env);
- addInstr(env, PPC32Instr_Alu(Palu_SHL, tmp, r_srcL, amt))=
;
- addInstr(env, PPC32Instr_Alu(aluOp, tmp, tmp, amt));
- r_srcL =3D tmp;
- }
}
- addInstr(env, PPC32Instr_Alu(aluOp, r_dst, r_srcL, ri_srcR));
+ /* Only 64 expressions need 64bit shifts,
+ 32bit shifts are fine for all others */
+ if (ty =3D=3D Ity_I64) {
+ vassert(mode64);
+ addInstr(env, PPC32Instr_Shft(shftOp, False/*64bit shift*/,
+ r_dst, r_srcL, ri_srcR));
+ } else {
+ addInstr(env, PPC32Instr_Shft(shftOp, True/*32bit shift*/,
+ r_dst, r_srcL, ri_srcR));
+ }
return r_dst;
}
=20
/* How about a div? */
if (e->Iex.Binop.op =3D=3D Iop_DivS32 ||=20
- e->Iex.Binop.op =3D=3D Iop_DivU32 ||
- e->Iex.Binop.op =3D=3D Iop_DivS64 ||=20
+ e->Iex.Binop.op =3D=3D Iop_DivU32) {
+ Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_DivS32);
+ HReg r_dst =3D newVRegI(env);
+ HReg r_srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
+ HReg r_srcR =3D iselIntExpr_R(env, e->Iex.Binop.arg2);
+ addInstr(env, PPC32Instr_Div(syned, True/*32bit div*/,
+ r_dst, r_srcL, r_srcR));
+ return r_dst;
+ }
+ if (e->Iex.Binop.op =3D=3D Iop_DivS64 ||=20
e->Iex.Binop.op =3D=3D Iop_DivU64) {
+ Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_DivS64);
HReg r_dst =3D newVRegI(env);
HReg r_srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
HReg r_srcR =3D iselIntExpr_R(env, e->Iex.Binop.arg2);
- Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_DivS32 ||
- e->Iex.Binop.op =3D=3D Iop_DivS64);
- addInstr(env, PPC32Instr_Div(syned, r_dst, r_srcL, r_srcR));
+ vassert(mode64);
+ addInstr(env, PPC32Instr_Div(syned, False/*64bit div*/,
+ r_dst, r_srcL, r_srcR));
return r_dst;
}
=20
@@ -1119,25 +1151,61 @@
e->Iex.Binop.op =3D=3D Iop_Mul32 ||
e->Iex.Binop.op =3D=3D Iop_Mul64) {
Bool syned =3D False;
+ Bool sz32 =3D (e->Iex.Binop.op !=3D Iop_Mul64);
HReg r_dst =3D newVRegI(env);
HReg r_srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
HReg r_srcR =3D iselIntExpr_R(env, e->Iex.Binop.arg2);
- addInstr(env, PPC32Instr_MulL(syned, False/*lo32*/,=20
+ addInstr(env, PPC32Instr_MulL(syned, False/*lo32*/, sz32,
r_dst, r_srcL, r_srcR));
return r_dst;
} =20
=20
+ /* 32 x 32 -> 64 multiply */
+ if (e->Iex.Binop.op =3D=3D Iop_MullU32 ||
+ e->Iex.Binop.op =3D=3D Iop_MullS32) {
+ HReg tLo =3D newVRegI(env);
+ HReg tHi =3D newVRegI(env);
+ HReg r_dst =3D newVRegI(env);
+ Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_MullS32);
+ HReg r_srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
+ HReg r_srcR =3D iselIntExpr_R(env, e->Iex.Binop.arg2);
+ vassert(mode64);
+ addInstr(env, PPC32Instr_MulL(False/*signedness irrelevant*/,=20
+ False/*lo32*/, True/*32bit mul*/,
+ tLo, r_srcL, r_srcR));
+ addInstr(env, PPC32Instr_MulL(syned,
+ True/*hi32*/, True/*32bit mul*/,
+ tHi, r_srcL, r_srcR));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, False/*64bit shift*/,
+ r_dst, tHi, PPC32RH_Imm(False,32)=
));
+ addInstr(env, PPC32Instr_Alu(Palu_OR, r_dst, r_dst, PPC32RH_Reg=
(tLo)));
+ return r_dst;
+ }
+
/* El-mutanto 3-way compare? */
if (e->Iex.Binop.op =3D=3D Iop_CmpORD32S ||
- e->Iex.Binop.op =3D=3D Iop_CmpORD32U ||
- e->Iex.Binop.op =3D=3D Iop_CmpORD64S ||
+ e->Iex.Binop.op =3D=3D Iop_CmpORD32U) {
+ Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_CmpORD32S)=
;
+ HReg dst =3D newVRegI(env);
+ HReg srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
+ PPC32RH* srcR =3D iselIntExpr_RH(env, syned, e->Iex.Binop.arg2=
);
+ addInstr(env, PPC32Instr_Cmp(syned, True/*32bit cmp*/,
+ 7/*cr*/, srcL, srcR));
+ addInstr(env, PPC32Instr_MfCR(dst));
+ addInstr(env, PPC32Instr_Alu(Palu_AND, dst, dst,
+ PPC32RH_Imm(False,7<<1)));
+ return dst;
+ }
+
+ if (e->Iex.Binop.op =3D=3D Iop_CmpORD64S ||
e->Iex.Binop.op =3D=3D Iop_CmpORD64U) {
- Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_CmpORD32S =
||
- e->Iex.Binop.op =3D=3D Iop_CmpORD64S);
+ Bool syned =3D toBool(e->Iex.Binop.op =3D=3D Iop_CmpORD64S)=
;
HReg dst =3D newVRegI(env);
HReg srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
PPC32RH* srcR =3D iselIntExpr_RH(env, syned, e->Iex.Binop.arg2=
);
- addInstr(env, PPC32Instr_Cmp(syned, /*cr*/7, srcL, srcR));
+ vassert(mode64);
+ addInstr(env, PPC32Instr_Cmp(syned, False/*64bit cmp*/,
+ 7/*cr*/, srcL, srcR));
addInstr(env, PPC32Instr_MfCR(dst));
addInstr(env, PPC32Instr_Alu(Palu_AND, dst, dst,
PPC32RH_Imm(False,7<<1)));
@@ -1217,18 +1285,22 @@
*/
=20
// r_ccIR_b0 =3D r_ccPPC32[0] | r_ccPPC32[3]
- addInstr(env, PPC32Instr_Alu(Palu_SHR, r_ccIR_b0, r_ccPPC32, PP=
C32RH_Imm(False,0x3)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHR, True/*32bit shift*/,
+ r_ccIR_b0, r_ccPPC32, PPC32RH_Imm=
(False,0x3)));
addInstr(env, PPC32Instr_Alu(Palu_OR, r_ccIR_b0, r_ccPPC32, PP=
C32RH_Reg(r_ccIR_b0)));
addInstr(env, PPC32Instr_Alu(Palu_AND, r_ccIR_b0, r_ccIR_b0, PP=
C32RH_Imm(False,0x1)));
=20
// r_ccIR_b2 =3D r_ccPPC32[0]
- addInstr(env, PPC32Instr_Alu(Palu_SHL, r_ccIR_b2, r_ccPPC32, PP=
C32RH_Imm(False,0x2)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, True/*32bit shift*/,
+ r_ccIR_b2, r_ccPPC32, PPC32RH_Imm=
(False,0x2)));
addInstr(env, PPC32Instr_Alu(Palu_AND, r_ccIR_b2, r_ccIR_b2, PP=
C32RH_Imm(False,0x4)));
=20
// r_ccIR_b6 =3D r_ccPPC32[0] | r_ccPPC32[1]
- addInstr(env, PPC32Instr_Alu(Palu_SHR, r_ccIR_b6, r_ccPPC32, PP=
C32RH_Imm(False,0x1)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHR, True/*32bit shift*/,
+ r_ccIR_b6, r_ccPPC32, PPC32RH_Imm=
(False,0x1)));
addInstr(env, PPC32Instr_Alu(Palu_OR, r_ccIR_b6, r_ccPPC32, PP=
C32RH_Reg(r_ccIR_b6)));
- addInstr(env, PPC32Instr_Alu(Palu_SHL, r_ccIR_b6, r_ccIR_b6, PP=
C32RH_Imm(False,0x6)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, True/*32bit shift*/,
+ r_ccIR_b6, r_ccIR_b6, PPC32RH_Imm=
(False,0x6)));
addInstr(env, PPC32Instr_Alu(Palu_AND, r_ccIR_b6, r_ccIR_b6, PP=
C32RH_Imm(False,0x40)));
=20
// r_ccIR =3D r_ccIR_b0 | r_ccIR_b2 | r_ccIR_b6
@@ -1322,27 +1394,36 @@
HReg r_dst =3D newVRegI(env);
HReg r_src =3D iselIntExpr_R(env, e->Iex.Unop.arg);
vassert(mode64);
- addInstr(env, PPC32Instr_Alu(Palu_SHL, r_dst, r_src,=20
- PPC32RH_Imm(False,32))=
);
- addInstr(env, PPC32Instr_Alu(Palu_SHR, r_dst, r_dst,=20
- PPC32RH_Imm(False,32))=
);
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, False/*64bit shift*/,
+ r_dst, r_src, PPC32RH_Imm(False,3=
2)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHR, False/*64bit shift*/,
+ r_dst, r_dst, PPC32RH_Imm(False,3=
2)));
return r_dst;
}
case Iop_8Sto16:
case Iop_8Sto32:
- case Iop_16Sto32:
+ case Iop_16Sto32: {
+ HReg r_dst =3D newVRegI(env);
+ HReg r_src =3D iselIntExpr_R(env, e->Iex.Unop.arg);
+ UShort amt =3D toUShort(e->Iex.Unop.op=3D=3DIop_16Sto32 ? 16 =
: 24);
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, True/*32bit shift*/,
+ r_dst, r_src, PPC32RH_Imm(False,a=
mt)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SAR, True/*32bit shift*/,
+ r_dst, r_dst, PPC32RH_Imm(False,a=
mt)));
+ return r_dst;
+ }
+ case Iop_8Sto64:
case Iop_16Sto64:
case Iop_32Sto64: {
HReg r_dst =3D newVRegI(env);
HReg r_src =3D iselIntExpr_R(env, e->Iex.Unop.arg);
- UShort amt =3D toUShort(e->Iex.Unop.op=3D=3DIop_16Sto64 ? 48 =
:
- e->Iex.Unop.op=3D=3DIop_32Sto64 ? 32 :
- e->Iex.Unop.op=3D=3DIop_16Sto32 ? 16 : =
24);
- vassert(amt<32 || mode64);
- addInstr(env, PPC32Instr_Alu(Palu_SHL, r_dst, r_src,=20
- PPC32RH_Imm(False,amt)=
));
- addInstr(env, PPC32Instr_Alu(Palu_SAR, r_dst, r_dst,=20
- PPC32RH_Imm(False,amt)=
));
+ UShort amt =3D toUShort(e->Iex.Unop.op=3D=3DIop_8Sto64 ? 56 =
:
+ e->Iex.Unop.op=3D=3DIop_16Sto64 ? 48 : =
32);
+ vassert(mode64);
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, False/*64bit shift*/,
+ r_dst, r_src, PPC32RH_Imm(False,a=
mt)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SAR, False/*64bit shift*/,
+ r_dst, r_dst, PPC32RH_Imm(False,a=
mt)));
return r_dst;
}
case Iop_Not8:
@@ -1362,8 +1443,8 @@
} else {
HReg r_dst =3D newVRegI(env);
HReg r_src =3D iselIntExpr_R(env, e->Iex.Unop.arg);
- addInstr(env, PPC32Instr_Alu(Palu_SHR, r_dst, r_src,=20
- PPC32RH_Imm(False,32)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHR, False/*64bit shift*=
/,
+ r_dst, r_src, PPC32RH_Imm(False=
,32)));
return r_dst;
}
}
@@ -1421,10 +1502,16 @@
HReg r_dst =3D newVRegI(env);
HReg r_src =3D iselIntExpr_R(env, e->Iex.Unop.arg);
UShort shift =3D toUShort(e->Iex.Unop.op =3D=3D Iop_16HIto8 ? 8=
: 16);
- addInstr(env, PPC32Instr_Alu(Palu_SHR, r_dst, r_src,=20
- PPC32RH_Imm(False,shift)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SHR, True/*32bit shift*/,
+ r_dst, r_src, PPC32RH_Imm(False,s=
hift)));
return r_dst;
}
+ case Iop_128HIto64: {
+ HReg rHi, rLo;
+ vassert(mode64);
+ iselInt128Expr(&rHi,&rLo, env, e->Iex.Unop.arg);
+ return rHi; /* and abandon rLo .. poor wee thing :-) */
+ }
case Iop_128to64: {
vassert(mode64);
HReg rHi, rLo;
@@ -1445,10 +1532,10 @@
HReg r_dst =3D newVRegI(env);
PPC32CondCode cond =3D iselCondCode(env, e->Iex.Unop.arg);
addInstr(env, PPC32Instr_Set32(cond,r_dst));
- addInstr(env, PPC32Instr_Alu(Palu_SHL, r_dst, r_dst,=20
- PPC32RH_Imm(False,31))=
);
- addInstr(env, PPC32Instr_Alu(Palu_SAR, r_dst, r_dst,=20
- PPC32RH_Imm(False,31))=
);
+ addInstr(env, PPC32Instr_Shft(Pshft_SHL, True/*32bit shift*/,
+ r_dst, r_dst, PPC32RH_Imm(False,3=
1)));
+ addInstr(env, PPC32Instr_Shft(Pshft_SAR, True/*32bit shift*/,
+ r_dst, r_dst, PPC32RH_Imm(False,3=
1)));
return r_dst;
}
=20
@@ -1584,7 +1671,8 @@
HReg r_tmp =3D newVRegI(env);
addInstr(env, mk_iMOVds_RR(r_dst,rX));
addInstr(env, PPC32Instr_Alu(Palu_AND, r_tmp, r_cond, PPC32RH_I=
mm(False,0xFF)));
- addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, 7/*cr*/, r_tmp,=
PPC32RH_Imm(False,0)));
+ addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, True/*32bit cmp=
*/,
+ 7/*cr*/, r_tmp, PPC32RH_Imm(False,=
0)));
addInstr(env, PPC32Instr_CMov(cc,r_dst,r0));
return r_dst;
}
@@ -1917,8 +2005,8 @@
// Make a compare that will always be true:
HReg r_zero =3D newVRegI(env);
addInstr(env, PPC32Instr_LI(r_zero, 0, mode64));
- addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, /*cr*/7,=20
- r_zero, PPC32RH_Reg(r_zero)));
+ addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, True/*32bit cmp*/,
+ 7/*cr*/, r_zero, PPC32RH_Reg(r_zero))=
);
return mk_PPCCondCode( Pct_TRUE, Pcf_7EQ );
}
=20
@@ -1949,15 +2037,14 @@
//.. }
=20
/* 32to1 */
- if (e->tag =3D=3D Iex_Unop && e->Iex.Unop.op =3D=3D Iop_32to1) {
+ if (e->tag =3D=3D Iex_Unop &&
+ (e->Iex.Unop.op =3D=3D Iop_32to1 || e->Iex.Unop.op =3D=3D Iop_64t=
o1)) {
HReg src =3D iselIntExpr_R(env, e->Iex.Unop.arg);
HReg tmp =3D newVRegI(env);
/* could do better, probably -- andi. */
- addInstr(env, PPC32Instr_Alu(
- Palu_AND, tmp, src, PPC32RH_Imm(False,1)));
- addInstr(env, PPC32Instr_Cmp(
- False/*unsigned*/, 7/*cr*/,=20
- tmp, PPC32RH_Imm(False,1)));
+ addInstr(env, PPC32Instr_Alu(Palu_AND, tmp, src, PPC32RH_Imm(False=
,1)));
+ addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, True/*32bit cmp*/,
+ 7/*cr*/, tmp, PPC32RH_Imm(False,1)));
return mk_PPCCondCode( Pct_TRUE, Pcf_7EQ );
}
=20
@@ -1969,9 +2056,10 @@
&& e->Iex.Unop.op =3D=3D Iop_CmpNEZ8) {
HReg r_32 =3D iselIntExpr_R(env, e->Iex.Unop.arg);
HReg r_l =3D newVRegI(env);
- addInstr(env, PPC32Instr_Alu(Palu_AND, r_l, r_32, PPC32RH_Imm(Fals=
e,0xFF)));
- addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, 7/*cr*/,=20
- r_l, PPC32RH_Imm(False,0)));
+ addInstr(env, PPC32Instr_Alu(Palu_AND, r_l, r_32,
+ PPC32RH_Imm(False,0xFF)));
+ addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, True/*32bit cmp*/,
+ 7/*cr*/, r_l, PPC32RH_Imm(False,0)));
return mk_PPCCondCode( Pct_FALSE, Pcf_7EQ );
}
=20
@@ -1981,7 +2069,8 @@
if (e->tag =3D=3D Iex_Unop
&& e->Iex.Unop.op =3D=3D Iop_CmpNEZ32) {
HReg r1 =3D iselIntExpr_R(env, e->Iex.Unop.arg);
- addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, 7, r1, PPC32RH_Imm=
(False,0)));
+ addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, True/*32bit cmp*/,
+ 7/*cr*/, r1, PPC32RH_Imm(False,0)));
return mk_PPCCondCode( Pct_FALSE, Pcf_7EQ );
}
=20
@@ -2040,15 +2129,12 @@
|| e->Iex.Binop.op =3D=3D Iop_CmpLT32U
|| e->Iex.Binop.op =3D=3D Iop_CmpLE32S
|| e->Iex.Binop.op =3D=3D Iop_CmpLE32U)) {
- PPC32RH* ri2;
- HReg r1 =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
- Bool syned =3D False;
- if (e->Iex.Binop.op =3D=3D Iop_CmpLT32S ||
- e->Iex.Binop.op =3D=3D Iop_CmpLE32S) {
- syned =3D True;
- }
- ri2 =3D iselIntExpr_RH(env, syned, e->Iex.Binop.arg2);
- addInstr(env, PPC32Instr_Cmp(syned,7,r1,ri2));
+ Bool syned =3D (e->Iex.Binop.op =3D=3D Iop_CmpLT32S ||
+ e->Iex.Binop.op =3D=3D Iop_CmpLE32S);
+ HReg r1 =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
+ PPC32RH* ri2 =3D iselIntExpr_RH(env, syned, e->Iex.Binop.arg2);
+ addInstr(env, PPC32Instr_Cmp(syned, True/*32bit cmp*/,
+ 7/*cr*/, r1, ri2));
=20
switch (e->Iex.Binop.op) {
case Iop_CmpEQ32: return mk_PPCCondCode( Pct_TRUE, Pcf_7EQ );
@@ -2069,15 +2155,13 @@
|| e->Iex.Binop.op =3D=3D Iop_CmpLT64U
|| e->Iex.Binop.op =3D=3D Iop_CmpLE64S
|| e->Iex.Binop.op =3D=3D Iop_CmpLE64U)) {
- PPC32RH* ri2;
- HReg r1 =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
- Bool syned =3D False;
- if (e->Iex.Binop.op =3D=3D Iop_CmpLT64S ||
- e->Iex.Binop.op =3D=3D Iop_CmpLE64S) {
- syned =3D True;
- }
- ri2 =3D iselIntExpr_RH(env, syned, e->Iex.Binop.arg2);
- addInstr(env, PPC32Instr_Cmp(syned,7,r1,ri2));
+ Bool syned =3D (e->Iex.Binop.op =3D=3D Iop_CmpLT64S ||
+ e->Iex.Binop.op =3D=3D Iop_CmpLE64S);
+ HReg r1 =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
+ PPC32RH* ri2 =3D iselIntExpr_RH(env, syned, e->Iex.Binop.arg2);
+ vassert(mode64);
+ addInstr(env, PPC32Instr_Cmp(syned, False/*64bit cmp*/,
+ 7/*cr*/, r1, ri2));
=20
switch (e->Iex.Binop.op) {
case Iop_CmpEQ64: return mk_PPCCondCode( Pct_TRUE, Pcf_7EQ );
@@ -2147,11 +2231,13 @@
iselInt64Expr( &hi, &lo, env, e->Iex.Unop.arg );
addInstr(env, mk_iMOVds_RR(tmp, lo));
addInstr(env, PPC32Instr_Alu(Palu_OR, tmp, tmp, PPC32RH_Reg(hi)=
));
- addInstr(env, PPC32Instr_Cmp(False/*sign*/,7/*cr*/,tmp,PPC32RH_=
Imm(False,0)));
+ addInstr(env, PPC32Instr_Cmp(False/*sign*/, True/*32bit cmp*/,
+ 7/*cr*/, tmp,PPC32RH_Imm(False,0))=
);
return mk_PPCCondCode( Pct_FALSE, Pcf_7EQ );
} else { // mode64
HReg r_src =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
- addInstr(env, PPC32Instr_Cmp(False/*sign*/,7/*cr*/,r_src,PPC32R=
H_Imm(False,0)));
+ addInstr(env, PPC32Instr_Cmp(False/*sign*/, False/*64bit cmp*/,
+ 7/*cr*/, r_src,PPC32RH_Imm(False,0=
)));
return mk_PPCCondCode( Pct_FALSE, Pcf_7EQ );
}
}
@@ -2161,7 +2247,8 @@
HReg r_src =3D lookupIRTemp(env, e->Iex.Tmp.tmp);
HReg src_masked =3D newVRegI(env);
addInstr(env, PPC32Instr_Alu(Palu_AND, src_masked, r_src, PPC32RH_=
Imm(False,1)));
- addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, 7/*cr*/, src_maske=
d, PPC32RH_Imm(False,1)));
+ addInstr(env, PPC32Instr_Cmp(False/*unsigned*/, True/*32bit cmp*/,
+ 7/*cr*/, src_masked, PPC32RH_Imm(Fals=
e,1)));
return mk_PPCCondCode( Pct_TRUE, Pcf_7EQ );
}
=20
@@ -2217,9 +2304,11 @@
HReg r_srcL =3D iselIntExpr_R(env, e->Iex.Binop.arg1);
HReg r_srcR =3D iselIntExpr_R(env, e->Iex.Binop.arg2);
addInstr(env, PPC32Instr_MulL(False/*signedness irrelevant*/,=20
- False/*lo64*/, tLo, r_srcL, r_src=
R));
+ False/*lo64*/, False/*64bit mul*/=
,
+ tLo, r_srcL, r_srcR));
addInstr(env, PPC32Instr_MulL(syned,
- True/*hi64*/, tHi, r_srcL, r_srcR=
));
+ True/*hi64*/, False/*64bit mul*/,
+ tHi, r_srcL, r_srcR));
*rHi =3D tHi;
*rLo =3D tLo;
return;
@@ -2357,7 +2446,7 @@
=20
addInstr(env, PPC32Instr_Alu(Palu_AND,=20
r_tmp, r_cond, PPC32RH_Imm(False,0xFF=
)));
- addInstr(env, PPC32Instr...
[truncated message content] |
|
From: James B. <ja...@ha...> - 2005-12-16 10:28:37
|
Hi, I've run the performance tests on my 4 year old laptop (running an up-to-date Fedora Core 4, gcc 4.0.2), which has a 1.1GHz Pentium III processor - cat /proc/cpuinfo gives model name : Intel(R) Pentium(R) III Mobile CPU 1133MHz After SVN updating to r5348 (vex r1497), the results of make perf are -- Running tests in perf ----------------------------------------- bigcode1 valgrind : 0.5s nl:16.6s (33.1x) mc:23.0s (45.9x) bigcode2 valgrind : 0.5s nl:26.6s (52.2x) mc:42.7s (83.8x) bz2 valgrind : 2.5s nl:14.9s ( 6.0x) mc:55.9s (22.5x) fbench valgrind : 1.8s nl: 5.7s ( 3.2x) mc:26.7s (15.1x) ffbench valgrind : 4.3s nl: 8.6s ( 2.0x) mc:24.6s ( 5.7x) sarp valgrind : 0.1s nl: 1.2s ( 9.1x) mc:30.2s (232.5x) -- Finished tests in perf ----------------------------------------- (snipped slightly to avoid line wrapping) Cheers, James. James Begley -- Telephone +354-575-2000 Marine Research Institute, Skulagata 4, P.O. Box 1390, 121 Reykjavik, Iceland. |
|
From: Julian S. <js...@ac...> - 2005-12-16 09:54:29
|
> > +#define BYTES_PER_SEC_VBIT_NODE 4 > > + > > this is a very 32 bit assumption. i'd define it to sizeof(Addr). just > being consequent about the "for free" principle ... unless i missed > something about v's internals. :) In this particular case the value is completely unrelated to the word size, so it has no bad effect on 64-bit systems. J |
|
From: Oswald B. <os...@kd...> - 2005-12-16 06:15:57
|
On Thu, Dec 15, 2005 at 09:18:34PM +0000, sv...@va... wrote: > +// 4 is the best value here. We can go from 1 to 4 for free -- it doesn't > +// change the size of the SecVBitNode because of padding. If we make it > +// larger, we have bigger nodes, but can possibly fit more partially defined > +// bytes in each node. In practice it seems that partially defined bytes > +// are not clustered close to each other, so going bigger than 4 does not > +// save space. > +#define BYTES_PER_SEC_VBIT_NODE 4 > + this is a very 32 bit assumption. i'd define it to sizeof(Addr). just being consequent about the "for free" principle ... unless i missed something about v's internals. :) -- Hi! I'm a .signature virus! Copy me into your ~/.signature, please! -- Chaos, panic, and disorder - my work here is done. |
|
From: <js...@ac...> - 2005-12-16 03:43:17
|
Nightly build on g5 ( YDL 4.0, ppc970 ) started at 2005-12-16 04:40:00 CET Checking out vex source tree ... failed Last 20 lines of log.verbose follow Nightly build on g5 ( YDL 4.0, ppc970 ) started at 2005-12-16 04:40:00 CET svn: Can't connect to host 'svn.valgrind.org': Connection timed out |
|
From: <js...@ac...> - 2005-12-16 03:36:14
|
Nightly build on phoenix ( SuSE 10.0 ) started at 2005-12-16 03:30:01 GMT Checking out vex source tree ... failed Last 20 lines of verbose log follow echo Checking out vex source tree ... svn co svn://svn.valgrind.org/vex/trunk -r {2005-12-16T03:30:01} vex svn: Can't connect to host 'svn.valgrind.org': Connection timed out ================================================= == Results from 24 hours ago == ================================================= Checking out vex source tree ... failed Last 20 lines of verbose log follow echo Checking out vex source tree ... svn co svn://svn.valgrind.org/vex/trunk -r {2005-12-15T03:30:01} vex svn: Can't connect to host 'svn.valgrind.org': Connection timed out ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Fri Dec 16 03:33:21 2005 --- new.short Fri Dec 16 03:36:30 2005 *************** *** 5,7 **** ! Checking out vex source tree ... svn co svn://svn.valgrind.org/vex/trunk -r {2005-12-15T03:30:01} vex svn: Can't connect to host 'svn.valgrind.org': Connection timed out --- 5,7 ---- ! Checking out vex source tree ... svn co svn://svn.valgrind.org/vex/trunk -r {2005-12-16T03:30:01} vex svn: Can't connect to host 'svn.valgrind.org': Connection timed out |
|
From: Yu Y. <yuy...@gm...> - 2005-12-16 02:23:06
|
Hi, everyone, Sorry for spamming the developer maillist. As this email is not about how to use valgrind, we thought developers will give out better answers. Right now, we are developing some multithreaded application with pthread. We tried to use Helgrind to detect potential data races, but it gives out many false warnings. According to the documentation, Helgrind is based on lockset algorithm, the same as Eraser etc. We want to compare these race detectors and decide which one we shall use. Is there any comparison data between Helgrind and Eraser, or other race-detectors? What can we do to help improve Helgrind? Thank you very much! -Yu |
|
From: Julian S. <js...@ac...> - 2005-12-16 01:16:00
|
> Julian's commits r5345 and r5346 (avoiding the profiling in the > dispatcher, and using jumps instead of call/return) have the following > effect on my 3.0 GHz P4 Prescott. There are similar (slightly more modest improvments) on amd64, and also ppc32 now. In fact ppc32 is shaping up to being, if anything, a slightly more efficient target than x86 or amd64. Here are before and after numbers for ppc32 on a 1.25GHz MPC7447. The machine was not as quiet as one would like, so take the numbers with a bit of caution. Nevertheless the direction is clear. I should point out, all these speedups come from being able to do self-hosting, and in particular from cachegrind pointing out performance stupidities. J ppc32, trunk, before: bigcode1 trunk : 0.5s nl:14.3s (30.4x, -----) mc:21.6s (45.9x, -----) bigcode2 trunk : 0.5s nl:22.7s (45.5x, -----) mc:42.5s (84.9x, -----) bz2 trunk : 2.1s nl:17.2s ( 8.2x, -----) mc:49.4s (23.5x, -----) fbench trunk : 1.6s nl:16.6s (10.4x, -----) mc:53.4s (33.6x, -----) ffbench trunk : 4.7s nl: 6.8s ( 1.4x, -----) mc:21.9s ( 4.7x, -----) sarp trunk : 0.1s nl: 1.1s (12.6x, -----) mc:16.7s (186.0x, -----) ppc32, trunk, after: bigcode1 trunk : 0.5s nl:11.5s (23.9x, -----) mc:18.9s (39.4x, -----) bigcode2 trunk : 0.5s nl:20.2s (40.5x, -----) mc:39.0s (78.1x, -----) bz2 trunk : 2.1s nl:13.8s ( 6.6x, -----) mc:45.5s (21.8x, -----) fbench trunk : 1.6s nl:12.9s ( 8.1x, -----) mc:49.7s (31.3x, -----) ffbench trunk : 4.6s nl: 6.7s ( 1.5x, -----) mc:22.0s ( 4.8x, -----) sarp trunk : 0.1s nl: 0.9s (10.6x, -----) mc:16.7s (185.3x, -----) |
|
From: <sv...@va...> - 2005-12-16 01:08:26
|
Author: sewardj
Date: 2005-12-16 01:08:22 +0000 (Fri, 16 Dec 2005)
New Revision: 5358
Log:
Hold the event count in r29 rather than the count register, since the
former doesn't need to be spilled and reloaded for every bb run.
Modified:
trunk/coregrind/m_dispatch/dispatch-ppc32-linux.S
Modified: trunk/coregrind/m_dispatch/dispatch-ppc32-linux.S
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/coregrind/m_dispatch/dispatch-ppc32-linux.S 2005-12-16 01:07:11=
UTC (rev 5357)
+++ trunk/coregrind/m_dispatch/dispatch-ppc32-linux.S 2005-12-16 01:08:22=
UTC (rev 5358)
@@ -183,10 +183,9 @@
/* CAB TODO: Use a caller-saved reg for orig guest_state ptr
- rem to set non-allocateable in isel.c */
=20
- /* hold dispatch_ctr in ctr reg */
+ /* hold dispatch_ctr in r29 */
lis 5,VG_(dispatch_ctr)@ha
- lwz 5,VG_(dispatch_ctr)@l(5)
- mtctr 5
+ lwz 29,VG_(dispatch_ctr)@l(5)
=20
/* set host FPU control word to the default mode expected=20
by VEX-generated code. See comments in libvex.h for
@@ -240,8 +239,8 @@
/* At entry: Live regs:
r1 (=3Dsp)
r3 (=3DCIA =3D next guest address)
+ r29 (=3Ddispatch_ctr)
r31 (=3Dguest_state)
- ctr (=3Ddispatch_ctr)
Stack state:
44(r1) (=3Dorig guest_state)
*/
@@ -255,7 +254,10 @@
stw 3,OFFSET_ppc32_CIA(31)
=20
/* Are we out of timeslice? If yes, defer to scheduler. */
- bdz counter_is_zero /* decrements ctr reg */
+// addic. 29,29,-1
+ addi 29,29,-1
+ cmplwi 29,0
+ beq counter_is_zero
=20
/* try a fast lookup in the translation cache */
/* r4=3D((r30<<2) & (VG_TT_FAST_MASK<<2)) */
@@ -271,17 +273,9 @@
addi 8,5,8
mtlr 8
=20
- /* stop ctr being clobbered */
- mfctr 5
- stw 5,40(1) /* =3D> 40-16 =3D 24(1) on our parent stack */
-
/* run the translation */
blrl
=20
- /* reinstate clobbered ctr */
- lwz 5,40(1)
- mtctr 5
-
/* start over */
b VG_(run_innerloop__dispatch_unprofiled)
/*NOTREACHED*/
@@ -295,8 +289,8 @@
/* At entry: Live regs:
r1 (=3Dsp)
r3 (=3DCIA =3D next guest address)
+ r29 (=3Ddispatch_ctr)
r31 (=3Dguest_state)
- ctr (=3Ddispatch_ctr)
Stack state:
44(r1) (=3Dorig guest_state)
*/
@@ -310,7 +304,8 @@
stw 3,OFFSET_ppc32_CIA(31)
=20
/* Are we out of timeslice? If yes, defer to scheduler. */
- bdz counter_is_zero /* decrements ctr reg */
+ addic. 29,29,-1
+ beq counter_is_zero
=20
/* try a fast lookup in the translation cache */
/* r4=3D((r30<<2) & (VG_TT_FAST_MASK<<2)) */
@@ -333,17 +328,9 @@
addi 8,5,8
mtlr 8
=20
- /* stop ctr being clobbered */
- mfctr 5
- stw 5,40(1) /* =3D> 40-16 =3D 24(1) on our parent stack */
-
/* run the translation */
blrl
=20
- /* reinstate clobbered ctr */
- lwz 5,40(1)
- mtctr 5
-
/* start over */
b VG_(run_innerloop__dispatch_profiled)
/*NOTREACHED*/
@@ -369,18 +356,14 @@
counter_is_zero:
/* %CIA is up to date */
/* back out decrement of the dispatch counter */
- mfctr 5
- addi 5,5,1
- mtctr 5
+ addi 29,29,1
li 3,VG_TRC_INNER_COUNTERZERO
b run_innerloop_exit
=20
fast_lookup_failed:
/* %CIA is up to date */
/* back out decrement of the dispatch counter */
- mfctr 5
- addi 5,5,1
- mtctr 5
+ addi 29,29,1
li 3,VG_TRC_INNER_FASTMISS
b run_innerloop_exit
=20
@@ -447,9 +430,8 @@
addi 1,1,16
=20
/* Write ctr to VG(dispatch_ctr) */
- mfctr 17
- lis 18,VG_(dispatch_ctr)@ha
- stw 17,VG_(dispatch_ctr)@l(18)
+ lis 5,VG_(dispatch_ctr)@ha
+ stw 29,VG_(dispatch_ctr)@l(5)
=20
/* Restore cr */
lwz 0,44(1)
|
|
From: <sv...@va...> - 2005-12-16 01:07:14
|
Author: sewardj
Date: 2005-12-16 01:07:11 +0000 (Fri, 16 Dec 2005)
New Revision: 5357
Log:
Add missing cases in debug printing.
Modified:
trunk/coregrind/m_scheduler/scheduler.c
Modified: trunk/coregrind/m_scheduler/scheduler.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/coregrind/m_scheduler/scheduler.c 2005-12-15 23:07:45 UTC (rev =
5356)
+++ trunk/coregrind/m_scheduler/scheduler.c 2005-12-16 01:07:11 UTC (rev =
5357)
@@ -157,6 +157,10 @@
case VEX_TRC_JMP_CLIENTREQ: return "CLIENTREQ";
case VEX_TRC_JMP_YIELD: return "YIELD";
case VEX_TRC_JMP_NODECODE: return "NODECODE";
+ case VEX_TRC_JMP_MAPFAIL: return "MAPFAIL";
+ case VEX_TRC_JMP_EMWARN: return "EMWARN";
+ case VEX_TRC_JMP_TINVAL: return "TINVAL";
+ case VG_TRC_INVARIANT_FAILED: return "INVFAILED";
case VG_TRC_INNER_COUNTERZERO: return "COUNTERZERO";
case VG_TRC_INNER_FASTMISS: return "FASTMISS";
case VG_TRC_FAULT_SIGNAL: return "FAULTSIGNAL";
|
|
From: <sv...@va...> - 2005-12-16 01:06:47
|
Author: sewardj
Date: 2005-12-16 01:06:42 +0000 (Fri, 16 Dec 2005)
New Revision: 1497
Log:
ppc32/64 backend: take r29 out of circulation so the Valgrind
dispatcher can use it.
Modified:
trunk/priv/host-ppc32/hdefs.c
Modified: trunk/priv/host-ppc32/hdefs.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- trunk/priv/host-ppc32/hdefs.c 2005-12-15 21:33:50 UTC (rev 1496)
+++ trunk/priv/host-ppc32/hdefs.c 2005-12-16 01:06:42 UTC (rev 1497)
@@ -204,9 +204,9 @@
{
UInt i=3D0;
if (mode64)
- *nregs =3D (32-8) + (32-24) + (32-24);
+ *nregs =3D (32-9) + (32-24) + (32-24);
else
- *nregs =3D (32-6) + (32-24) + (32-24);
+ *nregs =3D (32-7) + (32-24) + (32-24);
*arr =3D LibVEX_Alloc(*nregs * sizeof(HReg));
// GPR0 =3D scratch reg where possible - some ops interpret as value =
zero
// GPR1 =3D stack pointer
@@ -227,7 +227,7 @@
(*arr)[i++] =3D hregPPC_GPR12(mode64);
}
// GPR13 =3D thread specific pointer
- // GPR 14 and above are callee save. Yay.
+ // GPR14 and above are callee save. Yay.
(*arr)[i++] =3D hregPPC_GPR14(mode64);
(*arr)[i++] =3D hregPPC_GPR15(mode64);
(*arr)[i++] =3D hregPPC_GPR16(mode64);
@@ -243,7 +243,7 @@
(*arr)[i++] =3D hregPPC_GPR26(mode64);
(*arr)[i++] =3D hregPPC_GPR27(mode64);
(*arr)[i++] =3D hregPPC_GPR28(mode64);
- (*arr)[i++] =3D hregPPC_GPR29(mode64);
+ // GPR29 is reserved for the dispatcher
// GPR30 is reserved as AltiVec spill reg temporary
// GPR31 is reserved for the GuestStatePtr
=20
@@ -260,6 +260,7 @@
(*arr)[i++] =3D hregPPC32_FPR7();
=20
/* Same deal re Altivec */
+ /* NB, vr29 is used as a scratch temporary -- do not allocate */
(*arr)[i++] =3D hregPPC32_VR0();
(*arr)[i++] =3D hregPPC32_VR1();
(*arr)[i++] =3D hregPPC32_VR2();
@@ -1287,7 +1288,7 @@
ppHRegPPC32(i->Pin.Set32.dst);
vex_printf(",");
ppHRegPPC32(i->Pin.Set32.dst);
- vex_printf("1");
+ vex_printf(",1");
}
vex_printf(" }");
}
@@ -1741,7 +1742,7 @@
addHRegUse(u, HRmRead, i->Pin.AvBin32Fx4.srcL);
addHRegUse(u, HRmRead, i->Pin.AvBin32Fx4.srcR);
if (i->Pin.AvBin32Fx4.op =3D=3D Pavfp_MULF)
- addHRegUse(u, HRmWrite, hregPPC_GPR29(mode64));
+ addHRegUse(u, HRmWrite, hregPPC32_VR29());
return;
case Pin_AvUn32Fx4:
addHRegUse(u, HRmWrite, i->Pin.AvUn32Fx4.dst);
@@ -3374,7 +3375,9 @@
load -0.0 (0x8000_0000) to each 32-bit word of vB
this makes the add a noop.
*/
- UInt vB =3D 29; // XXX: Using r29 for temp
+ UInt vB =3D 29; // XXX: Using v29 for temp do not change
+ // without also changing
+ // getRegUsage_PPC32Instr
UInt konst =3D 0x1F;
=20
// Better way to load -0.0 (0x80000000) ?
|