|
From: <jhr...@t-...> - 2003-10-19 13:38:16
|
Hello, now that my basic tests work under valgrind when compiled with ICC 7.1 (excellent!), I've tighten the requirements. Another one of my tests, compiled with -march=pentium4 and profile guided optimizations under ICC 7.1 fails with ---------- ==1568== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux. ==1568== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward. ==1568== Using valgrind-20031012, a program supervision framework for x86-linux. ==1568== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. ==1568== Estimated CPU clock rate is 1724 MHz ==1568== For more details, rerun with: -v ==1568== float, 8 rrr disInstr: unhandled instruction bytes: 0xF 0x14 0xF8 0xF3 ---------- Again an easy one? ;-) Thanks, Joerg |
|
From: <jhr...@t-...> - 2003-10-19 15:06:07
|
Hello,
I wrote:
> now that my basic tests work under valgrind when compiled with ICC 7.1
> (excellent!), I've tighten the requirements. Another one of my tests,
> compiled with -march=pentium4 and profile guided optimizations under
> ICC 7.1 fails with
>
> ----------
> ==1568== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
> ==1568== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
> ==1568== Using valgrind-20031012, a program supervision framework for
> x86-linux.
> ==1568== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
> ==1568== Estimated CPU clock rate is 1724 MHz
> ==1568== For more details, rerun with: -v
> ==1568==
> float, 8
> rrr
> disInstr: unhandled instruction bytes: 0xF 0x14 0xF8 0xF3
> ----------
>
> Again an easy one? ;-)
Hm, unsure. To get this test running, I had to revisit the implementation of
the following instructions:
1. unpcklps/hps/lpd/hpd
lps/hps were missing
2. movaps/ups/apd/upd
upd was missing, the implemention of ps, pd differed in the computation of
the store flag (the one of ps with corrected typo seems to be more
appropriate).
3. movlps/lpd/hps/hpd
lps/hps/hpd were missing, check against reg-reg move failed, lpd moved 16
instead of 8 bytes IIUC.
I didn't look into movhlps/movlhps. They still seem to be missing.
I trusted dis_SSE_reg_or_mem() and dis_SSE3_load_store_or_mov() to always do
The Right Thing (TM).
Best,
Joerg
P.S.: here's the diff -u -w:
----------
--- vg_to_ucode.c.orig Sun Oct 19 14:01:03 2003
+++ vg_to_ucode.c Sun Oct 19 16:09:09 2003
@@ -4142,14 +4142,22 @@
goto decode_success;
}
- /* 0x14: UNPCKLPD (src)xmmreg-or-mem, (dst)xmmreg */
- /* 0x15: UNPCKHPD (src)xmmreg-or-mem, (dst)xmmreg */
- if (sz == 2
- && insn[0] == 0x0F
+ /* 0x0F 0x14: UNPCKLPS (src)xmmreg-or-mem, (dst)xmmreg */
+ /* 0x0F 0x15: UNPCKHPS (src)xmmreg-or-mem, (dst)xmmreg */
+ /* 0x66 0x0F 0x14: UNPCKLPD (src)xmmreg-or-mem, (dst)xmmreg */
+ /* 0x66 0x0F 0x15: UNPCKHPD (src)xmmreg-or-mem, (dst)xmmreg */
+ if (insn[0] == 0x0F
&& (insn[1] == 0x14 || insn[1] == 0x15)) {
+ vg_assert(sz == 4 || sz == 2);
+ if (sz == 4) {
+ eip = dis_SSE2_reg_or_mem ( cb, sorb, eip+2, 16,
+ "unpck{l,h}ps",
+ insn[0], insn[1] );
+ } else {
eip = dis_SSE3_reg_or_mem ( cb, sorb, eip+2, 16,
"unpck{l,h}pd",
0x66, insn[0], insn[1] );
+ }
goto decode_success;
}
@@ -4379,15 +4387,18 @@
goto decode_success;
}
- /* I don't understand how MOVAPD differs from MOVAPS. */
/* MOVAPD (28,29) -- aligned load/store of xmm reg, or xmm-xmm reg
move */
+ /* MOVUPD (10,11) -- unaligned load/store of xmm reg, or xmm-xmm
+ reg move */
if (sz == 2
- && insn[0] == 0x0F && insn[1] == 0x28) {
- UChar* name = "movapd";
- //(insn[1] == 0x10 || insn[1] == 0x11)
- // ? "movups" : "movaps";
- Bool store = False; //insn[1] == 0x29 || insn[1] == 11;
+ && insn[0] == 0x0F && (insn[1] == 0x28
+ || insn[1] == 0x29
+ || insn[1] == 0x10
+ || insn[1] == 0x11)) {
+ UChar* name = (insn[1] == 0x10 || insn[1] == 0x11)
+ ? "movupd" : "movapd";
+ Bool store = insn[1] == 0x29 || insn[1] == 0x11;
eip = dis_SSE3_load_store_or_mov
( cb, sorb, eip+2, 16, store, name,
0x66, insn[0], insn[1] );
@@ -4404,7 +4415,7 @@
|| insn[1] == 0x11)) {
UChar* name = (insn[1] == 0x10 || insn[1] == 0x11)
? "movups" : "movaps";
- Bool store = insn[1] == 0x29 || insn[1] == 11;
+ Bool store = insn[1] == 0x29 || insn[1] == 0x11;
vg_assert(sz == 4);
eip = dis_SSE2_load_store_or_mov
( cb, sorb, eip+2, 16, store, name,
@@ -4423,16 +4434,42 @@
goto decode_success;
}
- /* MOVLPD -- 8-byte load/store. */
- if (sz == 2
- && insn[0] == 0x0F
+ /* 0x0F 0x12/0x13: MOVLPS -- 8-byte load/store. */
+ /* 0x66 0x0F 0x12/0x13: MOVLPD -- 8-byte load/store. */
+ if (insn[0] == 0x0F
&& (insn[1] == 0x12 || insn[1] == 0x13)) {
+ vg_assert(sz == 4 || sz == 2);
Bool is_store = insn[1]==0x13;
/* Cannot be used for reg-reg moves, according to Intel docs. */
- vg_assert(!epartIsReg(insn[2]));
+ /* But ICC 7.1 tells us another story ;-( */
+ /* vg_assert(!epartIsReg(insn[2])); */
+ if (sz == 4) {
+ eip = dis_SSE2_load_store_or_mov
+ (cb, sorb, eip+2, 8, is_store, "movlps",
+ insn[0], insn[1] );
+ } else {
eip = dis_SSE3_load_store_or_mov
- (cb, sorb, eip+2, 16, is_store, "movlpd",
+ (cb, sorb, eip+2, 8, is_store, "movlpd",
0x66, insn[0], insn[1] );
+ }
+ goto decode_success;
+ }
+
+ /* 0x0F 0x16/0x17: MOVHPS -- 8-byte load/store. */
+ /* 0x66 0x0F 0x16/0x17: MOVHPD -- 8-byte load/store. */
+ if (insn[0] == 0x0F
+ && (insn[1] == 0x16 || insn[1] == 0x17)) {
+ vg_assert(sz == 4 || sz == 2);
+ Bool is_store = insn[1]==0x17;
+ if (sz == 4) {
+ eip = dis_SSE2_load_store_or_mov
+ (cb, sorb, eip+2, 8, is_store, "movhps",
+ insn[0], insn[1] );
+ } else {
+ eip = dis_SSE3_load_store_or_mov
+ (cb, sorb, eip+2, 8, is_store, "movhpd",
+ 0x66, insn[0], insn[1] );
+ }
goto decode_success;
}
----------
|
|
From: Dirk M. <dm...@gm...> - 2003-10-19 15:12:53
Attachments:
toucode.diff
|
On Saturday 18 October 2003 20:03, Joerg Walter wrote: > disInstr: unhandled instruction bytes: 0xF 0x14 0xF8 0xF3 > Again an easy one? ;-) that one is more difficult, as it reads 64bits from reg/mem but writes 128bit. the patch below is untested and not 100% accurate, but it might allow your tests to go on to find more problems. |
|
From: Tom H. <th...@cy...> - 2003-10-19 15:13:33
|
In message <003701c395a2$327d0300$010...@ms...>
jhr...@t-... (Joerg Walter) wrote:
> now that my basic tests work under valgrind when compiled with ICC 7.1
> (excellent!), I've tighten the requirements. Another one of my tests,
> compiled with -march=pentium4 and profile guided optimizations under ICC 7.1
> fails with
[ snipped ]
> disInstr: unhandled instruction bytes: 0xF 0x14 0xF8 0xF3
That's UNPCKLPD which seems to be present in CVS now, probably
from Julian's commit this morning.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: <jhr...@t-...> - 2003-10-19 16:05:05
|
----- Original Message ----- From: "Tom Hughes" <th...@cy...> To: <val...@li...> Sent: Sunday, October 19, 2003 4:32 PM Subject: Re: [Valgrind-users] disInstr: unhandled instruction bytes: 0xF 0x14 0xF8 0xF3 > In message <003701c395a2$327d0300$010...@ms...> > jhr...@t-... (Joerg Walter) wrote: > > > now that my basic tests work under valgrind when compiled with ICC 7.1 > > (excellent!), I've tighten the requirements. Another one of my tests, > > compiled with -march=pentium4 and profile guided optimizations under ICC 7.1 > > fails with > > [ snipped ] > > > disInstr: unhandled instruction bytes: 0xF 0x14 0xF8 0xF3 > > That's UNPCKLPD which seems to be present in CVS now, probably > from Julian's commit this morning. OK, I'll check that next. Thanks, Joerg |
|
From: <jhr...@t-...> - 2003-10-19 18:08:59
|
Hi Tom,
I wrote:
> > That's UNPCKLPD which seems to be present in CVS now, probably
> > from Julian's commit this morning.
>
> OK, I'll check that next.
Yes, Julian's commit fixed most problems already. Here's my diff -u -w
against CVS HEAD:
----------
--- vg_to_ucode.c.orig Sun Oct 19 17:53:48 2003
+++ vg_to_ucode.c Sun Oct 19 18:45:30 2003
@@ -4185,14 +4185,22 @@
goto decode_success;
}
- /* 0x14: UNPCKLPD (src)xmmreg-or-mem, (dst)xmmreg */
- /* 0x15: UNPCKHPD (src)xmmreg-or-mem, (dst)xmmreg */
- if (sz == 2
- && insn[0] == 0x0F
+ /* 0x0F 0x14: UNPCKLPS (src)xmmreg-or-mem, (dst)xmmreg */
+ /* 0x0F 0x15: UNPCKHPS (src)xmmreg-or-mem, (dst)xmmreg */
+ /* 0x66 0x0F 0x14: UNPCKLPD (src)xmmreg-or-mem, (dst)xmmreg */
+ /* 0x66 0x0F 0x15: UNPCKHPD (src)xmmreg-or-mem, (dst)xmmreg */
+ if (insn[0] == 0x0F
&& (insn[1] == 0x14 || insn[1] == 0x15)) {
+ vg_assert(sz == 4 || sz == 2);
+ if (sz == 4) {
+ eip = dis_SSE2_reg_or_mem ( cb, sorb, eip+2, 16,
+ "unpck{l,h}ps",
+ insn[0], insn[1] );
+ } else {
eip = dis_SSE3_reg_or_mem ( cb, sorb, eip+2, 16,
"unpck{l,h}pd",
0x66, insn[0], insn[1] );
+ }
goto decode_success;
}
@@ -4603,6 +4611,15 @@
goto decode_success;
}
+ /* SQRTSS: square root of scalar float. */
+ if (insn[0] == 0xF3 && insn[1] == 0x0F && insn[2] == 0x51) {
+ vg_assert(sz == 4);
+ eip = dis_SSE3_reg_or_mem ( cb, sorb, eip+3, 4,
+ "sqrtss",
+ insn[0], insn[1], insn[2] );
+ goto decode_success;
+ }
+
/* MOVLPS -- 8-byte load/store. How is this different from MOVLPS
? */
if (insn[0] == 0x0F
----------
Thanks,
Joerg
|