You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(71) |
Aug
(152) |
Sep
(123) |
Oct
(49) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2002 |
Jan
|
Feb
|
Mar
|
Apr
(37) |
May
(554) |
Jun
(301) |
Jul
(84) |
Aug
(39) |
Sep
(44) |
Oct
(99) |
Nov
(41) |
Dec
(52) |
2003 |
Jan
(15) |
Feb
(32) |
Mar
(19) |
Apr
(4) |
May
(8) |
Jun
(30) |
Jul
(122) |
Aug
(100) |
Sep
(120) |
Oct
(4) |
Nov
(39) |
Dec
(32) |
2004 |
Jan
(38) |
Feb
(87) |
Mar
(11) |
Apr
(23) |
May
(7) |
Jun
(6) |
Jul
(18) |
Aug
(2) |
Sep
(22) |
Oct
(2) |
Nov
(7) |
Dec
(48) |
2005 |
Jan
(74) |
Feb
(29) |
Mar
(28) |
Apr
(1) |
May
(24) |
Jun
(16) |
Jul
(9) |
Aug
(7) |
Sep
(69) |
Oct
(11) |
Nov
(13) |
Dec
(13) |
2006 |
Jan
(5) |
Feb
(3) |
Mar
(7) |
Apr
|
May
(12) |
Jun
(12) |
Jul
(5) |
Aug
(1) |
Sep
(4) |
Oct
(61) |
Nov
(68) |
Dec
(46) |
2007 |
Jan
(16) |
Feb
(15) |
Mar
(46) |
Apr
(171) |
May
(78) |
Jun
(109) |
Jul
(61) |
Aug
(71) |
Sep
(189) |
Oct
(219) |
Nov
(162) |
Dec
(91) |
2008 |
Jan
(49) |
Feb
(41) |
Mar
(43) |
Apr
(31) |
May
(70) |
Jun
(98) |
Jul
(39) |
Aug
(8) |
Sep
(75) |
Oct
(47) |
Nov
(11) |
Dec
(17) |
2009 |
Jan
(9) |
Feb
(12) |
Mar
(8) |
Apr
(11) |
May
(27) |
Jun
(25) |
Jul
(161) |
Aug
(28) |
Sep
(66) |
Oct
(36) |
Nov
(49) |
Dec
(22) |
2010 |
Jan
(34) |
Feb
(20) |
Mar
(3) |
Apr
(12) |
May
(1) |
Jun
(10) |
Jul
(28) |
Aug
(98) |
Sep
(7) |
Oct
(25) |
Nov
(4) |
Dec
(9) |
2011 |
Jan
|
Feb
(12) |
Mar
(7) |
Apr
(16) |
May
(11) |
Jun
(59) |
Jul
(120) |
Aug
(7) |
Sep
(4) |
Oct
(5) |
Nov
(3) |
Dec
(2) |
2012 |
Jan
|
Feb
(6) |
Mar
(21) |
Apr
|
May
|
Jun
|
Jul
(9) |
Aug
|
Sep
(5) |
Oct
(3) |
Nov
(6) |
Dec
(1) |
2013 |
Jan
|
Feb
(19) |
Mar
(10) |
Apr
|
May
(2) |
Jun
|
Jul
(7) |
Aug
(62) |
Sep
(14) |
Oct
(44) |
Nov
(38) |
Dec
(47) |
2014 |
Jan
(14) |
Feb
(1) |
Mar
(4) |
Apr
|
May
(20) |
Jun
|
Jul
|
Aug
(8) |
Sep
(6) |
Oct
(11) |
Nov
(9) |
Dec
(9) |
2015 |
Jan
(3) |
Feb
(2) |
Mar
(2) |
Apr
(3) |
May
(2) |
Jun
(5) |
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
(10) |
Dec
(2) |
2016 |
Jan
(12) |
Feb
(13) |
Mar
(9) |
Apr
(45) |
May
(9) |
Jun
(2) |
Jul
(15) |
Aug
(32) |
Sep
(6) |
Oct
(28) |
Nov
(1) |
Dec
|
2017 |
Jan
(1) |
Feb
|
Mar
|
Apr
(13) |
May
(8) |
Jun
(2) |
Jul
(3) |
Aug
(10) |
Sep
|
Oct
(2) |
Nov
|
Dec
(1) |
2018 |
Jan
(2) |
Feb
(4) |
Mar
(2) |
Apr
(7) |
May
|
Jun
(8) |
Jul
|
Aug
(8) |
Sep
(2) |
Oct
(2) |
Nov
(8) |
Dec
(6) |
2019 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(3) |
2020 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Ben Rudiak-G. <ben...@gm...> - 2013-02-20 19:11:46
|
Here are both patches as text/plain attachments. I think it may have been gmail on my end that screwed them up. |
From: Cyrill G. <gor...@op...> - 2013-02-20 18:19:20
|
On Tue, Feb 19, 2013 at 08:58:56PM -0800, Ben Rudiak-Gould wrote: > In long mode relative offsets are always 32 bits sign-extended to 64 > bits and absolute near addresses are always 64 bits, regardless of the > operand size. > > Signed-off-by: Ben Rudiak-Gould <benrudiak_at_gmail.com> --- > > diff --git a/disasm.c b/disasm.c > index 46cec8a..50149d2 100644 > --- a/disasm.c > +++ b/disasm.c > @@ -532,22 +532,21 @@ static int matches(const struct itemplate *t, > uint8_t *data, Seems sourceforge mailer has screwed the patch body and I can't apply it :( (hpa@ I believe it's a time to setup own Mailman?) So could you please re-send both patches as attachments with gor...@gm... CC'ed. |
From: H. P. A. <hp...@zy...> - 2013-02-20 17:34:33
|
On 02/20/2013 09:23 AM, H. Peter Anvin wrote: > On 02/19/2013 09:39 PM, Ben Rudiak-Gould wrote: >> This adds "np" to a bunch of SSE-style instructions that should have >> it, "norep" (which was implemented but unused) on quasi-SSE >> instructions that use F2 and F3 as instruction extensions but 66 for >> operand size, "nof3" (newly implemented) on a few instructions, >> "norexw" on some instructions that have only 32-bit and 64-bit >> versions, and one NOLONG. It also removes some incorrect "np"s, >> changes some "f3"s to "f3i"s, and fixes the decoding of the >> XCHG/NOP/PAUSE mess: F390 is always PAUSE even when rex.b=1 (at least >> according to XED). > > It should have been REX.R not REX.B, to prevent: > > [rep] xchg r8,rax > > ... from being treated as NOP or PAUSE. > Ah, but despite the documentation it is REX.B, not REX.R. And yes, I can confirm this applies to PAUSE but *NOT* NOP, at least on Sandy Bridge, i.e.: F3 49 90 - PAUSE (no swap) 49 90 - XCHG R8,RAX (registers do swap) Odd, but that's how it works. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
From: H. P. A. <hp...@zy...> - 2013-02-20 17:23:22
|
On 02/19/2013 09:39 PM, Ben Rudiak-Gould wrote: > This adds "np" to a bunch of SSE-style instructions that should have > it, "norep" (which was implemented but unused) on quasi-SSE > instructions that use F2 and F3 as instruction extensions but 66 for > operand size, "nof3" (newly implemented) on a few instructions, > "norexw" on some instructions that have only 32-bit and 64-bit > versions, and one NOLONG. It also removes some incorrect "np"s, > changes some "f3"s to "f3i"s, and fixes the decoding of the > XCHG/NOP/PAUSE mess: F390 is always PAUSE even when rex.b=1 (at least > according to XED). It should have been REX.R not REX.B, to prevent: [rep] xchg r8,rax ... from being treated as NOP or PAUSE. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
From: Ben Rudiak-G. <ben...@gm...> - 2013-02-20 05:40:28
|
This adds "np" to a bunch of SSE-style instructions that should have it, "norep" (which was implemented but unused) on quasi-SSE instructions that use F2 and F3 as instruction extensions but 66 for operand size, "nof3" (newly implemented) on a few instructions, "norexw" on some instructions that have only 32-bit and 64-bit versions, and one NOLONG. It also removes some incorrect "np"s, changes some "f3"s to "f3i"s, and fixes the decoding of the XCHG/NOP/PAUSE mess: F390 is always PAUSE even when rex.b=1 (at least according to XED). Signed-off-by: Ben Rudiak-Gould <benrudiak_at_gmail.com> diff --git a/assemble.c b/assemble.c index 4f791ec..7b33df9 100644 --- a/assemble.c +++ b/assemble.c @@ -118,6 +118,8 @@ * \323 - indicates fixed 64-bit operand size, REX on extensions only. * \324 - indicates 64-bit operand size requiring REX prefix. * \325 - instruction which always uses spl/bpl/sil/dil + * \326 - instruction not valid with 0xF3 REP prefix. Hint for + disassembler only; for SSE instructions. * \330 - a literal byte follows in the code stream, to be added * to the condition code value of the instruction. * \331 - instruction not valid with REP prefix. Hint for @@ -1061,6 +1063,9 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, ins->rex |= REX_NH; break; + case 0326: + break; + case 0330: codes++, length++; break; @@ -1709,6 +1714,9 @@ static void gencode(int32_t segment, int64_t offset, int bits, case 0325: break; + case 0326: + break; + case 0330: *bytes = *codes++ ^ condval[ins->condition]; out(offset, segment, bytes, OUT_RAWDATA, 1, NO_SEG, NO_SEG); diff --git a/disasm.c b/disasm.c index 46cec8a..c28ebe2 100644 --- a/disasm.c +++ b/disasm.c @@ -819,6 +819,11 @@ static int matches(const struct itemplate *t, uint8_t *data, break; } + case 0326: + if (prefix->rep == 0xF3) + return false; + break; + case 0331: if (prefix->rep) return false; diff --git a/insns.dat b/insns.dat index a039106..0c3828d 100644 --- a/insns.dat +++ b/insns.dat @@ -178,18 +178,18 @@ BB0_RESET void [ 0f 3a] PENT,CYRIX,ND BB1_RESET void [ 0f 3b] PENT,CYRIX,ND BOUND reg16,mem [rm: o16 62 /r] 186,NOLONG BOUND reg32,mem [rm: o32 62 /r] 386,NOLONG -BSF reg16,mem [rm: o16 0f bc /r] 386,SM -BSF reg16,reg16 [rm: o16 0f bc /r] 386 -BSF reg32,mem [rm: o32 0f bc /r] 386,SM -BSF reg32,reg32 [rm: o32 0f bc /r] 386 -BSF reg64,mem [rm: o64 0f bc /r] X64,SM -BSF reg64,reg64 [rm: o64 0f bc /r] X64 -BSR reg16,mem [rm: o16 0f bd /r] 386,SM -BSR reg16,reg16 [rm: o16 0f bd /r] 386 -BSR reg32,mem [rm: o32 0f bd /r] 386,SM -BSR reg32,reg32 [rm: o32 0f bd /r] 386 -BSR reg64,mem [rm: o64 0f bd /r] X64,SM -BSR reg64,reg64 [rm: o64 0f bd /r] X64 +BSF reg16,mem [rm: o16 nof3 0f bc /r] 386,SM +BSF reg16,reg16 [rm: o16 nof3 0f bc /r] 386 +BSF reg32,mem [rm: o32 nof3 0f bc /r] 386,SM +BSF reg32,reg32 [rm: o32 nof3 0f bc /r] 386 +BSF reg64,mem [rm: o64 nof3 0f bc /r] X64,SM +BSF reg64,reg64 [rm: o64 nof3 0f bc /r] X64 +BSR reg16,mem [rm: o16 nof3 0f bd /r] 386,SM +BSR reg16,reg16 [rm: o16 nof3 0f bd /r] 386 +BSR reg32,mem [rm: o32 nof3 0f bd /r] 386,SM +BSR reg32,reg32 [rm: o32 nof3 0f bd /r] 386 +BSR reg64,mem [rm: o64 nof3 0f bd /r] X64,SM +BSR reg64,reg64 [rm: o64 nof3 0f bd /r] X64 BSWAP reg32 [r: o32 0f c8+r] 486 BSWAP reg64 [r: o64 0f c8+r] X64 BT mem,reg16 [mr: o16 0f a3 /r] 386,SM @@ -320,7 +320,7 @@ CMPXCHG486 mem,reg16 [mr: o16 0f a7 /r] 486,SM,UNDOC,ND,LOCK CMPXCHG486 reg16,reg16 [mr: o16 0f a7 /r] 486,UNDOC,ND CMPXCHG486 mem,reg32 [mr: o32 0f a7 /r] 486,SM,UNDOC,ND,LOCK CMPXCHG486 reg32,reg32 [mr: o32 0f a7 /r] 486,UNDOC,ND -CMPXCHG8B mem [m: hle 0f c7 /1] PENT,LOCK +CMPXCHG8B mem [m: hle norexw 0f c7 /1] PENT,LOCK CMPXCHG16B mem [m: o64 0f c7 /1] X64,LOCK CPUID void [ 0f a2] PENT CPU_READ void [ 0f 3d] PENT,CYRIX @@ -715,7 +715,7 @@ LEA reg64,mem [rm: o64 8d /r] X64 LEAVE void [ c9] 186 LES reg16,mem [rm: o16 c4 /r] 8086,NOLONG LES reg32,mem [rm: o32 c4 /r] 386,NOLONG -LFENCE void [ 0f ae e8] X64,AMD +LFENCE void [ np 0f ae e8] X64,AMD LFS reg16,mem [rm: o16 0f b4 /r] 386 LFS reg32,mem [rm: o32 0f b4 /r] 386 LFS reg64,mem [rm: o64 0f b4 /r] X64 @@ -774,9 +774,9 @@ LSS reg64,mem [rm: o64 0f b2 /r] X64 LTR mem [m: 0f 00 /3] 286,PROT,PRIV LTR mem16 [m: 0f 00 /3] 286,PROT,PRIV LTR reg16 [m: 0f 00 /3] 286,PROT,PRIV -MFENCE void [ 0f ae f0] X64,AMD +MFENCE void [ np 0f ae f0] X64,AMD MONITOR void [ 0f 01 c8] PRESCOTT -MONITOR reg_eax,reg_ecx,reg_edx [---: 0f 01 c8] PRESCOTT,ND +MONITOR reg_eax,reg_ecx,reg_edx [---: 0f 01 c8] PRESCOTT,NOLONG,ND MONITOR reg_rax,reg_ecx,reg_edx [---: 0f 01 c8] X64,ND MOV mem,reg_sreg [mr: 8c /r] 8086,SW MOV reg16,reg_sreg [mr: o16 8c /r] 8086 @@ -874,7 +874,7 @@ NEG rm8 [m: hle f6 /3] 8086,LOCK NEG rm16 [m: hle o16 f7 /3] 8086,LOCK NEG rm32 [m: hle o32 f7 /3] 386,LOCK NEG rm64 [m: hle o64 f7 /3] X64,LOCK -NOP void [ norexb 90] 8086 +NOP void [ norexb nof3 90] 8086 NOP rm16 [m: o16 0f 1f /0] P6 NOP rm32 [m: o32 0f 1f /0] P6 NOP rm64 [m: o64 0f 1f /0] X64 @@ -938,7 +938,7 @@ PADDUSW mmxreg,mmxrm [rm: np o64nw 0f dd /r] PENT,MMX,SQ PADDW mmxreg,mmxrm [rm: np o64nw 0f fd /r] PENT,MMX,SQ PAND mmxreg,mmxrm [rm: np o64nw 0f db /r] PENT,MMX,SQ PANDN mmxreg,mmxrm [rm: np o64nw 0f df /r] PENT,MMX,SQ -PAUSE void [ norexb f3i 90] 8086 +PAUSE void [ f3i 90] 8086 PAVEB mmxreg,mmxrm [rm: o64nw 0f 50 /r] PENT,MMX,SQ,CYRIX PAVGUSB mmxreg,mmxrm [rm: o64nw 0f 0f /r bf] PENT,3DNOW,SQ PCMPEQB mmxreg,mmxrm [rm: np o64nw 0f 74 /r] PENT,MMX,SQ @@ -1177,7 +1177,7 @@ SCASB void [ repe ae] 8086 SCASD void [ repe o32 af] 386 SCASQ void [ repe o64 af] X64 SCASW void [ repe o16 af] 8086 -SFENCE void [ 0f ae f8] X64,AMD +SFENCE void [ np 0f ae f8] X64,AMD SGDT mem [m: 0f 01 /0] 286 SHL rm8,unity [m-: d0 /4] 8086 SHL rm8,reg_cl [m-: d2 /4] 8086 @@ -1480,7 +1480,7 @@ CVTTSS2SI reg32,xmmrm [rm: f3 0f 2c /r] KATMAI,SSE,SD,AR1 CVTTSS2SI reg64,xmmrm [rm: o64 f3 0f 2c /r] X64,SSE,SD,AR1 DIVPS xmmreg,xmmrm128 [rm: np 0f 5e /r] KATMAI,SSE DIVSS xmmreg,xmmrm32 [rm: f3 0f 5e /r] KATMAI,SSE -LDMXCSR mem32 [m: 0f ae /2] KATMAI,SSE +LDMXCSR mem32 [m: np 0f ae /2] KATMAI,SSE MAXPS xmmreg,xmmrm128 [rm: np 0f 5f /r] KATMAI,SSE MAXSS xmmreg,xmmrm32 [rm: f3 0f 5f /r] KATMAI,SSE MINPS xmmreg,xmmrm128 [rm: np 0f 5d /r] KATMAI,SSE @@ -1511,7 +1511,7 @@ RSQRTSS xmmreg,xmmrm32 [rm: f3 0f 52 /r] KATMAI,SSE SHUFPS xmmreg,xmmrm128,imm8 [rmi: np 0f c6 /r ib,u] KATMAI,SSE SQRTPS xmmreg,xmmrm128 [rm: np 0f 51 /r] KATMAI,SSE SQRTSS xmmreg,xmmrm32 [rm: f3 0f 51 /r] KATMAI,SSE -STMXCSR mem32 [m: 0f ae /3] KATMAI,SSE +STMXCSR mem32 [m: np 0f ae /3] KATMAI,SSE SUBPS xmmreg,xmmrm128 [rm: np 0f 5c /r] KATMAI,SSE SUBSS xmmreg,xmmrm32 [rm: f3 0f 5c /r] KATMAI,SSE UCOMISS xmmreg,xmmrm32 [rm: np 0f 2e /r] KATMAI,SSE @@ -1520,22 +1520,22 @@ UNPCKLPS xmmreg,xmmrm128 [rm: np 0f 14 /r] KATMAI,SSE XORPS xmmreg,xmmrm128 [rm: np 0f 57 /r] KATMAI,SSE ;# Introduced in Deschutes but necessary for SSE support -FXRSTOR mem [m: 0f ae /1] P6,SSE,FPU -FXRSTOR64 mem [m: o64 0f ae /1] X64,SSE,FPU -FXSAVE mem [m: 0f ae /0] P6,SSE,FPU -FXSAVE64 mem [m: o64 0f ae /0] X64,SSE,FPU +FXRSTOR mem [m: np 0f ae /1] P6,SSE,FPU +FXRSTOR64 mem [m: o64 np 0f ae /1] X64,SSE,FPU +FXSAVE mem [m: np 0f ae /0] P6,SSE,FPU +FXSAVE64 mem [m: o64 np 0f ae /0] X64,SSE,FPU ;# XSAVE group (AVX and extended state) ; Introduced in late Penryn ... we really need to clean up the handling ; of CPU feature bits. -XGETBV void [ np 0f 01 d0] NEHALEM -XSETBV void [ np 0f 01 d1] NEHALEM,PRIV -XSAVE mem [m: 0f ae /4] NEHALEM -XSAVE64 mem [m: o64 0f ae /4] LONG,NEHALEM -XSAVEOPT mem [m: 0f ae /6] FUTURE -XSAVEOPT64 mem [m: o64 0f ae /6] LONG,FUTURE -XRSTOR mem [m: 0f ae /5] NEHALEM -XRSTOR64 mem [m: o64 0f ae /5] LONG,NEHALEM +XGETBV void [ 0f 01 d0] NEHALEM +XSETBV void [ 0f 01 d1] NEHALEM,PRIV +XSAVE mem [m: np 0f ae /4] NEHALEM +XSAVE64 mem [m: o64 np 0f ae /4] LONG,NEHALEM +XSAVEOPT mem [m: np 0f ae /6] FUTURE +XSAVEOPT64 mem [m: o64 np 0f ae /6] LONG,FUTURE +XRSTOR mem [m: np 0f ae /5] NEHALEM +XRSTOR64 mem [m: o64 np 0f ae /5] LONG,NEHALEM ; These instructions are not SSE-specific; they are ;# Generic memory operations @@ -1544,7 +1544,7 @@ PREFETCHNTA mem [m: 0f 18 /0] KATMAI PREFETCHT0 mem [m: 0f 18 /1] KATMAI PREFETCHT1 mem [m: 0f 18 /2] KATMAI PREFETCHT2 mem [m: 0f 18 /3] KATMAI -SFENCE void [ 0f ae f8] KATMAI +SFENCE void [ np 0f ae f8] KATMAI ;# New MMX instructions introduced in Katmai MASKMOVQ mmxreg,mmxreg [rm: np 0f f7 /r] KATMAI,MMX @@ -1576,13 +1576,13 @@ PSWAPD mmxreg,mmxrm [rm: o64nw 0f 0f /r bb] PENT,3DNOW,SQ ;# Willamette SSE2 Cacheability Instructions MASKMOVDQU xmmreg,xmmreg [rm: 66 0f f7 /r] WILLAMETTE,SSE2 ; CLFLUSH needs its own feature flag implemented one day -CLFLUSH mem [m: 0f ae /7] WILLAMETTE,SSE2 +CLFLUSH mem [m: np 0f ae /7] WILLAMETTE,SSE2 MOVNTDQ mem,xmmreg [mr: 66 0f e7 /r] WILLAMETTE,SSE2,SO MOVNTI mem,reg32 [mr: np 0f c3 /r] WILLAMETTE,SD MOVNTI mem,reg64 [mr: o64 np 0f c3 /r] X64,SQ MOVNTPD mem,xmmreg [mr: 66 0f 2b /r] WILLAMETTE,SSE2,SO -LFENCE void [ 0f ae e8] WILLAMETTE,SSE2 -MFENCE void [ 0f ae f0] WILLAMETTE,SSE2 +LFENCE void [ np 0f ae e8] WILLAMETTE,SSE2 +MFENCE void [ np 0f ae f0] WILLAMETTE,SSE2 ;# Willamette MMX instructions (SSE2 SIMD Integer Instructions) MOVD mem,xmmreg [mr: 66 norexw 0f 7e /r] WILLAMETTE,SSE2,SD @@ -1722,20 +1722,20 @@ CVTPD2PS xmmreg,xmmrm [rm: 66 0f 5a /r] WILLAMETTE,SSE2,SO CVTPI2PD xmmreg,mmxrm [rm: 66 0f 2a /r] WILLAMETTE,SSE2,SQ CVTPS2DQ xmmreg,xmmrm [rm: 66 0f 5b /r] WILLAMETTE,SSE2,SO CVTPS2PD xmmreg,xmmrm [rm: np 0f 5a /r] WILLAMETTE,SSE2,SQ -CVTSD2SI reg32,xmmreg [rm: f2 0f 2d /r] WILLAMETTE,SSE2,SQ,AR1 -CVTSD2SI reg32,mem [rm: f2 0f 2d /r] WILLAMETTE,SSE2,SQ,AR1 +CVTSD2SI reg32,xmmreg [rm: norexw f2 0f 2d /r] WILLAMETTE,SSE2,SQ,AR1 +CVTSD2SI reg32,mem [rm: norexw f2 0f 2d /r] WILLAMETTE,SSE2,SQ,AR1 CVTSD2SI reg64,xmmreg [rm: o64 f2 0f 2d /r] X64,SSE2,SQ,AR1 CVTSD2SI reg64,mem [rm: o64 f2 0f 2d /r] X64,SSE2,SQ,AR1 CVTSD2SS xmmreg,xmmrm [rm: f2 0f 5a /r] WILLAMETTE,SSE2,SQ CVTSI2SD xmmreg,mem [rm: f2 0f 2a /r] WILLAMETTE,SSE2,SD,AR1,ND -CVTSI2SD xmmreg,rm32 [rm: f2 0f 2a /r] WILLAMETTE,SSE2,SD,AR1 +CVTSI2SD xmmreg,rm32 [rm: norexw f2 0f 2a /r] WILLAMETTE,SSE2,SD,AR1 CVTSI2SD xmmreg,rm64 [rm: o64 f2 0f 2a /r] X64,SSE2,SQ,AR1 CVTSS2SD xmmreg,xmmrm [rm: f3 0f 5a /r] WILLAMETTE,SSE2,SD CVTTPD2PI mmxreg,xmmrm [rm: 66 0f 2c /r] WILLAMETTE,SSE2,SO CVTTPD2DQ xmmreg,xmmrm [rm: 66 0f e6 /r] WILLAMETTE,SSE2,SO CVTTPS2DQ xmmreg,xmmrm [rm: f3 0f 5b /r] WILLAMETTE,SSE2,SO -CVTTSD2SI reg32,xmmreg [rm: f2 0f 2c /r] WILLAMETTE,SSE2,SQ,AR1 -CVTTSD2SI reg32,mem [rm: f2 0f 2c /r] WILLAMETTE,SSE2,SQ,AR1 +CVTTSD2SI reg32,xmmreg [rm: norexw f2 0f 2c /r] WILLAMETTE,SSE2,SQ,AR1 +CVTTSD2SI reg32,mem [rm: norexw f2 0f 2c /r] WILLAMETTE,SSE2,SQ,AR1 CVTTSD2SI reg64,xmmreg [rm: o64 f2 0f 2c /r] X64,SSE2,SQ,AR1 CVTTSD2SI reg64,mem [rm: o64 f2 0f 2c /r] X64,SSE2,SQ,AR1 DIVPD xmmreg,xmmrm [rm: 66 0f 5e /r] WILLAMETTE,SSE2,SO @@ -1795,8 +1795,8 @@ VMFUNC void [ 0f 01 d4] VMX VMLAUNCH void [ 0f 01 c2] VMX VMLOAD void [ 0f 01 da] X64,VMX VMMCALL void [ 0f 01 d9] X64,VMX -VMPTRLD mem [m: 0f c7 /6] VMX -VMPTRST mem [m: 0f c7 /7] VMX +VMPTRLD mem [m: np 0f c7 /6] VMX +VMPTRST mem [m: np 0f c7 /7] VMX VMREAD rm32,reg32 [mr: np 0f 78 /r] VMX,NOLONG,SD VMREAD rm64,reg64 [mr: o64nw np 0f 78 /r] X64,VMX,SQ VMRESUME void [ 0f 01 c3] VMX @@ -1878,7 +1878,7 @@ PCMPEQQ xmmreg,xmmrm [rm: 66 0f 38 29 /r] SSE41 PEXTRB reg32,xmmreg,imm [mri: 66 0f 3a 14 /r ib,u] SSE41 PEXTRB mem8,xmmreg,imm [mri: 66 0f 3a 14 /r ib,u] SSE41 PEXTRB reg64,xmmreg,imm [mri: o64 66 0f 3a 14 /r ib,u] SSE41,X64 -PEXTRD rm32,xmmreg,imm [mri: 66 0f 3a 16 /r ib,u] SSE41 +PEXTRD rm32,xmmreg,imm [mri: norexw 66 0f 3a 16 /r ib,u] SSE41 PEXTRQ rm64,xmmreg,imm [mri: o64 66 0f 3a 16 /r ib,u] SSE41,X64 PEXTRW reg32,xmmreg,imm [mri: 66 0f 3a 15 /r ib,u] SSE41 PEXTRW mem16,xmmreg,imm [mri: 66 0f 3a 15 /r ib,u] SSE41 @@ -1887,8 +1887,8 @@ PHMINPOSUW xmmreg,xmmrm [rm: 66 0f 38 41 /r] SSE41 PINSRB xmmreg,mem,imm [rmi: 66 0f 3a 20 /r ib,u] SSE41,SB,AR2 PINSRB xmmreg,rm8,imm [rmi: nohi 66 0f 3a 20 /r ib,u] SSE41,SB,AR2 PINSRB xmmreg,reg32,imm [rmi: 66 0f 3a 20 /r ib,u] SSE41,SB,AR2 -PINSRD xmmreg,mem,imm [rmi: 66 0f 3a 22 /r ib,u] SSE41,SB,AR2 -PINSRD xmmreg,rm32,imm [rmi: 66 0f 3a 22 /r ib,u] SSE41,SB,AR2 +PINSRD xmmreg,mem,imm [rmi: norexw 66 0f 3a 22 /r ib,u] SSE41,SB,AR2 +PINSRD xmmreg,rm32,imm [rmi: norexw 66 0f 3a 22 /r ib,u] SSE41,SB,AR2 PINSRQ xmmreg,mem,imm [rmi: o64 66 0f 3a 22 /r ib,u] SSE41,X64,SB,AR2 PINSRQ xmmreg,rm64,imm [rmi: o64 66 0f 3a 22 /r ib,u] SSE41,X64,SB,AR2 PMAXSB xmmreg,xmmrm [rm: 66 0f 38 3c /r] SSE41 @@ -1943,12 +1943,12 @@ PFRSQRTV mmxreg,mmxrm [rm: o64nw 0f 0f /r 87] PENT,3DNOW,SQ,CYRIX ;# Intel new instructions in ??? ; Is NEHALEM right here? -MOVBE reg16,mem16 [rm: o16 0f 38 f0 /r] NEHALEM,SM -MOVBE reg32,mem32 [rm: o32 0f 38 f0 /r] NEHALEM,SM -MOVBE reg64,mem64 [rm: o64 0f 38 f0 /r] NEHALEM,SM -MOVBE mem16,reg16 [mr: o16 0f 38 f1 /r] NEHALEM,SM -MOVBE mem32,reg32 [mr: o32 0f 38 f1 /r] NEHALEM,SM -MOVBE mem64,reg64 [mr: o64 0f 38 f1 /r] NEHALEM,SM +MOVBE reg16,mem16 [rm: o16 norep 0f 38 f0 /r] NEHALEM,SM +MOVBE reg32,mem32 [rm: o32 norep 0f 38 f0 /r] NEHALEM,SM +MOVBE reg64,mem64 [rm: o64 norep 0f 38 f0 /r] NEHALEM,SM +MOVBE mem16,reg16 [mr: o16 norep 0f 38 f1 /r] NEHALEM,SM +MOVBE mem32,reg32 [mr: o32 norep 0f 38 f1 /r] NEHALEM,SM +MOVBE mem64,reg64 [mr: o64 norep 0f 38 f1 /r] NEHALEM,SM ;# Intel AES instructions AESENC xmmreg,xmmrm128 [rm: 66 0f 38 dc /r] SSE,WESTMERE @@ -3356,9 +3356,9 @@ XTEST void [ 0f 01 d6] FUTURE,HLE,RTM ; ; based on pub number 319433-011 dated July 2011 ; -TZCNT reg16,rm16 [rm: o16 f3 0f bc /r] FUTURE,BMI1 -TZCNT reg32,rm32 [rm: o32 f3 0f bc /r] FUTURE,BMI1 -TZCNT reg64,rm64 [rm: o64 f3 0f bc /r] LONG,FUTURE,BMI1 +TZCNT reg16,rm16 [rm: o16 f3i 0f bc /r] FUTURE,BMI1 +TZCNT reg32,rm32 [rm: o32 f3i 0f bc /r] FUTURE,BMI1 +TZCNT reg64,rm64 [rm: o64 f3i 0f bc /r] LONG,FUTURE,BMI1 ANDN reg32,reg32,rm32 [rvm: vex.nds.lz.0f38.w0 f2 /r] FUTURE,BMI1 ANDN reg64,reg64,rm64 [rvm: vex.nds.lz.0f38.w1 f2 /r] LONG,FUTURE,BMI1 BEXTR reg32,rm32,reg32 [rmv: vex.nds.lz.0f38.w0 f7 /r] FUTURE,BMI1 diff --git a/insns.pl b/insns.pl index b154dbd..1b9d980 100755 --- a/insns.pl +++ b/insns.pl @@ -721,6 +721,8 @@ sub byte_code_compile($$) { 'norexw' => 0317, 'repe' => 0335, 'nohi' => 0325, # Use spl/bpl/sil/dil even without REX + 'nof3' => 0326, # No REP 0xF3 prefix permitted + 'norep' => 0331, # No REP prefix permitted 'wait' => 0341, # Needs a wait prefix 'resb' => 0340, 'jcc8' => 0370, # Match only if Jcc possible with single byte |
From: Ben Rudiak-G. <ben...@gm...> - 2013-02-20 04:59:43
|
In long mode relative offsets are always 32 bits sign-extended to 64 bits and absolute near addresses are always 64 bits, regardless of the operand size. Signed-off-by: Ben Rudiak-Gould <benrudiak_at_gmail.com> diff --git a/disasm.c b/disasm.c index 46cec8a..50149d2 100644 --- a/disasm.c +++ b/disasm.c @@ -532,22 +532,21 @@ static int matches(const struct itemplate *t, uint8_t *data, opx->segment &= ~SEG_32BIT; break; - case4(064): + case4(064): /* rel */ opx->segment |= SEG_RELATIVE; - if (osize == 16) { - opx->offset = gets16(data); - data += 2; - opx->segment &= ~(SEG_32BIT|SEG_64BIT); - } else if (osize == 32) { - opx->offset = gets32(data); - data += 4; - opx->segment &= ~SEG_64BIT; - opx->segment |= SEG_32BIT; - } - if (segsize != osize) { - opx->type = - (opx->type & ~SIZE_MASK) - | ((osize == 16) ? BITS16 : BITS32); + /* In long mode rel is always 32 bits, sign extended. */ + if (segsize == 64 || osize == 32) { + opx->offset = gets32(data); + data += 4; + if (segsize != 64) + opx->segment |= SEG_32BIT; + opx->type = (opx->type & ~SIZE_MASK) + | (segsize == 64 ? BITS64 : BITS32); + } else { + opx->offset = gets16(data); + data += 2; + opx->segment &= ~SEG_32BIT; + opx->type = (opx->type & ~SIZE_MASK) | BITS16; } break; diff --git a/insns.dat b/insns.dat index a039106..fe3b447 100644 --- a/insns.dat +++ b/insns.dat @@ -229,14 +229,17 @@ BTS rm16,imm [mi: hle o16 0f ba /5 ib,u] 386,SB,LOCK BTS rm32,imm [mi: hle o32 0f ba /5 ib,u] 386,SB,LOCK BTS rm64,imm [mi: hle o64 0f ba /5 ib,u] X64,SB,LOCK CALL imm [i: odf e8 rel] 8086 -CALL imm|near [i: odf e8 rel] 8086 +CALL imm|near [i: odf e8 rel] 8086,ND CALL imm|far [i: odf 9a iwd seg] 8086,ND,NOLONG -CALL imm16 [i: o16 e8 rel] 8086 -CALL imm16|near [i: o16 e8 rel] 8086 +; Call/jmp near imm/reg/mem is always 64-bit in long mode. +CALL imm16 [i: o16 e8 rel] 8086,NOLONG +CALL imm16|near [i: o16 e8 rel] 8086,ND,NOLONG CALL imm16|far [i: o16 9a iwd seg] 8086,ND,NOLONG -CALL imm32 [i: o32 e8 rel] 386 -CALL imm32|near [i: o32 e8 rel] 386 +CALL imm32 [i: o32 e8 rel] 386,NOLONG +CALL imm32|near [i: o32 e8 rel] 386,ND,NOLONG CALL imm32|far [i: o32 9a iwd seg] 386,ND,NOLONG +CALL imm64 [i: o64nw e8 rel] X64 +CALL imm64|near [i: o64nw e8 rel] X64,ND CALL imm:imm [ji: odf 9a iwd iw] 8086,NOLONG CALL imm16:imm [ji: o16 9a iw iw] 8086,NOLONG CALL imm:imm16 [ji: o16 9a iw iw] 8086,NOLONG @@ -248,17 +251,13 @@ CALL mem16|far [m: o16 ff /3] 8086 CALL mem32|far [m: o32 ff /3] 386 CALL mem64|far [m: o64 ff /3] X64 CALL mem|near [m: odf ff /2] 8086,ND -CALL mem16|near [m: o16 ff /2] 8086,ND -CALL mem32|near [m: o32 ff /2] 386,NOLONG,ND -CALL mem64|near [m: o64nw ff /2] X64,ND -CALL reg16 [m: o16 ff /2] 8086 -CALL reg32 [m: o32 ff /2] 386,NOLONG -CALL reg64 [m: o64nw ff /2] X64 +CALL rm16|near [m: o16 ff /2] 8086,NOLONG,ND +CALL rm32|near [m: o32 ff /2] 386,NOLONG,ND +CALL rm64|near [m: o64nw ff /2] X64,ND CALL mem [m: odf ff /2] 8086 -CALL mem16 [m: o16 ff /2] 8086 -CALL mem32 [m: o32 ff /2] 386,NOLONG -CALL mem [m: o64nw ff /2] X64 -CALL mem64 [m: o64nw ff /2] X64 +CALL rm16 [m: o16 ff /2] 8086,NOLONG +CALL rm32 [m: o32 ff /2] 386,NOLONG +CALL rm64 [m: o64nw ff /2] X64 CBW void [ o16 98] 8086 CDQ void [ o32 99] 386 CDQE void [ o64 98] X64 @@ -661,12 +660,15 @@ JMP imm [i: jmp8 eb rel8] 8086,ND JMP imm [i: odf e9 rel] 8086 JMP imm|near [i: odf e9 rel] 8086,ND JMP imm|far [i: odf ea iwd seg] 8086,ND,NOLONG -JMP imm16 [i: o16 e9 rel] 8086 -JMP imm16|near [i: o16 e9 rel] 8086,ND +; Call/jmp near imm/reg/mem is always 64-bit in long mode. +JMP imm16 [i: o16 e9 rel] 8086,NOLONG +JMP imm16|near [i: o16 e9 rel] 8086,ND,NOLONG JMP imm16|far [i: o16 ea iwd seg] 8086,ND,NOLONG -JMP imm32 [i: o32 e9 rel] 386 -JMP imm32|near [i: o32 e9 rel] 386,ND +JMP imm32 [i: o32 e9 rel] 386,NOLONG +JMP imm32|near [i: o32 e9 rel] 386,ND,NOLONG JMP imm32|far [i: o32 ea iwd seg] 386,ND,NOLONG +JMP imm64 [i: o64nw e9 rel] X64 +JMP imm64|near [i: o64nw e9 rel] X64,ND JMP imm:imm [ji: odf ea iwd iw] 8086,NOLONG JMP imm16:imm [ji: o16 ea iw iw] 8086,NOLONG JMP imm:imm16 [ji: o16 ea iw iw] 8086,NOLONG @@ -678,17 +680,13 @@ JMP mem16|far [m: o16 ff /5] 8086 JMP mem32|far [m: o32 ff /5] 386 JMP mem64|far [m: o64 ff /5] X64 JMP mem|near [m: odf ff /4] 8086,ND -JMP mem16|near [m: o16 ff /4] 8086,ND -JMP mem32|near [m: o32 ff /4] 386,NOLONG,ND -JMP mem64|near [m: o64nw ff /4] X64,ND -JMP reg16 [m: o16 ff /4] 8086 -JMP reg32 [m: o32 ff /4] 386,NOLONG -JMP reg64 [m: o64nw ff /4] X64 +JMP rm16|near [m: o16 ff /4] 8086,NOLONG,ND +JMP rm32|near [m: o32 ff /4] 386,NOLONG,ND +JMP rm64|near [m: o64nw ff /4] X64,ND JMP mem [m: odf ff /4] 8086 -JMP mem16 [m: o16 ff /4] 8086 -JMP mem32 [m: o32 ff /4] 386,NOLONG -JMP mem [m: o64nw ff /4] X64 -JMP mem64 [m: o64nw ff /4] X64 +JMP rm16 [m: o16 ff /4] 8086,NOLONG +JMP rm32 [m: o32 ff /4] 386,NOLONG +JMP rm64 [m: o64nw ff /4] X64 JMPE imm [i: odf 0f b8 rel] IA64 JMPE imm16 [i: o16 0f b8 rel] IA64 JMPE imm32 [i: o32 0f b8 rel] IA64 @@ -1428,8 +1426,9 @@ CMOVcc reg32,reg32 [rm: o32 0f 40+c /r] P6 CMOVcc reg64,mem [rm: o64 0f 40+c /r] X64,SM CMOVcc reg64,reg64 [rm: o64 0f 40+c /r] X64 Jcc imm|near [i: odf 0f 80+c rel] 386 -Jcc imm16|near [i: o16 0f 80+c rel] 386 -Jcc imm32|near [i: o32 0f 80+c rel] 386 +Jcc imm16|near [i: o16 0f 80+c rel] 386,NOLONG +Jcc imm32|near [i: o32 0f 80+c rel] 386,NOLONG +Jcc imm64|near [i: o64nw 0f 80+c rel] X64 Jcc imm|short [i: 70+c rel8] 8086,ND Jcc imm [i: jcc8 70+c rel8] 8086,ND Jcc imm [i: 0f 80+c rel] 386,ND @@ -3344,11 +3343,13 @@ VPGATHERQQ ymmreg,mem64,ymmreg [rmv: vm64y vex.dds.256.66.0f38.w1 91 /r] FUTURE XABORT imm [i: c6 f8 ib] FUTURE,RTM XABORT imm8 [i: c6 f8 ib] FUTURE,RTM XBEGIN imm [i: odf c7 f8 rel] FUTURE,RTM -XBEGIN imm|near [i: odf c7 f8 rel] FUTURE,RTM -XBEGIN imm16 [i: o16 c7 f8 rel] FUTURE,RTM -XBEGIN imm16|near [i: o16 c7 f8 rel] FUTURE,RTM -XBEGIN imm32 [i: o32 c7 f8 rel] FUTURE,RTM -XBEGIN imm32|near [i: o32 c7 f8 rel] FUTURE,RTM +XBEGIN imm|near [i: odf c7 f8 rel] FUTURE,RTM,ND +XBEGIN imm16 [i: o16 c7 f8 rel] FUTURE,RTM,NOLONG +XBEGIN imm16|near [i: o16 c7 f8 rel] FUTURE,RTM,NOLONG,ND +XBEGIN imm32 [i: o32 c7 f8 rel] FUTURE,RTM,NOLONG +XBEGIN imm32|near [i: o32 c7 f8 rel] FUTURE,RTM,NOLONG,ND +XBEGIN imm64 [i: o64nw c7 f8 rel] FUTURE,RTM,LONG +XBEGIN imm64|near [i: o64nw c7 f8 rel] FUTURE,RTM,LONG,ND XEND void [ 0f 01 d5] FUTURE,RTM XTEST void [ 0f 01 d6] FUTURE,HLE,RTM |
From: anonymous c. <nas...@us...> - 2012-12-02 02:21:42
|
>> However, support for parentheses around an >> operand (as suggested by #327364-001 3.5) >> is tricky: for reg ops the parentheses simply >> evaluate, but for mem ops the parser has to >> explicitly skip them -- however, that decision >> hinges on the leading transform modifier and >> there is no clear reg versus mem distinction, >> because of the D(...) down converts: they do >> use reg ops, but with mem transforms. >> >> I have not decided the most suitable course >> yet -- add extra parse-ahead to find '[', parse >> down convert instructions specially, or, well, >> ignore Intel's suggested syntax (i.e. no (...), >> and permit all modifiers before and after any >> operand, reg or mem [with subsequent tests >> for their sanity/validity, of course]). > > Cyrill and I looked at this last night, and we're not entirely sure what you > mean. Our thinking has been to treat braced keywords simply as a separate > keyword space, which would mean a slightly looser syntax but probably okay. > I seriously did not follow the big about parens, though... The KNC ISA PDF suggests this operand syntax: {transform} ( op {hint} ) {mask} - {eh} can follow memory operand, i.e. [...] {eh} - {transform} can precede specific source operand - {mask} can follow destination operand - the parentheses around op and {hint} are explicit I have come to refer to that as the canonical form. By contrast, a relaxed syntax would permit these: - incorrect source operand has transform modifier - non-destination operand has mask modifier - transform modifier specified after operand - eviction hint specified before operand - mask modifier specified before operand - transform modifier operand not preceded by "(" - transform modifier operand not followed by ")" - eviction hint specified after ")", not before - mask modifier specified before ")", not after - transform modifier invalid for memory operand - transform modifier invalid for register operand - modifier specified as extra operand And for that last "specified as extra operand" case: - only operand --> ignored - leading operand --> applied to next operand - trailing operand --> applied to previous operand - in between operands --> applied to previous operand The problem with the "(" and ")" in the canonical form: {transform} ( reg op ) --> parentheses will evaluate {transform} ( mem op ) --> parentheses won't evaluate finding ")" after mem op --> easy finding ")" after reg op --> hard So always skipping "(" is hard because finding ")" is. And to avoid having to scan ahead for reg vs mem op, the skip decision must be made based on {transform}. Since D(...) down converts use reg ops, but have mem transform modifiers, those instructions require special treatment. (I did managed to make that work after all.) That said... I do have functional KNC support. I am still working on KNF support. KNF became obsolete after KNC arrived. KNC might be obsolete after AVX3 arrives. Before burdening the assembler with KNF and KNC, it might make sense to await the AVX3 spec, to find out what "pieces" will actually be required in the long term. So far it looks like: - support for {modifier} tokens - support for one or more of them before any operand - support for one or more of them after any operand - support for one or more of them without any operand versus - constrains on which modifier token(s) can be used, based on KNC/KNF/AVX3, instruction, operand, etc. I plan to spend more time on this during the holidays. |
From: H. P. A. <hp...@zy...> - 2012-11-27 16:37:32
|
On 09/25/2012 03:40 AM, anonymous coward wrote: > > However, support for parentheses around an > operand (as suggested by #327364-001 3.5) > is tricky: for reg ops the parentheses simply > evaluate, but for mem ops the parser has to > explicitly skip them -- however, that decision > hinges on the leading transform modifier and > there is no clear reg versus mem distinction, > because of the D(...) down converts: they do > use reg ops, but with mem transforms. > > I have not decided the most suitable course > yet -- add extra parse-ahead to find '[', parse > down convert instructions specially, or, well, > ignore Intel's suggested syntax (i.e. no (...), > and permit all modifiers before and after any > operand, reg or mem [with subsequent tests > for their sanity/validity, of course]). > Hi... Cyrill and I looked at this last night, and we're not entirely sure what you mean. Our thinking has been to treat braced keywords simply as a separate keyword space, which would mean a slightly looser syntax but probably okay. I seriously did not follow the big about parens, though... -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
From: Brian R. <bre...@mu...> - 2012-11-27 02:39:56
|
> Third, covering all the cases and making them work with the matching > engine could take a fair bit of time. I think this is the biggest issue. I actually suggested something like this to Jules way back in 1998 or so. (Though my suggestion was to introduce just one new prefix keyword, e.g. "alt", that would cause nasm to use the less-common variant encoding for the following instruction. Of course that falls apart for cases that have three possible encodings.) He was vaguely sympathetic to the idea, but I think he felt (rightly) that it was a fair bit of work for the amount of payoff (particularly if future upkeep is taken into account). b |
From: H. P. A. <hp...@zy...> - 2012-11-27 01:55:00
|
On 11/20/2012 07:37 AM, Marat Dukhan wrote: > > Example for rex keyword: > > MOV ecx, [rsi] ; encoded without REX > rex MOV ecx, [rsi] ; encoded with REX > > > Example for vex3 keyword: > > VPADDD xmm0, xmm0, xmm0 ; encoded with 2-byte VEX prefix > vex3 VPADDD xmm0, xmm0, xmm0 ; encoded with 3-byte VEX prefix > > Is there any chance to get these features in NASM? > Well, there are a few problems: First, NASM is largely a volunteer project, so getting the resources to do it is always a problem. Second, and this may be the bigger issue, is that it causes namespace issues; someone may be using e.g. "rex" as a variable. However, the Xeon Phi assembler syntax already are addressing this by adding new keywords recognized only if they are inside curly braces, so perhaps we could just make that a general keyword space and make the syntax {rex} and so on. Third, covering all the cases and making them work with the matching engine could take a fair bit of time. If you are interested in this as a project, it is certainly something that would be welcome. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
From: Marat D. <ma...@gm...> - 2012-11-20 15:37:53
|
Dear NASM developers, I work on a high-performance library of optimized functions, and use NASM to assemble the x86/x86-64 specific implementations. For high-performance code on some x86 microachitectures (e.g. Intel Atom, Intel Nehalem, AMD Bulldozer) it is essential to align groups of instructions on certain boundaries (8 or 16 bytes) to achieve full CPU front-end performance. There are three ways to align instruction groups on a 8- or 16-byte boundary: insert NOPs, make instructions longer by adding prefixes, or make instructions longer by using longer instruction forms. 1. Since NOPs consume decoder resources, they do not help to improve decoder performance. 2. Adding instruction prefixes helps to certain degree, but CPU decoders are limited in the number of instruction prefixes they can decode per cycle, so this technique has limited use. 3. Using different (longer) encoding forms is the optimal solution, but it requires support from the assembler. NASM already supports some specifications of instruction forms, e.g. MOV ecx, [esi] ; encoded without memory displacement MOV ecx, [byte esi] ; encoded with 8-bit memory displacement MOV ecx, [dword esi] ; encoded with 32-bit memory displacement AND ecx, 0F ; encoded with 8-bit immediate AND ecx, dword 0F ; encoded with 32-bit immediate MOV ecx, [eax * 2] ; encoded as [eax + eax*1] without offset MOV ecx, [nosplit eax * 2] ; encoded as [eax*2] with offset I would like this functionality in NASM to be extended to more instruction forms, and suggest new keywords acc, modrm, sib, rex, vex3: acc keyword forces NASM to use special rax/eax/ax/al encoding form. Example for acc keyword: ADD eax, 32 ; encoded as ModR/M + imm8 acc ADD eax, 32 ; encoded as special eax form + imm32 modrm keyword forces NASM to use ModR/M encoding Example for modrm keyword: ADD al, 32 ; encoded as special eax form + imm8 modrm ADD al, 32 ; encoded as ModR/M form + imm8 (1 byte londer than the above version) PUSH ecx ; encoded as 50+rd modrm PUSH ecx ; encoded as FF /6 sib keyword forces NASM to use SIB byte even if ModR/M would be enough Example for sib keyword: MOV ecx, [esi + 4] ; encoded as ModR/M + imm8 MOV ecx, [sib esi + 4] ; encoded as ModR/M + sib + imm8 Example for rex keyword: MOV ecx, [rsi] ; encoded without REX rex MOV ecx, [rsi] ; encoded with REX Example for vex3 keyword: VPADDD xmm0, xmm0, xmm0 ; encoded with 2-byte VEX prefix vex3 VPADDD xmm0, xmm0, xmm0 ; encoded with 3-byte VEX prefix Is there any chance to get these features in NASM? Kind regards, Marat Dukhan |
From: Cyrill G. <gor...@op...> - 2012-11-14 06:44:03
|
On Tue, Nov 13, 2012 at 07:58:29PM -0800, H. Peter Anvin wrote: > On 11/05/2012 12:48 PM, nasm-bot for Cyrill Gorcunov wrote: > >Commit-ID: 7ce86b500c792b782b7b076f50b220fc62234954 > >Gitweb: http://repo.or.cz/w/nasm.git?a=commitdiff;h=7ce86b500c792b782b7b076f50b220fc62234954 > >Author: Cyrill Gorcunov <gor...@gm...> > >AuthorDate: Tue, 6 Nov 2012 00:47:20 +0400 > >Committer: Cyrill Gorcunov <gor...@gm...> > >CommitDate: Tue, 6 Nov 2012 00:47:20 +0400 > > > >BR3392231: Fix get_closest_section_symbol_by_offset > > > >This patch changes get_closest_section_symbol_by_offset > >logic to lookup only the closest symbols which are at > >or before the supplied offset. > > We could use an rbtree for this. That's actually exactly what we > have the rbtree code in NASM for. Sure. But this would require more code. I'll take a look once time permits, at moment fast fix should be enough. |
From: H. P. A. <hp...@zy...> - 2012-11-14 03:58:38
|
On 11/05/2012 12:48 PM, nasm-bot for Cyrill Gorcunov wrote: > Commit-ID: 7ce86b500c792b782b7b076f50b220fc62234954 > Gitweb: http://repo.or.cz/w/nasm.git?a=commitdiff;h=7ce86b500c792b782b7b076f50b220fc62234954 > Author: Cyrill Gorcunov <gor...@gm...> > AuthorDate: Tue, 6 Nov 2012 00:47:20 +0400 > Committer: Cyrill Gorcunov <gor...@gm...> > CommitDate: Tue, 6 Nov 2012 00:47:20 +0400 > > BR3392231: Fix get_closest_section_symbol_by_offset > > This patch changes get_closest_section_symbol_by_offset > logic to lookup only the closest symbols which are at > or before the supplied offset. > We could use an rbtree for this. That's actually exactly what we have the rbtree code in NASM for. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
From: C. M. <pu...@38...> - 2012-10-27 20:57:35
|
Hi Frank, > I suspect that the license change would apply to this anyway, (I'm also not a lawyer...) It seems likely that this is the case due to the language you used in the relicensing, iff all relevant copyright holders also participated in the relicensing. Otherwise, it is of course possible for something that's still in the repo's history to be only available under the then-applicable licensing, eg 6c98ca4 removed a tool's source which seems to not be legally obtainable under the current (2-clause BSD) licence (at least not via the repo), http://repo.or.cz/w/nasm.git?a=commit;h=6c98ca4ddce52101fb06abff7e65352693a01137 > a notation of which instructions affect which flags would be > a big improvement, IMO. I'm currently using an older manual that still includes the reference (and thus was compiled with the previous licensing in effect). I want to look into compiling it as a document on its own, maybe use Halibut for that. I also want to either embed some additional information and use a viewer that lets me fold all uninteresting sections, or failing that re-order the sections to have the uninteresting ones at the end, or only conditionally compile them into the result. (Most of the time, all post-586 extensions and even all the FPU instructions are uninteresting to me.) Highlighting the 086-/186-/386-compatible instructions also seems like an interesting idea. As far as content changes are concerned, it's still lacking 64-bit mode information (one of the apparent reasons it became obsolete for NASM's manual). I don't use that as of yet though so it's not important to me. Affected flags are a good idea. I'm also interested in recording more detailed semantic descriptions. Regards, Chris |
From: Frank K. <fbk...@my...> - 2012-10-27 19:22:39
|
C. Masloch wrote: > Hello, > > Can the instruction reference be used according to the new NASM (2-clause > BSD) licence as well? > > NASM's relicensing was completed around ecfba9d (on 2009-07-06), available > here using the web interface: > http://repo.or.cz/w/nasm.git/commit/ecfba9d6abdda57383f61031ab3406efba2769b3 > > The instruction reference was removed from the sources with 03b9f94 (on > 2009-05-09), > http://repo.or.cz/w/nasm.git?a=commit;h=03b9f941336d901e32054efc8cda20a3cc3916d3 > > No changes were applied to the doc/insref.src file after its extraction > from doc/nasmdoc.src by 9b49e24, > http://repo.or.cz/w/nasm.git/commitdiff/9b49e24e1fe1a4afc021f6c3a01720fcabdc47ca > > So next the annotations of that part of doc/nasmdoc.src from 62cb606 (the > parent of the extraction, 9b49e24) are relevant, (the last part of) > http://repo.or.cz/w/nasm.git/blame/62cb606f6876b01c5d89ad00b6d3d4a3a2ffccf2:/doc/nasmdoc.src > > This indicates that all the relevant changes are recorded as checked in by > Peter, Keith, Debbie, and Frank. I don't know whether that means that only > the four of you would be relevant for the licensing, though. Hence, it > seems best to ask you here. > > Regards, > Chris Hi Chris, I believe the original document was written by Simon Tatham (possible input from Julian Hall?). I'm happy with any changes I'm responsible for to be under 2-clause BSD. (I suspect that the license change would apply to this anyway, but... TGIANAL) If you're anticipating any changes, a notation of which instructions affect which flags would be a big improvement, IMO. Best, Frank |
From: C. M. <pu...@38...> - 2012-10-27 16:55:24
|
Hello, Can the instruction reference be used according to the new NASM (2-clause BSD) licence as well? NASM's relicensing was completed around ecfba9d (on 2009-07-06), available here using the web interface: http://repo.or.cz/w/nasm.git/commit/ecfba9d6abdda57383f61031ab3406efba2769b3 The instruction reference was removed from the sources with 03b9f94 (on 2009-05-09), http://repo.or.cz/w/nasm.git?a=commit;h=03b9f941336d901e32054efc8cda20a3cc3916d3 No changes were applied to the doc/insref.src file after its extraction from doc/nasmdoc.src by 9b49e24, http://repo.or.cz/w/nasm.git/commitdiff/9b49e24e1fe1a4afc021f6c3a01720fcabdc47ca So next the annotations of that part of doc/nasmdoc.src from 62cb606 (the parent of the extraction, 9b49e24) are relevant, (the last part of) http://repo.or.cz/w/nasm.git/blame/62cb606f6876b01c5d89ad00b6d3d4a3a2ffccf2:/doc/nasmdoc.src This indicates that all the relevant changes are recorded as checked in by Peter, Keith, Debbie, and Frank. I don't know whether that means that only the four of you would be relevant for the licensing, though. Hence, it seems best to ask you here. Regards, Chris |
From: William C. <Wil...@gl...> - 2012-09-25 11:05:27
|
The manuals for the encoding are out I have just printed them out. ________________________________________ From: Cyrill Gorcunov [gor...@gm...] on behalf of Cyrill Gorcunov [gor...@op...] Sent: Tuesday, September 25, 2012 7:46 AM To: William Cockshott Cc: nas...@li... Subject: Re: [Nasm-devel] Xeon Phi On Mon, Sep 24, 2012 at 07:23:41PM +0000, William Cockshott wrote: > Hi there I am the chief maintainer of the Vector Pascal compiler which uses Nasm as its preferred back end. > I am keen to have a version of the compiler out for the Xeon Phi as soon as I can get hold of a Xeon Phi board. > It would be a big help if the Nasm team plan to release a Xeon Phi upgrade since otherwise I will be > forced into the undocumented purgatory of the Gnu Assembler. > > Are there any such plans? As Peter mentioned, ineed the encoding is not yet well established. But we have a plan to support it somewhere in future. No dates though. Cyrill |
From: anonymous c. <nas...@us...> - 2012-09-25 10:40:16
|
> But we have a plan to support it somewhere in future. No dates though. I have been experimenting with a prototype implementation that supports the following: - K1OM instructions - support for K1OM to [CPU] and CPU - support for __CPU_K1OM__ - MVEX.R and MVEX.V prefixes (to force unused bits to 1) - ZMM0...ZMM31 registers (including VSIB) - K0...K7 mask registers - disp8*N displacements - operand modifiers -- {transform} ( op {eviction hint} ) {mask} - ZWORD operand size qualifier - ZWORD segment alignment argument - DZ and RESZ pseudo instructions - DZ and __DZ__ standard macros - XITEMZ optional standard macro The sanity checking for the {...} modifiers is somewhat tedious, but doable. However, support for parentheses around an operand (as suggested by #327364-001 3.5) is tricky: for reg ops the parentheses simply evaluate, but for mem ops the parser has to explicitly skip them -- however, that decision hinges on the leading transform modifier and there is no clear reg versus mem distinction, because of the D(...) down converts: they do use reg ops, but with mem transforms. I have not decided the most suitable course yet -- add extra parse-ahead to find '[', parse down convert instructions specially, or, well, ignore Intel's suggested syntax (i.e. no (...), and permit all modifiers before and after any operand, reg or mem [with subsequent tests for their sanity/validity, of course]). And yes, I share everyone's concerns about whether K1OM will persist, or simply be yet another one-off like L1OM. It's up to Intel, to provide clarity on that front first. |
From: Cyrill G. <gor...@op...> - 2012-09-25 06:46:50
|
On Mon, Sep 24, 2012 at 07:23:41PM +0000, William Cockshott wrote: > Hi there I am the chief maintainer of the Vector Pascal compiler which uses Nasm as its preferred back end. > I am keen to have a version of the compiler out for the Xeon Phi as soon as I can get hold of a Xeon Phi board. > It would be a big help if the Nasm team plan to release a Xeon Phi upgrade since otherwise I will be > forced into the undocumented purgatory of the Gnu Assembler. > > Are there any such plans? As Peter mentioned, ineed the encoding is not yet well established. But we have a plan to support it somewhere in future. No dates though. Cyrill |
From: Peter J. <pe...@to...> - 2012-09-25 05:32:56
|
On Mon, 24 Sep 2012, William Cockshott wrote: > Hi there I am the chief maintainer of the Vector Pascal compiler which uses Nasm as its > preferred back end. I am keen to have a version of the compiler out for the Xeon Phi as > soon as I can get hold of a Xeon Phi board. It would be a big help if the Nasm team > plan to release a Xeon Phi upgrade since otherwise I will be forced into the > undocumented purgatory of the Gnu Assembler. I've been looking at adding support for Knights Corner (KNC, Xeon Phi) to Yasm, but when I asked Intel about it, the response I received was that Intel has not committed to maintaining the instruction encodings. It's hard to know if they can't commit right now because that triggers cross-licensing arrangements, or whether they really plan on changing the encodings in the near future. As far as I can tell, GAS doesn't even have support for it yet (at least in official CVS). -Peter |
From: William C. <Wil...@gl...> - 2012-09-24 19:23:53
|
Hi there I am the chief maintainer of the Vector Pascal compiler which uses Nasm as its preferred back end. I am keen to have a version of the compiler out for the Xeon Phi as soon as I can get hold of a Xeon Phi board. It would be a big help if the Nasm team plan to release a Xeon Phi upgrade since otherwise I will be forced into the undocumented purgatory of the Gnu Assembler. Are there any such plans? |
From: Frank K. <fbk...@my...> - 2012-07-02 19:39:33
|
?????? ????? wrote: > Withdrawn. Thanks for the attempt, anyway. I hope you'll stick around and discuss ideas with this list! Best, Frank |
From: Йордан Г. <col...@gm...> - 2012-07-02 19:23:57
|
Withdrawn. I was working under a wrong assumption that it would use 64-bit relative addressing by default, since I saw no reason to limit the address space by using absolute. |
From: H. P. A. <hp...@zy...> - 2012-07-02 14:50:15
|
On 07/02/2012 12:44 AM, Йордан Гигов wrote: > The language addition can indeed be achieved with macros, but you > should really test the one in process_ea(). I can't test it until I > find out why all the 32-bit linkers I try are unable to find any of > the symbols. I haven't tried alink yet. > I have a feeling the else block after it won't work right. That patch looks wrong, and I mean dangerously wrong. I think you don't quite understand how the CPU works. The reason that code is there is that in 64-bit mode, a displacement without a SIB is a RIP-relative reference. There is no 64-bit displacement mode (except for one instruction, see the manual) at all; you have to get the address into a register. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. |
From: Йордан Г. <col...@gm...> - 2012-07-02 07:45:28
|
The language addition can indeed be achieved with macros, but you should really test the one in process_ea(). I can't test it until I find out why all the 32-bit linkers I try are unable to find any of the symbols. I haven't tried alink yet. I have a feeling the else block after it won't work right. 2012/7/2 H. Peter Anvin <hp...@zy...>: > On 07/01/2012 08:32 AM, Cyrill Gorcunov wrote: >> >> On Sun, Jul 01, 2012 at 10:13:49AM +0300, Йордан Гигов wrote: >>> >>> The current version of Nasm never generates mod 0 rm 5 bytes to >>> address memory or code, thus it can only be linked with >>> /LARGEADDRESSAWARE:NO by the Microsoft linkers. Additionally you can't >>> specify a base larger than 0x7FFFFFFF. My patch fixes that. >>> >>> Also I make the proposition that in addition to "db", "dw", "dd", >>> "dq", etc. keywords we add "dp" (as in define pointer). It is to be >>> the same size as the program's BITS mode. In 64-bit mode it would >>> behave as dq, in 32-bit as dd, and in 16-bit as dw. >> >> >> I think this should be done rather by a macro definition than >> squashing into C source (and, btw don't address two problems in >> one path, it could be 2 patches -- one for sib and one for dp). >> > > I think we could go either way on that... it's not a huge difference. > > However, to do an if tree is kind of silly... > > -hpa > > -- > H. Peter Anvin, Intel Open Source Technology Center > I work for Intel. I don't speak on their behalf. > > > |