You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(71) |
Aug
(152) |
Sep
(123) |
Oct
(49) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2002 |
Jan
|
Feb
|
Mar
|
Apr
(37) |
May
(554) |
Jun
(301) |
Jul
(84) |
Aug
(39) |
Sep
(44) |
Oct
(99) |
Nov
(41) |
Dec
(52) |
2003 |
Jan
(15) |
Feb
(32) |
Mar
(19) |
Apr
(4) |
May
(8) |
Jun
(30) |
Jul
(122) |
Aug
(100) |
Sep
(120) |
Oct
(4) |
Nov
(39) |
Dec
(32) |
2004 |
Jan
(38) |
Feb
(87) |
Mar
(11) |
Apr
(23) |
May
(7) |
Jun
(6) |
Jul
(18) |
Aug
(2) |
Sep
(22) |
Oct
(2) |
Nov
(7) |
Dec
(48) |
2005 |
Jan
(74) |
Feb
(29) |
Mar
(28) |
Apr
(1) |
May
(24) |
Jun
(16) |
Jul
(9) |
Aug
(7) |
Sep
(69) |
Oct
(11) |
Nov
(13) |
Dec
(13) |
2006 |
Jan
(5) |
Feb
(3) |
Mar
(7) |
Apr
|
May
(12) |
Jun
(12) |
Jul
(5) |
Aug
(1) |
Sep
(4) |
Oct
(61) |
Nov
(68) |
Dec
(46) |
2007 |
Jan
(16) |
Feb
(15) |
Mar
(46) |
Apr
(171) |
May
(78) |
Jun
(109) |
Jul
(61) |
Aug
(71) |
Sep
(189) |
Oct
(219) |
Nov
(162) |
Dec
(91) |
2008 |
Jan
(49) |
Feb
(41) |
Mar
(43) |
Apr
(31) |
May
(70) |
Jun
(98) |
Jul
(39) |
Aug
(8) |
Sep
(75) |
Oct
(47) |
Nov
(11) |
Dec
(17) |
2009 |
Jan
(9) |
Feb
(12) |
Mar
(8) |
Apr
(11) |
May
(27) |
Jun
(25) |
Jul
(161) |
Aug
(28) |
Sep
(66) |
Oct
(36) |
Nov
(49) |
Dec
(22) |
2010 |
Jan
(34) |
Feb
(20) |
Mar
(3) |
Apr
(12) |
May
(1) |
Jun
(10) |
Jul
(28) |
Aug
(98) |
Sep
(7) |
Oct
(25) |
Nov
(4) |
Dec
(9) |
2011 |
Jan
|
Feb
(12) |
Mar
(7) |
Apr
(16) |
May
(11) |
Jun
(59) |
Jul
(120) |
Aug
(7) |
Sep
(4) |
Oct
(5) |
Nov
(3) |
Dec
(2) |
2012 |
Jan
|
Feb
(6) |
Mar
(21) |
Apr
|
May
|
Jun
|
Jul
(9) |
Aug
|
Sep
(5) |
Oct
(3) |
Nov
(6) |
Dec
(1) |
2013 |
Jan
|
Feb
(19) |
Mar
(10) |
Apr
|
May
(2) |
Jun
|
Jul
(7) |
Aug
(62) |
Sep
(14) |
Oct
(44) |
Nov
(38) |
Dec
(47) |
2014 |
Jan
(14) |
Feb
(1) |
Mar
(4) |
Apr
|
May
(20) |
Jun
|
Jul
|
Aug
(8) |
Sep
(6) |
Oct
(11) |
Nov
(9) |
Dec
(9) |
2015 |
Jan
(3) |
Feb
(2) |
Mar
(2) |
Apr
(3) |
May
(2) |
Jun
(5) |
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
(10) |
Dec
(2) |
2016 |
Jan
(12) |
Feb
(13) |
Mar
(9) |
Apr
(45) |
May
(9) |
Jun
(2) |
Jul
(15) |
Aug
(32) |
Sep
(6) |
Oct
(28) |
Nov
(1) |
Dec
|
2017 |
Jan
(1) |
Feb
|
Mar
|
Apr
(13) |
May
(8) |
Jun
(2) |
Jul
(3) |
Aug
(10) |
Sep
|
Oct
(2) |
Nov
|
Dec
(1) |
2018 |
Jan
(2) |
Feb
(4) |
Mar
(2) |
Apr
(7) |
May
|
Jun
(8) |
Jul
|
Aug
(8) |
Sep
(2) |
Oct
(2) |
Nov
(8) |
Dec
(6) |
2019 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(3) |
2020 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Song, J. K. <jin...@in...> - 2013-08-28 10:00:09
|
This is weird. I will send this patch again. I guess the smtp server clobbered first a few lines of this patch. Thanks, Jin > -----Original Message----- > From: Cyrill Gorcunov [mailto:gor...@gm...] > Sent: Tuesday, August 27, 2013 10:42 PM > To: Song, Jin Kyu > Cc: nas...@li... > Subject: Re: [Nasm-devel] [PATCH 4/7] AVX-512: Add a feature to generate a > raw bytecode file > > On Mon, Aug 26, 2013 at 08:28:40PM -0700, Jin Kyu Song wrote: > > >From gas testsuite file, a text file containing raw bytecodes > > is useful when verifying the output of NASM. > > > > Signed-off-by: Jin Kyu Song <jin...@in...> > > --- > > test/gas2nasm.py | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/test/gas2nasm.py b/test/gas2nasm.py > > index de16745..a00af92 100755 > > --- a/test/gas2nasm.py > > +++ b/test/gas2nasm.py > > @@ -21,6 +21,9 @@ def setup(): > > parser.add_option('-b', dest='bits', action='store', > > default="", > > help='Bits for output ASM file.') > > + parser.add_option('-r', dest='raw_output', action='store', > > + default="", > > + help='Name for raw output bytes in text') > > This one doesn't apply. Please refresh and send again. |
From: Song, J. K. <jin...@in...> - 2013-08-28 09:55:58
|
> -----Original Message----- > From: Cyrill Gorcunov [mailto:gor...@gm...] > Sent: Tuesday, August 27, 2013 10:44 PM > To: Song, Jin Kyu > Cc: nas...@li... > Subject: Re: [Nasm-devel] [PATCH 6/7] AVX-512: Change the data type for > instruction flags > > On Mon, Aug 26, 2013 at 08:28:42PM -0700, Jin Kyu Song wrote: > > Increased the size of data type for instruction flags from 32bits to > 64bits. > > And a new type (iflags_t) is defined for better maintainability. > > > > Bigger data type is needed because more instruction set types are coming > > but there were not enough space for them. Since they are not bit masks, > > only one instruction set is allowed for each instruction. > > > > Signed-off-by: Jin Kyu Song <jin...@in...> > > -CVTPI2PS xmmreg,mmxrm64 [rm: np 0f 2a /r] > KATMAI,SSE,MMX > > -CVTPS2PI mmxreg,xmmrm64 [rm: np 0f 2d /r] > KATMAI,SSE,MMX > > +CVTPI2PS xmmreg,mmxrm64 [rm: np 0f 2a /r] > KATMAI,SSE > > +CVTPS2PI mmxreg,xmmrm64 [rm: np 0f 2d /r] > KATMAI,SSE > > Why you've changed flags here and a couple of other places? The reason is actually written in the commit message. "Since they are not bit masks, only one instruction set is allowed for each instruction." So both SSE and MMX could not be set for one instruction. As nasm64developer mentioned in his email, this may not be a proper way to expand and define bits for instruction sets. And it needs a major restructuring of instruction flags not a simple fix of increasing data type size. The original purpose of this change was merely that I needed a way to distinguish EVEX instruction from VEX one which has exactly same operand types. For example, "vmovq xmm30,xmm29" should be encoded with EVEX because of high-16 registers but in the matches() function, I could not think of a way to see the current template being matched is VEX or EVEX except checking the first byte of bytecode (0240 or 0260). So I decided to enable instruction set flags in IF_*. static const struct itemplate instrux_VMOVQ[] = { {I_VMOVQ, 2, {XMMREG,RM_XMM|BITS64,0,0,0}, NO_DECORATOR, nasm_bytecodes+13891, IF_AVX|IF_SANDYBRIDGE|IF_SQ}, {I_VMOVQ, 2, {XMMREG,RM_XMM|BITS64,0,0,0}, NO_DECORATOR, nasm_bytecodes+9496, IF_AVX512|IF_FUTURE}, Please let me know any better way to implement this part. |
From: Cyrill G. <gor...@gm...> - 2013-08-28 05:47:53
|
On Wed, Aug 21, 2013 at 07:29:07PM -0700, Jin Kyu Song wrote: > Please review these patches and pull if they look good. > git://repo.or.cz/nasm/avx512.git > > After running a test case, various issues were found. One major thing is > curly brace already used for grouping multi-line macro parameters. > An escape backward slash character '\' is added when braces are passed > as a part of enclosed parameter. The test asm file used here is also included. > > Patch "AVX-512: Add a test case for EVEX encoded instructions" is > relatively huge. So I did not attch that patch in this email. Please refer to > http://repo.or.cz/w/nasm/avx512.git/commitdiff/a4a573c47f3d9ddfd5c2521804454327765f367e Jin, I picked up the series, except path 4/7 which doesn't apply. But I'm deferring pushing it out until you explain why mmx IF flags are wiped. |
From: Cyrill G. <gor...@gm...> - 2013-08-28 05:44:23
|
On Mon, Aug 26, 2013 at 08:28:42PM -0700, Jin Kyu Song wrote: > Increased the size of data type for instruction flags from 32bits to 64bits. > And a new type (iflags_t) is defined for better maintainability. > > Bigger data type is needed because more instruction set types are coming > but there were not enough space for them. Since they are not bit masks, > only one instruction set is allowed for each instruction. > > Signed-off-by: Jin Kyu Song <jin...@in...> > -CVTPI2PS xmmreg,mmxrm64 [rm: np 0f 2a /r] KATMAI,SSE,MMX > -CVTPS2PI mmxreg,xmmrm64 [rm: np 0f 2d /r] KATMAI,SSE,MMX > +CVTPI2PS xmmreg,mmxrm64 [rm: np 0f 2a /r] KATMAI,SSE > +CVTPS2PI mmxreg,xmmrm64 [rm: np 0f 2d /r] KATMAI,SSE Why you've changed flags here and a couple of other places? |
From: Cyrill G. <gor...@gm...> - 2013-08-28 05:42:25
|
On Mon, Aug 26, 2013 at 08:28:40PM -0700, Jin Kyu Song wrote: > >From gas testsuite file, a text file containing raw bytecodes > is useful when verifying the output of NASM. > > Signed-off-by: Jin Kyu Song <jin...@in...> > --- > test/gas2nasm.py | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/test/gas2nasm.py b/test/gas2nasm.py > index de16745..a00af92 100755 > --- a/test/gas2nasm.py > +++ b/test/gas2nasm.py > @@ -21,6 +21,9 @@ def setup(): > parser.add_option('-b', dest='bits', action='store', > default="", > help='Bits for output ASM file.') > + parser.add_option('-r', dest='raw_output', action='store', > + default="", > + help='Name for raw output bytes in text') This one doesn't apply. Please refresh and send again. |
From: anonymous c. <nas...@us...> - 2013-08-28 05:17:24
|
>> PTEST and ROUND[PS|PD|SS|SD] were part of SSE4.1 and SSE5A. > Thanks for letting me know. NASM does not have a definition of SSE5A though. > Is NASM missing SSE5A intentionally? AMD decided not to ship SSE5A in the end. (I merely cited it as an example for instructions that were part of more than one extension or feature.) > So do you suggest make them all bit masks to lift this kind of limitation > (prohibiting two different types)? Since it became 64bit, there are enough > space for now. But I worried the case that it runs out quickly. You already need more than 64 bits to properly map all existing instructions to their CPUID features (and in some case you will need to make up a flag, since CPUID lacks a few of them, e.g. hinting NOPs, hints for branches, PAUSE, etc.). So instead of just widening from 32 to 64 bits, you'll need a real bit vector, with more than 64 bits. Also, this would be a change that you would want to make on the main tree, not the AVX-512 branch. |
From: Song, J. K. <jin...@in...> - 2013-08-27 21:57:56
|
> PTEST and ROUND[PS|PD|SS|SD] were part of SSE4.1 and SSE5A. Thanks for letting me know. NASM does not have a definition of SSE5A though. Is NASM missing SSE5A intentionally? > > XTEST is part of RTM and HLE. I made UNDOC and HLE bit masks for this reason while others are values. #define IF_UNDOC 0x8000000000UL /* it's an undocumented instruction */ #define IF_HLE 0x4000000000UL /* HACK NEED TO REORGANIZE THESE BITS */ So do you suggest make them all bit masks to lift this kind of limitation (prohibiting two different types)? Since it became 64bit, there are enough space for now. But I worried the case that it runs out quickly. Another question is that 0x00FFC000UL seemed not used anywhere now. Are those bits reserved? |
From: anonymous c. <nas...@us...> - 2013-08-27 04:22:19
|
> And please note that current code patch searches for two specific strings of > "\{" and "\}", so it might not break any existing code that have used > backslashes in macro parameters. How do you specify a parameter enclosed in '{' and '}' that expands to something containing '\{' and '\}'? |
From: anonymous c. <nas...@us...> - 2013-08-27 04:02:32
|
> Increased the size of data type for instruction flags from 32bits to 64bits. > And a new type (iflags_t) is defined for better maintainability. > > Bigger data type is needed because more instruction set types are coming > but there were not enough space for them. Since they are not bit masks, > only one instruction set is allowed for each instruction. PTEST and ROUND[PS|PD|SS|SD] were part of SSE4.1 and SSE5A. XTEST is part of RTM and HLE. |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:30:00
|
Increased the size of data type for instruction flags from 32bits to 64bits. And a new type (iflags_t) is defined for better maintainability. Bigger data type is needed because more instruction set types are coming but there were not enough space for them. Since they are not bit masks, only one instruction set is allowed for each instruction. Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 6 +++--- assemble.h | 4 ++-- disasm.c | 4 ++-- disasm.h | 2 +- insns.dat | 46 +++++++++++++++++++++++----------------------- insns.h | 53 ++++++++++++++++++++++++++++------------------------- insns.pl | 15 +++++++++++++++ nasm.c | 8 ++++---- nasm.h | 2 ++ ndisasm.c | 2 +- 10 files changed, 81 insertions(+), 61 deletions(-) diff --git a/assemble.c b/assemble.c index baae15f..c22075d 100644 --- a/assemble.c +++ b/assemble.c @@ -213,7 +213,7 @@ typedef struct { #define GEN_MODRM(mod, reg, rm) \ (((mod) << 6) | (((reg) & 7) << 3) | ((rm) & 7)) -static uint32_t cpu; /* cpu level received from nasm.c */ +static iflags_t cpu; /* cpu level received from nasm.c */ static efunc errfunc; static struct ofmt *outfmt; static ListGen *list; @@ -377,7 +377,7 @@ static bool jmp_match(int32_t segment, int64_t offset, int bits, return (isize >= -128 && isize <= 127); /* is it byte size? */ } -int64_t assemble(int32_t segment, int64_t offset, int bits, uint32_t cp, +int64_t assemble(int32_t segment, int64_t offset, int bits, iflags_t cp, insn * instruction, struct ofmt *output, efunc error, ListGen * listgen) { @@ -680,7 +680,7 @@ int64_t assemble(int32_t segment, int64_t offset, int bits, uint32_t cp, return 0; } -int64_t insn_size(int32_t segment, int64_t offset, int bits, uint32_t cp, +int64_t insn_size(int32_t segment, int64_t offset, int bits, iflags_t cp, insn * instruction, efunc error) { const struct itemplate *temp; diff --git a/assemble.h b/assemble.h index e5e5015..1197d59 100644 --- a/assemble.h +++ b/assemble.h @@ -38,9 +38,9 @@ #ifndef NASM_ASSEMBLE_H #define NASM_ASSEMBLE_H -int64_t insn_size(int32_t segment, int64_t offset, int bits, uint32_t cp, +int64_t insn_size(int32_t segment, int64_t offset, int bits, iflags_t cp, insn * instruction, efunc error); -int64_t assemble(int32_t segment, int64_t offset, int bits, uint32_t cp, +int64_t assemble(int32_t segment, int64_t offset, int bits, iflags_t cp, insn * instruction, struct ofmt *output, efunc error, ListGen * listgen); diff --git a/disasm.c b/disasm.c index cc55d2c..9a5f9ad 100644 --- a/disasm.c +++ b/disasm.c @@ -944,7 +944,7 @@ static const char * const condition_name[16] = { }; int32_t disasm(uint8_t *data, char *output, int outbufsize, int segsize, - int32_t offset, int autosync, uint32_t prefer) + int32_t offset, int autosync, iflags_t prefer) { const struct itemplate * const *p, * const *best_p; const struct disasm_index *ix; @@ -955,7 +955,7 @@ int32_t disasm(uint8_t *data, char *output, int outbufsize, int segsize, uint8_t *origdata; int works; insn tmp_ins, ins; - uint32_t goodness, best; + iflags_t goodness, best; int best_pref; struct prefix_info prefix; bool end_prefix; diff --git a/disasm.h b/disasm.h index 3edbfd5..70a9a7b 100644 --- a/disasm.h +++ b/disasm.h @@ -41,7 +41,7 @@ #define INSN_MAX 32 /* one instruction can't be longer than this */ int32_t disasm(uint8_t *data, char *output, int outbufsize, int segsize, - int32_t offset, int autosync, uint32_t prefer); + int32_t offset, int autosync, iflags_t prefer); int32_t eatbyte(uint8_t *data, char *output, int outbufsize, int segsize); #endif diff --git a/insns.dat b/insns.dat index 7a0ec60..772a3e9 100644 --- a/insns.dat +++ b/insns.dat @@ -1514,8 +1514,8 @@ CMPPS xmmreg,xmmreg,imm [rmi: np 0f c2 /r ib,u] KATMAI,SSE,SB,AR2 CMPSS xmmreg,mem,imm [rmi: f3 0f c2 /r ib,u] KATMAI,SSE,SB,AR2 CMPSS xmmreg,xmmreg,imm [rmi: f3 0f c2 /r ib,u] KATMAI,SSE,SB,AR2 COMISS xmmreg,xmmrm32 [rm: np 0f 2f /r] KATMAI,SSE -CVTPI2PS xmmreg,mmxrm64 [rm: np 0f 2a /r] KATMAI,SSE,MMX -CVTPS2PI mmxreg,xmmrm64 [rm: np 0f 2d /r] KATMAI,SSE,MMX +CVTPI2PS xmmreg,mmxrm64 [rm: np 0f 2a /r] KATMAI,SSE +CVTPS2PI mmxreg,xmmrm64 [rm: np 0f 2d /r] KATMAI,SSE CVTSI2SS xmmreg,mem [rm: f3 0f 2a /r] KATMAI,SSE,SD,AR1,ND CVTSI2SS xmmreg,rm32 [rm: f3 0f 2a /r] KATMAI,SSE,SD,AR1 CVTSI2SS xmmreg,rm64 [rm: o64 f3 0f 2a /r] X64,SSE,SQ,AR1 @@ -1523,7 +1523,7 @@ CVTSS2SI reg32,xmmreg [rm: f3 0f 2d /r] KATMAI,SSE,SD,AR1 CVTSS2SI reg32,mem [rm: f3 0f 2d /r] KATMAI,SSE,SD,AR1 CVTSS2SI reg64,xmmreg [rm: o64 f3 0f 2d /r] X64,SSE,SD,AR1 CVTSS2SI reg64,mem [rm: o64 f3 0f 2d /r] X64,SSE,SD,AR1 -CVTTPS2PI mmxreg,xmmrm [rm: np 0f 2c /r] KATMAI,SSE,MMX,SQ +CVTTPS2PI mmxreg,xmmrm [rm: np 0f 2c /r] KATMAI,SSE,SQ CVTTSS2SI reg32,xmmrm [rm: f3 0f 2c /r] KATMAI,SSE,SD,AR1 CVTTSS2SI reg64,xmmrm [rm: o64 f3 0f 2c /r] X64,SSE,SD,AR1 DIVPS xmmreg,xmmrm128 [rm: np 0f 5e /r] KATMAI,SSE @@ -1568,10 +1568,10 @@ UNPCKLPS xmmreg,xmmrm128 [rm: np 0f 14 /r] KATMAI,SSE XORPS xmmreg,xmmrm128 [rm: np 0f 57 /r] KATMAI,SSE ;# Introduced in Deschutes but necessary for SSE support -FXRSTOR mem [m: np 0f ae /1] P6,SSE,FPU -FXRSTOR64 mem [m: o64 np 0f ae /1] X64,SSE,FPU -FXSAVE mem [m: np 0f ae /0] P6,SSE,FPU -FXSAVE64 mem [m: o64 np 0f ae /0] X64,SSE,FPU +FXRSTOR mem [m: np 0f ae /1] P6,SSE +FXRSTOR64 mem [m: o64 np 0f ae /1] X64,SSE +FXSAVE mem [m: np 0f ae /0] P6,SSE +FXSAVE64 mem [m: o64 np 0f ae /0] X64,SSE ;# XSAVE group (AVX and extended state) ; Introduced in late Penryn ... we really need to clean up the handling @@ -1863,37 +1863,37 @@ INVVPID reg32,mem [rm: 66 0f 38 81 /r] VMX,SO,NOLONG INVVPID reg64,mem [rm: o64nw 66 0f 38 81 /r] VMX,SO,LONG ;# Tejas New Instructions (SSSE3) -PABSB mmxreg,mmxrm [rm: np 0f 38 1c /r] SSSE3,MMX,SQ +PABSB mmxreg,mmxrm [rm: np 0f 38 1c /r] SSSE3,SQ PABSB xmmreg,xmmrm [rm: 66 0f 38 1c /r] SSSE3 -PABSW mmxreg,mmxrm [rm: np 0f 38 1d /r] SSSE3,MMX,SQ +PABSW mmxreg,mmxrm [rm: np 0f 38 1d /r] SSSE3,SQ PABSW xmmreg,xmmrm [rm: 66 0f 38 1d /r] SSSE3 -PABSD mmxreg,mmxrm [rm: np 0f 38 1e /r] SSSE3,MMX,SQ +PABSD mmxreg,mmxrm [rm: np 0f 38 1e /r] SSSE3,SQ PABSD xmmreg,xmmrm [rm: 66 0f 38 1e /r] SSSE3 -PALIGNR mmxreg,mmxrm,imm [rmi: np 0f 3a 0f /r ib,u] SSSE3,MMX,SQ +PALIGNR mmxreg,mmxrm,imm [rmi: np 0f 3a 0f /r ib,u] SSSE3,SQ PALIGNR xmmreg,xmmrm,imm [rmi: 66 0f 3a 0f /r ib,u] SSSE3 -PHADDW mmxreg,mmxrm [rm: np 0f 38 01 /r] SSSE3,MMX,SQ +PHADDW mmxreg,mmxrm [rm: np 0f 38 01 /r] SSSE3,SQ PHADDW xmmreg,xmmrm [rm: 66 0f 38 01 /r] SSSE3 -PHADDD mmxreg,mmxrm [rm: np 0f 38 02 /r] SSSE3,MMX,SQ +PHADDD mmxreg,mmxrm [rm: np 0f 38 02 /r] SSSE3,SQ PHADDD xmmreg,xmmrm [rm: 66 0f 38 02 /r] SSSE3 -PHADDSW mmxreg,mmxrm [rm: np 0f 38 03 /r] SSSE3,MMX,SQ +PHADDSW mmxreg,mmxrm [rm: np 0f 38 03 /r] SSSE3,SQ PHADDSW xmmreg,xmmrm [rm: 66 0f 38 03 /r] SSSE3 -PHSUBW mmxreg,mmxrm [rm: np 0f 38 05 /r] SSSE3,MMX,SQ +PHSUBW mmxreg,mmxrm [rm: np 0f 38 05 /r] SSSE3,SQ PHSUBW xmmreg,xmmrm [rm: 66 0f 38 05 /r] SSSE3 -PHSUBD mmxreg,mmxrm [rm: np 0f 38 06 /r] SSSE3,MMX,SQ +PHSUBD mmxreg,mmxrm [rm: np 0f 38 06 /r] SSSE3,SQ PHSUBD xmmreg,xmmrm [rm: 66 0f 38 06 /r] SSSE3 -PHSUBSW mmxreg,mmxrm [rm: np 0f 38 07 /r] SSSE3,MMX,SQ +PHSUBSW mmxreg,mmxrm [rm: np 0f 38 07 /r] SSSE3,SQ PHSUBSW xmmreg,xmmrm [rm: 66 0f 38 07 /r] SSSE3 -PMADDUBSW mmxreg,mmxrm [rm: np 0f 38 04 /r] SSSE3,MMX,SQ +PMADDUBSW mmxreg,mmxrm [rm: np 0f 38 04 /r] SSSE3,SQ PMADDUBSW xmmreg,xmmrm [rm: 66 0f 38 04 /r] SSSE3 -PMULHRSW mmxreg,mmxrm [rm: np 0f 38 0b /r] SSSE3,MMX,SQ +PMULHRSW mmxreg,mmxrm [rm: np 0f 38 0b /r] SSSE3,SQ PMULHRSW xmmreg,xmmrm [rm: 66 0f 38 0b /r] SSSE3 -PSHUFB mmxreg,mmxrm [rm: np 0f 38 00 /r] SSSE3,MMX,SQ +PSHUFB mmxreg,mmxrm [rm: np 0f 38 00 /r] SSSE3,SQ PSHUFB xmmreg,xmmrm [rm: 66 0f 38 00 /r] SSSE3 -PSIGNB mmxreg,mmxrm [rm: np 0f 38 08 /r] SSSE3,MMX,SQ +PSIGNB mmxreg,mmxrm [rm: np 0f 38 08 /r] SSSE3,SQ PSIGNB xmmreg,xmmrm [rm: 66 0f 38 08 /r] SSSE3 -PSIGNW mmxreg,mmxrm [rm: np 0f 38 09 /r] SSSE3,MMX,SQ +PSIGNW mmxreg,mmxrm [rm: np 0f 38 09 /r] SSSE3,SQ PSIGNW xmmreg,xmmrm [rm: 66 0f 38 09 /r] SSSE3 -PSIGND mmxreg,mmxrm [rm: np 0f 38 0a /r] SSSE3,MMX,SQ +PSIGND mmxreg,mmxrm [rm: np 0f 38 0a /r] SSSE3,SQ PSIGND xmmreg,xmmrm [rm: 66 0f 38 0a /r] SSSE3 ;# AMD SSE4A diff --git a/insns.h b/insns.h index 58a4cd7..ad795e2 100644 --- a/insns.h +++ b/insns.h @@ -19,7 +19,7 @@ struct itemplate { opflags_t opd[MAX_OPERANDS]; /* bit flags for operand types */ decoflags_t deco[MAX_OPERANDS]; /* bit flags for operand decorators */ const uint8_t *code; /* the code it assembles to */ - uint32_t flags; /* some flags */ + iflags_t flags; /* some flags */ }; /* Disassembler table structure */ @@ -72,6 +72,8 @@ extern const uint8_t nasm_bytecodes[]; * (The default state if neither IF_SM nor IF_SM2 is specified is * that any operand with unspecified size in the template is * required to have unspecified size in the instruction too...) + * + * iflags_t is defined to store these flags. */ #define IF_SM 0x00000001UL /* size match */ @@ -103,33 +105,34 @@ extern const uint8_t nasm_bytecodes[]; #define IF_LONG 0x00001000UL /* long mode instruction */ #define IF_NOHLE 0x00002000UL /* HLE prefixes forbidden */ /* These flags are currently not used for anything - intended for insn set */ -#define IF_UNDOC 0x00000000UL /* it's an undocumented instruction */ -#define IF_FPU 0x00000000UL /* it's an FPU instruction */ -#define IF_MMX 0x00000000UL /* it's an MMX instruction */ -#define IF_3DNOW 0x00000000UL /* it's a 3DNow! instruction */ -#define IF_SSE 0x00000000UL /* it's a SSE (KNI, MMX2) instruction */ -#define IF_SSE2 0x00000000UL /* it's a SSE2 instruction */ -#define IF_SSE3 0x00000000UL /* it's a SSE3 (PNI) instruction */ -#define IF_VMX 0x00000000UL /* it's a VMX instruction */ -#define IF_SSSE3 0x00000000UL /* it's an SSSE3 instruction */ -#define IF_SSE4A 0x00000000UL /* AMD SSE4a */ -#define IF_SSE41 0x00000000UL /* it's an SSE4.1 instruction */ -#define IF_SSE42 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_SSE5 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_AVX 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_AVX2 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_AVX512 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_FMA 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_BMI1 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_BMI2 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_TBM 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_HLE 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_RTM 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ -#define IF_INVPCID 0x00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_UNDOC 0x8000000000UL /* it's an undocumented instruction */ +#define IF_HLE 0x4000000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_FPU 0x0100000000UL /* it's an FPU instruction */ +#define IF_MMX 0x0200000000UL /* it's an MMX instruction */ +#define IF_3DNOW 0x0300000000UL /* it's a 3DNow! instruction */ +#define IF_SSE 0x0400000000UL /* it's a SSE (KNI, MMX2) instruction */ +#define IF_SSE2 0x0500000000UL /* it's a SSE2 instruction */ +#define IF_SSE3 0x0600000000UL /* it's a SSE3 (PNI) instruction */ +#define IF_VMX 0x0700000000UL /* it's a VMX instruction */ +#define IF_SSSE3 0x0800000000UL /* it's an SSSE3 instruction */ +#define IF_SSE4A 0x0900000000UL /* AMD SSE4a */ +#define IF_SSE41 0x0A00000000UL /* it's an SSE4.1 instruction */ +#define IF_SSE42 0x0B00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_SSE5 0x0C00000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_AVX 0x0D00000000UL /* it's an AVX (128b) instruction */ +#define IF_AVX2 0x0E00000000UL /* it's an AVX2 (256b) instruction */ +#define IF_AVX512 0x0F00000000UL /* it's an AVX-512 (512b) instruction */ +#define IF_FMA 0x1000000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_BMI1 0x1100000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_BMI2 0x1200000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_TBM 0x1300000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_RTM 0x1400000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_INVPCID 0x1500000000UL /* HACK NEED TO REORGANIZE THESE BITS */ +#define IF_INSMASK 0xFF00000000UL /* the mask for instruction set types */ #define IF_PMASK 0xFF000000UL /* the mask for processor types */ #define IF_PLEVEL 0x0F000000UL /* the mask for processor instr. level */ /* also the highest possible processor */ -#define IF_PFMASK 0xF01FF800UL /* the mask for disassembly "prefer" */ +#define IF_PFMASK 0xFFF0000000UL /* the mask for disassembly "prefer" */ #define IF_8086 0x00000000UL /* 8086 instruction */ #define IF_186 0x01000000UL /* 186+ instruction */ #define IF_286 0x02000000UL /* 286+ instruction */ diff --git a/insns.pl b/insns.pl index eb99f6b..60f7dd3 100755 --- a/insns.pl +++ b/insns.pl @@ -427,6 +427,10 @@ sub format_insn($$$$$) { my $num, $nd = 0; my @bytecode; my $op, @ops, $opp, @opx, @oppx, @decos, @opevex; + my @iflags = ( "FPU", "MMX", "3DNOW", "SSE", "SSE2", + "SSE3", "VMX", "SSSE3", "SSE4A", "SSE41", + "SSE42", "SSE5", "AVX", "AVX2", "AVX512", + "FMA", "BMI1", "BMI2", "TBM", "RTM", "INVPCID"); return (undef, undef) if $operands eq "ignore"; @@ -476,6 +480,17 @@ sub format_insn($$$$$) { } $decorators =~ tr/a-z/A-Z/; + # check if two different insn set types are set + $cnt = 0; + foreach $fla (split(/,/, $flags)) { + if ($fla ~~ @iflags) { + $cnt++; + if ($cnt >= 2) { + die "Too many insn set flags in $flags\n"; + } + } + } + # format the flags $flags =~ s/,/|IF_/g; $flags =~ s/(\|IF_ND|IF_ND\|)//, $nd = 1 if $flags =~ /IF_ND/; diff --git a/nasm.c b/nasm.c index 126f271..3a0c050 100644 --- a/nasm.c +++ b/nasm.c @@ -74,7 +74,7 @@ struct forwrefinfo { /* info held on forward refs. */ }; static int get_bits(char *value); -static uint32_t get_cpu(char *cpu_str); +static iflags_t get_cpu(char *cpu_str); static void parse_cmdline(int, char **); static void assemble_file(char *, StrList **); static void nasm_verror_gnu(int severity, const char *fmt, va_list args); @@ -106,8 +106,8 @@ static FILE *error_file; /* Where to write error messages */ FILE *ofile = NULL; int optimizing = MAX_OPTIMIZE; /* number of optimization passes to take */ static int sb, cmd_sb = 16; /* by default */ -static uint32_t cmd_cpu = IF_PLEVEL; /* highest level by default */ -static uint32_t cpu = IF_PLEVEL; /* passed to insn_size & assemble.c */ +static iflags_t cmd_cpu = IF_PLEVEL; /* highest level by default */ +static iflags_t cpu = IF_PLEVEL; /* passed to insn_size & assemble.c */ int64_t global_offset_changed; /* referenced in labels.c */ int64_t prev_offset_changed; int32_t stall_count; @@ -2006,7 +2006,7 @@ static void usage(void) fputs("type `nasm -h' for help\n", error_file); } -static uint32_t get_cpu(char *value) +static iflags_t get_cpu(char *value) { if (!strcmp(value, "8086")) return IF_8086; diff --git a/nasm.h b/nasm.h index fc5a18d..72986ee 100644 --- a/nasm.h +++ b/nasm.h @@ -694,6 +694,8 @@ typedef struct insn { /* an instruction itself */ enum geninfo { GI_SWITCH }; +typedef uint64_t iflags_t; + /* * The data structure defining an output format driver, and the * interfaces to the functions therein. diff --git a/ndisasm.c b/ndisasm.c index 710d1f0..638299f 100644 --- a/ndisasm.c +++ b/ndisasm.c @@ -88,7 +88,7 @@ int main(int argc, char **argv) bool autosync = false; int bits = 16, b; bool eof = false; - uint32_t prefer = 0; + iflags_t prefer = 0; bool rn_error; int32_t offset; FILE *fp; -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:29:59
|
High-16 registers of XMM and YMM need to be encoded with EVEX not VEX. Even if all the operand types match with VEX instruction format, it should use EVEX instead. Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/assemble.c b/assemble.c index c22075d..b0d4571 100644 --- a/assemble.c +++ b/assemble.c @@ -191,6 +191,7 @@ enum match_result { MERR_BADCPU, MERR_BADMODE, MERR_BADHLE, + MERR_ENCMISMATCH, /* * Matching success; the conditional ones first */ @@ -1233,6 +1234,10 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, if (bits != 64 && ((ins->rex & bad32) || ins->vexreg > 7)) { errfunc(ERR_NONFATAL, "invalid operands in non-64-bit mode"); return -1; + } else if (!(ins->rex & REX_EV) && + ((ins->vexreg > 15) || (ins->evex_p[0] & 0xf0))) { + errfunc(ERR_NONFATAL, "invalid high-16 register in non-AVX-512"); + return -1; } if (ins->rex & REX_EV) length += 4; @@ -2147,6 +2152,9 @@ static enum match_result matches(const struct itemplate *itemp, */ opsizemissing = true; } + } else if (instruction->oprs[i].basereg >= 16 && + (itemp->flags & IF_INSMASK) != IF_AVX512) { + return MERR_ENCMISMATCH; } } -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:29:57
|
>From gas testsuite file, a text file containing raw bytecodes is useful when verifying the output of NASM. Signed-off-by: Jin Kyu Song <jin...@in...> --- test/gas2nasm.py | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/test/gas2nasm.py b/test/gas2nasm.py index de16745..a00af92 100755 --- a/test/gas2nasm.py +++ b/test/gas2nasm.py @@ -21,6 +21,9 @@ def setup(): parser.add_option('-b', dest='bits', action='store', default="", help='Bits for output ASM file.') + parser.add_option('-r', dest='raw_output', action='store', + default="", + help='Name for raw output bytes in text') (options, args) = parser.parse_args() return options @@ -77,11 +80,19 @@ def write(data, options): outstr = outstrfmt % tuple(insn) out.write(outstr) +def write_rawbytes(data, options): + if options.raw_output: + with open(options.raw_output, 'wb') as out: + for insn in data: + out.write(insn[0] + '\n') + if __name__ == "__main__": options = setup() recs = read(options) print "AVX3.1 instructions" + write_rawbytes(recs, options) + recs = commas(recs) write(recs, options) -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:29:57
|
Fixed a bug that derived an incorrect N value for tuple types of T2, T4, T8. Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/assemble.c b/assemble.c index 313ff8a..baae15f 100644 --- a/assemble.c +++ b/assemble.c @@ -2257,7 +2257,7 @@ static bool is_disp8n(operand *input, insn *ins, int8_t *compdisp) if (vectlen + 7 <= (evex_w + 5) + (tuple - T2 + 1)) n = 0; else - n = 1 << (tuple - T2 + evex_w + 4); + n = 1 << (tuple - T2 + evex_w + 3); break; case HVM: case QVM: -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:29:56
|
Since embedded rounding mode is following the last SIMD op, GPR op should be skipped when finding the last SIMD op. Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/assemble.c b/assemble.c index 4f0cd9c..313ff8a 100644 --- a/assemble.c +++ b/assemble.c @@ -1159,6 +1159,8 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, rfield = nasm_regvals[opx->basereg]; /* find the last SIMD operand where ER decorator resides */ oplast = &ins->oprs[op1 > op2 ? op1 : op2]; + while (oplast && is_class(REG_CLASS_GPR, oplast->type)) + oplast--; } else { rflags = 0; rfield = c & 7; -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:29:55
|
This is for following the current syntax used in gas even though this is not SDM conforming. According to SDM, {er} should follow the last GPR op not SIMD op. e.g. SDM : VCVTSI2SD xmm1, xmm2, r/m64{er} NASM : VCVTSI2SD xmm1, xmm2{er}, r/m64 Signed-off-by: Jin Kyu Song <jin...@in...> --- insns.dat | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/insns.dat b/insns.dat index 320280a..7a0ec60 100644 --- a/insns.dat +++ b/insns.dat @@ -3504,10 +3504,10 @@ VCVTSD2SI reg64,xmmrm64|er [rm:t1f64: VCVTSD2SS xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 5a /r ] AVX512,FUTURE VCVTSD2USI reg32,xmmrm64|er [rm:t1f64: evex.lig.f2.0f.w0 79 /r ] AVX512,FUTURE VCVTSD2USI reg64,xmmrm64|er [rm:t1f64: evex.lig.f2.0f.w1 79 /r ] AVX512,FUTURE -VCVTSI2SD xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f2.0f.w0 2a /r ] AVX512,FUTURE -VCVTSI2SD xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 2a /r ] AVX512,FUTURE -VCVTSI2SS xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 2a /r ] AVX512,FUTURE -VCVTSI2SS xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f3.0f.w1 2a /r ] AVX512,FUTURE +VCVTSI2SD xmmreg,xmmreg|er,rm32 [rvm:t1s: evex.nds.lig.f2.0f.w0 2a /r ] AVX512,FUTURE +VCVTSI2SD xmmreg,xmmreg|er,rm64 [rvm:t1s: evex.nds.lig.f2.0f.w1 2a /r ] AVX512,FUTURE +VCVTSI2SS xmmreg,xmmreg|er,rm32 [rvm:t1s: evex.nds.lig.f3.0f.w0 2a /r ] AVX512,FUTURE +VCVTSI2SS xmmreg,xmmreg|er,rm64 [rvm:t1s: evex.nds.lig.f3.0f.w1 2a /r ] AVX512,FUTURE VCVTSS2SD xmmreg|mask|z,xmmreg,xmmrm32|sae [rvm:t1s: evex.nds.lig.f3.0f.w0 5a /r ] AVX512,FUTURE VCVTSS2SI reg32,xmmrm32|er [rm:t1f32: evex.lig.f3.0f.w0 2d /r ] AVX512,FUTURE VCVTSS2SI reg64,xmmrm32|er [rm:t1f32: evex.lig.f3.0f.w1 2d /r ] AVX512,FUTURE @@ -3527,10 +3527,10 @@ VCVTTSS2USI reg32,xmmrm32|sae [rm:t1f32: VCVTTSS2USI reg64,xmmrm32|sae [rm:t1f32: evex.lig.f3.0f.w1 78 /r ] AVX512,FUTURE VCVTUDQ2PD zmmreg|mask|z,ymmrm256|b32|er [rm:hv: evex.512.f3.0f.w0 7a /r ] AVX512,FUTURE VCVTUDQ2PS zmmreg|mask|z,zmmrm512|b32|er [rm:fv: evex.512.f2.0f.w0 7a /r ] AVX512,FUTURE -VCVTUSI2SD xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f2.0f.w0 7b /r ] AVX512,FUTURE -VCVTUSI2SD xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 7b /r ] AVX512,FUTURE -VCVTUSI2SS xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 7b /r ] AVX512,FUTURE -VCVTUSI2SS xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f3.0f.w1 7b /r ] AVX512,FUTURE +VCVTUSI2SD xmmreg,xmmreg|er,rm32 [rvm:t1s: evex.nds.lig.f2.0f.w0 7b /r ] AVX512,FUTURE +VCVTUSI2SD xmmreg,xmmreg|er,rm64 [rvm:t1s: evex.nds.lig.f2.0f.w1 7b /r ] AVX512,FUTURE +VCVTUSI2SS xmmreg,xmmreg|er,rm32 [rvm:t1s: evex.nds.lig.f3.0f.w0 7b /r ] AVX512,FUTURE +VCVTUSI2SS xmmreg,xmmreg|er,rm64 [rvm:t1s: evex.nds.lig.f3.0f.w1 7b /r ] AVX512,FUTURE VDIVPD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f.w1 5e /r ] AVX512,FUTURE VDIVPS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.0f.w0 5e /r ] AVX512,FUTURE VDIVSD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 5e /r ] AVX512,FUTURE @@ -3548,6 +3548,7 @@ VEXTRACTI32X4 xmmreg|mask|z,zmmreg,imm8 [mri: e VEXTRACTI64X4 mem256|mask,zmmreg,imm8 [mri:t4: evex.512.66.0f3a.w1 3b /r ib ] AVX512,FUTURE VEXTRACTI64X4 ymmreg|mask|z,zmmreg,imm8 [mri: evex.512.66.0f3a.w1 3b /r ib ] AVX512,FUTURE VEXTRACTPS rm32,xmmreg,imm8 [mri:t1s: evex.128.66.0f3a.wig 17 /r ib ] AVX512,FUTURE +VEXTRACTPS rm64,xmmreg,imm8 [mri:t1s: evex.128.66.0f3a.w1 17 /r ib ] AVX512,FUTURE VFIXUPIMMPD zmmreg|mask|z,zmmreg,zmmrm512|b64|sae,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w1 54 /r ib ] AVX512,FUTURE VFIXUPIMMPS zmmreg|mask|z,zmmreg,zmmrm512|b32|sae,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w0 54 /r ib ] AVX512,FUTURE VFIXUPIMMSD xmmreg|mask|z,xmmreg,xmmrm64|sae,imm8 [rvmi:t1s: evex.nds.lig.66.0f3a.w1 55 /r ib ] AVX512,FUTURE -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-27 03:29:54
|
Please review and pull patches from: git://repo.or.cz/nasm/avx512.git A test case is added and checked against. Quite a few bugs are fixed in this series of patches. Gas and other tools expects the embedded rounding decorator located next to the last SIMD operand but this is different from what AVX-512 spec says. Currently NASM is implemented to be compatible with the existing other tools such as gas. The instruction flags (IF_*) in insns.dat ran out of the space for accomodating increasing number of instruction set types such as AVX-512. So the data type size is increased from 32 bits to 64 bits. Jin Kyu Song (7): AVX-512: Add a test case for EVEX encoded instructions AVX-512: Moved {er} decorator position next to the last SIMD op AVX-512: Find the correct position of the last SIMD op AVX-512: Add a feature to generate a raw bytecode file AVX-512: Fix a bug in calculating Disp8*N value AVX-512: Change the data type for instruction flags AVX-512: Fix match function to check the range of registers assemble.c | 18 +- assemble.h | 4 +- disasm.c | 4 +- disasm.h | 2 +- insns.dat | 63 +- insns.h | 53 +- insns.pl | 15 + nasm.c | 8 +- nasm.h | 2 + ndisasm.c | 2 +- test/avx512f.asm | 9175 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ test/gas2nasm.py | 107 + 12 files changed, 9383 insertions(+), 70 deletions(-) create mode 100644 test/avx512f.asm create mode 100755 test/gas2nasm.py -- 1.7.9.5 |
From: Song, J. K. <jin...@in...> - 2013-08-26 12:25:43
|
At first I considered smac way of counting braces. But for a better flexibility of macro, I decided not to follow that way. When the number of braces in a grouped parameter is odd, the macro expander would be confused again. This example might not be a good one but shows what I thought about. === example === %macro mmacro 1 vcvtph2ps zmm1{%1 %endmacro mmacro {k1\}\{z\},zmm3} === result ==== vcvtph2ps zmm1{k1}{z},zmm3 =============== This is why I chose to use a backslash escaping - eliminating any special meaning from braces in a grouped parameter. And it also give the same benefit to smac code, too. === example === %define smacro(x) vaddpd zmm1{x,zmm3 smacro({k1\}\{z\},zmm2}) === result ==== vaddpd zmm1{k1}{z},zmm2,zmm3 =============== And please note that current code patch searches for two specific strings of "\{" and "\}", so it might not break any existing code that have used backslashes in macro parameters. Please let me know if there is no such a case I was concerned about like shown above. Thanks, Jin > -----Original Message----- > From: anonymous coward [mailto:nas...@us...] > Sent: Sunday, August 25, 2013 12:35 PM > To: nas...@li... > Subject: Re: [Nasm-devel] [PATCH 0/6] AVX-512: Bug fixes and additional > features > > Instead of trying to introduce backslash escaping, you want to > fix the mmac code to match the smac code, i.e. keep a count > of curly braces. > > === example === > > %define smacro(x) [x] > > smacro ({{a,b}}) > > %macro mmacro 1 > <%1> > %endmacro > > mmacro {{a,b}} > > === current === > > %line 2+1 0.asm > > [{a,b}] > > %line 8+1 0.asm > > 0.asm:9: error: braces do not enclose all of macro parameter > <{a,b> > > === desired === > > %line 2+1 0.asm > > [{a,b}] > > %line 8+1 0.asm > > <{a,b}> > > -------------------------------------------------------------------------- > ---- > Introducing Performance Central, a new site from SourceForge and > AppDynamics. Performance Central is your source for news, insights, > analysis and resources for efficient Application Performance Management. > Visit us today! > http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktr > k > _______________________________________________ > Nasm-devel mailing list > Nas...@li... > https://lists.sourceforge.net/lists/listinfo/nasm-devel |
From: C. B. <cbe...@pa...> - 2013-08-25 23:49:55
|
Hi Anyone interested to take on some contract work to help us add AVX-512 or Xeon PHI support to llvm mc or yasm-nextgen? https://github.com/yasm/yasm-nextgen/ (We may have other work for adding various other vector extensions as well if interested..) Thanks ./C |
From: anonymous c. <nas...@us...> - 2013-08-25 19:44:32
|
>> I think NASM way makes more sense because each data element size is >> 64bits(QWORD) not ZWORD. But it is also true that > > I can't say for gas (better to ask gas developers then why zword there). > Still using QWORD for nasm looks sane for me. Lets wait for people > opinions. PD --> packed double --> 8 byte elements --> QWORD gather accesses |
From: anonymous c. <nas...@us...> - 2013-08-25 19:35:29
|
Instead of trying to introduce backslash escaping, you want to fix the mmac code to match the smac code, i.e. keep a count of curly braces. === example === %define smacro(x) [x] smacro ({{a,b}}) %macro mmacro 1 <%1> %endmacro mmacro {{a,b}} === current === %line 2+1 0.asm [{a,b}] %line 8+1 0.asm 0.asm:9: error: braces do not enclose all of macro parameter <{a,b> === desired === %line 2+1 0.asm [{a,b}] %line 8+1 0.asm <{a,b}> |
From: anonymous c. <nas...@us...> - 2013-08-25 19:11:10
|
> Multi-line macro uses curly braces for enclosing a parameter > containing comma(s). Passing curly braces as a part of a parameter > which is already enclosed with braces confuses the macro expander. > > Escape character '\' is prefixed in this case. > e.g.) mmacro {1,2,3}, {4,\{5,6\}} > mmacro gets 2 parameters of '1,2,3' and '4,{5,6}' > > Signed-off-by: Jin Kyu Song <jin...@in...> Yes, curly braces inside mmac params should be handled properly. But no, you really do not want to introduce backslash escaping -- it breaks existing code that has backslashes in mmac params. Also, there is no need to mess around with the curly brace code of the preprocessor when it comes to AVX-512 -- the preprocessor has no concept of {xxx} modifiers; from its perspective that's just a curly brace, followed by xxx, followed by another curly brace. Only the (assembler's) parser needs to understand {xxx} modifiers, and it's really trivial to handle them there. |
From: anonymous c. <nas...@us...> - 2013-08-25 18:07:46
|
>> In terms of modifier placement you probably want to look at >> gas again -- I have seen code which has modifiers before an >> operand, and I have seen code which has them as their own >> operand. For example, {one} op1, {two}, op2. > > Could you explain a little bit more about this? Is it regarding > {er} and {sae} that are put as if they are separate operands? In short, yes. For a longer more detailed background, read on. With L1OM and K1OM, Intel picked a specific operand syntax: {transform} ( operand {nt} {eh} ) {mask} In particular: L1OM = {sss,ccccc} ( operand {nt} ) {kkk} K1OM = {sss} ( operand {eh} ) {kkk} As a result you can face a bunch of non-compliant asm code: - incorrect source operand has transform modifier - non-destination operand has mask modifier - transform modifier specified after operand - non-temporal or eviction hint specified before operand - mask modifier specified before operand - transform modifier operand not preceded by "(" - transform modifier operand not followed by ")" - non-temporal or eviction hint specified after ")", not before - mask modifier specified before ")", not after - transform modifier invalid for memory operand - transform modifier invalid for register operand - modifier specified as extra operand As well as these "modifier used as an operand" cases: - only operand --> ignored - leading operand --> applied to next operand - trailing operand --> applied to previous operand - in between operands --> applied to previous operand With AVX-512 Intel failed to prescribe a specific operand syntax. So the modifiers can go anywhere -- before or after any operand, or as their own operands. |
From: Cyrill G. <gor...@gm...> - 2013-08-23 18:27:37
|
On Fri, Aug 23, 2013 at 06:03:36PM +0000, Song, Jin Kyu wrote: > I found one discrepancy between NASM and gas regarding the size specifier. > In insns.dat, VGATHERQPD/DPD expects 64bits(QWORD), if specified, specifier for VSIB operand. > NASM : VGATHERQPD xmmreg,xmem64,xmmreg [rmv: vm64x vex.dds.128.66.0f38.w1 93 /r] FUTURE,AVX2 > VGATHERDPD zmmreg|mask,ymem64 [rm:t1s: vsiby evex.512.66.0f38.w1 92 /r ] AVX512,FUTURE > But gas thinks it is a ZWORD. > Gas : vgatherdpd zmm30{k1}, ZMMWORD PTR [r14+ymm31*8-123] > > I think NASM way makes more sense because each data element size is 64bits(QWORD) not ZWORD. But it is also true that > the eventual data size gathered is up to ZWORD. Is this discrepancy made intentionally? Does it need to be fixed to > conform with gas or just to stay same as it used to be? I can't say for gas (better to ask gas developers then why zword there). Still using QWORD for nasm looks sane for me. Lets wait for people opinions. |
From: Song, J. K. <jin...@in...> - 2013-08-23 18:03:56
|
I found one discrepancy between NASM and gas regarding the size specifier. In insns.dat, VGATHERQPD/DPD expects 64bits(QWORD), if specified, specifier for VSIB operand. NASM : VGATHERQPD xmmreg,xmem64,xmmreg [rmv: vm64x vex.dds.128.66.0f38.w1 93 /r] FUTURE,AVX2 VGATHERDPD zmmreg|mask,ymem64 [rm:t1s: vsiby evex.512.66.0f38.w1 92 /r ] AVX512,FUTURE But gas thinks it is a ZWORD. Gas : vgatherdpd zmm30{k1}, ZMMWORD PTR [r14+ymm31*8-123] I think NASM way makes more sense because each data element size is 64bits(QWORD) not ZWORD. But it is also true that the eventual data size gathered is up to ZWORD. Is this discrepancy made intentionally? Does it need to be fixed to conform with gas or just to stay same as it used to be? - Jin |
From: Cyrill G. <gor...@gm...> - 2013-08-22 20:54:50
|
On Thu, Aug 22, 2013 at 08:33:23PM +0000, Song, Jin Kyu wrote: > > > > One question -- you use TOK_BRACE for both { and } terms, won't it be > > better to > > introduce two terms instead TOK_OPEN_BRACE and TOK_CLOSE_BRACE? How > > tokenizer > > will handle statements like > > > > term \{ term \{ > > > > it will be treated as non-error case? (I must admit I didn't yet review > > the whole avx code :( > > Hi Cyrill, > > This case might be treated as an error in a parser not in a preprocessor if braces > do not match even after expanding all macros. But this patch is for the multi-line macro preprocessing. > > I used TOK_BRACE for the braces inside the parameter - "\{" or "\}". So they should be > handled as a part of normal string without any special meaning. The reason why I added > a new token type is tok_is_()/ tok_isnt_() macros check if it is TOK_OTHER or not. > #define tok_is_(x,v) (tok_type_((x), TOK_OTHER) && !strcmp((x)->text,(v))) > #define tok_isnt_(x,v) ((x) && ((x)->type!=TOK_OTHER || strcmp((x)->text,(v)))) > > "{" with TOK_OTHER : an opening brace of a parameter > "{" with TOK_BRACE : same as any normal character. Originally it is "\{". > > So a new token type could easily avoid this type checking while holding a curly brace > as a string. I chose this way to minimize the change. Maybe I need to rename the new > token type because people may think the name of TOK_BRACE means the brace actually enclosing the macro parameter. > > At first I tried to change the parsing logic of preprocessor but that way led me to much bigger code change. Yeah, preprocessor is already complex enough, so big changes are not approved :-) I see Jin what you're implementing here, need to think. Still this should not stop you, we always can update code and logic before release. |