You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(71) |
Aug
(152) |
Sep
(123) |
Oct
(49) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2002 |
Jan
|
Feb
|
Mar
|
Apr
(37) |
May
(554) |
Jun
(301) |
Jul
(84) |
Aug
(39) |
Sep
(44) |
Oct
(99) |
Nov
(41) |
Dec
(52) |
2003 |
Jan
(15) |
Feb
(32) |
Mar
(19) |
Apr
(4) |
May
(8) |
Jun
(30) |
Jul
(122) |
Aug
(100) |
Sep
(120) |
Oct
(4) |
Nov
(39) |
Dec
(32) |
2004 |
Jan
(38) |
Feb
(87) |
Mar
(11) |
Apr
(23) |
May
(7) |
Jun
(6) |
Jul
(18) |
Aug
(2) |
Sep
(22) |
Oct
(2) |
Nov
(7) |
Dec
(48) |
2005 |
Jan
(74) |
Feb
(29) |
Mar
(28) |
Apr
(1) |
May
(24) |
Jun
(16) |
Jul
(9) |
Aug
(7) |
Sep
(69) |
Oct
(11) |
Nov
(13) |
Dec
(13) |
2006 |
Jan
(5) |
Feb
(3) |
Mar
(7) |
Apr
|
May
(12) |
Jun
(12) |
Jul
(5) |
Aug
(1) |
Sep
(4) |
Oct
(61) |
Nov
(68) |
Dec
(46) |
2007 |
Jan
(16) |
Feb
(15) |
Mar
(46) |
Apr
(171) |
May
(78) |
Jun
(109) |
Jul
(61) |
Aug
(71) |
Sep
(189) |
Oct
(219) |
Nov
(162) |
Dec
(91) |
2008 |
Jan
(49) |
Feb
(41) |
Mar
(43) |
Apr
(31) |
May
(70) |
Jun
(98) |
Jul
(39) |
Aug
(8) |
Sep
(75) |
Oct
(47) |
Nov
(11) |
Dec
(17) |
2009 |
Jan
(9) |
Feb
(12) |
Mar
(8) |
Apr
(11) |
May
(27) |
Jun
(25) |
Jul
(161) |
Aug
(28) |
Sep
(66) |
Oct
(36) |
Nov
(49) |
Dec
(22) |
2010 |
Jan
(34) |
Feb
(20) |
Mar
(3) |
Apr
(12) |
May
(1) |
Jun
(10) |
Jul
(28) |
Aug
(98) |
Sep
(7) |
Oct
(25) |
Nov
(4) |
Dec
(9) |
2011 |
Jan
|
Feb
(12) |
Mar
(7) |
Apr
(16) |
May
(11) |
Jun
(59) |
Jul
(120) |
Aug
(7) |
Sep
(4) |
Oct
(5) |
Nov
(3) |
Dec
(2) |
2012 |
Jan
|
Feb
(6) |
Mar
(21) |
Apr
|
May
|
Jun
|
Jul
(9) |
Aug
|
Sep
(5) |
Oct
(3) |
Nov
(6) |
Dec
(1) |
2013 |
Jan
|
Feb
(19) |
Mar
(10) |
Apr
|
May
(2) |
Jun
|
Jul
(7) |
Aug
(62) |
Sep
(14) |
Oct
(44) |
Nov
(38) |
Dec
(47) |
2014 |
Jan
(14) |
Feb
(1) |
Mar
(4) |
Apr
|
May
(20) |
Jun
|
Jul
|
Aug
(8) |
Sep
(6) |
Oct
(11) |
Nov
(9) |
Dec
(9) |
2015 |
Jan
(3) |
Feb
(2) |
Mar
(2) |
Apr
(3) |
May
(2) |
Jun
(5) |
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
(10) |
Dec
(2) |
2016 |
Jan
(12) |
Feb
(13) |
Mar
(9) |
Apr
(45) |
May
(9) |
Jun
(2) |
Jul
(15) |
Aug
(32) |
Sep
(6) |
Oct
(28) |
Nov
(1) |
Dec
|
2017 |
Jan
(1) |
Feb
|
Mar
|
Apr
(13) |
May
(8) |
Jun
(2) |
Jul
(3) |
Aug
(10) |
Sep
|
Oct
(2) |
Nov
|
Dec
(1) |
2018 |
Jan
(2) |
Feb
(4) |
Mar
(2) |
Apr
(7) |
May
|
Jun
(8) |
Jul
|
Aug
(8) |
Sep
(2) |
Oct
(2) |
Nov
(8) |
Dec
(6) |
2019 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(3) |
2020 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Song, J. K. <jin...@in...> - 2013-08-22 20:33:52
|
> -----Original Message----- > From: Cyrill Gorcunov [mailto:gor...@gm...] > Sent: Thursday, August 22, 2013 8:45 AM > To: Song, Jin Kyu > Cc: nas...@li...; H. Peter Anvin > Subject: Re: [Nasm-devel] [PATCH 0/6] AVX-512: Bug fixes and additional > features > > On Wed, Aug 21, 2013 at 07:29:07PM -0700, Jin Kyu Song wrote: > > Please review these patches and pull if they look good. > > git://repo.or.cz/nasm/avx512.git > > > > After running a test case, various issues were found. One major thing is > > curly brace already used for grouping multi-line macro parameters. > > An escape backward slash character '\' is added when braces are passed > > as a part of enclosed parameter. The test asm file used here is also > included. > > > > Patch "AVX-512: Add a test case for EVEX encoded instructions" is > > relatively huge. So I did not attch that patch in this email. Please > refer to > > > http://repo.or.cz/w/nasm/avx512.git/commitdiff/a4a573c47f3d9ddfd5c25218044 > 54327765f367e > > Hi Jin, I've picked up all patches and pushed them on avx512 branch, > thanks a lot, > good job! > > One question -- you use TOK_BRACE for both { and } terms, won't it be > better to > introduce two terms instead TOK_OPEN_BRACE and TOK_CLOSE_BRACE? How > tokenizer > will handle statements like > > term \{ term \{ > > it will be treated as non-error case? (I must admit I didn't yet review > the whole avx code :( Hi Cyrill, This case might be treated as an error in a parser not in a preprocessor if braces do not match even after expanding all macros. But this patch is for the multi-line macro preprocessing. I used TOK_BRACE for the braces inside the parameter - "\{" or "\}". So they should be handled as a part of normal string without any special meaning. The reason why I added a new token type is tok_is_()/ tok_isnt_() macros check if it is TOK_OTHER or not. #define tok_is_(x,v) (tok_type_((x), TOK_OTHER) && !strcmp((x)->text,(v))) #define tok_isnt_(x,v) ((x) && ((x)->type!=TOK_OTHER || strcmp((x)->text,(v)))) "{" with TOK_OTHER : an opening brace of a parameter "{" with TOK_BRACE : same as any normal character. Originally it is "\{". So a new token type could easily avoid this type checking while holding a curly brace as a string. I chose this way to minimize the change. Maybe I need to rename the new token type because people may think the name of TOK_BRACE means the brace actually enclosing the macro parameter. At first I tried to change the parsing logic of preprocessor but that way led me to much bigger code change. Thanks, Jin |
From: Cyrill G. <gor...@gm...> - 2013-08-22 15:44:44
|
On Wed, Aug 21, 2013 at 07:29:07PM -0700, Jin Kyu Song wrote: > Please review these patches and pull if they look good. > git://repo.or.cz/nasm/avx512.git > > After running a test case, various issues were found. One major thing is > curly brace already used for grouping multi-line macro parameters. > An escape backward slash character '\' is added when braces are passed > as a part of enclosed parameter. The test asm file used here is also included. > > Patch "AVX-512: Add a test case for EVEX encoded instructions" is > relatively huge. So I did not attch that patch in this email. Please refer to > http://repo.or.cz/w/nasm/avx512.git/commitdiff/a4a573c47f3d9ddfd5c2521804454327765f367e Hi Jin, I've picked up all patches and pushed them on avx512 branch, thanks a lot, good job! One question -- you use TOK_BRACE for both { and } terms, won't it be better to introduce two terms instead TOK_OPEN_BRACE and TOK_CLOSE_BRACE? How tokenizer will handle statements like term \{ term \{ it will be treated as non-error case? (I must admit I didn't yet review the whole avx code :( |
From: Jin K. S. <jin...@in...> - 2013-08-22 02:30:18
|
When a memory reference operand is a destination, this could have an opmask decorator as well. Signed-off-by: Jin Kyu Song <jin...@in...> --- parser.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/parser.c b/parser.c index ccbce49..585abe2 100644 --- a/parser.c +++ b/parser.c @@ -758,17 +758,20 @@ is_expression: recover = true; } else { /* we got the required ] */ i = stdscan(NULL, &tokval); - if (i == TOKEN_DECORATOR) { + if ((i == TOKEN_DECORATOR) || (i == TOKEN_OPMASK)) { /* - * according to AVX512 spec, only broacast decorator is - * expected for memory reference operands + * according to AVX512 spec, broacast or opmask decorator + * is expected for memory reference operands */ if (tokval.t_flag & TFLAG_BRDCAST) { brace_flags |= GEN_BRDCAST(0); i = stdscan(NULL, &tokval); + } else if (i == TOKEN_OPMASK) { + brace_flags |= VAL_OPMASK(nasm_regvals[tokval.t_integer]); + i = stdscan(NULL, &tokval); } else { - nasm_error(ERR_NONFATAL, "broadcast decorator" - "expected inside braces"); + nasm_error(ERR_NONFATAL, "broadcast or opmask " + "decorator expected inside braces"); recover = true; } } -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-22 02:30:17
|
ZWORD (512 bits) keyword is added Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 2 ++ disasm.c | 3 +++ nasm.h | 1 + parser.c | 5 +++++ tokens.dat | 1 + 5 files changed, 12 insertions(+) diff --git a/assemble.c b/assemble.c index 83971f6..4f0cd9c 100644 --- a/assemble.c +++ b/assemble.c @@ -265,6 +265,8 @@ static const char *size_name(int size) return "oword"; case 32: return "yword"; + case 64: + return "zword"; default: return "???"; } diff --git a/disasm.c b/disasm.c index 9d2e1b1..cc55d2c 100644 --- a/disasm.c +++ b/disasm.c @@ -1303,6 +1303,9 @@ int32_t disasm(uint8_t *data, char *output, int outbufsize, int segsize, if (t & BITS256) slen += snprintf(output + slen, outbufsize - slen, "yword "); + if (t & BITS512) + slen += + snprintf(output + slen, outbufsize - slen, "zword "); if (t & FAR) slen += snprintf(output + slen, outbufsize - slen, "far "); if (t & NEAR) diff --git a/nasm.h b/nasm.h index e46b5ca..fc5a18d 100644 --- a/nasm.h +++ b/nasm.h @@ -1011,6 +1011,7 @@ enum special_tokens { S_TWORD, S_WORD, S_YWORD, + S_ZWORD, SPECIAL_ENUM_LIMIT }; diff --git a/parser.c b/parser.c index 4b3f059..ccbce49 100644 --- a/parser.c +++ b/parser.c @@ -660,6 +660,11 @@ is_expression: result->oprs[operand].type |= BITS256; setsize = 1; break; + case S_ZWORD: + if (!setsize) + result->oprs[operand].type |= BITS512; + setsize = 1; + break; case S_TO: result->oprs[operand].type |= TO; break; diff --git a/tokens.dat b/tokens.dat index 1a00e3d..d12b296 100644 --- a/tokens.dat +++ b/tokens.dat @@ -72,6 +72,7 @@ to tword word yword +zword % TOKEN_FLOAT, 0, 0, 0 __infinity__ -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-22 02:30:16
|
When an instruction allows broadcasting, the memory element size is different from the size of normal memory operation. This information is provided in a decoflags field, so it should try to match those properties before it fails. Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 35 +++++++++++++++++++++++++++++++---- nasm.h | 18 ++++++++++++++++-- tables.h | 2 +- 3 files changed, 48 insertions(+), 7 deletions(-) diff --git a/assemble.c b/assemble.c index 6054d4a..83971f6 100644 --- a/assemble.c +++ b/assemble.c @@ -1915,10 +1915,22 @@ static enum match_result find_match(const struct itemplate **tempp, enum match_result m, merr; opflags_t xsizeflags[MAX_OPERANDS]; bool opsizemissing = false; + int8_t broadcast = -1; int i; + /* find the position of broadcasting operand */ for (i = 0; i < instruction->operands; i++) - xsizeflags[i] = instruction->oprs[i].type & SIZE_MASK; + if (instruction->oprs[i].decoflags & BRDCAST_MASK) { + broadcast = i; + break; + } + + /* broadcasting uses a different data element size */ + for (i = 0; i < instruction->operands; i++) + if (i == broadcast) + xsizeflags[i] = instruction->oprs[i].decoflags & BRSIZE_MASK; + else + xsizeflags[i] = instruction->oprs[i].type & SIZE_MASK; merr = MERR_INVALOP; @@ -1936,7 +1948,10 @@ static enum match_result find_match(const struct itemplate **tempp, * Missing operand size and a candidate for fuzzy matching... */ for (i = 0; i < temp->operands; i++) - xsizeflags[i] |= temp->opd[i] & SIZE_MASK; + if (i == broadcast) + xsizeflags[i] |= temp->deco[i] & BRSIZE_MASK; + else + xsizeflags[i] |= temp->opd[i] & SIZE_MASK; opsizemissing = true; } if (m > merr) @@ -1962,7 +1977,10 @@ static enum match_result find_match(const struct itemplate **tempp, if ((xsizeflags[i] & (xsizeflags[i]-1))) goto done; /* No luck */ - instruction->oprs[i].type |= xsizeflags[i]; /* Set the size */ + if (i == broadcast) + instruction->oprs[i].decoflags |= xsizeflags[i]; + else + instruction->oprs[i].type |= xsizeflags[i]; /* Set the size */ } /* Try matching again... */ @@ -2107,7 +2125,16 @@ static enum match_result matches(const struct itemplate *itemp, } else if ((itemp->opd[i] & SIZE_MASK) && (itemp->opd[i] & SIZE_MASK) != (type & SIZE_MASK)) { if (type & SIZE_MASK) { - return MERR_INVALOP; + /* + * when broadcasting, the element size depends on + * the instruction type. decorator flag should match. + */ +#define MATCH_BRSZ(bits) (((type & SIZE_MASK) == BITS##bits) && \ + ((itemp->deco[i] & BRSIZE_MASK) == BR_BITS##bits)) + if (!((deco & BRDCAST_MASK) && + (MATCH_BRSZ(32) || MATCH_BRSZ(64)))) { + return MERR_INVALOP; + } } else if (!is_class(REGISTER, type)) { /* * Note: we don't honor extrinsic operand sizes for registers, diff --git a/nasm.h b/nasm.h index 628ec43..e46b5ca 100644 --- a/nasm.h +++ b/nasm.h @@ -1038,6 +1038,7 @@ enum decorator_tokens { * ..........................1..... broadcast * .........................1...... static rounding * ........................1....... SAE + * ......................11........ broadcast element size */ #define OP_GENVAL(val, bits, shift) (((val) & ((UINT64_C(1) << (bits)) - 1)) << (shift)) @@ -1096,10 +1097,23 @@ enum decorator_tokens { #define SAE_MASK OP_GENMASK(SAE_BITS, SAE_SHIFT) #define GEN_SAE(bit) OP_GENBIT(bit, SAE_SHIFT) +/* + * Broadcasting element size. + * + * Bits: 8 - 9 + */ +#define BRSIZE_SHIFT (8) +#define BRSIZE_BITS (2) +#define BRSIZE_MASK OP_GENMASK(BRSIZE_BITS, BRSIZE_SHIFT) +#define GEN_BRSIZE(bit) OP_GENBIT(bit, BRSIZE_SHIFT) + +#define BR_BITS32 GEN_BRSIZE(0) +#define BR_BITS64 GEN_BRSIZE(1) + #define MASK OPMASK_MASK /* Opmask (k1 ~ 7) can be used */ #define Z Z_MASK -#define B32 BRDCAST_MASK /* {1to16} : load+op instruction can broadcast when it is reg-reg operation */ -#define B64 BRDCAST_MASK /* {1to8} : There are two definitions just for conforming to SDM */ +#define B32 (BRDCAST_MASK|BR_BITS32) /* {1to16} : broadcast 32b * 16 to zmm(512b) */ +#define B64 (BRDCAST_MASK|BR_BITS64) /* {1to8} : broadcast 64b * 8 to zmm(512b) */ #define ER STATICRND_MASK /* ER(Embedded Rounding) == Static rounding mode */ #define SAE SAE_MASK /* SAE(Suppress All Exception) */ diff --git a/tables.h b/tables.h index d0db3b3..4b14566 100644 --- a/tables.h +++ b/tables.h @@ -62,7 +62,7 @@ extern const char * const nasm_insn_names[]; extern const char * const nasm_reg_names[]; /* regflags.c */ typedef uint64_t opflags_t; -typedef uint8_t decoflags_t; +typedef uint16_t decoflags_t; extern const opflags_t nasm_reg_flags[]; /* regvals.c */ extern const int nasm_regvals[]; -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-22 02:30:15
|
Previous comment was not so clear. Signed-off-by: Jin Kyu Song <jin...@in...> --- parser.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/parser.c b/parser.c index 5571c6f..4b3f059 100644 --- a/parser.c +++ b/parser.c @@ -196,7 +196,7 @@ static void process_size_override(insn *result, int operand) /* * when two or more decorators follow a register operand, * consecutive decorators are parsed here. - * the order of decorators does not matter. + * opmask and zeroing decorators can be placed in any order. * e.g. zmm1 {k2}{z} or zmm2 {z,k3} * decorator(s) are placed at the end of an operand. */ -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-22 02:30:13
|
Multi-line macro uses curly braces for enclosing a parameter containing comma(s). Passing curly braces as a part of a parameter which is already enclosed with braces confuses the macro expander. Escape character '\' is prefixed in this case. e.g.) mmacro {1,2,3}, {4,\{5,6\}} mmacro gets 2 parameters of '1,2,3' and '4,{5,6}' Signed-off-by: Jin Kyu Song <jin...@in...> --- preproc.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/preproc.c b/preproc.c index e2b12e4..b878e4b 100644 --- a/preproc.c +++ b/preproc.c @@ -208,6 +208,7 @@ enum pp_token_type { TOK_PREPROC_Q, TOK_PREPROC_QQ, TOK_PASTE, /* %+ */ TOK_INDIRECT, /* %[...] */ + TOK_BRACE, /* \{...\} */ TOK_SMAC_PARAM, /* MUST BE LAST IN THE LIST!!! */ TOK_MAX = INT_MAX /* Keep compiler from reducing the range */ }; @@ -1103,6 +1104,10 @@ static Token *tokenize(char *line) type = TOK_COMMENT; while (*p) p++; + } else if (p[0] == '\\' && (p[1] == '{' || p[1] == '}')) { + type = TOK_BRACE; + p += 2; + line++; } else { /* * Anything else is an operator of some kind. We check -- 1.7.9.5 |
From: Jin K. S. <jin...@in...> - 2013-08-22 02:30:12
|
Please review these patches and pull if they look good. git://repo.or.cz/nasm/avx512.git After running a test case, various issues were found. One major thing is curly brace already used for grouping multi-line macro parameters. An escape backward slash character '\' is added when braces are passed as a part of enclosed parameter. The test asm file used here is also included. Patch "AVX-512: Add a test case for EVEX encoded instructions" is relatively huge. So I did not attch that patch in this email. Please refer to http://repo.or.cz/w/nasm/avx512.git/commitdiff/a4a573c47f3d9ddfd5c2521804454327765f367e - Jin Song Jin Kyu Song (6): AVX-512: Handle curly braces in multi-line macro parameters AVX-512: Add a test case for EVEX encoded instructions AVX-512: Reword comment about opmask decorators AVX-512: Fix instruction match function AVX-512: Add ZWORD keyword AVX-512: Fix parser to handle opmask decorator correctly assemble.c | 37 +- disasm.c | 3 + nasm.h | 19 +- parser.c | 20 +- preproc.c | 5 + tables.h | 2 +- test/avx512f.asm | 9221 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ test/gas2nasm.py | 88 + tokens.dat | 1 + 9 files changed, 9383 insertions(+), 13 deletions(-) create mode 100644 test/avx512f.asm create mode 100755 test/gas2nasm.py -- 1.7.9.5 |
From: Cyrill G. <gor...@gm...> - 2013-08-16 05:07:43
|
On Thu, Aug 15, 2013 at 07:01:25PM -0700, Jin Kyu Song wrote: > EVEX encoding support includes 32 vector regs (XMM/YMM/ZMM), > opmask, broadcasting, embedded rounding mode, > suppress all exceptions, compressed displacement. > > Signed-off-by: Jin Kyu Song <jin...@in...> Pushed into avx512 branch, thanks! |
From: Song, J. K. <jin...@in...> - 2013-08-16 02:19:34
|
> -----Original Message----- > From: anonymous coward [mailto:nas...@us...] > Sent: Tuesday, August 06, 2013 6:15 AM > To: nas...@li... > Subject: Re: [Nasm-devel] [PATCH] AVX-512: Add support for parsing braces > > > AVX-512 introduced new syntax using braces for decorators. > > Actually, the curly-brace operand modifiers were introduced > by L1OM on Larrabee. Also, K1OM used them on Xeon Phi. > AVX-512 is merely the third x86 extension to use them. ;-) Yes, you are right. AVX-512 introduced curly-brace to NASM. ;-) > > > + * the order of decorators does not matter. > > The order matters in terms of which one wins in case of any > conflict, i.e. first vs last, warning vs error, etc. > > You should consider following gas behavior, simply because > it effectively set the standard for this a long time ago. This statement misleads. I meant that opmask and zeroing decorators can be placed in any order. I will reword this line. > > > + * e.g. zmm1 {k2}{z} or zmm2 {z,k3} > > You really do NOT want that comma syntax, for two reasons. > > First, it poses a problem if a vendor ever introduces a future > modifier with a comma in it. For example, {a,b,c,d}. > > Second, it will simplify parsing/tokenizing. Because instead > of having to bastardize the existing clean identifier handling > with exceptions for curly braces and dashes, and exceptions > for nested braces and multiple qualifiers, you'll end up with a > rather simpler and more traditional sequence: opening curly > brace, optional whitespace, a sequence of non-whitespace > characters (that gets looked up against known qualifiers, in > e.g. a hash table), optional whitespace, plus a closing curly > brace. (Put another way, your modifications to that parsing/ > tokenizing code are very intrusive, and difficult to validate > when it comes to corner cases.) Fairly reasonable. I will simplify the parsing logic along with removing support for commas and nested braces. > > In terms of modifier placement you probably want to look at > gas again -- I have seen code which has modifiers before an > operand, and I have seen code which has them as their own > operand. For example, {one} op1, {two}, op2. Could you explain a little bit more about this? Is it regarding {er} and {sae} that are put as if they are separate operands? > > I do not know if you care about supporting L1OM and K1OM > eventually -- the longer you wait, the more obsolete they will > be, of course :-) -- but that would come with more challenges. > In particular, the regular braces which were used to enclose > operands decorated with transform modifiers are really hard > to get right. > |
From: Jin K. S. <jin...@in...> - 2013-08-16 02:02:28
|
EVEX encoding support includes 32 vector regs (XMM/YMM/ZMM), opmask, broadcasting, embedded rounding mode, suppress all exceptions, compressed displacement. Signed-off-by: Jin Kyu Song <jin...@in...> --- assemble.c | 326 +++++++++++++++++++++++++++++++++++++------ disasm.c | 6 + insns.dat | 448 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- insns.h | 10 +- insns.pl | 140 +++++++++++++++++-- nasm.h | 54 +++++++- opflags.h | 6 +- parser.c | 6 + regs.dat | 4 +- 9 files changed, 925 insertions(+), 75 deletions(-) diff --git a/assemble.c b/assemble.c index b119f86..6054d4a 100644 --- a/assemble.c +++ b/assemble.c @@ -1,6 +1,6 @@ /* ----------------------------------------------------------------------- * * - * Copyright 1996-2012 The NASM Authors - All Rights Reserved + * Copyright 1996-2013 The NASM Authors - All Rights Reserved * See the file AUTHORS included with the NASM distribution for * the specific copyright holders. * @@ -67,6 +67,35 @@ * an arbitrary value in bits 3..0 (assembled as zero.) * \2ab - a ModRM, calculated on EA in operand a, with the spare * field equal to digit b. + * + * \240..\243 - this instruction uses EVEX rather than REX or VEX/XOP, with the + * V field taken from operand 0..3. + * \250 - this instruction uses EVEX rather than REX or VEX/XOP, with the + * V field set to 1111b. + * EVEX prefixes are followed by the sequence: + * \cm\wlp\tup where cm is: + * cc 000 0mm + * c = 2 for EVEX and m is the legacy escape (0f, 0f38, 0f3a) + * and wlp is: + * 00 wwl lpp + * [l0] ll = 0 (.128, .lz) + * [l1] ll = 1 (.256) + * [l2] ll = 2 (.512) + * [lig] ll = 3 for EVEX.L'L don't care (always assembled as 0) + * + * [w0] ww = 0 for W = 0 + * [w1] ww = 1 for W = 1 + * [wig] ww = 2 for W don't care (always assembled as 0) + * [ww] ww = 3 for W used as REX.W + * + * [p0] pp = 0 for no prefix + * [60] pp = 1 for legacy prefix 60 + * [f3] pp = 2 + * [f2] pp = 3 + * + * tup is tuple type for Disp8*N from %tuple_codes in insns.pl + * (compressed displacement encoding) + * * \254..\257 - a signed 32-bit operand to be extended to 64 bits. * \260..\263 - this instruction uses VEX/XOP rather than REX, with the * V field taken from operand 0..3. @@ -76,9 +105,9 @@ * VEX/XOP prefixes are followed by the sequence: * \tmm\wlp where mm is the M field; and wlp is: * 00 wwl lpp - * [l0] ll = 0 for L = 0 (.128, .lz) - * [l1] ll = 1 for L = 1 (.256) - * [lig] ll = 2 for L don't care (always assembled as 0) + * [l0] ll = 0 for L = 0 (.128, .lz) + * [l1] ll = 1 for L = 1 (.256) + * [lig] ll = 2 for L don't care (always assembled as 0) * * [w0] ww = 0 for W = 0 * [w1 ] ww = 1 for W = 1 @@ -136,6 +165,7 @@ * used for conditional jump over longer jump * \374 - this instruction takes an XMM VSIB memory EA * \375 - this instruction takes an YMM VSIB memory EA + * \376 - this instruction takes an ZMM VSIB memory EA */ #include "compiler.h" @@ -174,6 +204,7 @@ typedef struct { int bytes; /* # of bytes of offset needed */ int size; /* lazy - this is sib+bytes+1 */ uint8_t modrm, sib, rex, rip; /* the bytes themselves */ + int8_t disp8; /* compressed displacement for EVEX */ } ea; #define GEN_SIB(scale, index, base) \ @@ -200,9 +231,10 @@ static opflags_t regflag(const operand *); static int32_t regval(const operand *); static int rexflags(int, opflags_t, int); static int op_rexflags(const operand *, int); +static int op_evexflags(const operand *, int, uint8_t); static void add_asp(insn *, int); -static enum ea_type process_ea(operand *, ea *, int, int, int, opflags_t); +static enum ea_type process_ea(operand *, ea *, int, int, opflags_t, insn *); static int has_prefix(insn * ins, enum prefix_pos pos, int prefix) { @@ -820,6 +852,7 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, ins->rex = 0; /* Ensure REX is reset */ eat = EA_SCALAR; /* Expect a scalar EA */ + memset(ins->evex_p, 0, 3); /* Ensure EVEX is reset */ if (ins->prefixes[PPS_OSIZE] == P_O64) ins->rex |= REX_W; @@ -910,6 +943,23 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, length++; break; + case4(0240): + ins->rex |= REX_EV; + ins->vexreg = regval(opx); + ins->evex_p[2] |= op_evexflags(opx, EVEX_P2VP, 2); /* High-16 NDS */ + ins->vex_cm = *codes++; + ins->vex_wlp = *codes++; + ins->evex_tuple = (*codes++ - 0300); + break; + + case 0250: + ins->rex |= REX_EV; + ins->vexreg = 0; + ins->vex_cm = *codes++; + ins->vex_wlp = *codes++; + ins->evex_tuple = (*codes++ - 0300); + break; + case4(0254): length += 4; break; @@ -1076,6 +1126,10 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, eat = EA_YMMVSIB; break; + case 0376: + eat = EA_ZMMVSIB; + break; + case4(0100): case4(0110): case4(0120): @@ -1093,6 +1147,7 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, int rfield; opflags_t rflags; struct operand *opy = &ins->oprs[op2]; + struct operand *oplast; ea_data.rex = 0; /* Ensure ea.REX is initially 0 */ @@ -1100,12 +1155,30 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, /* pick rfield from operand b (opx) */ rflags = regflag(opx); rfield = nasm_regvals[opx->basereg]; + /* find the last SIMD operand where ER decorator resides */ + oplast = &ins->oprs[op1 > op2 ? op1 : op2]; } else { rflags = 0; rfield = c & 7; + oplast = opy; } - if (process_ea(opy, &ea_data, bits,ins->addr_size, - rfield, rflags) != eat) { + + if (oplast->decoflags & ER) { + /* set EVEX.RC (rounding control) and b */ + ins->evex_p[2] |= (((ins->evex_rm - BRC_RN) << 5) & EVEX_P2LL) | + EVEX_P2B; + } else { + /* set EVEX.L'L (vector length) */ + ins->evex_p[2] |= ((ins->vex_wlp << (5 - 2)) & EVEX_P2LL); + if ((oplast->decoflags & SAE) || + (opy->decoflags & BRDCAST_MASK)) { + /* set EVEX.b */ + ins->evex_p[2] |= EVEX_P2B; + } + } + + if (process_ea(opy, &ea_data, bits, + rfield, rflags, ins) != eat) { errfunc(ERR_NONFATAL, "invalid effective address"); return -1; } else { @@ -1132,11 +1205,11 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, ins->rex &= ~REX_P; /* Don't force REX prefix due to high reg */ } - if (ins->rex & REX_V) { + if (ins->rex & (REX_V | REX_EV)) { int bad32 = REX_R|REX_W|REX_X|REX_B; if (ins->rex & REX_H) { - errfunc(ERR_NONFATAL, "cannot use high register in vex instruction"); + errfunc(ERR_NONFATAL, "cannot use high register in AVX instruction"); return -1; } switch (ins->vex_wlp & 060) { @@ -1157,7 +1230,9 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, errfunc(ERR_NONFATAL, "invalid operands in non-64-bit mode"); return -1; } - if (ins->vex_cm != 1 || (ins->rex & (REX_W|REX_X|REX_B))) + if (ins->rex & REX_EV) + length += 4; + else if (ins->vex_cm != 1 || (ins->rex & (REX_W|REX_X|REX_B))) length += 3; else length += 2; @@ -1194,7 +1269,7 @@ static int64_t calcsize(int32_t segment, int64_t offset, int bits, static inline unsigned int emit_rex(insn *ins, int32_t segment, int64_t offset, int bits) { if (bits == 64) { - if ((ins->rex & REX_REAL) && !(ins->rex & REX_V)) { + if ((ins->rex & REX_REAL) && !(ins->rex & (REX_V | REX_EV))) { ins->rex = (ins->rex & REX_REAL) | REX_P; out(offset, segment, &ins->rex, OUT_RAWDATA, 1, NO_SEG, NO_SEG); ins->rex = 0; @@ -1431,6 +1506,25 @@ static void gencode(int32_t segment, int64_t offset, int bits, offset += 4; break; + case4(0240): + case 0250: + codes += 3; + ins->evex_p[2] |= op_evexflags(&ins->oprs[0], + EVEX_P2Z | EVEX_P2AAA, 2); + ins->evex_p[2] ^= EVEX_P2VP; /* 1's complement */ + bytes[0] = 0x62; + /* EVEX.X can be set by either REX or EVEX for different reasons */ + bytes[1] = (~(((ins->rex & 7) << 5) | + (ins->evex_p[0] & (EVEX_P0X | EVEX_P0RP))) & 0xf0) | + (ins->vex_cm & 3); + bytes[2] = ((ins->rex & REX_W) << (7 - 3)) | + ((~ins->vexreg & 15) << 3) | + (1 << 2) | (ins->vex_wlp & 3); + bytes[3] = ins->evex_p[2]; + out(offset, segment, &bytes, OUT_RAWDATA, 4, NO_SEG, NO_SEG); + offset += 4; + break; + case4(0260): case 0270: codes += 2; @@ -1631,6 +1725,10 @@ static void gencode(int32_t segment, int64_t offset, int bits, eat = EA_YMMVSIB; break; + case 0376: + eat = EA_ZMMVSIB; + break; + case4(0100): case4(0110): case4(0120): @@ -1661,8 +1759,8 @@ static void gencode(int32_t segment, int64_t offset, int bits, rfield = c & 7; } - if (process_ea(opy, &ea_data, bits, ins->addr_size, - rfield, rflags) != eat) + if (process_ea(opy, &ea_data, bits, + rfield, rflags, ins) != eat) errfunc(ERR_NONFATAL, "invalid effective address"); p = bytes; @@ -1687,7 +1785,8 @@ static void gencode(int32_t segment, int64_t offset, int bits, case 2: case 4: case 8: - data = opy->offset; + /* use compressed displacement, if available */ + data = ea_data.disp8 ? ea_data.disp8 : opy->offset; s += ea_data.bytes; if (ea_data.rip) { if (opy->segment == segment) { @@ -1702,9 +1801,9 @@ static void gencode(int32_t segment, int64_t offset, int bits, insn_end - offset, opy->segment, opy->wrt); } } else { - if (overflow_general(opy->offset, ins->addr_size >> 3) || - signed_bits(opy->offset, ins->addr_size) != - signed_bits(opy->offset, ea_data.bytes * 8)) + if (overflow_general(data, ins->addr_size >> 3) || + signed_bits(data, ins->addr_size) != + signed_bits(data, ea_data.bytes * 8)) warn_overflow(ERR_PASS2, ea_data.bytes); out(offset, segment, &data, OUT_ADDRESS, @@ -1774,6 +1873,40 @@ static int rexflags(int val, opflags_t flags, int mask) return rex & mask; } +static int evexflags(int val, decoflags_t deco, + int mask, uint8_t byte) +{ + int evex = 0; + + switch(byte) { + case 0: + if (val >= 16) + evex |= (EVEX_P0RP | EVEX_P0X); + break; + case 2: + if (val >= 16) + evex |= EVEX_P2VP; + if (deco & Z) + evex |= EVEX_P2Z; + if (deco & OPMASK_MASK) + evex |= deco & EVEX_P2AAA; + break; + } + return evex & mask; +} + +static int op_evexflags(const operand * o, int mask, uint8_t byte) +{ + int val; + + if (!is_register(o->basereg)) + errfunc(ERR_PANIC, "invalid operand passed to op_evexflags()"); + + val = nasm_regvals[o->basereg]; + + return evexflags(val, o->decoflags, mask, byte); +} + static enum match_result find_match(const struct itemplate **tempp, insn *instruction, int32_t segment, int64_t offset, int bits) @@ -1908,6 +2041,9 @@ static enum match_result matches(const struct itemplate *itemp, asize = BITS256; break; case IF_SZ: + asize = BITS512; + break; + case IF_SIZE: switch (bits) { case 16: asize = BITS16; @@ -1961,10 +2097,12 @@ static enum match_result matches(const struct itemplate *itemp, */ for (i = 0; i < itemp->operands; i++) { opflags_t type = instruction->oprs[i].type; + decoflags_t deco = instruction->oprs[i].decoflags; if (!(type & SIZE_MASK)) type |= size[i]; - if (itemp->opd[i] & ~type & ~SIZE_MASK) { + if ((itemp->opd[i] & ~type & ~SIZE_MASK) || + (itemp->deco[i] & deco) != deco) { return MERR_INVALOP; } else if ((itemp->opd[i] & SIZE_MASK) && (itemp->opd[i] & SIZE_MASK) != (type & SIZE_MASK)) { @@ -2036,16 +2174,116 @@ static enum match_result matches(const struct itemplate *itemp, return MOK_GOOD; } +/* + * Check if offset is a multiple of N with corresponding tuple type + * if Disp8*N is available, compressed displacement is stored in compdisp + */ +static bool is_disp8n(operand *input, insn *ins, int8_t *compdisp) +{ + const uint8_t fv_n[2][2][VLMAX] = {{{16, 32, 64}, {4, 4, 4}}, + {{16, 32, 64}, {8, 8, 8}}}; + const uint8_t hv_n[2][VLMAX] = {{8, 16, 32}, {4, 4, 4}}; + const uint8_t dup_n[VLMAX] = {8, 32, 64}; + + bool evex_b = input->decoflags & BRDCAST_MASK; + enum ttypes tuple = ins->evex_tuple; + /* vex_wlp composed as [wwllpp] */ + enum vectlens vectlen = (ins->vex_wlp & 0x0c) >> 2; + /* wig(=2) is treated as w0(=0) */ + bool evex_w = (ins->vex_wlp & 0x10) >> 4; + int32_t off = input->offset; + uint8_t n = 0; + int32_t disp8; + + switch(tuple) { + case FV: + n = fv_n[evex_w][evex_b][vectlen]; + break; + case HV: + n = hv_n[evex_b][vectlen]; + break; + + case FVM: + /* 16, 32, 64 for VL 128, 256, 512 respectively*/ + n = 1 << (vectlen + 4); + break; + case T1S8: /* N = 1 */ + case T1S16: /* N = 2 */ + n = tuple - T1S8 + 1; + break; + case T1S: + /* N = 4 for 32bit, 8 for 64bit */ + n = evex_w ? 8 : 4; + break; + case T1F32: + case T1F64: + /* N = 4 for 32bit, 8 for 64bit */ + n = (tuple == T1F32 ? 4 : 8); + break; + case T2: + case T4: + case T8: + if (vectlen + 7 <= (evex_w + 5) + (tuple - T2 + 1)) + n = 0; + else + n = 1 << (tuple - T2 + evex_w + 4); + break; + case HVM: + case QVM: + case OVM: + n = 1 << (OVM - tuple + vectlen + 1); + break; + case M128: + n = 16; + break; + case DUP: + n = dup_n[vectlen]; + break; + + default: + break; + } + + if (n && !(off & (n - 1))) { + disp8 = off / n; + /* if it fits in Disp8 */ + if (disp8 >= -128 && disp8 <= 127) { + *compdisp = disp8; + return true; + } + } + + *compdisp = 0; + return false; +} + +/* + * Check if ModR/M.mod should/can be 01. + * - EAF_BYTEOFFS is set + * - offset can fit in a byte when EVEX is not used + * - offset can be compressed when EVEX is used + */ +#define IS_MOD_01() (input->eaflags & EAF_BYTEOFFS || \ + (o >= -128 && o <= 127 && \ + seg == NO_SEG && !forw_ref && \ + !(input->eaflags & EAF_WORDOFFS) && \ + !(ins->rex & REX_EV)) || \ + (ins->rex & REX_EV && \ + is_disp8n(input, ins, &output->disp8))) + static enum ea_type process_ea(operand *input, ea *output, int bits, - int addrbits, int rfield, opflags_t rflags) + int rfield, opflags_t rflags, insn *ins) { bool forw_ref = !!(input->opflags & OPFLAG_UNKNOWN); + int addrbits = ins->addr_size; output->type = EA_SCALAR; output->rip = false; /* REX flags for the rfield operand */ output->rex |= rexflags(rfield, rflags, REX_R | REX_P | REX_W | REX_H); + /* EVEX.R' flag for the REG operand */ + ins->evex_p[0] |= evexflags(rfield, 0, EVEX_P0RP, 0); if (is_class(REGISTER, input->type)) { /* @@ -2054,10 +2292,17 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, if (!is_register(input->basereg)) goto err; - if (!is_class(REG_EA, regflag(input))) + if (!is_reg_class(REG_EA, input->basereg)) goto err; + /* broadcasting is not available with a direct register operand. */ + if (input->decoflags & BRDCAST_MASK) { + nasm_error(ERR_NONFATAL, "Broadcasting not allowed from a register"); + goto err; + } + output->rex |= op_rexflags(input, REX_B | REX_P | REX_W | REX_H); + ins->evex_p[0] |= op_evexflags(input, EVEX_P0X, 0); output->sib_present = false; /* no SIB necessary */ output->bytes = 0; /* no offset necessary either */ output->modrm = GEN_MODRM(3, rfield, nasm_regvals[input->basereg]); @@ -2065,6 +2310,14 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, /* * It's a memory reference. */ + + /* Embedded rounding or SAE is not available with a mem ref operand. */ + if (input->decoflags & (ER | SAE)) { + nasm_error(ERR_NONFATAL, + "Embedded rounding is available only with reg-reg op."); + return -1; + } + if (input->basereg == -1 && (input->indexreg == -1 || input->scale == 0)) { /* @@ -2125,7 +2378,7 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, } /* if either one are a vector register... */ - if ((ix|bx) & (XMMREG|YMMREG) & ~REG_EA) { + if ((ix|bx) & (XMMREG|YMMREG|ZMMREG) & ~REG_EA) { opflags_t sok = BITS32 | BITS64; int32_t o = input->offset; int mod, scale, index, base; @@ -2134,7 +2387,7 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, * For a vector SIB, one has to be a vector and the other, * if present, a GPR. The vector must be the index operand. */ - if (it == -1 || (bx & (XMMREG|YMMREG) & ~REG_EA)) { + if (it == -1 || (bx & (XMMREG|YMMREG|ZMMREG) & ~REG_EA)) { if (s == 0) s = 1; else if (s != 1) @@ -2165,11 +2418,13 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, (addrbits == 64 && !(sok & BITS64))) goto err; - output->type = (ix & YMMREG & ~REG_EA) - ? EA_YMMVSIB : EA_XMMVSIB; + output->type = ((ix & ZMMREG & ~REG_EA) ? EA_ZMMVSIB + : ((ix & YMMREG & ~REG_EA) + ? EA_YMMVSIB : EA_XMMVSIB)); - output->rex |= rexflags(it, ix, REX_X); - output->rex |= rexflags(bt, bx, REX_B); + output->rex |= rexflags(it, ix, REX_X); + output->rex |= rexflags(bt, bx, REX_B); + ins->evex_p[2] |= evexflags(it, 0, EVEX_P2VP, 2); index = it & 7; /* it is known to be != -1 */ @@ -2199,10 +2454,7 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, seg == NO_SEG && !forw_ref && !(input->eaflags & (EAF_BYTEOFFS | EAF_WORDOFFS))) mod = 0; - else if (input->eaflags & EAF_BYTEOFFS || - (o >= -128 && o <= 127 && - seg == NO_SEG && !forw_ref && - !(input->eaflags & EAF_WORDOFFS))) + else if (IS_MOD_01()) mod = 1; else mod = 2; @@ -2293,10 +2545,7 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, seg == NO_SEG && !forw_ref && !(input->eaflags & (EAF_BYTEOFFS | EAF_WORDOFFS))) mod = 0; - else if (input->eaflags & EAF_BYTEOFFS || - (o >= -128 && o <= 127 && - seg == NO_SEG && !forw_ref && - !(input->eaflags & EAF_WORDOFFS))) + else if (IS_MOD_01()) mod = 1; else mod = 2; @@ -2340,10 +2589,7 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, seg == NO_SEG && !forw_ref && !(input->eaflags & (EAF_BYTEOFFS | EAF_WORDOFFS))) mod = 0; - else if (input->eaflags & EAF_BYTEOFFS || - (o >= -128 && o <= 127 && - seg == NO_SEG && !forw_ref && - !(input->eaflags & EAF_WORDOFFS))) + else if (IS_MOD_01()) mod = 1; else mod = 2; @@ -2428,9 +2674,7 @@ static enum ea_type process_ea(operand *input, ea *output, int bits, if (o == 0 && seg == NO_SEG && !forw_ref && rm != 6 && !(input->eaflags & (EAF_BYTEOFFS | EAF_WORDOFFS))) mod = 0; - else if (input->eaflags & EAF_BYTEOFFS || - (o >= -128 && o <= 127 && seg == NO_SEG && - !forw_ref && !(input->eaflags & EAF_WORDOFFS))) + else if (IS_MOD_01()) mod = 1; else mod = 2; diff --git a/disasm.c b/disasm.c index 97bf27e..9d2e1b1 100644 --- a/disasm.c +++ b/disasm.c @@ -328,6 +328,8 @@ static uint8_t *do_ea(uint8_t *data, int modrm, int asize, op->indexreg = nasm_rd_xmmreg[index | ((rex & REX_X) ? 8 : 0)]; else if (type == EA_YMMVSIB) op->indexreg = nasm_rd_ymmreg[index | ((rex & REX_X) ? 8 : 0)]; + else if (type == EA_ZMMVSIB) + op->indexreg = nasm_rd_zmmreg[index | ((rex & REX_X) ? 8 : 0)]; else if (index == 4 && !(rex & REX_X)) op->indexreg = -1; /* ESP/RSP cannot be an index */ else if (a64) @@ -868,6 +870,10 @@ static int matches(const struct itemplate *t, uint8_t *data, eat = EA_YMMVSIB; break; + case 0376: + eat = EA_ZMMVSIB; + break; + default: return false; /* Unknown code */ } diff --git a/insns.dat b/insns.dat index 0b55b68..320280a 100644 --- a/insns.dat +++ b/insns.dat @@ -1064,16 +1064,16 @@ PUSH reg_ds [-: 1e] 8086,NOLONG PUSH reg_fs [-: 0f a0] 386 PUSH reg_gs [-: 0f a8] 386 PUSH imm8 [i: 6a ib,s] 186 -PUSH sbyteword16 [i: o16 6a ib,s] 186,AR0,SZ,ND -PUSH imm16 [i: o16 68 iw] 186,AR0,SZ -PUSH sbytedword32 [i: o32 6a ib,s] 386,NOLONG,AR0,SZ,ND -PUSH imm32 [i: o32 68 id] 386,NOLONG,AR0,SZ +PUSH sbyteword16 [i: o16 6a ib,s] 186,AR0,SIZE,ND +PUSH imm16 [i: o16 68 iw] 186,AR0,SIZE +PUSH sbytedword32 [i: o32 6a ib,s] 386,NOLONG,AR0,SIZE,ND +PUSH imm32 [i: o32 68 id] 386,NOLONG,AR0,SIZE PUSH sbytedword32 [i: o32 6a ib,s] 386,NOLONG,SD,ND PUSH imm32 [i: o32 68 id] 386,NOLONG,SD -PUSH sbytedword64 [i: o64nw 6a ib,s] X64,AR0,SZ,ND -PUSH imm64 [i: o64nw 68 id,s] X64,AR0,SZ -PUSH sbytedword32 [i: o64nw 6a ib,s] X64,AR0,SZ,ND -PUSH imm32 [i: o64nw 68 id,s] X64,AR0,SZ +PUSH sbytedword64 [i: o64nw 6a ib,s] X64,AR0,SIZE,ND +PUSH imm64 [i: o64nw 68 id,s] X64,AR0,SIZE +PUSH sbytedword32 [i: o64nw 6a ib,s] X64,AR0,SIZE,ND +PUSH imm32 [i: o64nw 68 id,s] X64,AR0,SIZE PUSHA void [ odf 60] 186,NOLONG PUSHAD void [ o32 60] 386,NOLONG PUSHAW void [ o16 60] 186,NOLONG @@ -3457,7 +3457,437 @@ TZMSK reg32,rm32 [vm: xop.ndd.lz.m9.w0 01 /4] FUTURE,TBM TZMSK reg64,rm64 [vm: xop.ndd.lz.m9.w1 01 /4] LONG,FUTURE,TBM T1MSKC reg32,rm32 [vm: xop.ndd.lz.m9.w0 01 /7] FUTURE,TBM T1MSKC reg64,rm64 [vm: xop.ndd.lz.m9.w1 01 /7] LONG,FUTURE,TBM -+ + +;# Intel AVX512 instructions +; +; based on pub number 319433-015 dated July 2013 +; +VADDPD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f.w1 58 /r ] AVX512,FUTURE +VADDPS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.0f.w0 58 /r ] AVX512,FUTURE +VADDSD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 58 /r ] AVX512,FUTURE +VADDSS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 58 /r ] AVX512,FUTURE +VALIGND zmmreg|mask|z,zmmreg,zmmrm512|b32,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w0 03 /r ib ] AVX512,FUTURE +VALIGNQ zmmreg|mask|z,zmmreg,zmmrm512|b64,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w1 03 /r ib ] AVX512,FUTURE +VBLENDMPD zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 65 /r ] AVX512,FUTURE +VBLENDMPS zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f38.w0 65 /r ] AVX512,FUTURE +VBROADCASTF32X4 zmmreg|mask|z,mem128 [rm:t4: evex.512.66.0f38.w0 1a /r ] AVX512,FUTURE +VBROADCASTF64X4 zmmreg|mask|z,mem256 [rm:t4: evex.512.66.0f38.w1 1b /r ] AVX512,FUTURE +VBROADCASTI32X4 zmmreg|mask|z,mem128 [rm:t4: evex.512.66.0f38.w0 5a /r ] AVX512,FUTURE +VBROADCASTI64X4 zmmreg|mask|z,mem256 [rm:t4: evex.512.66.0f38.w1 5b /r ] AVX512,FUTURE +VBROADCASTSD zmmreg|mask|z,mem64 [rm:t1s: evex.512.66.0f38.w1 19 /r ] AVX512,FUTURE +VBROADCASTSD zmmreg|mask|z,xmmreg [rm: evex.512.66.0f38.w1 19 /r ] AVX512,FUTURE +VBROADCASTSS zmmreg|mask|z,mem32 [rm:t1s: evex.512.66.0f38.w0 18 /r ] AVX512,FUTURE +VBROADCASTSS zmmreg|mask|z,xmmreg [rm: evex.512.66.0f38.w0 18 /r ] AVX512,FUTURE +VCMPPD opmaskreg|mask,zmmreg,zmmrm512|b64|sae,imm8 [rvmi:fv: evex.nds.512.66.0f.w1 c2 /r ib ] AVX512,FUTURE +VCMPPS opmaskreg|mask,zmmreg,zmmrm512|b32|sae,imm8 [rvmi:fv: evex.nds.512.0f.w0 c2 /r ib ] AVX512,FUTURE +VCMPSD opmaskreg|mask,xmmreg,xmmrm64|sae,imm8 [rvmi:t1s: evex.nds.lig.f2.0f.w1 c2 /r ib ] AVX512,FUTURE +VCMPSS opmaskreg|mask,xmmreg,xmmrm32|sae,imm8 [rvmi:t1s: evex.nds.lig.f3.0f.w0 c2 /r ib ] AVX512,FUTURE +VCOMISD xmmreg,xmmrm64|sae [rm:t1s: evex.lig.66.0f.w1 2f /r ] AVX512,FUTURE +VCOMISS xmmreg,xmmrm32|sae [rm:t1s: evex.lig.0f.w0 2f /r ] AVX512,FUTURE +VCOMPRESSPD mem512|mask,zmmreg [mr:t1s: evex.512.66.0f38.w1 8a /r ] AVX512,FUTURE +VCOMPRESSPD zmmreg|mask|z,zmmreg [mr: evex.512.66.0f38.w1 8a /r ] AVX512,FUTURE +VCOMPRESSPS mem512|mask,zmmreg [mr:t1s: evex.512.66.0f38.w0 8a /r ] AVX512,FUTURE +VCOMPRESSPS zmmreg|mask|z,zmmreg [mr: evex.512.66.0f38.w0 8a /r ] AVX512,FUTURE +VCVTDQ2PD zmmreg|mask|z,ymmrm256|b32|er [rm:hv: evex.512.f3.0f.w0 e6 /r ] AVX512,FUTURE +VCVTDQ2PS zmmreg|mask|z,zmmrm512|b32|er [rm:fv: evex.512.0f.w0 5b /r ] AVX512,FUTURE +VCVTPD2DQ ymmreg|mask|z,zmmrm512|b64|er [rm:fv: evex.512.f2.0f.w1 e6 /r ] AVX512,FUTURE +VCVTPD2PS ymmreg|mask|z,zmmrm512|b64|er [rm:fv: evex.512.66.0f.w1 5a /r ] AVX512,FUTURE +VCVTPD2UDQ ymmreg|mask|z,zmmrm512|b64|er [rm:fv: evex.512.0f.w1 79 /r ] AVX512,FUTURE +VCVTPH2PS zmmreg|mask|z,ymmrm256|sae [rm:hvm: evex.512.66.0f38.w0 13 /r ] AVX512,FUTURE +VCVTPS2DQ zmmreg|mask|z,zmmrm512|b32|er [rm:fv: evex.512.66.0f.w0 5b /r ] AVX512,FUTURE +VCVTPS2PD zmmreg|mask|z,ymmrm256|b32|sae [rm:hv: evex.512.0f.w0 5a /r ] AVX512,FUTURE +VCVTPS2PH mem256|mask,zmmreg|sae,imm8 [mri:hvm: evex.512.66.0f3a.w0 1d /r ib ] AVX512,FUTURE +VCVTPS2PH ymmreg|mask|z,zmmreg|sae,imm8 [mri:hvm: evex.512.66.0f3a.w0 1d /r ib ] AVX512,FUTURE +VCVTPS2UDQ zmmreg|mask|z,zmmrm512|b32|er [rm:fv: evex.512.0f.w0 79 /r ] AVX512,FUTURE +VCVTSD2SI reg32,xmmrm64|er [rm:t1f64: evex.lig.f2.0f.w0 2d /r ] AVX512,FUTURE +VCVTSD2SI reg64,xmmrm64|er [rm:t1f64: evex.lig.f2.0f.w1 2d /r ] AVX512,FUTURE +VCVTSD2SS xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 5a /r ] AVX512,FUTURE +VCVTSD2USI reg32,xmmrm64|er [rm:t1f64: evex.lig.f2.0f.w0 79 /r ] AVX512,FUTURE +VCVTSD2USI reg64,xmmrm64|er [rm:t1f64: evex.lig.f2.0f.w1 79 /r ] AVX512,FUTURE +VCVTSI2SD xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f2.0f.w0 2a /r ] AVX512,FUTURE +VCVTSI2SD xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 2a /r ] AVX512,FUTURE +VCVTSI2SS xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 2a /r ] AVX512,FUTURE +VCVTSI2SS xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f3.0f.w1 2a /r ] AVX512,FUTURE +VCVTSS2SD xmmreg|mask|z,xmmreg,xmmrm32|sae [rvm:t1s: evex.nds.lig.f3.0f.w0 5a /r ] AVX512,FUTURE +VCVTSS2SI reg32,xmmrm32|er [rm:t1f32: evex.lig.f3.0f.w0 2d /r ] AVX512,FUTURE +VCVTSS2SI reg64,xmmrm32|er [rm:t1f32: evex.lig.f3.0f.w1 2d /r ] AVX512,FUTURE +VCVTSS2USI reg32,xmmrm32|er [rm:t1f32: evex.lig.f3.0f.w0 79 /r ] AVX512,FUTURE +VCVTSS2USI reg64,xmmrm32|er [rm:t1f32: evex.lig.f3.0f.w1 79 /r ] AVX512,FUTURE +VCVTTPD2DQ ymmreg|mask|z,zmmrm512|b64|sae [rm:fv: evex.512.66.0f.w1 e6 /r ] AVX512,FUTURE +VCVTTPD2UDQ ymmreg|mask|z,zmmrm512|b64|sae [rm:fv: evex.512.0f.w1 78 /r ] AVX512,FUTURE +VCVTTPS2DQ zmmreg|mask|z,zmmrm512|b32|sae [rm:fv: evex.512.f3.0f.w0 5b /r ] AVX512,FUTURE +VCVTTPS2UDQ zmmreg|mask|z,zmmrm512|b32|sae [rm:fv: evex.512.0f.w0 78 /r ] AVX512,FUTURE +VCVTTSD2SI reg32,xmmrm64|sae [rm:t1f64: evex.lig.f2.0f.w0 2c /r ] AVX512,FUTURE +VCVTTSD2SI reg64,xmmrm64|sae [rm:t1f64: evex.lig.f2.0f.w1 2c /r ] AVX512,FUTURE +VCVTTSD2USI reg32,xmmrm64|sae [rm:t1f64: evex.lig.f2.0f.w0 78 /r ] AVX512,FUTURE +VCVTTSD2USI reg64,xmmrm64|sae [rm:t1f64: evex.lig.f2.0f.w1 78 /r ] AVX512,FUTURE +VCVTTSS2SI reg32,xmmrm32|sae [rm:t1f32: evex.lig.f3.0f.w0 2c /r ] AVX512,FUTURE +VCVTTSS2SI reg64,xmmrm32|sae [rm:t1f32: evex.lig.f3.0f.w1 2c /r ] AVX512,FUTURE +VCVTTSS2USI reg32,xmmrm32|sae [rm:t1f32: evex.lig.f3.0f.w0 78 /r ] AVX512,FUTURE +VCVTTSS2USI reg64,xmmrm32|sae [rm:t1f32: evex.lig.f3.0f.w1 78 /r ] AVX512,FUTURE +VCVTUDQ2PD zmmreg|mask|z,ymmrm256|b32|er [rm:hv: evex.512.f3.0f.w0 7a /r ] AVX512,FUTURE +VCVTUDQ2PS zmmreg|mask|z,zmmrm512|b32|er [rm:fv: evex.512.f2.0f.w0 7a /r ] AVX512,FUTURE +VCVTUSI2SD xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f2.0f.w0 7b /r ] AVX512,FUTURE +VCVTUSI2SD xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 7b /r ] AVX512,FUTURE +VCVTUSI2SS xmmreg,xmmreg,rm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 7b /r ] AVX512,FUTURE +VCVTUSI2SS xmmreg,xmmreg,rm64|er [rvm:t1s: evex.nds.lig.f3.0f.w1 7b /r ] AVX512,FUTURE +VDIVPD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f.w1 5e /r ] AVX512,FUTURE +VDIVPS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.0f.w0 5e /r ] AVX512,FUTURE +VDIVSD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 5e /r ] AVX512,FUTURE +VDIVSS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 5e /r ] AVX512,FUTURE +VEXPANDPD zmmreg|mask|z,mem512 [rm:t1s: evex.512.66.0f38.w1 88 /r ] AVX512,FUTURE +VEXPANDPD zmmreg|mask|z,zmmreg [rm:t1s: evex.512.66.0f38.w1 88 /r ] AVX512,FUTURE +VEXPANDPS zmmreg|mask|z,mem512 [rm:t1s: evex.512.66.0f38.w0 88 /r ] AVX512,FUTURE +VEXPANDPS zmmreg|mask|z,zmmreg [rm:t1s: evex.512.66.0f38.w0 88 /r ] AVX512,FUTURE +VEXTRACTF32X4 mem128|mask,zmmreg,imm8 [mri:t4: evex.512.66.0f3a.w0 19 /r ib ] AVX512,FUTURE +VEXTRACTF32X4 xmmreg|mask|z,zmmreg,imm8 [mri:t4: evex.512.66.0f3a.w0 19 /r ib ] AVX512,FUTURE +VEXTRACTF64X4 mem256|mask,zmmreg,imm8 [mri:t4: evex.512.66.0f3a.w1 1b /r ib ] AVX512,FUTURE +VEXTRACTF64X4 ymmreg|mask|z,zmmreg,imm8 [mri: evex.512.66.0f3a.w1 1b /r ib ] AVX512,FUTURE +VEXTRACTI32X4 mem128|mask,zmmreg,imm8 [mri:t4: evex.512.66.0f3a.w0 39 /r ib ] AVX512,FUTURE +VEXTRACTI32X4 xmmreg|mask|z,zmmreg,imm8 [mri: evex.512.66.0f3a.w0 39 /r ib ] AVX512,FUTURE +VEXTRACTI64X4 mem256|mask,zmmreg,imm8 [mri:t4: evex.512.66.0f3a.w1 3b /r ib ] AVX512,FUTURE +VEXTRACTI64X4 ymmreg|mask|z,zmmreg,imm8 [mri: evex.512.66.0f3a.w1 3b /r ib ] AVX512,FUTURE +VEXTRACTPS rm32,xmmreg,imm8 [mri:t1s: evex.128.66.0f3a.wig 17 /r ib ] AVX512,FUTURE +VFIXUPIMMPD zmmreg|mask|z,zmmreg,zmmrm512|b64|sae,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w1 54 /r ib ] AVX512,FUTURE +VFIXUPIMMPS zmmreg|mask|z,zmmreg,zmmrm512|b32|sae,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w0 54 /r ib ] AVX512,FUTURE +VFIXUPIMMSD xmmreg|mask|z,xmmreg,xmmrm64|sae,imm8 [rvmi:t1s: evex.nds.lig.66.0f3a.w1 55 /r ib ] AVX512,FUTURE +VFIXUPIMMSS xmmreg|mask|z,xmmreg,xmmrm32|sae,imm8 [rvmi:t1s: evex.nds.lig.66.0f3a.w0 55 /r ib ] AVX512,FUTURE +VFMADD132PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 98 /r ] AVX512,FUTURE +VFMADD132PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 98 /r ] AVX512,FUTURE +VFMADD132SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 99 /r ] AVX512,FUTURE +VFMADD132SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 99 /r ] AVX512,FUTURE +VFMADD213PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 a8 /r ] AVX512,FUTURE +VFMADD213PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 a8 /r ] AVX512,FUTURE +VFMADD213SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 a9 /r ] AVX512,FUTURE +VFMADD213SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 a9 /r ] AVX512,FUTURE +VFMADD231PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 b8 /r ] AVX512,FUTURE +VFMADD231PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 b8 /r ] AVX512,FUTURE +VFMADD231SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 b9 /r ] AVX512,FUTURE +VFMADD231SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 b9 /r ] AVX512,FUTURE +VFMADDSUB132PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 96 /r ] AVX512,FUTURE +VFMADDSUB132PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 96 /r ] AVX512,FUTURE +VFMADDSUB213PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 a6 /r ] AVX512,FUTURE +VFMADDSUB213PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 a6 /r ] AVX512,FUTURE +VFMADDSUB231PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 b6 /r ] AVX512,FUTURE +VFMADDSUB231PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 b6 /r ] AVX512,FUTURE +VFMSUB132PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 9a /r ] AVX512,FUTURE +VFMSUB132PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 9a /r ] AVX512,FUTURE +VFMSUB132SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 9b /r ] AVX512,FUTURE +VFMSUB132SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 9b /r ] AVX512,FUTURE +VFMSUB213PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 aa /r ] AVX512,FUTURE +VFMSUB213PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 aa /r ] AVX512,FUTURE +VFMSUB213SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 ab /r ] AVX512,FUTURE +VFMSUB213SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 ab /r ] AVX512,FUTURE +VFMSUB231PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 ba /r ] AVX512,FUTURE +VFMSUB231PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 ba /r ] AVX512,FUTURE +VFMSUB231SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 bb /r ] AVX512,FUTURE +VFMSUB231SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 bb /r ] AVX512,FUTURE +VFMSUBADD132PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 97 /r ] AVX512,FUTURE +VFMSUBADD132PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 97 /r ] AVX512,FUTURE +VFMSUBADD213PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 a7 /r ] AVX512,FUTURE +VFMSUBADD213PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 a7 /r ] AVX512,FUTURE +VFMSUBADD231PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 b7 /r ] AVX512,FUTURE +VFMSUBADD231PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 b7 /r ] AVX512,FUTURE +VFNMADD132PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 9c /r ] AVX512,FUTURE +VFNMADD132PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 9c /r ] AVX512,FUTURE +VFNMADD132SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 9d /r ] AVX512,FUTURE +VFNMADD132SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 9d /r ] AVX512,FUTURE +VFNMADD213PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 ac /r ] AVX512,FUTURE +VFNMADD213PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 ac /r ] AVX512,FUTURE +VFNMADD213SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 ad /r ] AVX512,FUTURE +VFNMADD213SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 ad /r ] AVX512,FUTURE +VFNMADD231PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 bc /r ] AVX512,FUTURE +VFNMADD231PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 bc /r ] AVX512,FUTURE +VFNMADD231SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 bd /r ] AVX512,FUTURE +VFNMADD231SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 bd /r ] AVX512,FUTURE +VFNMSUB132PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 9e /r ] AVX512,FUTURE +VFNMSUB132PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 9e /r ] AVX512,FUTURE +VFNMSUB132SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 9f /r ] AVX512,FUTURE +VFNMSUB132SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 9f /r ] AVX512,FUTURE +VFNMSUB213PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 ae /r ] AVX512,FUTURE +VFNMSUB213PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 ae /r ] AVX512,FUTURE +VFNMSUB213SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 af /r ] AVX512,FUTURE +VFNMSUB213SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 af /r ] AVX512,FUTURE +VFNMSUB231PD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f38.w1 be /r ] AVX512,FUTURE +VFNMSUB231PS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.66.0f38.w0 be /r ] AVX512,FUTURE +VFNMSUB231SD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.66.0f38.w1 bf /r ] AVX512,FUTURE +VFNMSUB231SS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.66.0f38.w0 bf /r ] AVX512,FUTURE +VGATHERDPD zmmreg|mask,ymem64 [rm:t1s: vsiby evex.512.66.0f38.w1 92 /r ] AVX512,FUTURE +VGATHERDPS zmmreg|mask,zmem32 [rm:t1s: vsibz evex.512.66.0f38.w0 92 /r ] AVX512,FUTURE +VGATHERQPD zmmreg|mask,zmem64 [rm:t1s: vsibz evex.512.66.0f38.w1 93 /r ] AVX512,FUTURE +VGATHERQPS ymmreg|mask,zmem32 [rm:t1s: vsibz evex.512.66.0f38.w0 93 /r ] AVX512,FUTURE +VGETEXPPD zmmreg|mask|z,zmmrm512|b64|sae [rm:fv: evex.512.66.0f38.w1 42 /r ] AVX512,FUTURE +VGETEXPPS zmmreg|mask|z,zmmrm512|b32|sae [rm:fv: evex.512.66.0f38.w0 42 /r ] AVX512,FUTURE +VGETEXPSD xmmreg|mask|z,xmmreg,xmmrm64|sae [rvm:t1s: evex.nds.lig.66.0f38.w1 43 /r ] AVX512,FUTURE +VGETEXPSS xmmreg|mask|z,xmmreg,xmmrm32|sae [rvm:t1s: evex.nds.lig.66.0f38.w0 43 /r ] AVX512,FUTURE +VGETMANTPD zmmreg|mask|z,zmmrm512|b64|sae,imm8 [rmi:fv: evex.512.66.0f3a.w1 26 /r ib ] AVX512,FUTURE +VGETMANTPS zmmreg|mask|z,zmmrm512|b32|sae,imm8 [rmi:fv: evex.512.66.0f3a.w0 26 /r ib ] AVX512,FUTURE +VGETMANTSD xmmreg|mask|z,xmmreg,xmmrm64|sae,imm8 [rvmi:t1s: evex.nds.lig.66.0f3a.w1 27 /r ib ] AVX512,FUTURE +VGETMANTSS xmmreg|mask|z,xmmreg,xmmrm32|sae,imm8 [rvmi:t1s: evex.nds.lig.66.0f3a.w0 27 /r ib ] AVX512,FUTURE +VINSERTF32X4 zmmreg|mask|z,zmmreg,xmmrm128,imm8 [rvmi:t4: evex.nds.512.66.0f3a.w0 18 /r ib ] AVX512,FUTURE +VINSERTF64X4 zmmreg|mask|z,zmmreg,ymmrm256,imm8 [rvmi:t4: evex.nds.512.66.0f3a.w1 1a /r ib ] AVX512,FUTURE +VINSERTI32X4 zmmreg|mask|z,zmmreg,xmmrm128,imm8 [rvmi:t4: evex.nds.512.66.0f3a.w0 38 /r ib ] AVX512,FUTURE +VINSERTI64X4 zmmreg|mask|z,zmmreg,ymmrm256,imm8 [rvmi:t4: evex.nds.512.66.0f3a.w1 3a /r ib ] AVX512,FUTURE +VINSERTPS xmmreg,xmmreg,xmmrm32,imm8 [rvmi:t1s: evex.nds.128.66.0f3a.w0 21 /r ib ] AVX512,FUTURE +VMAXPD zmmreg|mask|z,zmmreg,zmmrm512|b64|sae [rvm:fv: evex.nds.512.66.0f.w1 5f /r ] AVX512,FUTURE +VMAXPS zmmreg|mask|z,zmmreg,zmmrm512|b32|sae [rvm:fv: evex.nds.512.0f.w0 5f /r ] AVX512,FUTURE +VMAXSD xmmreg|mask|z,xmmreg,xmmrm64|sae [rvm:t1s: evex.nds.lig.f2.0f.w1 5f /r ] AVX512,FUTURE +VMAXSS xmmreg|mask|z,xmmreg,xmmrm32|sae [rvm:t1s: evex.nds.lig.f3.0f.w0 5f /r ] AVX512,FUTURE +VMINPD zmmreg|mask|z,zmmreg,zmmrm512|b64|sae [rvm:fv: evex.nds.512.66.0f.w1 5d /r ] AVX512,FUTURE +VMINPS zmmreg|mask|z,zmmreg,zmmrm512|b32|sae [rvm:fv: evex.nds.512.0f.w0 5d /r ] AVX512,FUTURE +VMINSD xmmreg|mask|z,xmmreg,xmmrm64|sae [rvm:t1s: evex.nds.lig.f2.0f.w1 5d /r ] AVX512,FUTURE +VMINSS xmmreg|mask|z,xmmreg,xmmrm32|sae [rvm:t1s: evex.nds.lig.f3.0f.w0 5d /r ] AVX512,FUTURE +VMOVAPD mem512|mask,zmmreg [mr:fvm: evex.512.66.0f.w1 29 /r ] AVX512,FUTURE +VMOVAPD zmmreg|mask|z,zmmreg [mr: evex.512.66.0f.w1 29 /r ] AVX512,FUTURE +VMOVAPD zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.66.0f.w1 28 /r ] AVX512,FUTURE +VMOVAPS mem512|mask,zmmreg [mr:fvm: evex.512.0f.w0 29 /r ] AVX512,FUTURE +VMOVAPS zmmreg|mask|z,zmmreg [mr: evex.512.0f.w0 29 /r ] AVX512,FUTURE +VMOVAPS zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.0f.w0 28 /r ] AVX512,FUTURE +VMOVD rm32,xmmreg [mr:t1s: evex.128.66.0f.w0 7e /r ] AVX512,FUTURE +VMOVD xmmreg,rm32 [rm:t1s: evex.128.66.0f.w0 6e /r ] AVX512,FUTURE +VMOVDDUP zmmreg|mask|z,zmmrm512 [rm:dup: evex.512.f2.0f.w1 12 /r ] AVX512,FUTURE +VMOVDQA32 mem512|mask,zmmreg [mr:fvm: evex.512.66.0f.w0 7f /r ] AVX512,FUTURE +VMOVDQA32 zmmreg|mask|z,zmmreg [mr: evex.512.66.0f.w0 7f /r ] AVX512,FUTURE +VMOVDQA32 zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.66.0f.w0 6f /r ] AVX512,FUTURE +VMOVDQA64 mem512|mask,zmmreg [mr:fvm: evex.512.66.0f.w1 7f /r ] AVX512,FUTURE +VMOVDQA64 zmmreg|mask|z,zmmreg [mr: evex.512.66.0f.w1 7f /r ] AVX512,FUTURE +VMOVDQA64 zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.66.0f.w1 6f /r ] AVX512,FUTURE +VMOVDQU32 mem512|mask,zmmreg [mr:fvm: evex.512.f3.0f.w0 7f /r ] AVX512,FUTURE +VMOVDQU32 zmmreg|mask|z,zmmreg [mr: evex.512.f3.0f.w0 7f /r ] AVX512,FUTURE +VMOVDQU32 zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.f3.0f.w0 6f /r ] AVX512,FUTURE +VMOVDQU64 mem512|mask,zmmreg [mr:fvm: evex.512.f3.0f.w1 7f /r ] AVX512,FUTURE +VMOVDQU64 zmmreg|mask|z,zmmreg [mr: evex.512.f3.0f.w1 7f /r ] AVX512,FUTURE +VMOVDQU64 zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.f3.0f.w1 6f /r ] AVX512,FUTURE +VMOVHLPS xmmreg,xmmreg,xmmreg [rvm: evex.nds.128.0f.w0 12 /r ] AVX512,FUTURE +VMOVHPD mem64,xmmreg [mr:t1s: evex.128.66.0f.w1 17 /r ] AVX512,FUTURE +VMOVHPD xmmreg,xmmreg,mem64 [rvm:t1s: evex.nds.128.66.0f.w1 16 /r ] AVX512,FUTURE +VMOVHPS mem64,xmmreg [mr:t2: evex.128.0f.w0 17 /r ] AVX512,FUTURE +VMOVHPS xmmreg,xmmreg,mem64 [rvm:t2: evex.nds.128.0f.w0 16 /r ] AVX512,FUTURE +VMOVLHPS xmmreg,xmmreg,xmmreg [rvm: evex.nds.128.0f.w0 16 /r ] AVX512,FUTURE +VMOVLPD mem64,xmmreg [mr:t1s: evex.128.66.0f.w1 13 /r ] AVX512,FUTURE +VMOVLPD xmmreg,xmmreg,mem64 [rvm:t1s: evex.nds.128.66.0f.w1 12 /r ] AVX512,FUTURE +VMOVLPS mem64,xmmreg [mr:t2: evex.128.0f.w0 13 /r ] AVX512,FUTURE +VMOVLPS xmmreg,xmmreg,mem64 [rvm:t2: evex.nds.128.0f.w0 12 /r ] AVX512,FUTURE +VMOVNTDQ mem512,zmmreg [mr:fvm: evex.512.66.0f.w0 e7 /r ] AVX512,FUTURE +VMOVNTDQA zmmreg,mem512 [rm:fvm: evex.512.66.0f38.w0 2a /r ] AVX512,FUTURE +VMOVNTPD mem512,zmmreg [mr:fvm: evex.512.66.0f.w1 2b /r ] AVX512,FUTURE +VMOVNTPS mem512,zmmreg [mr:fvm: evex.512.0f.w0 2b /r ] AVX512,FUTURE +VMOVQ rm64,xmmreg [mr:t1s: evex.128.66.0f.w1 7e /r ] AVX512,FUTURE +VMOVQ xmmreg,rm64 [rm:t1s: evex.128.66.0f.w1 6e /r ] AVX512,FUTURE +VMOVQ xmmreg,xmmrm64 [rm:t1s: evex.128.f3.0f.w1 7e /r ] AVX512,FUTURE +VMOVQ xmmrm64,xmmreg [mr:t1s: evex.128.66.0f.w1 d6 /r ] AVX512,FUTURE +VMOVSD mem64|mask,xmmreg [mr:t1s: evex.lig.f2.0f.w1 11 /r ] AVX512,FUTURE +VMOVSD xmmreg|mask|z,mem64 [rm:t1s: evex.lig.f2.0f.w1 10 /r ] AVX512,FUTURE +VMOVSD xmmreg|mask|z,xmmreg,xmmreg [mvr: evex.nds.lig.f2.0f.w1 11 /r ] AVX512,FUTURE +VMOVSD xmmreg|mask|z,xmmreg,xmmreg [rvm: evex.nds.lig.f2.0f.w1 10 /r ] AVX512,FUTURE +VMOVSHDUP zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.f3.0f.w0 16 /r ] AVX512,FUTURE +VMOVSLDUP zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.f3.0f.w0 12 /r ] AVX512,FUTURE +VMOVSS mem32|mask,xmmreg [mr:t1s: evex.lig.f3.0f.w0 11 /r ] AVX512,FUTURE +VMOVSS xmmreg|mask|z,mem32 [rm:t1s: evex.lig.f3.0f.w0 10 /r ] AVX512,FUTURE +VMOVSS xmmreg|mask|z,xmmreg,xmmreg [mvr: evex.nds.lig.f3.0f.w0 11 /r ] AVX512,FUTURE +VMOVSS xmmreg|mask|z,xmmreg,xmmreg [rvm: evex.nds.lig.f3.0f.w0 10 /r ] AVX512,FUTURE +VMOVUPD mem512|mask,zmmreg [mr:fvm: evex.512.66.0f.w1 11 /r ] AVX512,FUTURE +VMOVUPD zmmreg|mask|z,zmmreg [mr: evex.512.66.0f.w1 11 /r ] AVX512,FUTURE +VMOVUPD zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.66.0f.w1 10 /r ] AVX512,FUTURE +VMOVUPS mem512|mask,zmmreg [mr:fvm: evex.512.0f.w0 11 /r ] AVX512,FUTURE +VMOVUPS zmmreg|mask|z,zmmreg [mr: evex.512.0f.w0 11 /r ] AVX512,FUTURE +VMOVUPS zmmreg|mask|z,zmmrm512 [rm:fvm: evex.512.0f.w0 10 /r ] AVX512,FUTURE +VMULPD zmmreg|mask|z,zmmreg,zmmrm512|b64|er [rvm:fv: evex.nds.512.66.0f.w1 59 /r ] AVX512,FUTURE +VMULPS zmmreg|mask|z,zmmreg,zmmrm512|b32|er [rvm:fv: evex.nds.512.0f.w0 59 /r ] AVX512,FUTURE +VMULSD xmmreg|mask|z,xmmreg,xmmrm64|er [rvm:t1s: evex.nds.lig.f2.0f.w1 59 /r ] AVX512,FUTURE +VMULSS xmmreg|mask|z,xmmreg,xmmrm32|er [rvm:t1s: evex.nds.lig.f3.0f.w0 59 /r ] AVX512,FUTURE +VPABSD zmmreg|mask|z,zmmrm512|b32 [rm:fv: evex.512.66.0f38.w0 1e /r ] AVX512,FUTURE +VPABSQ zmmreg|mask|z,zmmrm512|b64 [rm:fv: evex.512.66.0f38.w1 1f /r ] AVX512,FUTURE +VPADDD zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f.w0 fe /r ] AVX512,FUTURE +VPADDQ zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f.w1 d4 /r ] AVX512,FUTURE +VPANDD zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f.w0 db /r ] AVX512,FUTURE +VPANDND zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f.w0 df /r ] AVX512,FUTURE +VPANDNQ zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f.w1 df /r ] AVX512,FUTURE +VPANDQ zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f.w1 db /r ] AVX512,FUTURE +VPBLENDMD zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f38.w0 64 /r ] AVX512,FUTURE +VPBLENDMQ zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 64 /r ] AVX512,FUTURE +VPBROADCASTD zmmreg|mask|z,mem32 [rm:t1s: evex.512.66.0f38.w0 58 /r ] AVX512,FUTURE +VPBROADCASTD zmmreg|mask|z,reg32 [rm: evex.512.66.0f38.w0 7c /r ] AVX512,FUTURE +VPBROADCASTD zmmreg|mask|z,xmmreg [rm: evex.512.66.0f38.w0 58 /r ] AVX512,FUTURE +VPBROADCASTQ zmmreg|mask|z,mem64 [rm:t1s: evex.512.66.0f38.w1 59 /r ] AVX512,FUTURE +VPBROADCASTQ zmmreg|mask|z,reg64 [rm: evex.512.66.0f38.w1 7c /r ] AVX512,FUTURE +VPBROADCASTQ zmmreg|mask|z,xmmreg [rm: evex.512.66.0f38.w1 59 /r ] AVX512,FUTURE +VPCMPD opmaskreg|mask,zmmreg,zmmrm512|b32,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w0 1f /r ib ] AVX512,FUTURE +VPCMPEQD opmaskreg|mask,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f.w0 76 /r ] AVX512,FUTURE +VPCMPEQQ opmaskreg|mask,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 29 /r ] AVX512,FUTURE +VPCMPGTD opmaskreg|mask,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f.w0 66 /r ] AVX512,FUTURE +VPCMPGTQ opmaskreg|mask,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 37 /r ] AVX512,FUTURE +VPCMPQ opmaskreg|mask,zmmreg,zmmrm512|b64,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w1 1f /r ib ] AVX512,FUTURE +VPCMPUD opmaskreg|mask,zmmreg,zmmrm512|b32,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w0 1e /r ib ] AVX512,FUTURE +VPCMPUQ opmaskreg|mask,zmmreg,zmmrm512|b64,imm8 [rvmi:fv: evex.nds.512.66.0f3a.w1 1e /r ib ] AVX512,FUTURE +VPCOMPRESSD mem512|mask,zmmreg [mr:t1s: evex.512.66.0f38.w0 8b /r ] AVX512,FUTURE +VPCOMPRESSD zmmreg|mask|z,zmmreg [mr: evex.512.66.0f38.w0 8b /r ] AVX512,FUTURE +VPCOMPRESSQ mem512|mask,zmmreg [mr:t1s: evex.512.66.0f38.w1 8b /r ] AVX512,FUTURE +VPCOMPRESSQ zmmreg|mask|z,zmmreg [mr: evex.512.66.0f38.w1 8b /r ] AVX512,FUTURE +VPERMD zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f38.w0 36 /r ] AVX512,FUTURE +VPERMI2D zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f38.w0 76 /r ] AVX512,FUTURE +VPERMI2PD zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 77 /r ] AVX512,FUTURE +VPERMI2PS zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f38.w0 77 /r ] AVX512,FUTURE +VPERMI2Q zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 76 /r ] AVX512,FUTURE +VPERMILPD zmmreg|mask|z,zmmreg,zmmrm512|b64 [rvm:fv: evex.nds.512.66.0f38.w1 0d /r ] AVX512,FUTURE +VPERMILPD zmmreg|mask|z,zmmrm512|b64,imm8 [rmi:fv: evex.512.66.0f3a.w1 05 /r ib ] AVX512,FUTURE +VPERMILPS zmmreg|mask|z,zmmreg,zmmrm512|b32 [rvm:fv: evex.nds.512.66.0f38.w0 0c /r ] AVX512,FUTURE +VPERMILPS zmmreg|mask|z,zmmrm512|b32,imm8 [rmi:fv: evex.512.66.0f3a.w0 04 /r ib ] AVX512,FUTURE +VPERMPD zmmreg|mask|z,zmmreg,zmmrm512|b64 ... [truncated message content] |
From: anonymous c. <nas...@us...> - 2013-08-06 13:14:46
|
> AVX-512 introduced new syntax using braces for decorators. Actually, the curly-brace operand modifiers were introduced by L1OM on Larrabee. Also, K1OM used them on Xeon Phi. AVX-512 is merely the third x86 extension to use them. ;-) > + * the order of decorators does not matter. The order matters in terms of which one wins in case of any conflict, i.e. first vs last, warning vs error, etc. You should consider following gas behavior, simply because it effectively set the standard for this a long time ago. > + * e.g. zmm1 {k2}{z} or zmm2 {z,k3} You really do NOT want that comma syntax, for two reasons. First, it poses a problem if a vendor ever introduces a future modifier with a comma in it. For example, {a,b,c,d}. Second, it will simplify parsing/tokenizing. Because instead of having to bastardize the existing clean identifier handling with exceptions for curly braces and dashes, and exceptions for nested braces and multiple qualifiers, you'll end up with a rather simpler and more traditional sequence: opening curly brace, optional whitespace, a sequence of non-whitespace characters (that gets looked up against known qualifiers, in e.g. a hash table), optional whitespace, plus a closing curly brace. (Put another way, your modifications to that parsing/ tokenizing code are very intrusive, and difficult to validate when it comes to corner cases.) In terms of modifier placement you probably want to look at gas again -- I have seen code which has modifiers before an operand, and I have seen code which has them as their own operand. For example, {one} op1, {two}, op2. I do not know if you care about supporting L1OM and K1OM eventually -- the longer you wait, the more obsolete they will be, of course :-) -- but that would come with more challenges. In particular, the regular braces which were used to enclose operands decorated with transform modifiers are really hard to get right. |
From: Cyrill G. <gor...@gm...> - 2013-08-06 05:43:22
|
On Mon, Aug 05, 2013 at 10:37:33PM -0700, H. Peter Anvin wrote: > On 08/05/2013 10:34 PM, Cyrill Gorcunov wrote: > > On Mon, Aug 05, 2013 at 08:46:18PM -0700, Jin Kyu Song wrote: > >> AVX-512 introduced new syntax using braces for decorators. > >> Opmask, broadcat, rounding control use this new syntax. > >> > >> http://software.intel.com/sites/default/files/319433-015.pdf > > > > Hi Jin, great job!!! I'm about to merge this, Peter? > > > > Definitely, but let's stick it on an "avx512" branch until the whole > thing is complete and usable (which hopefully should be soon.) Sure! Jin I've merged it to avx512 branch, so fetch it there, thanks! |
From: H. P. A. <hp...@zy...> - 2013-08-06 05:38:23
|
On 08/05/2013 10:34 PM, Cyrill Gorcunov wrote: > On Mon, Aug 05, 2013 at 08:46:18PM -0700, Jin Kyu Song wrote: >> AVX-512 introduced new syntax using braces for decorators. >> Opmask, broadcat, rounding control use this new syntax. >> >> http://software.intel.com/sites/default/files/319433-015.pdf > > Hi Jin, great job!!! I'm about to merge this, Peter? > Definitely, but let's stick it on an "avx512" branch until the whole thing is complete and usable (which hopefully should be soon.) -hpa |
From: Cyrill G. <gor...@gm...> - 2013-08-06 05:34:48
|
On Mon, Aug 05, 2013 at 08:46:18PM -0700, Jin Kyu Song wrote: > AVX-512 introduced new syntax using braces for decorators. > Opmask, broadcat, rounding control use this new syntax. > > http://software.intel.com/sites/default/files/319433-015.pdf Hi Jin, great job!!! I'm about to merge this, Peter? |
From: Jin K. S. <jin...@in...> - 2013-08-06 03:47:24
|
AVX-512 introduced new syntax using braces for decorators. Opmask, broadcat, rounding control use this new syntax. http://software.intel.com/sites/default/files/319433-015.pdf Signed-off-by: Jin Kyu Song <jin...@in...> --- eval.c | 4 +++ nasm.h | 108 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- opflags.h | 21 ++++++++---- parser.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++-- regs.dat | 15 ++++++++- regs.pl | 2 +- stdscan.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++--- tables.h | 3 +- tokens.dat | 28 ++++++++++++---- tokhash.pl | 17 +++++++--- 10 files changed, 358 insertions(+), 28 deletions(-) diff --git a/eval.c b/eval.c index 0035088..c57ff04 100644 --- a/eval.c +++ b/eval.c @@ -869,6 +869,7 @@ static expr *expr6(int critical) case TOKEN_INSN: /* Opcodes that occur here are really labels */ case TOKEN_HERE: case TOKEN_BASE: + case TOKEN_DECORATOR: begintemp(); switch (i) { case TOKEN_NUM: @@ -938,6 +939,9 @@ static expr *expr6(int critical) if (label_seg != NO_SEG) addtotemp(EXPR_SEGBASE + label_seg, 1L); break; + case TOKEN_DECORATOR: + addtotemp(EXPR_RDSAE, tokval->t_integer); + break; } i = scan(scpriv, tokval); return finishtemp(); diff --git a/nasm.h b/nasm.h index 7802d9b..fb6c6e9 100644 --- a/nasm.h +++ b/nasm.h @@ -226,6 +226,8 @@ enum token_type { /* token types, other than chars */ TOKEN_FLOATIZE, /* __floatX__ */ TOKEN_STRFUNC, /* __utf16*__, __utf32*__ */ TOKEN_IFUNC, /* __ilog2*__ */ + TOKEN_DECORATOR, /* decorators such as {...} */ + TOKEN_OPMASK, /* translated token for opmask registers */ }; enum floatize { @@ -272,6 +274,7 @@ struct tokenval { int64_t t_integer; int64_t t_inttwo; enum token_type t_type; + int8_t t_flag; }; typedef int (*scanner)(void *private_data, struct tokenval *tv); @@ -352,11 +355,14 @@ typedef expr *(*evalfunc)(scanner sc, void *scprivate, /* * Special values for expr->type. * These come after EXPR_REG_END as defined in regs.h. + * Expr types : 0 ~ EXPR_REG_END, EXPR_UNKNOWN, EXPR_...., EXPR_RDSAE, + * EXPR_SEGBASE ~ EXPR_SEGBASE + SEG_ABS, ... */ #define EXPR_UNKNOWN (EXPR_REG_END+1) /* forward references */ #define EXPR_SIMPLE (EXPR_REG_END+2) #define EXPR_WRT (EXPR_REG_END+3) -#define EXPR_SEGBASE (EXPR_REG_END+4) +#define EXPR_RDSAE (EXPR_REG_END+4) +#define EXPR_SEGBASE (EXPR_REG_END+5) /* * Linked list of strings @@ -466,6 +472,14 @@ enum ccode { /* condition code names */ C_none = -1 }; +/* + * token flags + */ +#define TFLAG_BRC (1 << 0) /* valid only with braces. {1to8}, {rd-sae}, ...*/ +#define TFLAG_BRC_OPT (1 << 1) /* may or may not have braces. opmasks {k1} */ +#define TFLAG_BRC_ANY (TFLAG_BRC | TFLAG_BRC_OPT) +#define TFLAG_BRDCAST (1 << 2) /* broadcasting decorator */ + static inline uint8_t get_cond_opcode(enum ccode c) { static const uint8_t ccode_opcodes[] = { @@ -563,6 +577,7 @@ typedef struct operand { /* operand to an instruction */ int32_t wrt; /* segment base it's relative to */ int eaflags; /* special EA flags */ int opflags; /* see OPFLAG_* defines below */ + decoflags_t decoflags; /* decorator flags such as {...} */ } operand; #define OPFLAG_FORWARD 1 /* operand is a forward reference */ @@ -627,6 +642,7 @@ typedef struct insn { /* an instruction itself */ int vexreg; /* Register encoded in VEX prefix */ int vex_cm; /* Class and M field for VEX prefix */ int vex_wlp; /* W, P and L information for VEX prefix */ + int evex_rm; /* static rounding mode for AVX3 (EVEX) */ } insn; enum geninfo { GI_SWITCH }; @@ -951,6 +967,96 @@ enum special_tokens { SPECIAL_ENUM_LIMIT }; +enum decorator_tokens { + DECORATOR_ENUM_START = SPECIAL_ENUM_LIMIT, + BRC_1TO8 = DECORATOR_ENUM_START, + BRC_1TO16, + BRC_RN, + BRC_RU, + BRC_RD, + BRC_RZ, + BRC_SAE, + BRC_Z, + DECORATOR_ENUM_LIMIT +}; + +/* + * AVX512 Decorator (decoflags_t) bits distribution (counted from 0) + * 3 2 1 + * 10987654321098765432109876543210 + * | + * | word boundary + * ............................1111 opmask + * ...........................1.... zeroing / merging + * ..........................1..... broadcast + * .........................1...... static rounding + * ........................1....... SAE + */ + +/* + * Opmask register number + * identical to EVEX.aaa + * + * Bits: 0 - 3 + */ +#define OPMASK_SHIFT (0) +#define OPMASK_BITS (4) +#define OPMASK_MASK OP_GENMASK(OPMASK_BITS, OPMASK_SHIFT) +#define GEN_OPMASK(bit) OP_GENBIT(bit, OPMASK_SHIFT) +#define VAL_OPMASK(val) OP_GENVAL(val, OPMASK_BITS, OPMASK_SHIFT) + +/* + * zeroing / merging control available + * matching to EVEX.z + * + * Bits: 4 + */ +#define Z_SHIFT (4) +#define Z_BITS (1) +#define Z_MASK OP_GENMASK(Z_BITS, Z_SHIFT) +#define GEN_Z(bit) OP_GENBIT(bit, Z_SHIFT) +#define VAL_Z(val) OP_GENVAL(val, Z_BITS, Z_SHIFT) + +/* + * broadcast - Whether this operand can be broadcasted + * + * Bits: 5 + */ +#define BRDCAST_SHIFT (5) +#define BRDCAST_BITS (1) +#define BRDCAST_MASK OP_GENMASK(BRDCAST_BITS, BRDCAST_SHIFT) +#define GEN_BRDCAST(bit) OP_GENBIT(bit, BRDCAST_SHIFT) +#define VAL_BRDCAST(val) OP_GENVAL(val, BRDCAST_BITS, BRDCAST_SHIFT) + +/* + * Whether this instruction can have a static rounding mode. + * It goes with the last simd operand because the static rounding mode + * decorator is located between the last simd operand and imm8 (if any). + * + * Bits: 6 + */ +#define STATICRND_SHIFT (6) +#define STATICRND_BITS (1) +#define STATICRND_MASK OP_GENMASK(STATICRND_BITS, STATICRND_SHIFT) +#define GEN_STATICRND(bit) OP_GENBIT(bit, STATICRND_SHIFT) + +/* + * SAE(Suppress all exception) available + * + * Bits: 7 + */ +#define SAE_SHIFT (7) +#define SAE_BITS (1) +#define SAE_MASK OP_GENMASK(SAE_BITS, SAE_SHIFT) +#define GEN_SAE(bit) OP_GENBIT(bit, SAE_SHIFT) + +#define MASK OPMASK_MASK /* Opmask (k1 ~ 7) can be used */ +#define Z Z_MASK +#define B32 BRDCAST_MASK /* {1to16} : load+op instruction can broadcast when it is reg-reg operation */ +#define B64 BRDCAST_MASK /* {1to8} : There are two definitions just for conforming to SDM */ +#define ER STATICRND_MASK /* ER(Embedded Rounding) == Static rounding mode */ +#define SAE SAE_MASK /* SAE(Suppress All Exception) */ + /* * Global modes */ diff --git a/opflags.h b/opflags.h index 41fce3d..ed7f8ee 100644 --- a/opflags.h +++ b/opflags.h @@ -39,6 +39,7 @@ #define NASM_OPFLAGS_H #include "compiler.h" +#include "tables.h" /* for opflags_t and nasm_reg_flags[] */ /* * Here we define the operand types. These are implemented as bit @@ -53,10 +54,9 @@ * if and only if "operand" belongs to class type "class". */ -typedef uint64_t opflags_t; - #define OP_GENMASK(bits, shift) (((UINT64_C(1) << (bits)) - 1) << (shift)) #define OP_GENBIT(bit, shift) (UINT64_C(1) << ((shift) + (bit))) +#define OP_GENVAL(val, bits, shift) (((val) & ((UINT64_C(1) << (bits)) - 1)) << (shift)) /* * Type of operand: memory reference, register, etc. @@ -162,11 +162,14 @@ typedef uint64_t opflags_t; #define REG_CLASS_RM_MMX GEN_REG_CLASS(4) #define REG_CLASS_RM_XMM GEN_REG_CLASS(5) #define REG_CLASS_RM_YMM GEN_REG_CLASS(6) +#define REG_CLASS_RM_ZMM GEN_REG_CLASS(7) +#define REG_CLASS_OPMASK GEN_REG_CLASS(8) -#define is_class(class, op) (!((opflags_t)(class) & ~(opflags_t)(op))) +#define is_class(class, op) (!((opflags_t)(class) & ~(opflags_t)(op))) +#define is_reg_class(class, reg) is_class((class), nasm_reg_flags[(reg)]) -#define IS_SREG(op) is_class(REG_SREG, nasm_reg_flags[(op)]) -#define IS_FSGS(op) is_class(REG_FSGS, nasm_reg_flags[(op)]) +#define IS_SREG(op) is_reg_class(REG_SREG, (op)) +#define IS_FSGS(op) is_reg_class(REG_FSGS, (op)) /* Register classes */ #define REG_EA ( REGMEM | REGISTER) /* 'normal' reg, qualifies as EA */ @@ -186,6 +189,12 @@ typedef uint64_t opflags_t; #define RM_YMM ( REG_CLASS_RM_YMM | REGMEM) /* YMM (AVX) operand */ #define YMMREG ( REG_CLASS_RM_YMM | REGMEM | REGISTER) /* YMM (AVX) register */ #define YMM0 (GEN_SUBCLASS(1) | REG_CLASS_RM_YMM | REGMEM | REGISTER) /* YMM register zero */ +#define RM_ZMM ( REG_CLASS_RM_ZMM | REGMEM) /* ZMM (AVX512) operand */ +#define ZMMREG ( REG_CLASS_RM_ZMM | REGMEM | REGISTER) /* ZMM (AVX512) register */ +#define ZMM0 (GEN_SUBCLASS(1) | REG_CLASS_RM_ZMM | REGMEM | REGISTER) /* ZMM register zero */ +#define RM_OPMASK ( REG_CLASS_OPMASK | REGMEM) /* Opmask operand */ +#define OPMASKREG ( REG_CLASS_OPMASK | REGMEM | REGISTER) /* Opmask register */ +#define OPMASK0 (GEN_SUBCLASS(1) | REG_CLASS_OPMASK | REGMEM | REGISTER) /* Opmask register zero (k0) */ #define REG_CDT ( REG_CLASS_CDT | BITS32 | REGISTER) /* CRn, DRn and TRn */ #define REG_CREG (GEN_SUBCLASS(1) | REG_CLASS_CDT | BITS32 | REGISTER) /* CRn */ #define REG_DREG (GEN_SUBCLASS(2) | REG_CLASS_CDT | BITS32 | REGISTER) /* DRn */ @@ -232,7 +241,7 @@ typedef uint64_t opflags_t; #define YMEM (GEN_SUBCLASS(4) | MEMORY) /* 256-bit vector SIB */ /* memory which matches any type of r/m operand */ -#define MEMORY_ANY (MEMORY | RM_GPR | RM_MMX | RM_XMM | RM_YMM) +#define MEMORY_ANY (MEMORY | RM_GPR | RM_MMX | RM_XMM | RM_YMM | RM_ZMM) /* special immediate values */ #define UNITY (GEN_SUBCLASS(0) | IMMEDIATE) /* operand equals 1 */ diff --git a/parser.c b/parser.c index afc422a..f7139f3 100644 --- a/parser.c +++ b/parser.c @@ -193,6 +193,51 @@ static void process_size_override(insn *result, int operand) } } +/* + * when two or more decorators follow a register operand, + * consecutive decorators are parsed here. + * the order of decorators does not matter. + * e.g. zmm1 {k2}{z} or zmm2 {z,k3} + * decorator(s) are placed at the end of an operand. + */ +static bool parse_braces(decoflags_t *decoflags) +{ + int i; + bool recover = false; + + i = tokval.t_type; + do { + if (i == TOKEN_OPMASK) { + if (*decoflags & OPMASK_MASK) { + nasm_error(ERR_NONFATAL, "opmask k%lu is already set", + *decoflags & OPMASK_MASK); + *decoflags &= ~OPMASK_MASK; + } + *decoflags |= VAL_OPMASK(nasm_regvals[tokval.t_integer]); + } else if (i == TOKEN_DECORATOR) { + switch (tokval.t_integer) { + case BRC_Z: + /* + * according to AVX512 spec, only zeroing/merging decorator + * is supported with opmask + */ + *decoflags |= GEN_Z(0); + break; + } + } else if (i == ',' || i == TOKEN_EOS){ + break; + } else { + nasm_error(ERR_NONFATAL, "only a series of valid decorators" + " expected"); + recover = true; + break; + } + i = stdscan(NULL, &tokval); + } while(1); + + return recover; +} + insn *parse_line(int pass, char *buffer, insn *result, ldfunc ldef) { bool insn_is_label = false; @@ -557,10 +602,12 @@ is_expression: int mref; /* is this going to be a memory ref? */ int bracket; /* is it a [] mref, or a & mref? */ int setsize = 0; + decoflags_t brace_flags = 0; /* flags for decorators in braces */ result->oprs[operand].disp_size = 0; /* have to zero this whatever */ result->oprs[operand].eaflags = 0; /* and this */ result->oprs[operand].opflags = 0; + result->oprs[operand].decoflags = 0; i = stdscan(NULL, &tokval); if (i == TOKEN_EOS) @@ -702,17 +749,37 @@ is_expression: recover = true; } else { /* we got the required ] */ i = stdscan(NULL, &tokval); + if (i == TOKEN_DECORATOR) { + /* + * according to AVX512 spec, only broacast decorator is + * expected for memory reference operands + */ + if (tokval.t_flag & TFLAG_BRDCAST) { + brace_flags |= GEN_BRDCAST(0); + i = stdscan(NULL, &tokval); + } else { + nasm_error(ERR_NONFATAL, "broadcast decorator" + "expected inside braces"); + recover = true; + } + } + if (i != 0 && i != ',') { nasm_error(ERR_NONFATAL, "comma or end of line expected"); recover = true; } } } else { /* immediate operand */ - if (i != 0 && i != ',' && i != ':') { - nasm_error(ERR_NONFATAL, "comma, colon or end of line expected"); + if (i != 0 && i != ',' && i != ':' && + i != TOKEN_DECORATOR && i != TOKEN_OPMASK) { + nasm_error(ERR_NONFATAL, "comma, colon, decorator or end of " + "line expected after operand"); recover = true; } else if (i == ':') { result->oprs[operand].type |= COLON; + } else if (i == TOKEN_DECORATOR || i == TOKEN_OPMASK) { + /* parse opmask (and zeroing) after an operand */ + recover = parse_braces(&brace_flags); } } if (recover) { @@ -856,6 +923,7 @@ is_expression: result->oprs[operand].indexreg = i; result->oprs[operand].scale = s; result->oprs[operand].offset = o; + result->oprs[operand].decoflags |= brace_flags; } else { /* it's not a memory reference */ if (is_just_unknown(value)) { /* it's immediate but unknown */ result->oprs[operand].type |= IMMEDIATE; @@ -891,6 +959,27 @@ is_expression: result->oprs[operand].type |= SDWORD; } } + } else if(value->type == EXPR_RDSAE) { + /* + * it's not an operand but a rounding or SAE decorator. + * put the decorator information in the (opflag_t) type field + * of previous operand. + */ + operand --; + switch (value->value) { + case BRC_RN: + case BRC_RU: + case BRC_RD: + case BRC_RZ: + case BRC_SAE: + result->oprs[operand].decoflags |= + (value->value == BRC_SAE ? SAE : ER); + result->evex_rm = value->value; + break; + default: + nasm_error(ERR_NONFATAL, "invalid decorator"); + break; + } } else { /* it's a register */ opflags_t rs; @@ -923,6 +1012,7 @@ is_expression: result->oprs[operand].type &= TO; result->oprs[operand].type |= REGISTER; result->oprs[operand].type |= nasm_reg_flags[value->type]; + result->oprs[operand].decoflags |= brace_flags; result->oprs[operand].basereg = value->type; if (rs && (result->oprs[operand].type & SIZE_MASK) != rs) diff --git a/regs.dat b/regs.dat index 57cef6a..742b69d 100644 --- a/regs.dat +++ b/regs.dat @@ -36,12 +36,17 @@ # # The columns are: # -# register name, assembler class, disassembler class(es), x86 register number +# register name, assembler class, disassembler class(es), x86 register number[, token flag] # # If the register name ends in two numbers separated by a dash, then it is # repeated as many times as indicated, and the register number is # updated with it. # +# If 'token flag' is present, this value will be assigned to tokflag field in +# 'struct tokendata tokendata[]' table. Token flag can be used for specifying +# special usage of corresponding register. E.g. opmask registers can be either +# enclosed by curly braces or standalone operand depending on the usage. +# # General-purpose registers al REG_AL reg8,reg8_rex 0 @@ -117,3 +122,11 @@ xmm1-15 XMMREG xmmreg 1 # AVX registers ymm0 YMM0 ymmreg 0 ymm1-15 YMMREG ymmreg 1 + +# AVX3 registers +zmm0 ZMM0 zmmreg 0 +zmm1-31 ZMMREG zmmreg 1 + +# Opmask registers +k0 OPMASK0 opmaskreg 0 +k1-7 OPMASKREG opmaskreg 1 TFLAG_BRC_OPT diff --git a/regs.pl b/regs.pl index 82c4829..52e5ca3 100755 --- a/regs.pl +++ b/regs.pl @@ -48,7 +48,7 @@ sub process_line($) { my($line) = @_; my @v; - if ( $line !~ /^\s*(\S+)\s*(\S+)\s*(\S+)\s*([0-9]+)$/i ) { + if ( $line !~ /^\s*(\S+)\s*(\S+)\s*(\S+)\s*([0-9]+)\s*(\S*)/i ) { die "regs.dat:$nline: invalid input\n"; } $reg = $1; diff --git a/stdscan.c b/stdscan.c index b7d8000..b5e389d 100644 --- a/stdscan.c +++ b/stdscan.c @@ -53,6 +53,8 @@ static char *stdscan_bufptr = NULL; static char **stdscan_tempstorage = NULL; static int stdscan_tempsize = 0, stdscan_templen = 0; +static int brace = 0; /* nested brace counter */ +static bool brace_opened = false; /* if brace is just opened */ #define STDSCAN_TEMP_DELTA 256 void stdscan_set(char *str) @@ -105,6 +107,40 @@ static char *stdscan_copy(char *p, int len) return text; } +/* + * a token is enclosed with braces. proper token type will be assigned + * accordingly with the token flag. + * a closing brace is treated as an ending character of corresponding token. + */ +static int stdscan_handle_brace(struct tokenval *tv) +{ + if (!(tv->t_flag & TFLAG_BRC_ANY)) { + /* invalid token is put inside braces */ + nasm_error(ERR_NONFATAL, + "%s is not a valid decorator with braces", tv->t_charptr); + tv->t_type = TOKEN_INVALID; + } else if (tv->t_flag & TFLAG_BRC_OPT) { + if (is_reg_class(OPMASKREG, tv->t_integer)) { + /* within braces, opmask register is now used as a mask */ + tv->t_type = TOKEN_OPMASK; + } + } + + stdscan_bufptr = nasm_skip_spaces(stdscan_bufptr); + + if (stdscan_bufptr[0] == '}') { + stdscan_bufptr ++; /* skip the closing brace */ + brace --; + } else if (stdscan_bufptr[0] != ',') { + /* treat {foo,bar} as {foo}{bar} + * by regarding ',' as a mere separator between decorators + */ + nasm_error(ERR_NONFATAL, "closing brace expected"); + tv->t_type = TOKEN_INVALID; + } + return tv->t_type; +} + int stdscan(void *private_data, struct tokenval *tv) { char ourcopy[MAX_KEYWORD + 1], *r, *s; @@ -112,14 +148,22 @@ int stdscan(void *private_data, struct tokenval *tv) (void)private_data; /* Don't warn that this parameter is unused */ stdscan_bufptr = nasm_skip_spaces(stdscan_bufptr); - if (!*stdscan_bufptr) + if (!*stdscan_bufptr) { + /* nested brace shouldn't affect following lines */ + brace = 0; return tv->t_type = TOKEN_EOS; + } /* we have a token; either an id, a number or a char */ if (isidstart(*stdscan_bufptr) || - (*stdscan_bufptr == '$' && isidstart(stdscan_bufptr[1]))) { + (*stdscan_bufptr == '$' && isidstart(stdscan_bufptr[1])) || + (brace && isidchar(*stdscan_bufptr))) { /* because of {1to8} */ /* now we've got an identifier */ bool is_sym = false; + int token_type; + + /* opening brace is followed by any letter */ + brace_opened = false; if (*stdscan_bufptr == '$') { is_sym = true; @@ -128,7 +172,8 @@ int stdscan(void *private_data, struct tokenval *tv) r = stdscan_bufptr++; /* read the entire buffer to advance the buffer pointer but... */ - while (isidchar(*stdscan_bufptr)) + /* {rn-sae}, {rd-sae}, {ru-sae}, {rz-sae} contain '-' in tokens. */ + while (isidchar(*stdscan_bufptr) || (brace && *stdscan_bufptr == '-')) stdscan_bufptr++; /* ... copy only up to IDLEN_MAX-1 characters */ @@ -143,7 +188,19 @@ int stdscan(void *private_data, struct tokenval *tv) *r = '\0'; /* right, so we have an identifier sitting in temp storage. now, * is it actually a register or instruction name, or what? */ - return nasm_token_hash(ourcopy, tv); + token_type = nasm_token_hash(ourcopy, tv); + + if (likely(!brace)) { + if (likely(!(tv->t_flag & TFLAG_BRC))) { + /* most of the tokens fall into this case */ + return token_type; + } else { + return tv->t_type = TOKEN_ID; + } + } else { + /* handle tokens inside braces */ + return stdscan_handle_brace(tv); + } } else if (*stdscan_bufptr == '$' && !isnumchar(stdscan_bufptr[1])) { /* * It's a $ sign with no following hex number; this must @@ -267,6 +324,35 @@ int stdscan(void *private_data, struct tokenval *tv) } else if (stdscan_bufptr[0] == '|' && stdscan_bufptr[1] == '|') { stdscan_bufptr += 2; return tv->t_type = TOKEN_DBL_OR; + } else if (stdscan_bufptr[0] == '{') { + stdscan_bufptr ++; /* skip the opening brace */ + brace ++; /* in case of nested braces */ + brace_opened = true; /* brace is just opened */ + return stdscan(private_data, tv); + } else if (stdscan_bufptr[0] == ',' && brace) { + /* + * a comma inside braces should be treated just as a separator. + * this is almost same as an opening brace except increasing counter. + */ + stdscan_bufptr ++; + brace_opened = true; /* brace is just opened */ + return stdscan(private_data, tv); + } else if (stdscan_bufptr[0] == '}') { + stdscan_bufptr ++; /* skip the closing brace */ + if (brace) { + /* unhandled nested closing brace */ + brace --; + /* if brace is closed without any content in it */ + if (brace_opened) { + brace_opened = false; + nasm_error(ERR_NONFATAL, "nothing inside braces"); + } + return stdscan(private_data, tv); + } else { + /* redundant closing brace */ + return tv->t_type = TOKEN_INVALID; + } + return stdscan(private_data, tv); } else /* just an ordinary char */ return tv->t_type = (uint8_t)(*stdscan_bufptr++); } diff --git a/tables.h b/tables.h index e6f84cb..d0db3b3 100644 --- a/tables.h +++ b/tables.h @@ -43,7 +43,6 @@ #include "compiler.h" #include <inttypes.h> #include "insnsi.h" /* For enum opcode */ -#include "opflags.h" /* For opflags_t */ /* --- From standard.mac via macros.pl: --- */ @@ -62,6 +61,8 @@ extern const char * const nasm_insn_names[]; /* regs.c */ extern const char * const nasm_reg_names[]; /* regflags.c */ +typedef uint64_t opflags_t; +typedef uint8_t decoflags_t; extern const opflags_t nasm_reg_flags[]; /* regvals.c */ extern const int nasm_regvals[]; diff --git a/tokens.dat b/tokens.dat index c2df469..1a00e3d 100644 --- a/tokens.dat +++ b/tokens.dat @@ -35,7 +35,7 @@ # Tokens other than instructions and registers # -% TOKEN_PREFIX, 0, P_* +% TOKEN_PREFIX, 0, 0, P_* a16 a32 a64 @@ -55,7 +55,7 @@ wait xacquire xrelease -% TOKEN_SPECIAL, 0, S_* +% TOKEN_SPECIAL, 0, 0, S_* abs byte dword @@ -73,13 +73,13 @@ tword word yword -% TOKEN_FLOAT, 0, 0 +% TOKEN_FLOAT, 0, 0, 0 __infinity__ __nan__ __qnan__ __snan__ -% TOKEN_FLOATIZE, 0, FLOAT_{__float*__} +% TOKEN_FLOATIZE, 0, 0, FLOAT_{__float*__} __float8__ __float16__ __float32__ @@ -89,7 +89,7 @@ __float80e__ __float128l__ __float128h__ -% TOKEN_STRFUNC, 0, STRFUNC_{__*__} +% TOKEN_STRFUNC, 0, 0, STRFUNC_{__*__} __utf16__ __utf16le__ __utf16be__ @@ -97,12 +97,26 @@ __utf32__ __utf32le__ __utf32be__ -% TOKEN_IFUNC, 0, IFUNC_{__*__} +% TOKEN_IFUNC, 0, 0, IFUNC_{__*__} __ilog2e__ __ilog2w__ __ilog2f__ __ilog2c__ -% TOKEN_*, 0, 0 +% TOKEN_*, 0, 0, 0 seg wrt + +% TOKEN_DECORATOR, 0, TFLAG_BRC | TFLAG_BRDCAST , BRC_1TO{1to*} +1to8 +1to16 + +% TOKEN_DECORATOR, 0, TFLAG_BRC, BRC_{*-sae} +rn-sae +rd-sae +ru-sae +rz-sae + +% TOKEN_DECORATOR, 0, TFLAG_BRC, BRC_* +sae +z diff --git a/tokhash.pl b/tokhash.pl index 6c05802..4ea387d 100755 --- a/tokhash.pl +++ b/tokhash.pl @@ -65,14 +65,14 @@ while (defined($line = <ID>)) { # Single instruction token if (!defined($tokens{$token})) { $tokens{$token} = scalar @tokendata; - push(@tokendata, "\"${token}\", TOKEN_INSN, C_none, I_${insn}"); + push(@tokendata, "\"${token}\", TOKEN_INSN, C_none, 0, I_${insn}"); } } else { # Conditional instruction foreach $cc (@conditions) { if (!defined($tokens{$token.$cc})) { $tokens{$token.$cc} = scalar @tokendata; - push(@tokendata, "\"${token}${cc}\", TOKEN_INSN, C_\U$cc\E, I_${insn}"); + push(@tokendata, "\"${token}${cc}\", TOKEN_INSN, C_\U$cc\E, 0, I_${insn}"); } } } @@ -85,8 +85,9 @@ close(ID); # open(RD, "< ${regs_dat}") or die "$0: cannot open $regs_dat: $!\n"; while (defined($line = <RD>)) { - if ($line =~ /^([a-z0-9_-]+)\s/) { + if ($line =~ /^([a-z0-9_-]+)\s*\S+\s*\S+\s*[0-9]+\s*(\S*)/) { $reg = $1; + $reg_flag = $2; if ($reg =~ /^(.*[^0-9])([0-9]+)\-([0-9]+)(|[^0-9].*)$/) { $nregs = $3-$2+1; @@ -104,7 +105,11 @@ while (defined($line = <RD>)) { die "Duplicate definition: $reg\n"; } $tokens{$reg} = scalar @tokendata; - push(@tokendata, "\"${reg}\", TOKEN_REG, 0, R_\U${reg}\E"); + if ($reg_flag eq '') { + push(@tokendata, "\"${reg}\", TOKEN_REG, 0, 0, R_\U${reg}\E"); + } else { + push(@tokendata, "\"${reg}\", TOKEN_REG, 0, ${reg_flag}, R_\U${reg}\E"); + } if (defined($reg_prefix)) { $reg_nr++; @@ -214,7 +219,8 @@ if ($output eq 'h') { print "struct tokendata {\n"; print " const char *string;\n"; print " int16_t tokentype;\n"; - print " int16_t aux;\n"; + print " int8_t aux;\n"; + print " int8_t tokflag;\n"; print " int32_t num;\n"; print "};\n"; print "\n"; @@ -270,6 +276,7 @@ if ($output eq 'h') { print "\n"; print " tv->t_integer = data->num;\n"; print " tv->t_inttwo = data->aux;\n"; + print " tv->t_flag = data->tokflag;\n"; print " return tv->t_type = data->tokentype;\n"; print "}\n"; } -- 1.7.9.5 |
From: H. P. A. <hp...@zy...> - 2013-08-05 20:56:12
|
On 08/04/2013 02:34 PM, Frank Kotler wrote: > Sorry to bother you guys again, but I wonder if anyone has time to look > at this: > http://forum.nasm.us/index.php?topic=1691.new#new > > In short, "-f macho64" is generating an error about 32-bit addresses. > "-f elf64" and "-f win64" seem to work. I can't see anything wrong with > the guy's code. Adding "bits 64" or "use64" as a section attribute seem > not to help... Help! > > Best, > Frank > default rel global _start _start: mov al,'*' mov byte [Msg+rsi],al ; ERROR The problem is that Msg here isn't PC-relative since it has an index. This is presumably illegal in MachO64. Instead he should: lea rdi,[Msg] mov [rdi+rsi],al Note lea, not mov... -hpa |
From: Frank K. <fbk...@my...> - 2013-08-04 22:58:25
|
Sorry to bother you guys again, but I wonder if anyone has time to look at this: http://forum.nasm.us/index.php?topic=1691.new#new In short, "-f macho64" is generating an error about 32-bit addresses. "-f elf64" and "-f win64" seem to work. I can't see anything wrong with the guy's code. Adding "bits 64" or "use64" as a section attribute seem not to help... Help! Best, Frank |
From: anonymous c. <nas...@us...> - 2013-07-17 04:36:38
|
The original goal of the "interminable macro recursion" message in expand_smacro() was to cope with a true case of smacro recursion. Here is a minimal (though meaningless) test case: %define hang h %+ a %+ n %+ g hang The message also triggers for tlines with lots of tokens because its code resides inside the tline loop. Given sufficient memory and time, a lengthy tline tends to complete just fine -- case in point: a test case with a little over a million tokens completes instantaneously on a modern system with a multi-GHz processor. By contrast, true recursion will eat all memory and then fail in e.g. nasm_malloc() -- and with lots of memory it is going to take quite some time to get there. This is what the message was meant for. Also, it was meant to be a suppressible warning rather than an error, so that code wouldn't be precluded from intentional recursion. |
From: Cyrill G. <gor...@gm...> - 2013-07-16 04:29:53
|
On Mon, Jul 15, 2013 at 06:33:35PM -0400, Frank Kotler wrote: > Hi list, > > Since you're looking at bugs, Cyrill, let me run this one by you. > This came to my attention from a question at "Stack Overflow" (why > would anyone call themselves that???). A guy was getting an > "interminable macro expansion" error message. He was trying to > include an insanely large (IMO) lookup table. I think the actual > problem was that he was putting everything on one insanely long > (IMO) line. I'll attach the Python script he was using. I don't know > Python, but was planning to try to "hack" the script to use multiple > lines but, well... I haven't gotten around to it. If anyone knows > Python, I imagine it's easy. I have confirmed that an include file > of similar size, but on multiple lines works fine. I have also > confirmed that we can assemble his file by increasing the value of > DEADMAN_LIMIT in preproc.c to 1<<31. I don't consider it a "bug" > that needs to be "fixed", but I promised him I'd discuss it with > y'all. Any thoughts? Hi Frank! I agree with Ed, that it's nasm abuse and lines are to be splitted. |
From: Ed B. <be...@mi...> - 2013-07-16 03:01:56
|
Frank Kotler wrote: > works fine. I have also confirmed that we can assemble his file by > increasing the value of DEADMAN_LIMIT in preproc.c to 1<<31. I don't > consider it a "bug" that needs to be "fixed", but I promised him I'd > discuss it with y'all. Any thoughts? Definitely nasm abuse. Tell the poster to break up the lines into human-sized chunks. A simple way to do this is to replace the ln_10_2() function with the following enhanced version (the four new lines are marked with # new line): import math def ieee754(x): import struct return hex(struct.unpack('>l', struct.pack('>f', x))[0]) def ln(x, n): ln_x = 0 for k in range(1, n): k = float(k) ln_x += float((1/(2*k-1)) * (((x-1)/(x+1))**(2*k-1))) return float(2*ln_x) def ln0_10_2(i): start = 0 step = 1 stop = 10**i n = 0 # new line with open('log_0x10.inc','w') as file: file.write('log2_table dd ') while (start != stop): start += step with open('log_0x10.inc','a') as file: if (n & 3) == 3: # new line file.write("\n dd "); # new line n += 1 # new line file.write('%s, ' % ieee754(math.log(start,2))) ln0_10_2(6) print ieee754(math.log(10**4,2)); ''' ln(x) = ln(a * 10 ^ i) x = a * 10 ^ i a = x / 10 ^ i 0.0005 = ''' |
From: Frank K. <fbk...@my...> - 2013-07-16 00:03:29
|
Hi list, Since you're looking at bugs, Cyrill, let me run this one by you. This came to my attention from a question at "Stack Overflow" (why would anyone call themselves that???). A guy was getting an "interminable macro expansion" error message. He was trying to include an insanely large (IMO) lookup table. I think the actual problem was that he was putting everything on one insanely long (IMO) line. I'll attach the Python script he was using. I don't know Python, but was planning to try to "hack" the script to use multiple lines but, well... I haven't gotten around to it. If anyone knows Python, I imagine it's easy. I have confirmed that an include file of similar size, but on multiple lines works fine. I have also confirmed that we can assemble his file by increasing the value of DEADMAN_LIMIT in preproc.c to 1<<31. I don't consider it a "bug" that needs to be "fixed", but I promised him I'd discuss it with y'all. Any thoughts? Best, Frank |
From: Cyrill G. <gor...@gm...> - 2013-07-15 21:35:58
|
We know that P_none = 0 thus instead of using for() statement assign them all in one memset call. Signed-off-by: Cyrill Gorcunov <gor...@gm...> --- Peter, was there some reason for not using memset here before? I'm working on a bug atm and found this code snipped meanwhile which I think worth to simplify. parser.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/parser.c b/parser.c index 99bbc25..a71dd33 100644 --- a/parser.c +++ b/parser.c @@ -201,7 +201,6 @@ insn *parse_line(int pass, char *buffer, insn *result, ldfunc ldef) int critical; bool first; bool recover; - int j; restart_parse: first = true; @@ -261,8 +260,8 @@ restart_parse: return result; } - for (j = 0; j < MAXPREFIX; j++) - result->prefixes[j] = P_none; + nasm_build_assert(P_none != 0); + memset(result->prefixes, P_none, sizeof(result->prefixes)); result->times = 1L; while (i == TOKEN_PREFIX || -- 1.8.1.4 |
From: Cyrill G. <gor...@gm...> - 2013-07-13 15:56:30
|
On Sat, Jul 13, 2013 at 08:40:35AM -0700, H. Peter Anvin wrote: > Sorry everyone, but I am trying to figure out if there is a problem with > the nasm-devel list. > :-) |
From: H. P. A. <hp...@zy...> - 2013-07-13 15:40:58
|
Sorry everyone, but I am trying to figure out if there is a problem with the nasm-devel list. -hpa |