From: Joe Forster/S. <st...@c6...> - 2017-12-29 10:44:08
|
I added to NASM: - 8086-friendliness; - memory models and languages (far/near procedures, parameter passing); - procedures (arguments, local variables, used registers); - smart instructions; - typed variables/arguments/labels; mostly taken from Turbo Assembler with as little changes as possible. Also: - package for standard integer types; - round out already existing functionality with what seemed logical. Chapters below: - Fixes: Bugs I found in the meantime. - Changes: Added functionality. - Technical: List of actual changes in each file. The changes were originally done to 2.13.01 but I reapplied them to 2.13.02 now. You can also find the whole source package, including the diff file quoted below, at http://sta.c64.org/shared/nasm-2.13.02-sta.zip . Fixes - x86/disp8.c: - evex_* variables changed from bool to int type (compiler warning). - asm/assemble.c: - matches(): Return "operand size mismatch" error only if instruction specifies size matching. - asm/nasm.c: - assemble_file(): At the beginning of each pass, the previous global label is (re)initialized to empty, to clear that from the previous pass. - include/iflag.h: - iflag_ffs(): Reversed scan direction. (It's supposed to find the highest bit set in the highest field, not the lowest field.) Changes - New smart instructions: - bswap replacement for 386: bswap eREGx -> xchg REGl, REGh ror eREGx, 16 xchg REGl, REGh - call replacement for far target in the same segment in 16/32-bit mode: call far LABEL -> push cs call near LABEL - enter (without operands) for 186: enter -> enter 0,0 - enter replacement for 8086: enter[ 0,0] -> push bp mov bp, sp (or) enter IMM8/IMM16,0 -> push bp mov bp, sp sub sp, IMM8/IMM16 - jcxz/jecxz/jrcxz for near target: jSIZEcxz LABEL -> jSIZEcxz .ZERO jmp short .NONZERO .ZERO: jmp near LABEL .NONZERO: - leave replacement for 8086: leave -> mov sp, bp pop bp - loop/loope/loopne/loopnz/loopz for near target: loopCC LABEL -> loopCC .NONZERO jmp short .ZERO .NONZERO: jmp near LABEL .ZERO: - push replacement for 8086 for constant source: push IMM16 -> push bp mov bp, sp mov word [bp][2], IMM16 pop bp (or) push IMM32 -> push bp push bp push bp mov bp, sp mov word [bp][4], word (IMM32 >> 16 ) mov word [bp][2], word (IMM32 & 0FFFFh) pop bp (or) push IMM64 -> add sp, 8 mov bp, sp mov word [bp][8], word ( IMM32 >> 48 ) mov word [bp][6], word ((IMM32 >> 32) & 0FFFFh) mov word [bp][4], word ((IMM32 >> 16) & 0FFFFh) mov word [bp][2], word ( IMM32 & 0FFFFh) pop bp - push replacement for 8086 for 32/64-bit memory source: push dword [ADDR] -> push word [ADDR][2] push word [ADDR] (or) push qword [ADDR] -> push word [ADDR][6] push word [ADDR][4] push word [ADDR][2] push word [ADDR] - push replacement for non-x86_64 for 64-bit constant source: push IMM64 -> push IMM64 >> 32 push IMM64 & 0FFFFFFFFh - push replacement for non-x86_64 for 64-bit memory source: push qword [ADDR] -> push dword [ADDR][4] push dword [ADDR] - push replacement for 32-bit constant source in 64-bit mode: push IMM32 -> push word (IMM32 >> 16 ) push word (IMM32 & 0FFFFh) (or) push dword [ADDR] -> push word [ADDR][2] push word [ADDR] - rcl/rcr/rol/ror/sal/sar/shl/shr replacement for 8086 for shifts more than 1: OPCODE OPERAND,N -> N times OPCODE OPERAND,1 - ret inside %proc: ret -> retn/retf [IMM8/IMM16] - New instruction aliases for forcing 16-32-bit operand size: - enterw/enterd - leavew/leaved - loopwCC/loopdCC - New operators: - |?: Conditional. (Note the pipe character in front of the question mark! A lone question mark is a valid identifier.) - Syntax: COND |? TRUE : FALSE - Result: TRUE, if COND is true; FALSE otherwise. - Precedence: Below ||, above wrt. - New preprocessor macros: - %ifconst: Check whether value is a constant. - Syntax: %[el]if[n]const ID [...] %endif - Result: Conditional block is included, if ID is (or isn't) a constant (numeric or string) value; skipped otherwise. - %ifreg: Check whether value is a register. - Syntax: %[el]if[n]reg ID [...] %endif - Result: Conditional block is included, if ID is (or isn't) a register name; skipped otherwise. - %procdesc/%proc/%endp: Declare/start/end procedure. - Syntax: %proc[desc] NAME[ ATTRIB...][ARG...] [...] %endp[ NAME] - Result: Declare/define procedure NAME, with ATTRIB attributes and ARG arguments. - Arguments: - See %arg for syntax. - Names for arguments are not required but types are. If you omit names, you can add them later in %arg's. Thus any of the following syntax is valid for defining procedure PROC with an argument called ARG of type TYPE: %procdesc PROC[...] ARG:TYPE %proc PROC (or) %procdesc PROC[...] :TYPE %proc PROC ARG (or) %proc PROC ARG:TYPE (or) %proc PROC :TYPE %arg ARG (or) %proc PROC %arg ARG:TYPE - Attributes: - [no_]auto_enter: Insert (or don't insert) %__enter/%__leave automatically. Global default: no_auto_enter. - enter_style=STYLE: Convert (or don't convert) "enter" and "leave" instructions to simpler/shorter instructions according to STYLE. Possible STYLEs: - normal: Both instructions are kept unchanged. - bp: 8086-compatible replacements (see smart instructions above). - short: "leave" instruction is kept unchanged; "enter 0,0" is converted to 8086-compatible replacement (also 1 byte shorter), otherwise kept unchanged. (Like Turbo/Borland Pascal.) Global default: enter_style=normal or, for 8086, enter_style=bp. - near/far: Force near/far. Default from: __FARCODE__; local copy: %$__FARCODE__. - use16/use32/use64: Force bits. Default from: __BITS__; local copy: %$__BITS__. - LANGUAGE: Force language interface to LANGUAGE. Defaults from: __LANGUAGE__ and __LANGUAGE_NUM__; local copies: %$__LANGUAGE__ and %$__LANGUAGE_NUM__. - Notes: - If NAME is "__global__", changes global procedure attributes. (Attributes near/far/use*/LANGUAGE override MODEL and BITS for all procedures, although this is not recommended.) - If a procedure has been declared with %procdesc, its definition - a combination of %proc and %arg - must be identical, otherwise an error occurs. A special case is if a %procdesc declaration is followed by an empty %proc definition and no %arg's: the procedure inherits all attributes and arguments from the declaration. - Automatically pushes/pops "proc" context; must not be pushed or popped manually. - Automatically pushes/pops previous global label. - Inherits global attributes into context-specific macros, which can be overriden. - Automatically sets %stacksize; %stacksize is ignored with a warning. - Automatically initializes %$localsize and %$argsize to 0. - Procedures have implicit types: 0xFFFE for far and 0xFFFF for near (like proc in TASM). - The optional procedure name of %endp must match that of the previous %proc, otherwise an error occurs. - Procedures cannot be nested. - Pascal procedures are overriden to always far. - Fills %$__SP__/%$__BP__ with name of stack pointer/base pointer register. - %__enter/%__leave: Enter/leave procedure. - Syntax: %__enter/%__leave - Result: Prolog/epilog emitted for entering/leaving procedure. - Notes: - The "enter"/"leave" instructions (without operands) in procedures are replaced by these automatically. - Prolog/epilog is emitted only if procedure language is other than no language and if there are arguments and/or local variables. - Both are inserted automatically into instruction stream if "auto_enter" attribute is enabled for procedure: - %__enter: before the first instruction/label; - %__leave: before each "ret"/"%__ret". (Note: Multiple ret's are allowed and result in multiple epilogs.) - Actual "enter"/"leave" instructions replaced according to the enter_style attribute of the procedure (see above). - Actual "enter"/"leave" instructions are forced to "enterw"/"leavew" or "enterd"/"leaved" if procedure bits is different from global bits. - %__enter saves the current procedure into procdefs. - %__ret: Return from procedure. - Syntax: %__ret - Result: Converted to "retn/retf[ IMM8/IMM16]", according to procedure attributes: near/far, total size of arguments and language. - Note: The "ret" instruction (without operands) in procedures is replaced by this automatically. - %__call: Call procedure with arguments. (See "call" below.) !!!TODO: Unfinished, don't use!!! - %uses: Save/restore registers/variables upon entering/leaving procedure. - Syntax: %uses VALUE[[,] VALUE...] - Result: Emits instructions for saving VALUEs at the end of the prolog and instructions for restoring them, in reverse order, at the beginning of each epilog. - Notes: - If first VALUE is preceded by the "distinct" keyword, duplicate VALUEs are omitted from the list. - If procedure language is no language, a warning is given (TASM doesn't allow "uses" at all in such cases). - %setlabel: Set previous global label. - Syntax: %setlabel LABEL - Result: The previous global label is set to LABEL. - Note: Macros in LABEL are expanded. - %setsize/%settype: Force data size (in bytes)/type of label/macro. - Syntax: %setsize/%settype ID SIZE/TYPE - Result: Internal data size/type of ID is set to SIZE/TYPE. - %strstr: Search for substring. - Syntax: %strstr VAR STR[,] SUBSTR - Result: VAR is set the position of first instance of SUBSTR in STR; VAR is set to 0, if SUBSTR is not part of STR. - %numstr: Convert numeric constant into string format. - Syntax: %numstr VAR NUM[[,] RADIX[[,] FORMAT] - Result: VAR is set to NUM in string format in RADIX radix; if RADIX is missing, it is considered 10. FORMAT is in C-style syntax: [%][flags][width][prec][type] where: - %: optional, for C-style; - flags: - -: justify to the left, pad on the right (by default, justify to the right, pad on the left); - +: force sign (- for negative, + otherwise) (by default, only negative signed is prepended); - space: force place of sign (- for negative, space otherwise); - 0: pad with zeros (default is spaces); - width: - NUM: pad to at least NUM characters; - prec: - .NUM: cut to at most NUM characters (after padding); - type: - lowercase letter: convert digits above 9 (A etc.) to lowercase; - uppercase letter: convert digits above 9 (A etc.) to uppercase. - New intrinsic macros: - __PROC__: Name of current procedure, in unquoted indentifier (not quoted string!) form. Empty outside %proc. - __LABEL__: Name of previous global label, in unquoted indentifier form. - __CPU__: CPU, as set by cpu, in numeric form. Values: - bits 0-7: 0 = 8086, ... 4 = 486, 5 = Pentium, 6 = P6, ... 12 = Sandybridge, 255 = future; - bit 8: 1 = x64; - bit 8: 1 = IA64; - __CPU_VENDOR__: CPU vendor, in identifier form. Currently cannot be set. Values: AMD, Cyrix or Intel. - __CPU_SMART__: Whether smart instructions are enabled. Values: 0 = disabled, 1 = enabled. - __MODEL__: Memory model, as set by model, in indentifier form: Values: tiny, small, medium, compact, large, huge. - __MODEL_NUM_: Memory model, in numeric form (like @Model in TASM). Values: 1 = tiny, 2 = small, 3 = medium, 4 = compact, 5 = large, 6 = huge. - __LANGUAGE__: Language interface, as set by model, in identifier form. Values: nolanguage, c, syscall, stdcall, pascal, fortran, basic, prolog, cpp. - __LANGUAGE_NUM__: Language interface, in numeric form (like @Interface in TASM). Values: 0 = nolanguage, 1 = c, 2 = syscall, 3 = stdcall, 4 = pascal, 5 = fortran, 6 = basic, 7 = prolog, 8 = cpp. - __FARCODE__: Code pointer size, in numeric form (like @CodeSize in TASM). Currently cannot be set. Values: 0 = near, 1 = far. - __FARDATA__: Data pointer size, in numeric form (like @DataSize in TASM). Currently cannot be set. Values: 0 = near, 1 = far, 2 = huge. - __INT_T__: Native integer type accordingly to __BITS__: 8 = byte, 16 = word, 32 = dword, 64 = qword. - __DEF_INT_T__: Define native integer type accordingly to __BITS__: 8 = db, 16 = dw, 32 = dd, 64 = dq. - __RES_INT_T__: Reserve native integer type accordingly to __BITS__: 8 = resb, 16 = resw, 32 = resd, 64 = resq. - __CODEPTR__: Code pointer type (similar to codeptr in TASM). Values: word = 16-bit near, dword = 16-bit far or 32-bit near, fword = 32-bit far, qword = 64-bit near/far. - __DEF_CODEPTR__/__RES_CODEPTR__: Define/reserve code pointer type. - __DATAPTR__: Data pointer type (similar to dataptr in TASM). Values like __CODEPTR__. - __DEF_DATAPTR__/__RES_DATAPTR__: Define/reserve data pointer type. - New intrinsic functions: - __sizeof__/__typeof__(): Determine data size/type of argument. - Syntax: __sizeof__/__typeof__(SYMBOL) - Result: Data size (in bytes)/type of SYMBOL. If SYMBOL is: - a register, then its size/type; - a type, then its own size/type; - a single-line macro with no arguments, then expanded to its first token and retried recursively; - a label or structure type member (technically just a local label), then size/type associated with it; - a numerical constant, then the smallest numerical data type that the constant value fits into; - a procedure, then 0xFFFE size and "far" type for far or 0xFFFF size and "near" type for near. - Notes: - Size and type associated with identifiers are maintained internally and are set implicitly for: - labels: generated automatically from following D*/RES*; - %arg/%local: preprocessor generates __settype__() type override prefix automatically; - structure types: struc macro sets via %setsize/%settype macros; - structure instances: istruc/resistruc macros set via %setsize/%settype macros; or explicitly for: - single-line macros: via %setsize/%settype macros. - Setting type implicitly also sets size. - For an unknown/unset type, type is determined from size (if known/set): 1 = byte, 2 = word, 4 = dword, 6 = fword, 8 = qword, 10 = tword, 16 = oword, 32 = yword, 64 = zword. - __str__(): Stringize. - Syntax: __str__(SYMBOL[ SYMBOL...][, STYLE]) - Result: SYMBOLs enclosed into quotation marks according to STYLE. - Notes: - Multiple SYMBOLs are separated in the resulting string with a single space each. - STYLE must be a string. If STYLE is specified, starting quotation mark is the first, ending is the second character of SYMBOL; If STYLE consists of a single character, ending quotation mark equals starting; if STYLE is empty or not specified, double quotation marks are used as default. - __tok__(): Tokenize. - Syntax: __tok__(STRING) - Result: STRING unquoted to a token or (!!!TODO!), if STRING contains spaces, multiple tokens. - __proc_args__(): List of procedure argument types. - Syntax __proc_args__(PROC) - Result: Tokenized list of types of arguments of PROC procedure, separated by commas. Special type is vararg. - resistruc: Reserve structure instance. - Syntax: INSTANCE resistruc TYPE - Result: At label INSTANCE, bytes are reserved for the size of TYPE. (Default values for structure members may not be specified. AT does not work. IEND is not necessary and causes an error instead.) - New macros: - pushlabel/poplabel: Push/pop previous global label to/from the context stack (context "prevlabel"). To declare global labels without affecting local labels. - New directives: - model: Define memory model and language interface. - Syntax: model MODEL[, LANGUAGE] - Result: __MODEL__ is set to MODEL; __LANGUAGE__ is set to LANGUAGE. - smart: Enable smart instructions. - nosmart: Disable smart instructions. (Note: Primitive forms - those enclosed into brackets - are always available, user-level forms - without brackets - only after %use tasm.) - New syntax: - %arg/%local: - Syntax: %arg/%local NAME:[ ]TYPE[, NAME...] - Result: Declares an argument/local variable called NAME of type TYPE. TYPE may also be a struc or anything that has a size/type (see __sizeof__/__typeof__()). - Notes: - All arguments and local variables are truly local to the procedure: they are undefined upon %endp. - New type: vararg: - Only last argument may be vararg. Only one such argument is allowed; none are allowed if procedure language is Pascal. - Vararg argument has no implicit type or size. - Macros in TYPE are expanded. - Adds implicit type and size to variable. - Whitespaces are allowed after colon. - Error if %arg is inside %proc and no language has been defined yet. - %arg maintains total size of arguments (in bytes) in %$argsize, %local in %$localsize. - %substr: - Syntax: %substr VAR STR[,] START[, NUM] - Note: A comma is allowed between the first and second arguments. - []: Memory references can be pasted together. - Syntax: [ADDR1][ADDR2][...] - Result: A memory reference whose offset is the sum of offsets, bases, indexes and scales in ADDR's; segment and wrt from the first ADDR that has any; recombines bases, indexes and scales into optimal code. - Note: This helps with adding offsets to the arguments of macros, e.g. on a 8086: inc dword [VAR] -> inc word [VAR][0] jne .SKIP inc word [VAR][2] .SKIP: (rather than the original syntax) inc word [VAR+0] jne .SKIP inc word [VAR+2] .SKIP: - +: The sum of values inherits the data size of the value to the right, if it has one; otherwise that of the value to the left. E.g.: !!!TODO - at: Local labels (structure members) are forced to be those of the structure type (not the structure instance). - call/jmp far ADDR: Forces segment to current segment. (Causes segment fixup, not supported by bin output format.) - call/jmp far [ADDR]: Default operand size is: 16-bit = word (far turns it into dword), 32-bit = dword (fword), 64-bit = qword. - call PROC[,ARG...]: Procedure call with arguments. - Checks number and type of arguments against procedure definition (%proc or %procdesc): - If an argument of the procedure is of type vararg (available as the last argument for non-Pascal procedures only), any number (including zero) and type of actual arguments may be passed in its place. - The type of actual arguments is forced (promoted or demoted) to that in the procedure definition: - Automatic promotion/demotion is: - possible for: - constants; - memory references (!); - but not possible for: - registers; - "strict" types. - If the type is unknown (or vararg), the global integer type (bases on __BITS__) is forced (like in C variable arguments). - If the forced type is smaller than the stack size of the procedure (e.g. byte in 16-bit mode), it is further promoted to the stack size. - Automatically generates: - Pushing arguments onto the stack. - Calling the procedure (without arguments). - Popping arguments from the stack. (Non-Pascal procedures only; Pascal procedures pop arguments themselves.) - '...' and "..." strings: Do not necessarily end at the first matching quote character; two quote characters in the string are replaced by a single quote character. (!!!TODO: Unfortunately, also activates on tokens pasted together with %+.) - New packages: - stdint: Standard integer package. Defines: - Constants: - CHAR_BIT = 8 - false/true = 0/1 - CHAR_MIN/CHAR_MAX = -80h/7Fh - UCHAR_MIN/UCHAR_MAX = 0/0FFh - WCHAR_MIN/WCHAR_MAX = 0/0FFFFh - INT8_MIN/INT8_MAX/UINT8_MAX = -80h/7Fh/0FFh - INT16_MIN/INT16_MAX/UINT16_MAX = -8000h/7FFFh/0FFFFh - INT32_MIN/INT32_MAX/UINT32_MAX = -80000000h/7FFFFFFFh/0FFFFFFFFh - INT64_MIN/INT64_MAX/UINT64_MAX = -8000000000000000h/7FFFFFFFFFFFFFFFh/0FFFFFFFFFFFFFFFFh - INT_MIN/INT_MAX - Types: - regl_t = byte - regx_t = word - eregx_t = dword - rregx_t = qword - bool = regx_t - bool8_t/bool16_t/bool32_t/bool64_t - [u]char = byte - wchar = word - [u]int = __INT_T__ - [u]int8_t/[u]int16_t/[u]int32_t/[u]int64_t - off_t = dword - size_t = dword - size64_t = qword - void@ = __DATAPTR__ - proc@ = __CODEPTR__ - Functions: - sizeof/typeof() = __sizeof__/__typeof__() - Macros: - D*/RES* for all types. - tasm: TASM compatibility package. Defines: - Automatic settings: - Automatic entering/leaving of procedures enabled. - Smart instructions enabled. - CPU forced to 8086. (Higher CPU's must be set explicitly.) - Types: - codeptr = __CODEPTR__ - dataptr = __DATAPTR__ - Functions: - size/type() = __sizeof__/__typeof__() - Macros: - offset = (empty) - ptr = (empty) - @WordSize = (__BITS__ / 8) - @32Bit = (__BITS__ / 32) - @64Bit = (__BITS__ / 64) - @CodeSize = __FARCODE__ - @DataSize = __FARDATA__ - @Model = __MODEL_NUM__ - @ModelStr = __MODEL__ - @Interface = __LANGUAGE_NUM__ - @InterfaceStr = __LANGUAGE__ - [.|p]XX86[n|p] = cpu XX - .code/codeseg[ NAME] = section [.]text[ NAME] (dot is missing in obj output format, present otherwise) - [.]const = section [.]const - .data/dataseg = section [.]data - .data?/udataseg = section [.]bss - [.]stack[ SIZE] = section [.]stack stack [SIZE][; resb SIZE] - segcs/segds/seges/segss/segfs/seggs = cs/ds/es/ss/fs/gs - retcode = ret (inside %proc) or retn/retf (outside %proc, according to __FARCODE__) Technical - asm/assemble.c: - Added instruction prefixes: - 0327 (o32!) - 0372 (rlen) - calcsize(): - Added instruction prefixes: - 0327 - 0372 - emit_prefix(): Added P_PUSHCS. - gencode(): - Added instruction prefixes: - 0327 - 0372 - matches(): Return "operand size mismatch" error only if instruction specifies size matching. - size_name(): Moved to asm/preproc.c. - asm/directiv.c: - New variables and functions: - get_model() - static int memmodel_farcode - static int memmodel_fardata - process_directives(): - Added support for directives: - model - smart - nosmart - [CPU]: Keeps smart flag. - get_bits(): Changed to non-static. - asm/directiv.dat: - Added directives: - model - smart - nosmart - asm/eval.c: - expr0(): Renamed to expr0a(), replaced with |?: processing. - addtotemp(): Added size argument. - finishtemp(): Added moving to the first expression the size of last expression with a forced size. - expr6(): Added __settype__() type override prefix. - asm/labels.c: - union label: Added typesize, typename and size members. - lookup_label(): Added typesize, typename and size arguments. - get_prevlabel(): New function. - set_prevlabel(): New function. - set_label_typename(): New function. - set_label_size(): New function. - redefine_label(): Added initialization of typesize, typename and size members. - define_label(): Added initialization of typesize, typename and size members. - asm/nasm.c: - preproc: Changes to non-static. (Declared in nasm.h.) - assemble_file(): - Added DF/RESF. - Changed computation of size according to original opcode. (Not any RES* converted to RESB.) - Added setting the size of label to that of the first D*/RES* operand. - asm/parser.c: - process_type_override(): New function. - process_size_override(): - Added force argument. - Added FWORD type. - size_to_opflags(): New function. - parse_line(): - operstrs: Operands in original string format. - Added initialization of orig_opcode, typesize, typename, size and more_oprs members. - Parses up to 256 operands; above MAX_OPERANDS, collected into dynamically allocated more_oprs member. - Additional "__noargs__" specifier for calling procedures without arguments. (Overrides checking the number and type of arguments.) - D*/INCBIN: - Added initialization of typesize member and calculation of size member. - Memory references: - Added [ADDR1][ADDR2][...] syntax for pasting. - Added fetching size from expression. - Added __settype__() type override prefix. - Added ZERO, FWORD, SBYTE and UWORD operand types. - RES*: Added calculation of typesize, typename and size members. - Added default operand size for CALL/JMP FAR indirect. - Added automatic entering/leaving of procedures: - enter before the first instruction/label; - leave before every ret. Explicit enter's/leave's (without operands) are allowed and are replaced accordingly to procedure-specific settings. - Label is defined only if an implicit automatic enter hasn't been generated yet. (Automatic enter comes first, then the label.) - Added smart instructions: - call far PROC -> push cs call near PROC - rol/ror/shl/shr OP,N -> times N rol/ror/shl/shr OP,1 - enter (in %proc) -> %__enter - leave (in %proc) -> %__leave - Added making the implicit segment of CALL/JMP FAR explicit. (Causes segment fixup, not supported by bin output format.) - cleanup_insn(): - Uninitializes more_oprs member. - asm/pptok.dat: - Added macros: - *const - *reg - %endp - %numstr - %proc - %setlabel - %setsize - %settype - %strstr - %uses - %__enter - %__leave - %__ret - asm/pptok.dat: - Added allowing underscores to be part of token names. - asm/preproc.c: - New variables: - extern iflag_t cpu - static char *memmodel_names[] - static char *langint_names[] - struct proc procglobal - struct proc proclocal - static struct hash_table procdefs - New functions: - set_Token_type() - set_stack_size() - get_pointer_size() - stack_pointer_name() - ctx_push() - push_istk() - push_istk_sprintf() - constant_size() - save_proclocal() - IsPascalCallConvention() - typeof_sizeof() - pp_putline() - New macros: - skip_white_multi_() - StackSize, StackPointer, ArgOffset, LocalOffset: Moved to struct proc. - StackPointer: Renamed to StackBasePointer. - struct SMacro: Added typesize, typename and size members. - struct Token: Added typesize, typename and size members. - hash_findix(): Changed to non-static. (Exported from nasm.h header.) - parse_size(): - Changed to non-static. (Exported from nasm.h header.) - Added FWORD and ZWORD types. - size_name(): Moved from asm/assemble.c here. - tokenize(): Added |?:. - ppscan(): Added |?:. - if_condition(): - Added %ifconst and %ifreg. - Changed checks for comma token into calls of tok_is_() and skips of whitespaces to calls of skip_white_multi_(). - do_directive(): - Permit comma separators in %substr. - Added %proc, %endp, %uses, %strstr, %numstr, %setlabel, %setsize, %settype, %__enter, %__leave and %__ret. - %arg/%local: - Syntax: %arg/%local VAR:[ ]TYPE - Adds implicit type and size to variable. - Added vararg argument type. - Whitespaces are allowed after colon. - Error if %arg is inside %proc and no language has been defined yet. - Merged code. - Changed snprintf() + do_directive() calls into faster define_smacro() calls. - %stacksize: Most code moved into set_stack_size() function. - %push: Most code moved into ctx_push() function. - %[i][x]define: - Added __settype__() type override prefix. - %[i]defstr: Changed allocation of macro_start from nasm_malloc() to new_Token(). - %pathsearch: Changed allocation of macro_start from nasm_malloc() to new_Token(). - %strlen: Changed allocation of macro_start from nasm_malloc() to new_Token(). - %strcat: Changed allocation of macro_start from nasm_malloc() to new_Token(). - %[i]assign: Changed allocation of macro_start from nasm_malloc() to new_Token(). - Changed checks for comma token into calls of tok_is[nt]_() and skips of whitespaces to calls of skip_white_multi_(). - find_cc(): Changed checks for comma token into calls of _tok_is(). - is_mmacro(): Added warning if recursion is not allowed (maximum depth is 0). - define_smacro(): Added initialization of typesize, typename and size members. - line_from_stdmac(): Added call to set_Token_type(). - expand_mmac_params_range(): Added calls to set_Token_type(). - expand_mmac_params(): Added call to set_Token_type(). - pp_init(): Added initialization of procdefs. - expand_smacro(): - Added intrinsic macros: - __PROC__ - __LABEL__ - __CPU__ - __CPU_VENDOR__ - __CPU_SMART__ - __MODEL__ - __MODEL_NUM__ - __LANGUAGE__ - __LANGUAGE_NUM__ - __FARCODE__ - __FARDATA__ - __INT_T__ - __DEF_INT_T__ - __RES_INT_T__ - __CODEPTR__ - __DEF_CODEPTR__ - __RES_CODEPTR__ - __DATAPTR__ - __DEF_DATAPTR__ - __RES_DATAPTR__ - __typeof__() - __sizeof__() - __str__() - __tok__() - __proc_args__() - Added calls to set_Token_type(). - expand_mmacro(): - Added calls to set_Token_type(). - Changed checks for comma token into calls of _tok_is(). - pp_getline(): - Added an extra condition to distinguish %rep blocks from actual macro calls. - Added calls to set_Token_type(). - Added automatic insertion of "%enter" and "%leave" tokens, if needed. - make_tok_num(): Added initialization of typesize, typename and size members. - nasmpp: Added pp_putline(). - asm/preproc.c: - nop_putline(): New function. - preproc_nop: Added nop_putline(). - asm/quote.c: - nasm_unquote: Added termination of strings quoted with backquotes. - asm/stdscan.c: - stdscan(): Added |?:. - asm/tokens.dat: - Added FWORD type. - Added macro group TYPEFUNC, with tokens: - __settype__ - __typeof__ - __sizeof__ - __tok__ - __str__ - __proc_args__ - __noargs__ - asm/tokhash.pl: - nasm_token_hash(): Added initialization of t_size and t_typename members. - common/common.c: - New global variables: - enum memory_models memmodel - idata_bytes(): - Added DF. - resv_bytes(): - Added RESF. - disasm/disasm.c: - Added translation of bl/bx/ebx/rbx registers. - Added support for ZERO operand type. - include/iflag.h: - iflag_cmp_cpu_level(): Added test for SMART flag. - include/labels.h: - lookup_label(): Added size and typename arguments. - get_prevlabel(): New function. - set_prevlabel(): New function. - set_label_typename(): New function. - set_label_size(): New function. - include/nasm.h: - enum token_type: Added TOKEN_TERNARY (|?:) and TOKEN_TYPEFUNC. - enum prefixes: Added P_PUSHCS. - enum prefix_pos: Added PPS_PUSHCS. - enum special_tokens: Added S_FWORD. - struct preproc_ops: Added putline member. - struct tokenval: Added t_typesize, t_typename and t_size members. - struct expr: Added typesize member. - struct insn: Added orig_opcode, typesize, typename, size and more_oprs members. - Added types, global variables and functions: - extern const struct preproc_ops *preproc - enum typefunc - enum memory_models - extern enum memory_models memmodel - enum language_interfaces - extern enum language_interfaces langint - enum proc_flags - struct procarg - struct proc - Moved into it: StackSize, Stack[Base]Pointer, ArgOffset and LocalOffset. - extern struct proc procglobal - extern struct proc proclocal - extern struct hash_table procdefs - parse_size() - size_name() - size_to_opflags() - opflags_to_size() - get_bits() - get_langint() - hash_findix() - get_pointer_size() - stack_pointer_name() - IsPascalCallConvention() - TY_FWORD - OPFLAG_NOARGS - include/opflags.h: - Added defines for operand types: - REG_BASE (bl/bx/ebx/rbx) - REG_BL - REG_BX - REG_EBX - REG_RBX - ZERO (operand equals 0) - SBYTE (operand between -128..127, fits into a signed byte) - UWORD (operand between 0..0xFFFF, fits into an unsigned word) - BITS48 (fword) - BITSxxx (BITS8 | ... | BITS512) - SIZE_BITS: Changed to 12. - macros/standard.mac: - Added intrinsic macros: - __PROC__ - __LABEL__ - __CPU__ - __CPU_VENDOR__ - __CPU_SMART__ - __MODEL__ - __MODEL_NUM__ - __LANGUAGE_NUM__ - __FARCODE__ - __FARDATA__ - __INT_T__ - __DEF_INT_T__ - __RES_INT_T__ - __CODEPTR__ - __DEF_CODEPTR__ - __RES_CODEPTR__ - __DATAPTR__ - __DEF_DATAPTR__ - __RES_DATAPTR__ - __typeof__() - __sizeof__() - __str__() - __tok__() - __proc_args__() - Added macros: - quoteid() - Added TASM-specific macros: - MASM (empty) - Removed TASM-specific macros (are defined in TASM compatibility package instead): - P386 - P486 - P586 - endstruc: Added setting type and size of structure type to itself. - istruc: Changed last label to the name of structure instance so that local labels (structure members) are named after structure instance. - iend: Added setting type and size of structure instance to structure type. - resistruc: New macro. - macros/stdint.mac: New file. - macros/tasm.mac: New file. - Mkfiles/openwcom.mak: - Enclosed path all -Ipath options into quotation marks (compiler error under Windows). (Suggested here: https://forum.nasm.us/index.php?topic=2355#msg10563 .) - x86/insns.dat: - Added smart instructions. - Added instruction aliases. - Added DF and RESF. - x86/insns.pl: - Added instruction prefixes: - rlen (0372): register size. - o32! (0327): force 32-bit operand. - x86/insns-iflags.ph: - Added SMART flag (92). - x86/regs.dat: - Moved bl/bx/ebx/rbx registers into their own group rather than general non-accumulator group. Signed-off-by: Joe Forster/STA <st...@c6...> (yes, pseudonym, please ;-) ) --- --- asm/assemble.c.bak 2017-11-29 20:44:08.000000000 +0100 +++ asm/assemble.c 2017-12-28 01:00:00.000000000 +0100 @@ -146,6 +146,8 @@ * \325 nohi instruction which always uses spl/bpl/sil/dil * \326 nof3 instruction not valid with 0xF3 REP prefix. Hint for disassembler only; for SSE instructions. + * \327 o32! forces 32-bit operand size: assemble 0x66 if bits==16; + * do nothing otherwise * \330 a literal byte follows in the code stream, to be added * to the condition code value of the instruction. * \331 norep instruction not valid with REP prefix. Hint for @@ -168,6 +170,8 @@ * \367 address-size prefix (0x67) used as opcode extension * \370,\371 jcc8 match only if operand 0 meets byte jump criteria. * jmp8 370 is used for Jcc, 371 is used for JMP. + * \372 rlen assemble 0x02 if bits==16, 0x04 if bits==32; + * used for register-sized offsets * \373 jlen assemble 0x03 if bits==16, 0x05 if bits==32; * used for conditional jump over longer jump * \374 vsibx|vm32x|vm64x this instruction takes an XMM VSIB memory EA @@ -266,30 +270,6 @@ static void assert_no_prefix(insn * ins, prefix_name(ins->prefixes[pos])); } -static const char *size_name(int size) -{ - switch (size) { - case 1: - return "byte"; - case 2: - return "word"; - case 4: - return "dword"; - case 8: - return "qword"; - case 10: - return "tword"; - case 16: - return "oword"; - case 32: - return "yword"; - case 64: - return "zword"; - default: - return "???"; - } -} - static void warn_overflow(int size) { nasm_error(ERR_WARNING | ERR_PASS2 | ERR_WARN_NOV, @@ -1120,6 +1100,11 @@ static int64_t calcsize(int32_t segment, case 0326: break; + case 0327: + if (globalbits == 16) + length++; + break; + case 0330: codes++, length++; break; @@ -1185,6 +1170,7 @@ static int64_t calcsize(int32_t segment, case 0371: break; + case 0372: case 0373: length++; break; @@ -1413,6 +1399,9 @@ static int emit_prefix(struct out_data * case P_WAIT: c = 0x9B; break; + case P_PUSHCS: + c = 0x0E; + break; case P_LOCK: c = 0xF0; break; @@ -1801,6 +1790,11 @@ static void gencode(struct out_data *dat case 0326: break; + case 0327: + if (globalbits == 16) + out_rawbyte(data, 0x66); + break; + case 0330: out_rawbyte(data, *codes++ ^ get_cond_opcode(ins->condition)); break; @@ -1852,7 +1846,12 @@ static void gencode(struct out_data *dat out_rawbyte(data, c - 0366 + 0x66); break; - case3(0370): + case 0370: + case 0371: + break; + + case 0372: + out_rawbyte(data, bits == 16 ? 2 : 4); break; case 0373: @@ -2346,6 +2345,7 @@ static enum match_result matches(const s for (i = 0; i < itemp->operands; i++) { if (!(itemp->opd[i] & SIZE_MASK) && + size[i] && (instruction->oprs[i].type & SIZE_MASK & ~size[i])) return MERR_OPSIZEMISMATCH; } --- asm/directiv.c.bak 2017-11-29 20:44:08.000000000 +0100 +++ asm/directiv.c 2017-12-28 01:00:00.000000000 +0100 @@ -104,7 +104,7 @@ static iflag_t get_cpu(char *value) return r; } -static int get_bits(char *value) +int get_bits(char *value) { int i; @@ -131,6 +131,63 @@ static int get_bits(char *value) return i; } +static enum memory_models get_model(char *value, int len, bool error) +{ + enum memory_models i; + + if (!nasm_strnicmp(value, "tiny", len)) + i = MODEL_TINY; + else if (!nasm_strnicmp(value, "small", len) || + !nasm_strnicmp(value, "flat", len)) + i = MODEL_SMALL; + else if (!nasm_strnicmp(value, "medium", len)) + i = MODEL_MEDIUM; + else if (!nasm_strnicmp(value, "compact", len)) + i = MODEL_COMPACT; + else if (!nasm_strnicmp(value, "large", len)) + i = MODEL_LARGE; + else if (!nasm_strnicmp(value, "huge", len)) + i = MODEL_HUGE; + else { + i = -1; + if (error) + nasm_error(pass0 < 2 ? ERR_NONFATAL : ERR_FATAL, + "unknown memory model"); + } + return i; +} + +enum language_interfaces get_langint(char *value, int len, bool error) +{ + enum language_interfaces i; + + if (!nasm_strnicmp(value, "nolanguage", len)) + i = LANG_NONE; + else if (!nasm_strnicmp(value, "c", len)) + i = LANG_C; + else if (!nasm_strnicmp(value, "syscall", len)) + i = LANG_SYSCALL; + else if (!nasm_strnicmp(value, "stdcall", len)) + i = LANG_STDCALL; + else if (!nasm_strnicmp(value, "pascal", len)) + i = LANG_PASCAL; + else if (!nasm_strnicmp(value, "fortran", len)) + i = LANG_FORTRAN; + else if (!nasm_strnicmp(value, "basic", len)) + i = LANG_BASIC; + else if (!nasm_strnicmp(value, "prolog", len)) + i = LANG_PROLOG; + else if (!nasm_strnicmp(value, "cpp", len)) + i = LANG_CPP; + else { + i = -1; + if (error) + nasm_error(pass0 < 2 ? ERR_NONFATAL : ERR_FATAL, + "unknown language interface"); + } + return i; +} + static enum directive parse_directive_line(char **directive, char **value) { char *p, *q, *buf; @@ -182,6 +239,9 @@ static enum directive parse_directive_li return directive_find(*directive); } +static int memmodel_farcode[MODEL_MAX] = {0, 0, 0, 1, 0, 1, 1}; +static int memmodel_fardata[MODEL_MAX] = {0, 0, 0, 0, 1, 1, 2}; + /* * Process a line from the assembler and try to handle it if it * is a directive. Return true if the line was handled (including @@ -323,6 +383,33 @@ bool process_directives(char *directive) globalbits = get_bits(value); break; + case D_MODEL: /* [MODEL model, language] */ + q = value; + while (*q && !nasm_isspace(*q) && *q != ',') + q++; + memmodel = get_model(value, q - value, true); + procglobal.flags = (procglobal.flags & ~PROCFLAG_FARCODE) | + ((memmodel_farcode[memmodel] << PROCFLAG_FARCODE_SHIFT) & PROCFLAG_FARCODE); + procglobal.flags = (procglobal.flags & ~PROCFLAG_FARDATA) | + ((memmodel_fardata[memmodel] << PROCFLAG_FARDATA_SHIFT) & PROCFLAG_FARDATA); + q = nasm_skip_spaces(q); + if (*q == ',') { + q++; + q = nasm_skip_spaces(q); + p = q; + q = nasm_skip_word(q); + procglobal.langint = get_langint(p, q - p, true); + } + break; + + case D_SMART: /* [SMART] */ + iflag_set(&cpu, IF_SMART); + break; + + case D_NOSMART: /* [NOSMART] */ + iflag_clear(&cpu, IF_SMART); + break; + case D_GLOBAL: /* [GLOBAL symbol:special] */ if (*value == '$') value++; /* skip initial $ if present */ @@ -487,9 +574,13 @@ bool process_directives(char *directive) } break; - case D_CPU: /* [CPU] */ + case D_CPU: { /* [CPU] */ + unsigned int cpu_smart = iflag_test(&cpu, IF_SMART); cpu = get_cpu(value); + if (cpu_smart) + iflag_set(&cpu, IF_SMART); break; + } case D_LIST: /* [LIST {+|-}] */ value = nasm_skip_spaces(value); --- asm/directiv.dat.bak 2017-11-29 20:44:08.000000000 +0100 +++ asm/directiv.dat 2017-12-28 01:00:00.000000000 +0100 @@ -67,8 +67,11 @@ extern float global list +model section segment +smart +nosmart warning sectalign pragma --- asm/eval.c.bak 2017-11-29 20:44:08.000000000 +0100 +++ asm/eval.c 2017-12-28 01:00:00.000000000 +0100 @@ -93,7 +93,7 @@ static void begintemp(void) tempexpr_size = ntempexpr = 0; } -static void addtotemp(int32_t type, int64_t value) +static void addtotemp(int32_t type, int64_t value, int typesize) { while (ntempexpr >= tempexpr_size) { tempexpr_size += TEMPEXPR_DELTA; @@ -101,12 +101,19 @@ static void addtotemp(int32_t type, int6 tempexpr_size * sizeof(*tempexpr)); } tempexpr[ntempexpr].type = type; - tempexpr[ntempexpr++].value = value; + tempexpr[ntempexpr].value = value; + tempexpr[ntempexpr++].typesize = typesize; } static expr *finishtemp(void) { - addtotemp(0L, 0L); /* terminate */ + int i; + int typesize; + addtotemp(0L, 0L, 0); /* terminate */ + for (i = 0, typesize = 0; i < ntempexpr; i++) + if (tempexpr[i].typesize != 0/* && typesize == 0*/) + typesize = tempexpr[i].typesize; + tempexpr[0].typesize = typesize; while (ntempexprs >= tempexprs_size) { tempexprs_size += TEMPEXPRS_DELTA; tempexprs = nasm_realloc(tempexprs, @@ -134,15 +141,18 @@ static expr *add_vectors(expr * p, expr int lasttype; if (p->type > q->type) { - addtotemp(q->type, q->value); + /* set data size of sum expression to that of second vector, + if it has one; if it doesn't, keep that of first vector + result: always gets the size of last subexpression */ + addtotemp(q->type, q->value, q->typesize ? q->typesize : p->typesize); lasttype = q++->type; } else if (p->type < q->type) { - addtotemp(p->type, p->value); + addtotemp(p->type, p->value, q->typesize ? q->typesize : p->typesize); lasttype = p++->type; } else { /* *p and *q have same type */ int64_t sum = p->value + q->value; if (sum) { - addtotemp(p->type, sum); + addtotemp(p->type, sum, q->typesize ? q->typesize : p->typesize); if (hint) hint->type = EAH_SUMMED; } @@ -154,11 +164,11 @@ static expr *add_vectors(expr * p, expr } } while (p->type && (preserve || p->type < EXPR_SEGBASE + SEG_ABS)) { - addtotemp(p->type, p->value); + addtotemp(p->type, p->value, q->typesize ? q->typesize : p->typesize); p++; } while (q->type && (preserve || q->type < EXPR_SEGBASE + SEG_ABS)) { - addtotemp(q->type, q->value); + addtotemp(q->type, q->value, q->typesize ? q->typesize : p->typesize); q++; } @@ -196,14 +206,14 @@ static expr *scalar_mult(expr * vect, in static expr *scalarvect(int64_t scalar) { begintemp(); - addtotemp(EXPR_SIMPLE, scalar); + addtotemp(EXPR_SIMPLE, scalar, 0); return finishtemp(); } static expr *unknown_expr(void) { begintemp(); - addtotemp(EXPR_UNKNOWN, 1L); + addtotemp(EXPR_UNKNOWN, 1L, 0); return finishtemp(); } @@ -239,7 +249,7 @@ static expr *segment_part(expr * e) begintemp(); addtotemp((base == NO_SEG ? EXPR_UNKNOWN : EXPR_SEGBASE + base), - 1L); + 1L, 0); return finishtemp(); } } @@ -260,7 +270,8 @@ static expr *segment_part(expr * e) * * expr : bexpr [ WRT expr6 ] * bexpr : rexp0 or expr0 depending on relative-mode setting - * rexp0 : rexp1 [ {||} rexp1...] + * rexp0 : rexp0a[ {|?} rexp0a {:} rexp0a...] + * rexp0a: rexp1 [ {||} rexp1...] * rexp1 : rexp2 [ {^^} rexp2...] * rexp2 : rexp3 [ {&&} rexp3...] * rexp3 : expr0 [ {=,==,<>,!=,<,>,<=,>=} expr0 ] @@ -279,7 +290,7 @@ static expr *segment_part(expr * e) static expr *rexp0(int), *rexp1(int), *rexp2(int), *rexp3(int); -static expr *expr0(int), *expr1(int), *expr2(int), *expr3(int); +static expr *expr0(int), *expr0a(int), *expr1(int), *expr2(int), *expr3(int); static expr *expr4(int), *expr5(int), *expr6(int); static expr *(*bexpr) (int); @@ -423,6 +434,40 @@ static expr *rexp3(int critical) static expr *expr0(int critical) { + expr *e, *f, *g; + + e = expr0a(critical); + if (!e) + return NULL; + + while (i == TOKEN_TERNARY) { + int j; + i = scan(scpriv, tokval); + f = expr0a(critical); + j = i; + if (j != ':') { + nasm_error(ERR_NONFATAL, "`:' operator expected after first" + " operand of `?'"); + } + i = scan(scpriv, tokval); + g = expr0a(critical); + if (!f || !g) + return NULL; + if (!(is_simple(e) || is_just_unknown(e))) { + nasm_error(ERR_NONFATAL, "`?:' operator requires scalar" + " first operand"); + } + + if (is_just_unknown(e) || is_just_unknown(f) || is_just_unknown(g)) + e = unknown_expr(); + else + e = (int64_t)(reloc_value(e)) ? f : g; + } + return e; +} + +static expr *expr0a(int critical) +{ expr *e, *f; e = expr1(critical); @@ -678,7 +723,7 @@ static expr *eval_floatize(enum floatize } begintemp(); - addtotemp(EXPR_SIMPLE, val); + addtotemp(EXPR_SIMPLE, val, formats[type].bytes); i = scan(scpriv, tokval); return finishtemp(); @@ -721,7 +766,7 @@ static expr *eval_strfunc(enum strfunc t nasm_error(ERR_WARNING|ERR_PASS1, "character constant too long"); begintemp(); - addtotemp(EXPR_SIMPLE, val); + addtotemp(EXPR_SIMPLE, val, 0); i = scan(scpriv, tokval); return finishtemp(); @@ -767,6 +812,8 @@ static expr *expr6(int critical) int64_t tmpval; bool rn_warn; char *scope; + int typesize = 0; +rescan: switch (i) { case '-': @@ -868,23 +915,44 @@ static expr *expr6(int critical) begintemp(); switch (i) { case TOKEN_NUM: - addtotemp(EXPR_SIMPLE, tokval->t_integer); + addtotemp(EXPR_SIMPLE, tokval->t_integer, 0); break; case TOKEN_STR: tmpval = readstrnum(tokval->t_charptr, tokval->t_inttwo, &rn_warn); if (rn_warn) nasm_error(ERR_WARNING|ERR_PASS1, "character constant too long"); - addtotemp(EXPR_SIMPLE, tmpval); + addtotemp(EXPR_SIMPLE, tmpval, 0); break; case TOKEN_REG: - addtotemp(tokval->t_integer, 1L); + addtotemp(tokval->t_integer, 1L, 0); if (hint && hint->type == EAH_NOHINT) hint->base = tokval->t_integer, hint->type = EAH_MAKEBASE; break; case TOKEN_ID: case TOKEN_INSN: case TOKEN_HERE: - case TOKEN_BASE: + case TOKEN_BASE: { + /* __settype__ type override prefix */ + if (i == TOKEN_TYPEFUNC && tokval->t_integer == TYPEFUNC_SETTYPE) { + i = scan(scpriv, tokval); + if (i != '(') { + nasm_error(ERR_NONFATAL, "`(' expected after `__settype__'"); + return NULL; + } + i = scan(scpriv, tokval); + typesize = parse_size(tokval->t_charptr); + if (!typesize) { + nasm_error(ERR_NONFATAL, "invalid type in `__settype__'"); + return NULL; + } + i = scan(scpriv, tokval); + if (i != ')') { + nasm_error(ERR_NONFATAL, "`)' expected to terminate `__settype__'"); + return NULL; + } + i = scan(scpriv, tokval); + goto rescan; + } /* * If !location.known, this indicates that no * symbol, Here or Base references are valid because we @@ -896,7 +964,7 @@ static expr *expr6(int critical) (i == TOKEN_HERE ? "`$'" : i == TOKEN_BASE ? "`$$'" : "symbol references")); - addtotemp(EXPR_UNKNOWN, 1L); + addtotemp(EXPR_UNKNOWN, 1L, 0); break; } @@ -908,6 +976,7 @@ static expr *expr6(int critical) label_seg = in_absolute ? absolute.segment : location.segment; label_ofs = in_absolute ? absolute.offset : location.offset; } else { + int label_typesize = 0; if (!lookup_label(tokval->t_charptr, &label_seg, &label_ofs)) { scope = local_scope(tokval->t_charptr); if (critical == 2) { @@ -927,15 +996,19 @@ static expr *expr6(int critical) label_ofs = 1; } } + get_label_type(tokval->t_charptr, &label_typesize, NULL, NULL); + if (label_typesize != 0) + typesize = label_typesize; if (opflags && is_extern(tokval->t_charptr)) *opflags |= OPFLAG_EXTERN; } - addtotemp(type, label_ofs); + addtotemp(type, label_ofs, typesize); if (label_seg != NO_SEG) - addtotemp(EXPR_SEGBASE + label_seg, 1L); + addtotemp(EXPR_SEGBASE + label_seg, 1L, 0); break; + } case TOKEN_DECORATOR: - addtotemp(EXPR_RDSAE, tokval->t_integer); + addtotemp(EXPR_RDSAE, tokval->t_integer, 0); break; } i = scan(scpriv, tokval); @@ -1005,7 +1078,7 @@ expr *evaluate(scanner sc, void *scpriva nasm_error(ERR_NONFATAL, "invalid right-hand operand to WRT"); return NULL; } - addtotemp(EXPR_WRT, value); + addtotemp(EXPR_WRT, value, 0); g = finishtemp(); } e = add_vectors(e, g); --- asm/labels.c.bak 2017-11-29 20:44:08.000000000 +0100 +++ asm/labels.c 2017-12-28 01:00:00.000000000 +0100 @@ -94,6 +94,9 @@ union label { /* actua int64_t offset; char *label, *special; int is_global, is_norm; + int typesize; + char *typename; + int size; } defn; struct { int32_t movingon; @@ -212,14 +215,58 @@ bool lookup_label(const char *label, int lptr = find_label(label, 0, NULL); if (lptr && (lptr->defn.is_global & DEFINED_BIT)) { - *segment = lptr->defn.segment; - *offset = lptr->defn.offset; + if (segment) + *segment = lptr->defn.segment; + if (offset) + *offset = lptr->defn.offset; return true; } return false; } +char *get_prevlabel(void) { + return prevlabel; +} + +void set_prevlabel(const char *label) { + prevlabel = (char *)label; +} + + +void get_label_type(const char *label, int *typesize, char **typename, int *size) +{ + union label *lptr; + + if (!initialized) + return; + + lptr = find_label(label, 0, NULL); + if (lptr && (lptr->defn.is_global & DEFINED_BIT)) { + if (typesize) + *typesize = lptr->defn.typesize; + if (typename) + *typename = lptr->defn.typename; + if (size) + *size = lptr->defn.size; + } +} + +void set_label_type(const char *label, int typesize, const char *typename, int size) +{ + union label *lptr; + + if (!initialized) + return; + + lptr = find_label(label, 0, NULL); + if (lptr && (lptr->defn.is_global & DEFINED_BIT)) { + lptr->defn.typesize = typesize; + lptr->defn.typename = typename ? nasm_strdup(typename) : NULL; + lptr->defn.size = size; + } +} + bool is_extern(const char *label) { union label *lptr; @@ -271,6 +318,8 @@ void redefine_label(char *label, int32_t lptr->defn.offset = offset; lptr->defn.segment = segment; + lptr->defn.typesize = lptr->defn.size = 0; + lptr->defn.typename = NULL; if (pass0 == 1) { exi = !!(lptr->defn.is_global & GLOBAL_BIT); @@ -332,6 +381,8 @@ void define_label(char *label, int32_t s lptr->defn.segment = segment; lptr->defn.offset = offset; lptr->defn.is_norm = (!islocalchar(label[0]) && is_norm); + lptr->defn.typesize = lptr->defn.size = 0; + lptr->defn.typename = NULL; if (pass0 == 1 || (!is_norm && !isextrn && (segment > 0) && (segment & 1))) { exi = !!(lptr->defn.is_global & GLOBAL_BIT); --- asm/nasm.c.bak 2017-11-29 20:44:08.000000000 +0100 +++ asm/nasm.c 2017-12-28 01:00:00.000000000 +0100 @@ -122,7 +122,7 @@ static struct RAA *offsets; static struct SAA *forwrefs; /* keep track of forward references */ static const struct forwrefinfo *forwref; -static const struct preproc_ops *preproc; +const struct preproc_ops *preproc; #define OP_NORMAL (1u << 0) #define OP_PREPROCESS (1u << 1) @@ -1312,6 +1312,7 @@ static void assemble_file(char *fname, S pass_max = prev_offset_changed = (INT_MAX >> 1) + 2; /* Almost unlimited */ for (passn = 1; pass0 <= 2; passn++) { ldfunc def_label; + set_prevlabel(""); pass1 = pass0 == 2 ? 2 : 1; /* 1, 1, 1, ..., 1, 2 */ pass2 = passn > 1 ? 2 : 1; /* 1, 2, 2, ..., 2, 2 */ @@ -1468,7 +1469,7 @@ static void assemble_file(char *fname, S /* this is done here so we can do debug type info */ int32_t typeinfo = TYS_ELEMENTS(output_ins.operands); - switch (output_ins.opcode) { + switch (output_ins.orig_opcode) { case I_RESB: typeinfo = TYS_ELEMENTS(output_ins.oprs[0].offset) | TY_BYTE; @@ -1481,6 +1482,10 @@ static void assemble_file(char *fname, S typeinfo = TYS_ELEMENTS(output_ins.oprs[0].offset) | TY_DWORD; break; + case I_RESF: + typeinfo = + TYS_ELEMENTS(output_ins.oprs[0].offset) | TY_FWORD; + break; case I_RESQ: typeinfo = TYS_ELEMENTS(output_ins.oprs[0].offset) | TY_QWORD; @@ -1513,6 +1518,9 @@ static void assemble_file(char *fname, S else typeinfo |= TY_DWORD; break; + case I_DF: + typeinfo |= TY_FWORD; + ... [truncated message content] |