#116 GAS-like align + logical NOT (diff attached)

closed
nobody
None
1
2008-02-16
2006-09-11
No

Hi,

Here is the implementation of `p2align'
pseudo-instruction, which uses a GAS-like mechanism for
aligning code. Syntax is:
p2align 3 ; align on 8-byte boundary

p2align 3,,4 ; as above, but skip at most 4 bytes

p2align 3,0x90,4 ; as above, but pad with NOPs

p2align 5,0x00 ; align on 32-byte boundary using 0s

There is also a new operator for critical expressions.
`!' performs much like logical NOT in C (except this
always returns 1 instead of whatever logical TRUE maps to):
!exp -> 0 if exp != 0
!exp -> 1 if exp == 0

attached nasm-cvs-2006-09-12.patch contains the
implementation of the two (against CVS of 12 sep 2006)

attached nasm-0.98.39.patch contains the implementation
of the two (against 0.98.39 release)

attached nasmdoc.txt contains the description of the two.

attached palign.mac contains a `palign' macro which
does roughly the same as `p2align', and uses the
logical NOT operator

Discussion

  • Daniel Borca

    Daniel Borca - 2006-09-11

    p2align + logical NOT

     
  • nasm64developer

    nasm64developer - 2006-09-12
    • priority: 5 --> 1
     
  • nasm64developer

    nasm64developer - 2006-09-12

    Logged In: YES
    user_id=804543

    The proposed support for the "!" operator should
    be applied to eval.c and the user manual. I could
    have sworn that we had discussed this one before,
    but I can't seem to find a record at SF...

    By contrast the proposed support for the P2ALIGN
    pseudo-instruction should *NOT* be applied.

    Instead the existing USE16, USE32, and BITS macros
    should be enhanced as follows...

    %imacro bits 1+.nolist
    [bits %1]
    + %if (%1) == 16
    + %undef __BITS_32__
    + %define __BITS_16__
    + %endif
    + %if (%1) == 32
    + %undef __BITS_16__
    + %define __BITS_32__
    + %endif
    %endmacro

    %imacro use16 0.nolist
    [bits 16]
    + %undef __BITS_32__
    + %define __BITS_16__
    %endmacro
    %imacro use32 0.nolist
    [bits 32]
    + %undef __BITS_16__
    + %define __BITS_32__
    %endmacro

    ...that is, they should retain a notion of which
    mode the assembler is in, so that subsequent code
    can take mode-sensitive actions.

    At that point P2ALIGN should become a multi-line
    macro. Why is a macro preferable? Because it does
    not introduce new code into the assembler itself,
    and because it does not pollute the name space --
    that said, the macro should not be in standard.mac
    but rather in a file that the user can include if
    need be.

    Last but not least, unrelated requests should be
    filed separately. ;-)

     
  • nasm64developer

    nasm64developer - 2006-09-12

    Logged In: YES
    user_id=804543

    Two more comments.

    1. You might find these "filler" sequences useful.

    0000 90 <0> nop
    0001 8D1F <0> lea bx,[bx]
    0003 8D5F00 <0> lea bx,[byte bx]
    0006 8D9F0000 <0> lea bx,[word bx]
    000A 3E8D9F0000 <0> lea bx,[word ds:bx]

    0000 90 <0> nop
    0001 8D00 <0> lea eax,[eax]
    0003 8D4000 <0> lea eax,[byte eax]
    0006 8D442000 <0> lea eax,[byte sib0 eax]
    000A 3E8D442000 <0> lea eax,[byte sib0 ds:eax]
    000F 8D8000000000 <0> lea eax,[dword eax]
    0015 8D842000000000 <0> lea eax,[dword sib0 eax]
    001C 3E8D842000000000 <0> lea eax,[dword sib0 ds:eax]

    0000 90 <0> nop
    0001 4890 <0> xchg rax,rax
    0003 488D00 <0> lea rax,[rax]
    0006 488D4000 <0> lea rax,[byte rax]
    000A 488D442000 <0> lea rax,[byte sib0 rax]
    000F 64488D442000 <0> lea rax,[byte sib0 fs:rax]
    0015 488D8000000000 <0> lea rax,[dword rax]
    001C 488D842000000000 <0> lea rax,[dword sib0 rax]
    0024 64488D8420000000- <0> lea rax,[dword sib0 fs:rax]
    002C 00 <0>

    2. The recommended "filler" sequences vary from CPU to CPU.
    For example, for K8 AMD recommends 66h-prefixed NOPs.

    This is another reason for why P2ALIGN should not be a
    pesudo-instruction, but rather a macro.

     
  • Daniel Borca

    Daniel Borca - 2006-09-12

    Logged In: YES
    user_id=718668

    I agree the unrelated requests should be filed separately
    (as a pun, they *are* in separate files - inside the zip).

    I guess I am too lazy...

    Besides, they were spawned in the same context, of palign; I
    really needed that `!' to make palign macro.

    Although I'm not dying to see p2align in assembler, I still
    have a few notes on your reply:
    1. why introducing new code into the assembler itself is a
    bad thing? especially if code is only added, not changed.
    is the assembler code frozen?
    2. which namespace [pollution] are you refering exactly?
    3. a macro can't "jump" if the padded section is too large
    4. i'm not all too happy to completely rely on the
    preprocessor (may run into troubles with nasm -a/-e)
    5. not sure about using prefixes (either seg or opsize).
    have to check a few specs, but i have the feeling it's not
    right.
    6. and last, but not least, the assembler knows the CPU type
    at assembly-time, thus p2align can use it for generating
    efficient fillers (something that's a killa using macros).

     
  • nasm64developer

    nasm64developer - 2006-09-13

    Logged In: YES
    user_id=804543

    > 1. why introducing new code into the assembler
    > itself is a bad thing?

    I didn't say that it is a bad thing.
    I merely stated my preference for using a macro.

    > especially if code is only added, not changed.

    Touch the assembler, and there's a chance of breaking it.
    Write a macro instead, and there's no such chance.

    > is the assembler code frozen?

    No.
    I just don't think it should be changed it for this one.

    > 2. which namespace [pollution] are you refering exactly?

    Existing code might use P2ALIGN as an identifier.
    Add it as a pseudo instruction, and said code will break.
    A macro avoids that problem: the user can name the macro.
    And yes, said macro should not reside in standard.mac.

    Granted, that problem applies to any new instruction.
    I just don't see a good reason for making it worse.
    That is, only add instructions if there really is a need.
    Otherwise use macros.

    > 3. a macro can't "jump" if the padded section is
    > too large

    I'm not sure what you mean by that.
    Can you explain?

    > 4. i'm not all too happy to completely rely on the
    > preprocessor (may run into troubles with nasm -a/-e)

    Yes, $ or $$ references won't work with -e.
    But then, how often do you use -e and then -a. Or just -e.
    In other words: most code wants the preprocessor anyway.

    > 5. not sure about using prefixes (either seg or
    > opsize). have to check a few specs, but i have
    > the feeling it's not right.

    Since LEA doesn't access memory, DS: and FS: are ignored.
    Also, my sequences don't use operand size prefixes.
    Except for REX.W in 64-bit mode, of course. :)

    > 6. and last, but not least, the assembler knows
    > the CPU type at assembly-time

    Note that "knows the CPU" covers two categories.

    First, the feature set of the target CPU.
    Think CPUID flags. Like "has MMX", "has (RD)TSC", etc.

    Second, the actual target CPU/implementation.
    Think scheduling. Like "avoid or prefer this encoding".

    (If you're familiar with GCC, think -march vs. -mtune.)

    > thus p2align can use it for generating efficient
    > fillers (something that's a killa using macros).

    So can a macro, with the method I showed.
    (That is, if the CPU macro leaves an indication around.)

     
  • Daniel Borca

    Daniel Borca - 2006-09-13

    Logged In: YES
    user_id=718668

    > Touch the assembler, and there's a chance of breaking it.
    > Write a macro instead, and there's no such chance.

    I'd say this is hardly the case with this patch. :)

    It doesn't *change* the functionality of the existing code
    (so no regression bugs should appear). If if breaks
    anything at all, the chances are skyhigh there's a bug in
    the existing code.

    > Existing code might use P2ALIGN as an identifier.
    > Granted, that problem applies to any new instruction.
    > I just don't see a good reason for making it worse.

    True. However, new instructions are added from time to time
    and the risk will always be with the user. This is not a
    reason for not adding them. Some quick tests indicate that
    trying to use P2ALIGN will not do the wrong thing silently,
    but trigger an error - which is a good thing.

    > Since LEA doesn't access memory, DS: and FS: are ignored.
    > Also, my sequences don't use operand size prefixes.
    > Except for REX.W in 64-bit mode, of course. :)

    Used and Decoded are two different things. The decoder has
    to... well... decode the prefixes to reach the instruction
    opcode and see that the whole sweat was for naught. As I
    said, I have to check a few specs.

    NB: My reference to the opsize prefix was for "for K8 AMD
    recommends 66h-prefixed NOPs".

    > Note that "knows the CPU" covers two categories.

    True. I wasn't specific enough. I was referring to the CPU
    directive in NASM, not the runtime capabilities of the
    current CPU.

    >> 3. a macro can't "jump" if the padded section is too
    >> large
    >
    > I'm not sure what you mean by that.
    > Can you explain?

    Alignments which generate a large number of bytes should
    have a JMP in front of them. Not sure that's doable using
    the preprocessor.

    > Yes, $ or $$ references won't work with -e.
    > In other words: most code wants the preprocessor anyway.

    Most, but not all. I'm thinking NASM as a back-end for
    whatever.

     
  • nasm64developer

    nasm64developer - 2006-09-14

    a powerful ALIGN macro... instead of the proposed pseudo instruction

     
  • nasm64developer

    nasm64developer - 2006-09-14

    Logged In: YES
    user_id=804543

    > Alignments which generate a large number of
    > bytes should have a JMP in front of them.
    > Not sure that's doable using the preprocessor.

    It is.

    Take a look at the attached macro. It supports
    arbitrary alignments, padding a la GAS (*OLD*),
    K8 (*NEW*), default (NOP or RESB 1), or a byte
    specified by the user, as well as skipping said
    padding if it takes more than 3 instructions.

    I hope I got it right... but I might have over-
    looked something in my long list of testcases.

    >> In other words: most code wants the preprocessor
    >> anyway.
    > Most, but not all.
    > I'm thinking NASM as a back-end for whatever.

    And that precludes the NASM preprocessor? ;-)

     
  • Daniel Borca

    Daniel Borca - 2006-09-14

    Logged In: YES
    user_id=718668

    > Take a look at the attached macro. It supports
    > arbitrary alignments, padding a la GAS (*OLD*),

    Good stuff! I suppose the 3rd parameter can also be done.

    However, it requires defining (or redefining) 6 additional
    macros. Since you brought the user-code concern in line,
    I'd say that this is more likely to cause silent problems
    than an additional pseudo-instruction.

    >> I'm thinking NASM as a back-end for whatever.
    >
    > And that precludes the NASM preprocessor? ;-)

    Not necessarily. But one might want to skip the
    preprocessing phase for speed.

    Mixing assembly-time constants with preprocessor constants
    is dangerous and confusing. Two different binaries will be
    produced, depending on how the source was compiled (via
    temporary preprocessed file or not).

    Not mentioning that it beats the purpose of a
    preprocess-only command-line switch. It could as well be
    removed, because "most code wants the preprocessor anyway".

    Ah well, they can always choose not to include the macro... :)

     
  • nasm64developer

    nasm64developer - 2006-09-14

    Logged In: YES
    user_id=804543

    > Good stuff!

    Thank you.

    > I suppose the 3rd parameter can also be done.

    You mean "skip the whole thing if it would emit
    more than x bytes"? Yeah, that would be trivial.

    > However, it requires defining (or redefining)
    > 6 additional macros.

    It merely requires one big fat ALIGN macro, and
    minor modifications to the SEGMENT/ABSOLUTE as
    well as the BITS/MODE macros. (Fwiw, the latter
    are done in the namespace that is more or less
    reserved for NASM, i.e. identifiers which begin
    with two underscore characters. Similar to C.)

    Also, said minor modifications to the existing
    macros come in useful for other purposes. (That
    is, knowing whether NASM is in a segment or in
    absolute space, as well as knowing whether it is
    in 16-bit or 32-bit mode, comes in quite handy.)

    > Since you brought the user-code concern in line,
    > I'd say that this is more likely to cause silent
    > problems than an additional pseudo-instruction.

    However, a user can (a) decide when to put this
    into his/her code, (b) how to name things, and
    (c) customize it further without having to make
    changes to NASM, wait for the next release, etc.

    Anyway. I can't think of anything else right now.

     
  • Nobody/Anonymous

    Logged In: NO

    The "logical NOT" part is in 0.99.

    A powerful align macro is attached.

    With that, this request is "done".

     
  • H. Peter Anvin

    H. Peter Anvin - 2008-02-16
    • status: open --> closed
     
  • H. Peter Anvin

    H. Peter Anvin - 2008-02-16

    Logged In: YES
    user_id=58697
    Originator: NO

    Looks like we have what we need for this to be done by a macro.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks