Menu

#43 Mishandled instructions

1.54.1900
open
nobody
None
5
2019-08-19
2019-08-19
No

I don't expect these have adversely impacted anyone, but for the sake of completeness:

  • On the 6510, opcode $93 is "SHA (ZP),Y", but 64tass is expecting "SHA ABS,X". (Reference: NMOS 6510 Unintended Opcodes)
  • On the 65816, the COP instruction should take a 1-byte argument. 64tass requires it not to have one.
  • On the 65816, opcode $42 is WDM, which is vaguely specified. I would argue that it should be treated the same as COP, since it's meaningless without an argument. 64tass does not recognize this at all.
  • The BRK instruction is documented (starting with the 65C02?) as a two-byte instruction that takes an optional argument. 64tass does not allow an arg. (Most things treat it as a 1-byte instruction, possibly because that's how the 6502 data sheet had it.)

I discovered these while writing the regression tests for a disassembler that can generate 64tass output (https://6502bench.com/).

Discussion

  • Soci/Singular

    Soci/Singular - 2019-08-19

    It's fixed now in r1915, thanks!

    ;Hex            ;Monitor        ;Source
    93 0c           sha ($0c),y     sha (12),y
    93 0c           sha ($0c),y     ahx (12),y
    

    Both COP and BRK had parameters which could be optionally omitted since 1.46 (2010) so I'm not sure what is meant.

    02              cop #           cop
    02 03           cop #$03        cop #3
    00              brk #           brk
    00 03           brk #$03        brk #3
    

    I've added WDM now:

    42 09           wdm #$09        wdm #9
    
     
  • Andy McFadden

    Andy McFadden - 2019-08-19

    Ah. BRK/COP are normally written without the '#'. For example, ACME and cc65 accept "COP 3" but reject "COP #3". Merlin 32 accepts both forms.

    In terms of reference materials, programming the 65816 (Eyes & Lichty) p. 448 shows "COP const", and on the facing page the immediate mode for CPX shows "CPX #const". So the custom probably originated from the spec sheet recommendations. Also, the COP argument is described as mandatory, and is required by ACME and cc65.

    BRK takes an optional non-immediate argument in cc65 (BRK 3 or BRK), and again '#' is not allowed. ACME doesn't allow a BRK arg at all. Merlin 32 allows everything, because that's how they roll. :-)

    I think this situation is different from the MVN/MVP operands, where immediate mode means something different from non-immediate (8-bit constant vs. bank byte of value).

    So if you want to do the "standard" thing, COP should require a non-immediate 8-bit argument, and BRK should allow an optional non-immediate 8-bit argument. I can make the SourceGen code generator work with whatever 64tass accepts, but it's best if code doesn't stop working when the assembler gets an update, so it'd be helpful to know which format 64tass will accept in the future. (I'm currently just outputting a pair of hex bytes for COP in 64tass, and frankly COP is pretty rare, so this is not a high priority. I also recently switched to single-byte BRK because I felt like I was padding against the current, but that may become selectable in the future.)

    FWIW, I'd treat WDM in whatever way you treat COP (i.e. same argument format and mandatory-ness). AFAIK nothing has ever actually used WDM, but it ought to be defined as something for the sake of completeness, and making it like COP seems reasonable.

     
  • Soci/Singular

    Soci/Singular - 2019-08-19

    BRK, COP and WDM are in the immediate column and they do not fetch their parameter from elsewhere just immediately after the instruction. Using a number alone implies they do fetch it from that address but that's not what happens.

    Therefore I stick to the immediate mode for them. It's not going to change to the alternatives proposed.

    For MVP/MVN has two immediate operands by definition. They are not an address but immediately loaded bank constants. The plain number variant may give warnings or errors in future.

    BRK/COP have their parameters optional as their handling is usually implemented in software through vectors. At least for BRK monitor programs routinely fix it's return address up. WDM has no vector and cannot be custom handled so it's parameter is mandatory.

    Btw. while we are here I'd mention a few other related things.

    ASL/LSR/ROL/ROR/INC/DEC and others instructions which operate on the accumulator need the accumulator as the addressing mode explicitly written out (e.g. ASL A). The implied variant is not preferred and does not compile in all cases. INC A is preferred over INA.

    For PEI the zeropage addressing is preferred without any indirection. E.g. PEI $12 and not PEI ($12). There is no indirection involved and it just loads the 16 bit value from there.

    For PEA immediate addressing #$1234 mode is preferred and not the absolute. It does not load from the address, it's just an immediate numeric constant.

    For PER it's program bank relative $1234. It can be thought of as a deferred relative jump. As the value depends on the instructions location it's not an immediate constant.

    Instructions, directives, register names are preferred to be written lower case. Using them in upper case won't work with case sensitivity enabled and there are users who want to have it on.

    Source code format is preferred to be encoded in UTF-8 if non-ASCII letters are needed. Use NFKC normalization. BOM is not needed. Labels may contain Unicode letters as per UAX #31 but choosing the right symbols is the user's responsibility. Tabs/spaces and line ending style does not matter.

    Label definitions are preferred to not use any trailing colons unless it's absolutely required. That is when they match an instruction name (e.g. sec). They should start in the first column or else some of them may trigger warnings (mistyped implied instructions).

    The .addr directive is preferred for 16 bit addresses and .word/.sint for numeric constants (if it can distinguished).

    If for any reason path names need to be put in the sources these should only use forward slashes as the path separator and must be relative. File name case should match as well.

    Btw. it's an interesting project even if I can't use it for now.

     
  • Andy McFadden

    Andy McFadden - 2019-08-19

    Great, that's what I needed to know for BRK/COP. I'll add a '#' for 64tass.

    For MVN/MVP, most assemblers accept "MVN #1,#2", the exception being ACME (there's a ticket open). Most others, including the official WDC tools (see https://westerndesigncenter.com/wdc/documentation/Assembler_Linker.pdf page 53), accept non-immediate values, which are interpreted as 24-bit. The operands are right-shifted 16x to get the bank byte. So "MVN #1,#2" is the same as "MVN $010000,$020000". (cc65 v2.18 supports this, as do at least some Apple IIgs assemblers. Merlin 32 supports this, and also lets you write "MVN 1,2", which is potentially ambiguous.)

    All assemblers I've worked with accept PEI ($12), PEA $1234, and PER $1234, so that's what I generate. Eyes & Lichty show this syntax, but point out that it's misleading. (So it might be a bad choice, but it's apparently a deliberate choice.) The page from the WDC tool manual shows multiple variations for PEA, PEI, and PER, so perhaps they have relented. (For PER it slows "PER label" and "PER #offset", so I assume PER with an immediate constant would be specifying the actual offset value rather than the target address.)

    I personally prefer an explicit accumulator operand (LSR A), so I output that by default. ACME rejects it, so I suppress it for them.

    If you want to see what all 256 opcodes look like when generated for 64tass (including the updated COP): https://github.com/fadden/6502bench/blob/master/SourceGen/SGTestData/Expected/1000-allops-value-65816_64tass.S

    (Assembled with --case-sensitive --nostart --long-address -Wall ... normally that appears in a comment at the top so people know how to run the assembler, but I suppress that for the regression tests. The tests also suppress the automatic generation of local labels, because I'm still fiddling with the localization and don't want all my tests to break if I change it.)

     

Log in to post a comment.

MongoDB Logo MongoDB