Menu

#3735 MOS6502 code generation incorrect for expression "x & ~y"

closed-fixed
None
MOS6502
5
2024-06-04
2024-05-21
No

For an expression (x & ~y), the mos6502 codegen will generate a faulty instruction sequence unless the result of this expression is stored to an intermediate variable.

An example using GBDK can be found here: https://github.com/michel-iwaniec/gbdk-2020/commit/22d37a1ae44fcdf5955329a5a4cc6b8861d1e00c

The example creates two global 8-bit variables:

uint8_t x = 0xFF;
uint8_t y = 0xAA;

As well as a macro and a function returning the (x & ~y) expression:

#define XANDNOTY_MACRO() (x & ~y)

uint8_t xandnoty_function()
{
    return (x & ~y);
}

The main function then tries to call a small helper function print_hex in 3 differnt ways:
1. By passing the return value of the function directly: print_hex(xandnoty_function());
2. By first storing the expression of the macro into a local variable r: r = XANDNOTY_MACRO(); print_hex(r);
3. By passing the expression of the macro directly: print_hex(XANDNOTY_MACRO());

Cases 1 and 2 work fine and give the expected value of 0x55, while case 3 does not.

Looking into the code generated for these cases, we can clearly see how 1 and 2 give a short expected sequence of instructions:
case 1:

      00C4E8                        115 _xandnoty_function:
                           000000   116     C$x_and_not_y.c$25$1_0$103 ==.
                                    117 ;   src/x_and_not_y.c: 25: return (x & ~y);
      00C4E8 AD 01 03         [ 4]  118     lda _y
      00C4EB 49 FF            [ 2]  119     eor #0xff
      00C4ED 2D 00 03         [ 4]  120     and _x
                           000008   121     C$x_and_not_y.c$26$1_0$103 ==.
                                    122 ;   src/x_and_not_y.c: 26: }
                           000008   123     C$x_and_not_y.c$26$1_0$103 ==.
                           000008   124     XG$xandnoty_function$0$0 ==.
      00C4F0 60               [ 6]  125     rts

[...]

                                    241 ;   src/x_and_not_y.c: 51: print_hex(xandnoty_function());
      00C559 20 E8 C4         [ 6]  242     jsr _xandnoty_function
      00C55C A2 00            [ 2]  243     ldx #0x00
      00C55E 20 F1 C4         [ 6]  244     jsr _print_hex

case 2:

                                    266 ;   src/x_and_not_y.c: 56: r = XANDNOTY_MACRO();
      00C57C AD 01 03         [ 4]  267     lda _y
      00C57F 49 FF            [ 2]  268     eor #0xff
      00C581 2D 00 03         [ 4]  269     and _x
                           00009C   270     C$x_and_not_y.c$57$1_0$107 ==.
                                    271 ;   src/x_and_not_y.c: 57: print_hex(r);
      00C584 A2 00            [ 2]  272     ldx #0x00
      00C586 20 F1 C4         [ 6]  273     jsr _print_hex

Whereas case 3 generates a very long sequence of instructions, where the value of the "lda _y" gets immediately overwritten by a "ldx #0x00 / txa" pair:
295 ; src/x_and_not_y.c: 62: print_hex(XANDNOTY_MACRO());
00C5A4 AD 01 03 [ 4] 296 lda _y
00C5A7 A2 00 [ 2] 297 ldx #0x00
00C5A9 8A [ 2] 298 txa
00C5AA 49 FF [ 2] 299 eor #0xff
00C5AC 85 2F [ 3] 300 sta (_main_sloc0_1_0 + 1)
00C5AE 49 FF [ 2] 301 eor #0xff
00C5B0 85 2E [ 3] 302 sta
_main_sloc0_1_0
00C5B2 AD 00 03 [ 4] 303 lda _x
00C5B5 25 2E [ 3] 304 and _main_sloc0_1_0
00C5B7 85 34 [ 3] 305 sta
(REGTEMP+0)
00C5B9 8A [ 2] 306 txa
00C5BA 25 2F [ 3] 307 and (_main_sloc0_1_0 + 1)
00C5BC AA [ 2] 308 tax
00C5BD A5 34 [ 3] 309 lda
(REGTEMP+0)
00C5BF 20 F1 C4 [ 6] 310 jsr _print_hex
0000DA 311 C$x_and_not_y.c$63$1_0$107 ==.
312 ; src/x_and_not_y.c: 63: vsync();
00C5C2 20 58 C3 [ 6] 313 jsr _vsync
0000DD 314 C$x_and_not_y.c$64$1_0$107 ==.
315 ; src/x_and_not_y.c: 64: }
0000DD 316 C$x_and_not_y.c$64$1_0$107 ==.
0000DD 317 XG$main$0$0 ==.
00C5C5 60 [ 6] 318 rts

  Other targets like Z80 and SM83 do not seem to suffer from this kind of issue.
1 Attachments

Discussion

  • Michel Iwaniec

    Michel Iwaniec - 2024-05-21

    Case 3 in the above (the actual failing case) was a bit incorrect/badly formatted, but I can't edit the ticket.

    It should say:

    Whereas case 3 generates a very long sequence of instructions, where the value of the "lda y" gets immediately overwritten by a "ldx #0x00 / txa" pair:

    295 ; src/x_and_not_y.c: 62: print_hex(XANDNOTY_MACRO());
    00C5A4 AD 01 03 [ 4] 296 lda _y
    00C5A7 A2 00 [ 2] 297 ldx #0x00
    00C5A9 8A [ 2] 298 txa
    00C5AA 49 FF [ 2] 299 eor #0xff
    00C5AC 85 2F [ 3] 300 sta (_main_sloc0_1_0 + 1)
    00C5AE 49 FF [ 2] 301 eor #0xff
    00C5B0 85 2E [ 3] 302 sta _main_sloc0_1_0
    00C5B2 AD 00 03 [ 4] 303 lda _x
    00C5B5 25 2E [ 3] 304 and _main_sloc0_1_0
    00C5B7 85 34 [ 3] 305 sta (REGTEMP+0)
    00C5B9 8A [ 2] 306 txa
    00C5BA 25 2F [ 3] 307 and (_main_sloc0_1_0 + 1)
    00C5BC AA [ 2] 308 tax
    00C5BD A5 34 [ 3] 309 lda (REGTEMP+0)
    00C5BF 20 F1 C4 [ 6] 310 jsr _print_hex
    
     
  • Michel Iwaniec

    Michel Iwaniec - 2024-06-04

    Seems it was actually fixed in [r14650]. So this ticket can be closed.

    There still seems to be a problem with case 3 generating a 16-bit sequence that seems redundant to me, given that the two input variables and the function parameter are all 8-bit.
    But that's not a functional bug - just a possible optimization, assuming the compiler can predict this.

     

    Related

    Commit: [r14650]


    Last edit: Maarten Brock 2024-06-06
  • Philipp Klaus Krause

    • status: open --> closed-fixed
    • assigned_to: Philipp Klaus Krause
     

Log in to post a comment.