Menu

#986 z80 access to global/static variable through (IY) (worst case in perf/size)

open
nobody
None
5
2026-05-03
2025-02-17
No

sdcc version: 4.5.0 #15242 (Linux)

Consider the attached file compiling to z80 as:

sdcc -mz80 --opt-code-speed -c main.c

Then compare both functions, the function that access the global/static var directly will generate the worst scenario in performance and size by accessing it through (IY), while the second one accessing the same vars but using a pointer will get a much better code using the register HL.

This kills the code performance as the IY is the most expensive register.

2 Attachments

Discussion

  • Janko Stamenović

    It seems to me that SDCC resolves the accesses to the globals and static like they would constantly have to be updated.

    Starting with the function:

    int gOffsetX = 100;
    
    int calc(int i) {
        static int dx;
        dx = gOffsetX >> 4;
        dx += i;
        return dx;
    }
    

    even with

    sdcc -mz80 --opt-code-speed --reserve-regs-iy \
        --max-allocs-per-node2900000 -c t.c
    

    we get a lot of updates of the memory locations reserved for dx even if the temporary register pair would be better:

    _calc::
        ex  de, hl
    ;t.c:5: dx = gOffsetX >> 4;
        ld  a, (#_gOffsetX)
        ld  (#_calc_dx_10000_2),a
        ld  a, (#_gOffsetX + 1)
        ld  hl, #_calc_dx_10000_2 + 1
        ld  (hl), a
        sra (hl)
        dec hl
        rr  (hl)
        inc hl
        sra (hl)
        dec hl
        rr  (hl)
        inc hl
        sra (hl)
        dec hl
        rr  (hl)
        inc hl
        sra (hl)
        dec hl
        rr  (hl)
    ;t.c:6: dx += i;
        ld  a, (hl)
        add a, e
        ld  (hl), a
        inc hl
        ld  a, (hl)
        adc a, d
        ld  (hl), a
    ;t.c:7: return dx;
        dec hl
        ld  e, (hl)
        inc hl
        ld  d, (hl)
    ;t.c:8: }
        ret
    

    Explicitly using the temporary variable produces much better code:

    int gOffsetX = 100;
    
    int calc(int i) {
        static int dx;
        dx = gOffsetX >> 4;
        dx += i;
        return dx;
    }
    
    _calc_t::
        ld  c, l
        ld  b, h
    ;t.c:20: int tdx = gOffsetX >> 4;
        ld  de, (_gOffsetX)
        sra d
        rr  e
        sra d
        rr  e
        sra d
        rr  e
        sra d
        rr  e
    ;t.c:21: tdx += i;
        ex  de, hl
        add hl, bc
    ;t.c:22: dx = tdx;
        ld  (_calc_t_dx_10000_6), hl
    ;t.c:23: return tdx;
        ex  de, hl
    ;t.c:24: }
        ret
    

    And to confirm the initial report as soon as --reserve-regs-iy is not specified, the calc is even with -max-allocs-per-node2900000 compiled (TD- 4.5.2 #15383) to:

    _calc::
        ex  de, hl
    ;t.c:5: dx = gOffsetX >> 4;
        ld  a, (_gOffsetX+0)
        ld  (_calc_dx_10000_2+0), a
        ld  a, (_gOffsetX+1)
        ld  (_calc_dx_10000_2+1), a
        ld  iy, #_calc_dx_10000_2
        sra 1 (iy)
        rr  0 (iy)
        sra 1 (iy)
        rr  0 (iy)
        sra 1 (iy)
        rr  0 (iy)
        sra 1 (iy)
        rr  0 (iy)
    ;t.c:6: dx += i;
        ld  hl, #_calc_dx_10000_2
        ld  a, (hl)
        add a, e
        ld  (hl), a
        inc hl
        ld  a, (hl)
        adc a, d
        ld  (hl), a
    ;t.c:7: return dx;
        ld  de, (_calc_dx_10000_2)
    ;t.c:8: }
        ret
    
     

    Last edit: Janko Stamenović 2025-10-13
  • Ragozini Arturo

    Ragozini Arturo - 2026-05-03

    The generated code from v4.5 is very disappointing...
    I hope you have improved this issue in the current beta

     
    • Philipp Klaus Krause

      No, I still se the same code being generated using SDCC from current trunk.

       
  • Ragozini Arturo

    Ragozini Arturo - 2026-05-03

    manually encoding the C function in the example leads to

    _calc_t:
    ; int tdx = gOffsetX >> 4;
        ld  de, (_gOffsetX)
        sra d
        rr  e
        sra d
        rr  e
        sra d
        rr  e
        sra d
        rr  e
    ; tdx += i;
        add hl, de
    ; dx = tdx;
        ld  (_calc_t_dx_10000_6), hl
    ; return tdx;
        ex  de, hl
        ret
    

    Even when iy is not used, the compiler is ignoring the fact you can swap terms of an addition without changing the result

     

Log in to post a comment.

MongoDB Logo MongoDB