Menu

#3196 [Z80] Small regression since r12108

closed
z80 (18)
Z80
5
2021-04-30
2021-03-19
No

Some of recent changes makes code less optimized:

 ;--------------------------------------------------------
 ; File Created by SDCC : free open source ANSI-C Compiler
-; Version 4.1.1 #12108 (Linux)
+; Version 4.1.2 #12144 (Linux)
 ;--------------------------------------------------------
        .module asincosf
        .optsdcc -mz80
@@ -162,12 +162,15 @@
        pop     af
        pop     af
        ex      (sp), hl
-;../../../../sdcc/device/lib/z80/../asincosf.c:73: y = sqrtf(g);
        ld      -7 (ix), e
        ld      -6 (ix), d
+;../../../../sdcc/device/lib/z80/../asincosf.c:73: y = sqrtf(g);

+       pop     de
+       pop     hl
+       ex      de, hl
+       push    de
+       push    hl
+       pop     hl
+       push    hl
        push    de
-       ld      l, -9 (ix)
-       ld      h, -8 (ix)
        push    hl
        call    _sqrtf

There are lot of such changes in standard library, where registers are loaded now using pop/push instead of access via IX.

But there is inverse change:

 _atomic_flag_clear::
 ;../../../../sdcc/device/lib/z80/../atomic_flag_clear.c:34: object->flag = 1;

-       pop     bc
-       pop     hl
-       push    hl
-       push    bc
+       ld      hl, #2
+       add     hl, sp
+       ld      a, (hl)
+       inc     hl
+       ld      h, (hl)
+       ld      l, a
        ld      (hl), #0x01
 ;../../../../sdcc/device/lib/z80/../atomic_flag_clear.c:40: }
        ret

Discussion

  • Sebastian Riedel

    Oh, the second one looks like a part, I touched.
    I disabled pop pop push push for gbz80 if size optimization isn't active.
    Gonna check if I didn't fuck up the condition.

     
  • Philipp Klaus Krause

    • status: open --> closed
    • assigned_to: Philipp Klaus Krause
    • Category: other --> Z80
     
  • Philipp Klaus Krause

    I had a look at the situation in current [r12261].
    The code looks good to me now.
    E.g. in asincosf, we use

        pop de
        pop hl
        push    hl
        push    de
    

    to load from stack into hl. This is 4 bytes and 42 cycles, while normal access via the frame pointer would be 6 bytes and 38 cycles. That is a reasonable choice for the default and when optimizing for code size.

    However, in [r12262] I made a change so that the access via frame pointer is used instead when optimizing for code speed.

     

    Last edit: Maarten Brock 2021-05-13

Log in to post a comment.

Auth0 Logo