Some of recent changes makes code less optimized:
;--------------------------------------------------------
; File Created by SDCC : free open source ANSI-C Compiler
-; Version 4.1.1 #12108 (Linux)
+; Version 4.1.2 #12144 (Linux)
;--------------------------------------------------------
.module asincosf
.optsdcc -mz80
@@ -162,12 +162,15 @@
pop af
pop af
ex (sp), hl
-;../../../../sdcc/device/lib/z80/../asincosf.c:73: y = sqrtf(g);
ld -7 (ix), e
ld -6 (ix), d
+;../../../../sdcc/device/lib/z80/../asincosf.c:73: y = sqrtf(g);
+ pop de
+ pop hl
+ ex de, hl
+ push de
+ push hl
+ pop hl
+ push hl
push de
- ld l, -9 (ix)
- ld h, -8 (ix)
push hl
call _sqrtf
There are lot of such changes in standard library, where registers are loaded now using pop/push instead of access via IX.
But there is inverse change:
_atomic_flag_clear::
;../../../../sdcc/device/lib/z80/../atomic_flag_clear.c:34: object->flag = 1;
- pop bc
- pop hl
- push hl
- push bc
+ ld hl, #2
+ add hl, sp
+ ld a, (hl)
+ inc hl
+ ld h, (hl)
+ ld l, a
ld (hl), #0x01
;../../../../sdcc/device/lib/z80/../atomic_flag_clear.c:40: }
ret
Oh, the second one looks like a part, I touched.
I disabled pop pop push push for gbz80 if size optimization isn't active.
Gonna check if I didn't fuck up the condition.
I had a look at the situation in current [r12261].
The code looks good to me now.
E.g. in asincosf, we use
to load from stack into hl. This is 4 bytes and 42 cycles, while normal access via the frame pointer would be 6 bytes and 38 cycles. That is a reasonable choice for the default and when optimizing for code size.
However, in [r12262] I made a change so that the access via frame pointer is used instead when optimizing for code speed.
Last edit: Maarten Brock 2021-05-13