#431 Less optimal loop code for z80 port

None
open
nobody
z80 port (30)
5
2015-02-03
2013-05-14
Falk
No

I've encountered some very suboptimal code while using for loops with an 8bit counter in the Z80 port.
The simple example to reproduce is:

#include <stdint.h>
int main()
{
    uint_fast8_t i;
    for(i=0; i<16;++i)
    {  
    }

    for(i=16;i>0;--i)
    {
    }

return 0;
}

The assembler output of this looks like:

        ld      a,#0x10
00105$:
        add     a,#0xFF
        or      a, a
        jr      NZ,00105$
;forLoopTest.c:12: for(i=16;i>0;--i)
        ld      a,#0x10
00107$:
        dec     a
        or      a, a
        jr      NZ,00107$

The "or a,a" after the add is definitely not required because "add a,#0xFF" already sets the condition flags (See page 143 of Z80-UM). The same goes for "dec a", a dec with an 8bit operant also sets the condition flags (See page 165 of the Z80-UM), so that the "or" is not required here

You may ask, why bother about one instruction .... if you extend the example a little bit to:

#include <stdint.h>
int main()
{

    uint_fast8_t i;
    uint_fast8_t j=0;
    for(i=0; i<16;++i)
    {
        j+=5; //Just do something here 
    }

    for(i=16;i>0;--i)
    {
        j-=3; //Just do something here
    }

    return j;
}

The asm output looks like this:

        ld      de,#0x0010
00105$:
;forLoopTest.c:9: j+=5; //Just do something here
        ld      a,d
        add     a, #0x05
        ld      d,a
        dec     e
;forLoopTest.c:7: for(i=0; i<16;++i)
        ld      a,e
        or      a, a
        jr      NZ,00105$
;forLoopTest.c:12: for(i=16;i>0;--i)
        ld      e,#0x10
00106$:
;forLoopTest.c:14: j-=3; //Just do something here
        ld      a,d
        add     a,#0xFD
        ld      d,a
;forLoopTest.c:12: for(i=16;i>0;--i)
        dec     e
        ld      a,e
        or      a, a
        jr      NZ,00106$

You can clearly see that the code generator runs into trouble because it must backup the "a" register, which generates a lot of overhead, which would not be required if the not required "or a,a" would not be generated.

Both examples above are compiled with:

sdcc -mz80 --std-sdcc99 forLoopTest.c

SDCC Version is the latest from the SVN:

sdcc -v
SDCC : gbz80/z80/z180 3.3.0 #8610 (May 12 2013) (Linux)

Discussion

  •  
    • labels: --> z80 port
     
  • Ticket moved from /p/sdcc/bugs/2164/

    Can't be converted:

    • _category: Z80