Menu

[z80] peepholer is too fragile

alvin
2016-01-18
2016-01-22
  • alvin

    alvin - 2016-01-18

    I've been examining a lot of code generation recently and have been keeping track of some places where the peepholer is unable to apply rules where it should be able to. I've taken a closer look at the z80's peep.c and the problems seems to be coming from the peepholer's fragility.

    First, from z80MightRead(), I think these are bugs:

    if(!strcmp(pl->line, "ex\t(sp), hl") || !strcmp(pl->line, "ex\t(sp),hl"))
        return(!strchr(what, 'h') || !strchr(what, 'l'));
      if(!strcmp(pl->line, "ex\tde, hl") || !strcmp(pl->line, "ex\tde,hl"))
        return(!strchr(what, 'h') || !strchr(what, 'l') || !strchr(what, 'd') || !strchr(what, 'e'));
    

    The sense on the strchr tests are wrong? Shouldn't they be this:

    if(!strcmp(pl->line, "ex\t(sp), hl") || !strcmp(pl->line, "ex\t(sp),hl"))
        return(strchr(what, 'h') || strchr(what, 'l'));
      if(!strcmp(pl->line, "ex\tde, hl") || !strcmp(pl->line, "ex\tde,hl"))
        return(strchr(what, 'h') || strchr(what, 'l') || strchr(what, 'd') || strchr(what, 'e'));
    

    In quite a few places the testing against 'what' is done with a very exact strcmp() rather than strchr():

      if(!IS_GB && ISINST(pl->line, "ldir"))
          return(!strcmp(what, "b") || !strcmp(what, "c") || !strcmp(what, "d") || !strcmp(what, "e") || !strcmp(what, "h") || !strcmp(what, "l"));
    

    This is going to fail if, eg, what="bc". I'm assuming the 'what' can be a string containing more than one register and comes from peephole rules like "notUsed('bc')".

    A more accurate ldir would then be:

      if(!IS_GB && ISINST(pl->line, "ldir"))
          return(strchr(what, 'b') || strchr(what, 'c') || strchr(what, 'd') || strchr(what, 'e') || strchr(what, 'h') || strchr(what, 'l'));
    

    There are many cases like this in z80MightRead()

    In looking at cases for why the peepholer was not killing a dead load involving register 'A' I found problems in the "ld" case:

        ld  a,h
        jr  NZ,l_kn_open_00107$
        ld  hl,#0x0000
        ld  (_udata + 0x009f),hl
    l_kn_open_00107$:
        ld  hl,#_kn_open_ninode_1_266 + 1
        ld  a,(hl)
    

    I used to think the peepholer was not able to pursue branches recursively as I had seen this happening with several registers, not just 'A'. The initial "ld a,h" is dead of course because of the eventual "ld a,(hl)".

    The problem is in the "ld" case for z80MightRead():

      if(ISINST(pl->line, "ld\t"))
        {
          if(strstr(strchr(pl->line, ','), what) && strchr(pl->line, ',')[1] != '#' && !(strchr(pl->line, ',')[1] == '(' && strchr(pl->line, ',')[2] == '#') && !(strchr(pl->line, ',')[1] == '(' && strchr(pl->line, ',')[3] != ')' && strchr(pl->line, ',')[4] != ')'))
            return TRUE;
          if(*(strchr(pl->line, ',') - 1) == ')' && strstr(pl->line + 3, what) && (strchr(pl->line, '#') == 0 || strchr(pl->line, '#') > strchr(pl->line, ',')))
            return TRUE;
          return FALSE;
        }
    

    The first contained "if" is looking for registers after the comma (the last part of the conditional will fail for short label names containing register letters) and the second is looking for registers before the comma as in "(hl)", "(de)", etc

    For the case "ld (udata + 0x009f),hl" follow the second "if". The conditional finds a closing bracket to the left of comma, finds the letter 'A' in the label name and no '#" is found so returns, yes, register "A" is read. sdcc is generating code using labels without # so it's not an artifact of z88dk.

    Here's another linear case that fails to kill 'A':

        ld  a,l
        ld  (#_i_alloc_j_1_316),hl
        ld  -2 (ix),c
        ld  -1 (ix),b
    l_i_alloc_00125$:
        ld  hl,(#_i_alloc_j_1_316)
        add hl,hl
        ld  de,#_i_alloc_j_1_316 + 1
        add hl,hl
        add hl,hl
        add hl,hl
        add hl,hl
        add hl,hl
        ex  de,hl
        pop hl
        push    hl
        add hl,de
        ld  e,(hl)
        inc hl
        ld  d,(hl)
        dec hl
        ld  a,d
    

    However in this one I think it's the incorrect "ex de,hl" in z80MightRead().

      if(!strcmp(pl->line, "ex\tde, hl") || !strcmp(pl->line, "ex\tde,hl"))
        return(!strchr(what, 'h') || !strchr(what, 'l') || !strchr(what, 'd') || !strchr(what, 'e'));
    

    If you ask is "A" read, this returns true.

    Some more cases:

        ld  c,l
        bit 0,l
        jr  NZ,l_d_open_00102$
        ld  hl,#0x0006
        ld  (_udata + 0x000c),hl
        ld  hl,#0xFFFF
        jr  l_d_open_00103$
    l_d_open_00102$:
        ld  c,5 (ix)
    ...
    ...
    l_d_open_00103$:
        ld  sp, ix
        pop ix
        ret
    

    Failure to kill "c". This might be "ld (udata + 0x000c),hl" where "c" appears on the left of the load.

        ex  de,hl
    l_ch_link_00123$:
        ld  hl,#0x0020
        ld  (_udata + 0x0062),hl
        ld  c,-6 (ix)
        ld  b,-5 (ix)
        ld  (_udata + 0x0060),bc
        ld  hl,#(_udata + 0x006a)
        ld  (hl),#0x01
        xor a, a
        push    af
        inc sp
        ld  l,-4 (ix)
        ld  h,-3 (ix)
        push    hl
        call    _writei
    

    There is a rule that eliminates "ex de,hl" if notUsed('de') and notUsed('hl'). The first "ex de,hl" should be eliminated. "hl" is killed by the load immediately following and "de" should be killed by the function call. I am wondering if this is another 'ld' problem where the letter 'd' is found in "ld (udata + 0x0062),hl".

        ld  c,a
        sub a, #0xC6
        jr  NZ,l_baddev_00103$
        ld  a,b
        sub a, #0x31
        jr  NZ,l_baddev_00103$
        ld  a,#0x01
        jr  l_baddev_00104$
    l_baddev_00103$:
        xor a,a
    l_baddev_00104$:
        xor a, #0x01
        ld  l, a
        pop ix
        ret
    

    Here I am not sure why the load of 'C' is not killed.

     
    • Philipp Klaus Krause

      This is going to fail if, eg, what="bc". I'm assuming the 'what' can be a string containing more than one register and comes from peephole rules like "notUsed('bc')".

      No. The "notUsed('bc')" would result in two calls to z80MightRead, one with "b" for what, one with "c" for what. See the if in line 657 of peep.c

      Philipp

       
    • Philipp Klaus Krause

      Here I am not sure why the load of 'C' is not killed.

      Change the #if 0 to #if 1 in line 34 of peep.c to find out more about this and similar issues.

      Philipp

       
      • alvin

        alvin - 2016-01-18

        Cheers. I will try to apply some fixes later on and re-run the compiles to see if things sort out.

        This one:

            ld  c,a
            sub a, #0xC6
        

        May actually be in argCont() where it finds the 'C' in the constant.

         
        • Philipp Klaus Krause

          Can you check if the #0xc6 issue still appears for you in revision #9469?

          Philipp

           
    • Philipp Klaus Krause

      For the case "ld (udata + 0x009f),hl" follow the second "if". The conditional finds a closing bracket to the left of comma, finds the letter 'A' in the label name and no '#" is found so returns, yes, register "A" is read. sdcc is generating code using labels without # so it's not an artifact of z88dk.

      When I first implemented notUsed(), the priority was to get it working for the common cases quickly. It din't have to be exact as long as it always fails in a safe way (i.e. sometimes missing an optimization is ok, but never generating wrong code). Over time it was made more exact, to optimize more, but it is still far from perfect.

      Philipp

       
      • alvin

        alvin - 2016-01-18

        Yes I realize it's all a work in progress :) I'm poking at the peepholer quite a bit so I get see where issues come up that probably never came up before or were unimportant.

         

        Last edit: alvin 2016-01-18
  • alvin

    alvin - 2016-01-19

    I found three cases in the source code I was testing:

    Scanning ld ((_udata + 0x000c)), hl for d
    S4O_RD_OP
    
    Scanning ld bc, (#(_udata + 0x009f) + 0) for d
    S4O_RD_OP
    
    Scanning sub    a, #0x2E for e
    S4O_RD_OP
    

    The "ex de,hl" and "ex (sp),hl" can be corrected as above.

    For "ex de,hl" I'm wondering if it's possible just to swap de for hl in the "what" string and continue on?

     
    • Philipp Klaus Krause

      For "ex de,hl" I'm wondering if it's possible just to swap de for hl in the "what" string and continue on?

      No. That could work for straight-line code, but not in general.

      Philipp

       
  • alvin

    alvin - 2016-01-20

    I've got a patch that will fix "ex de,hl", "ex (sp),hl", the load issues and the hex constants being confused with registers, attached below.

    In output code I could not see any failed dead register elimination and I just got through looking at 10k lines of assembly that I could not see an obvious way to improve. The code was cleaner too.

    One test compile was on a lisp interpretter. Binary size shrank from 30623 bytes to 30384 bytes (this is with the z88dk rules).

    I'll be looking at "ex de,hl" and swapping the what register to get an exact answer tomorrow.

    These changes might be too much just before a release and I have to say the code could probably be improved -- this is a first attempt at it and incremental changes were made to accommodate "ld" so that spoiled the quality.

    I replaced argCont() with a new function argContPrec() (precise). The original argCont() is still needed because z80 instruction size computation needs the more relaxed matching in that function.

    argContPrec() splits the argument into two parts separated by the comma and looks for register references in each half separately. There is a parameter that indicates which side of the comma you want to look at because "ld" treats both sides differently. The return value is more than bool with non-zero values indicating what sort of register reference there was (like (bc) or bc). Again this is because the left side of "ld" only generates reads of registers if the left side is (bc) (de) (hl) (ix) or (iy).

    The patch is for sdcc/src/z80/peep.c

     

    Last edit: alvin 2016-01-23
    • Philipp Klaus Krause

      I applied variants of your ex, rrd, rld changes in revision #9468.

      Philipp

       
  • alvin

    alvin - 2016-01-20

    ok. I updated the patch to the version I'll foist on z88dk.

    I found one more bug:

    Scanning jp PO, l_bmap_00178$ for c
    S4O_ABORT
    

    This is caused by line 174 in findLabel(). The space between the comma and the label name is causing the label name to improperly include a leading space. The line should have a case for space in addition to comma and tab.

        if (*p == ',' || *p == '\t' || *p == ' ')
    

    I added cases for "exx" "ex (sp),iy" "ex (sp),ix" but this will only affect inline asm.

     
  • Philipp Klaus Krause

    Fixing the issue with condtional jumps in #9474 gave a nice improvement in code size. I also added the handling for ex, since ex (sp), iy will probably be used by code generation in the future.

    Philipp

     

Log in to post a comment.