Menu

#2600 Bug in how SDCCpeeph.c matches source lines and peephole rules

closed-fixed
None
other
5
2017-05-04
2017-04-19
alvin
No

In function matchLine() at line 1905 of SDCCpeeph.c there is minor error in how a line of source text is matched against a line of a peephole rule.

This portion of code:

      /* they should be an exact match other wise */
      if (*s && *d)
        {
          if (*s++ != *d++)
            return FALSE;
        }

    }

should only be running if the "destination is not a var". The previous "if" statement is taking care of the case of a "%" in the peephole rule. After taking care of that it exits the "if" and requires an exact match of the following character in the rule and source. This prevents rules having two successive % from being possible, for example:

replace restart {
    ld  c,l
    ld  b,h
    %1  %2,%3
} by {
    %1  %2,%3
    ld  c,l
    ld  b,h
    ; peephole z88dk-485c1
} if same(%1 'set' 'res' 'bit'), operandsNotRelated(%3 'bc')

In the above rule for z80 code, I am looking for any set,res,bit instruction so "%1 %2" appears in sequence. This rule is not currently matched by SDCCpeeph.c.

The fix is to make the exact match step part of the "else" part of the "if":

.....
          while (ISCHARSPACE (*s))
            s++;
          while (ISCHARSPACE (*d))
            d++;
        }
      else if (*s && *d)
        {
          /* they should be an exact match otherwise */

          if (*s++ != *d++)
              return FALSE;
        }
    }

  /* get rid of the trailing spaces
     in both source & destination */
.....

Discussion

  • Philipp Klaus Krause

    • assigned_to: Philipp Klaus Krause
     
  • Philipp Klaus Krause

    • status: open --> closed-fixed
     
  • Philipp Klaus Krause

    Fixed in revision [r9888].

    Philipp

     
  • alvin

    alvin - 2017-05-03

    Philip, the patch applied is flawed. I found recently compiled programs were 2k+ larger because some peephole rules were not being applied.

    The code that is in there now in function matchLine():

    static bool
    matchLine (char *s, const char *d, hTab ** vars)
    {
      if (!s || !(*s))
        return FALSE;
    
      /* skip leading white spaces */
      while (ISCHARSPACE (*s))
        s++;
      while (ISCHARSPACE (*d))
        d++;
    
      while (*s && *d)
        {
          /* if the destination is a var */
          if (*d == '%' && ISCHARDIGIT (*(d + 1)) && vars)
            {
              const char *v = hTabItemWithKey (*vars, keyForVar (d + 1));
              /* if the variable is already bound
                 then it MUST match with dest */
              if (v)
                {
                  while (*v)
                    if (*v++ != *s++)
                      return FALSE;
                }
              else
                /* variable not bound we need to bind it */
                bindVar (keyForVar (d + 1), &s, vars);
    
              /* in either case go past the variable */
              d++;
              while (ISCHARDIGIT (*d))
                d++;
            }
          else if (ISCHARSPACE (*s) && ISCHARSPACE (*d)) /* whitespace sequences match any whitespace sequences */
            {
              while (ISCHARSPACE (*s))
                s++;
              while (ISCHARSPACE (*d))
                d++;
            }
          else if (*s && *d) /* they should be an exact match otherwise */
            {
              if (*s++ != *d++)
                return FALSE;
            }
        }
    
      /* skip trailing whitespaces */
      if (*s)
        while (ISCHARSPACE (*s))
          s++;
    
      if (*d)
        while (ISCHARSPACE (*d))
          d++;
    
      /* after all this if only one of them
         has something left over then no match */
      if (*s || *d)
        return FALSE;
    
      return TRUE;
    }
    

    is not functionally equivalent to this, which is working for me and is based on your code and code previous to the patch:

    static bool
    matchLine (char *s, const char *d, hTab ** vars)
    {
      if (!s || !(*s))
        return FALSE;
    
      while (*s && *d)
        {
          /* skip white space in both */
          while (ISCHARSPACE(*s))
              s++;
          while (ISCHARSPACE(*d))
              d++;
    
          /* if the destination is a var */
          if (*d == '%' && ISCHARDIGIT (*(d + 1)) && vars)
            {
              const char *v = hTabItemWithKey (*vars, keyForVar (d + 1));
              /* if the variable is already bound
                 then it MUST match with dest */
              if (v)
                {
                  while (*v)
                    if (*v++ != *s++)
                      return FALSE;
                }
              else
                /* variable not bound we need to bind it */
                bindVar (keyForVar (d + 1), &s, vars);
    
              /* in either case go past the variable */
              d++;
              while (ISCHARDIGIT (*d))
                d++;
            }
          else if (*s && *d)
            {
              /* they should be an exact match otherwise */
    
              if (*s++ != *d++)
                  return FALSE;
            }
        }
    
      /* skip trailing whitespaces */
      if (*s)
        while (ISCHARSPACE (*s))
          s++;
    
      if (*d)
        while (ISCHARSPACE (*d))
          d++;
    
      /* after all this if only one of them
         has something left over then no match */
      if (*s || *d)
        return FALSE;
    
      return TRUE;
    }
    
     
    • Philipp Klaus Krause

      I didn't see any regression in the peephole rules that come with SDCC, Can you give an example peephole, where this bug is visible?

      Philipp

       
  • alvin

    alvin - 2017-05-03

    It was brought to my attention in this project: https://bitbucket.org/CmGonzalez/pietro_bros/overview
    The binary size increased from ~32700 bytes to ~34000 bytes. I could see that most peephole rules were not being applied.

    The enemy_coin1 function can be used as example. This is code generated by 9880:

    811   0000              ;   ---------------------------------
    812   0000              ; Function enemy_coin1
    813   0000              ; ---------------------------------
    814   0000              _enemy_coin1:
    815   0000  3A 00 00        ld  a,(_phase_left)
    816   0003  B7              or  a, a
    817   0004  28 2B           jr  Z,l_enemy_coin1_00107
    818   0006  CD 05 0E        call    _enemy_walk
    819   0009  2A 00 00        ld  hl,(_sprite)
    820   000C  26 00           ld  h,0x00
    821   000E  01 00 00        ld  bc,_lin
    822   0011  09              add hl,bc
    823   0012  7E              ld  a,(hl)
    824   0013  D6 98           sub a,0x98
    825   0015  20 1A           jr  NZ,l_enemy_coin1_00107
    826   0017  2A 00 00        ld  hl,(_sprite)
    827   001A  26 00           ld  h,0x00
    828   001C  01 00 00        ld  bc,_col
    829   001F  09              add hl,bc
    830   0020  7E              ld  a,(hl)
    831   0021  4F              ld  c,a
    832   0022  D6 03           sub a,0x03
    833   0024  28 05           jr  Z,l_enemy_coin1_00101
    834   0026  79              ld  a,c
    835   0027  D6 1B           sub a,0x1b
    836   0029  20 06           jr  NZ,l_enemy_coin1_00107
    837   002B              l_enemy_coin1_00101:
    838   002B  2A 00 00        ld  hl,(_sprite)
    839   002E  CD 00 00        call    _spr_destroy
    840   0031              l_enemy_coin1_00107:
    841   0031  C9              ret
    

    By 9889:

    811   0000              ;   ---------------------------------
    812   0000              ; Function enemy_coin1
    813   0000              ; ---------------------------------
    814   0000              _enemy_coin1:
    815   0000  3A 00 00        ld  a,(_phase_left)
    816   0003  B7              or  a, a
    817   0004  28 2E           jr  Z,l_enemy_coin1_00107
    818   0006  CD 99 0F        call    _enemy_walk
    819   0009  3E 00           ld  a,+((_lin) & 0xFF)
    820   000B  21 00 00        ld  hl,_sprite
    821   000E  86              add a, (hl)
    822   000F  4F              ld  c,a
    823   0010  3E 00           ld  a,+((_lin) / 256)
    824   0012  CE 00           adc a,0x00
    825   0014  47              ld  b,a
    826   0015  0A              ld  a,(bc)
    827   0016  D6 98           sub a,0x98
    828   0018  20 1A           jr  NZ,l_enemy_coin1_00107
    829   001A  3E 00           ld  a,+((_col) & 0xFF)
    830   001C  86              add a, (hl)
    831   001D  4F              ld  c,a
    832   001E  3E 00           ld  a,+((_col) / 256)
    833   0020  CE 00           adc a,0x00
    834   0022  47              ld  b,a
    835   0023  0A              ld  a,(bc)
    836   0024  4F              ld  c,a
    837   0025  D6 03           sub a,0x03
    838   0027  28 05           jr  Z,l_enemy_coin1_00101
    839   0029  79              ld  a,c
    840   002A  D6 1B           sub a,0x1b
    841   002C  20 06           jr  NZ,l_enemy_coin1_00107
    842   002E              l_enemy_coin1_00101:
    843   002E  2A 00 00        ld  hl,(_sprite)
    844   0031  CD 00 00        call    _spr_destroy
    845   0034              l_enemy_coin1_00107:
    846   0034  C9              ret
    

    With 1200+ bytes increase in size there are a lot of instances like this.

    I'll see if I can isolate this function and applicable rules.

     
    • Philipp Klaus Krause

      --fverbose-asm might be helpful (if the rules have comments, like the ones that come with SDCC do).

      Philipp

       
  • alvin

    alvin - 2017-05-03

    Yes that's how I found the rules.

    I think it's only going to let me attach one file. So here is test.c:

    #define GAME_LIN_FLOOR 152
    
    extern unsigned char phase_left;
    extern unsigned char lin[8];
    extern unsigned char col[8];
    extern unsigned char    sprite;
    
    extern void enemy_walk(void);
    extern void spr_destroy(unsigned char f_sprite) __z88dk_fastcall;
    
    void enemy_coin1(void){
        //COINS N SLIPICE
        if ( phase_left > 0 ) {
            enemy_walk();
            if ( ( lin[sprite] == GAME_LIN_FLOOR )  && ( col[sprite] == 3 || col[sprite] == 27) ) {
                spr_destroy(sprite);
            }
        }
    }
    

    and peep.def is attached.

    To reproduce:

    sdcc -v
    9897 MINGW

    First without the additional rules:

    sdcc -mz80 -S --reserve-regs-iy test.c

    It must be a reserve-regs-iy compile.

    Then with the rules:

    sdcc -mz80 -S --reserve-regs-iy test.c --peep-file peep.def

    I can see that this only changes the last store in the translated function. I'm trying to find an earler sdcc now to verify more rules are applied.

     
  • alvin

    alvin - 2017-05-03

    Yes, sdcc #9883 is still available at the sdcc snapshot page and I've just verified that it does apply the full set of rules.

    Edit: Or rather more of them. The output still doesn't match the 9880 at the original website. I'll check if I missed something in the peep.def file.

    Edit: No nothing is missing. I think the discrepancy is down to there not being any interplay between the sdcc rules and peep.def in the examples we have here. In z88dk the two rulesets are together in the same file so they both modify at the same time. You may be able to get the same result by appending these rules to the existing sdcc ones before the jp/jr rules.

     

    Last edit: alvin 2017-05-03
    • Philipp Klaus Krause

      I see the difference. Previously, sequences of white space could match against nothing. Now sequences of white-space only match against other sequences of white-space.

      This was actually an intentional change. After all, (on a hypothetic architecure with such instructions), we wouldn't want "add c b" to match "addc b".

      But the change breaks your peephole rules that do not have whitespace after the comma.
      I guess making ',' a special case, so that ',' matches ',' followed by whitespace should be an ok fix.

      Philipp

       
      • Philipp Klaus Krause

        The change in handling of whitespace after comma was implemented in [r9898].

        Philipp

         
  • Philipp Klaus Krause

    • status: closed-fixed --> open-fixed
     
  • Philipp Klaus Krause

    • status: open-fixed --> closed-fixed
     

Log in to post a comment.