Small Device C Compiler (SDCC) / Patches / #488 STM8 Peephole optimizer "argCont(...)" Revamp

Philipp Klaus Krause - 2025-03-04

I tried the patch. For me, stm8 and stm8-large regression tests pass, and I see a small reduction code size.

Looks like you intend to make argCont assume a two-operand instruction where argCont is supposed to check if reading the second argument of the line arg implies reading what. That is a bit different from the previous semantics ("Check if reading arg implies reading what"). IMO, such a change in semantics,if intentional, should be reflected in the function name and comment.

The strchr at the beginning looks like it could go all the way into a comment containing a (, thus skipping over the actual arguments.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Mike Vand - 2025-03-05
  
  Thanks, I tried to clarify the new function with better name and comments (I hope). Also wasn't aware of comments here. See if anything else needs to be addressed here (so I can update the patch accordingly):
  
  /* Check if instruction in the line reads "reg" ("x", "y" with their * byte-halfs xh:xl, yh:yl and "a") based on their location. */ static bool readReg(const char *line, const char *reg) { char *tmp; char *c = strchr (line, ';'); // Comments if (!(tmp = strchr (line, '(')) || (c && tmp > c)) tmp = strchr (line, ','); if (!tmp || (c && tmp > c)) return false; line = tmp + strspn (tmp + 1, " \t") + 1; // Skip blank characters if (*line == reg[0] && line[strspn (line + 1, " \t") + 1] == ')') return true; if ((tmp = strchr (line, ',')) && (!c || tmp < c)) line = tmp + strspn (tmp + 1, " \t") + 1; if (*line != reg[0]) { if ((tmp = strchr (line, ',')) && (!c || tmp < c)) { line = tmp + strspn (tmp + 1, " \t") + 1; if (*line != reg[0]) return false; } else return false; } // Register's first letter has already matched if (!reg[1]) // Return true if "reg" is a single-letter register ("a", "x" ...) return true; line++; if (*line == reg[1]) // Opcodes reading low or high byte return true; line += strspn (line, " \t"); return (!*line || *line == ')'); // Opcodes reading complete register }
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Philipp Klaus Krause - 2025-03-06
    
    This looks better, but has limitations:
    * It won't check the first argument, if that argument is a register argument (e.g. add a, (x) or addw x, _var.
    
    I wonder if this would be simpler/cleaner if it gets split into two function:
    * A new argCont that checks a single instruction argument for a possible read of what (this function would receive a pointer to the beginnign of the argument it is supposed to check, and maybe one to the end
    * A function like your readReg that finds the two arguments, then passes them to argCont.
    
    Last edit: Philipp Klaus Krause 2025-03-06
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Mike Vand - 2025-03-07
      
      Thanks. Based on your feedback, I've decided to focus the function on only index registers and give "a" register its own independent lines of code (since this function is used only in two places inside stm8MightRead). For "a" register, since we're only interested in ld/ldf instruction, the code is much simpler, and I think we can directly inline it with stm8MightRead function:
      
      if (ISINST (pl->line, "ld") || ISINST (pl->line, "ldf")) { // check to see if 'a' is used in 'source' positions char *s = strchr (pl->line, ','); char *c = strchr (pl->line, ';'); // Comments if (s && (!c || s < c)) { char *s2 = strchr (s + 1, ','); // 2nd comma (if any) if (s2 && (!c || s2 < c)) s = s2; s += strspn (s + 1, " \t") + 1; // Skip blank characters if (*s == 'a') return true; } }
      
      As for readReg function, now renamed readIdxReg, not much has changed:
      
      /* Check if instruction in the line reads index "reg" (i.e. "x" or "y" with their byte-halfs xh:xl, yh:yl) based on their string location. */ static bool readIdxReg(const char *line, const char *reg) { char *tmp; char *c = strchr (line, ';'); // Comments if (!(tmp = strchr (line, '(')) || (c && tmp > c)) tmp = strchr (line, ','); if (!tmp || (c && tmp > c)) return false; line = tmp + strspn (tmp + 1, " \t") + 1; // Skip blank characters if (*line == reg[0] && line[strspn (line + 1, " \t") + 1] == ')') return true; if ((tmp = strchr (line, ',')) && (!c || tmp < c)) line = tmp + strspn (tmp + 1, " \t") + 1; if (*line != reg[0]) { if ((tmp = strchr (line, ',')) && (!c || tmp < c)) { line = tmp + strspn (tmp + 1, " \t") + 1; if (*line != reg[0]) return false; } else return false; } // Register's first letter has already matched if (!reg[1]) // Return true if "reg" is a single-letter register x or y. return true; line++; return (*line == reg[1] || (*line != 'l' && *line != 'h')); }
      
      But semantically it's much clearer, the code is slightly simplified before its return and much importantly, it requires much smaller mental map to navigate.
      What do you think?
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Mike Vand - 2025-03-12
        
        I'm posting the latest diff file here for the reference, so to keep track of the changes throughout the discussion and to make it easier to review fully when needed. Thanks.
        
        peep.c.rev2.diff
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Philipp Klaus Krause - 2025-03-26
        
        Thanks. Sorry for taking so long to look through it.
        
        In readIdxReg, tmp and c should be of type const char *, not char * (using const char * was always the better style, and since ISO C23, strchr, etc actually return const char * when their argument is const char *).
        
        Otherwise, readIdxReg looks like it should work, except for the exgw, rlwa, and rrwa instructions (I haven't tested it yet).
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Mike Vand - 2025-03-27
        
        Thanks!
        Very happy to see that you see some potential with this.
        
        Regarding handling exgw, rlwa and others like these, there are special if/then segments that properly handle them (especially the curious case of cpw that needs special handling + readIdxReg!). Because of their simple nature, I don't think readIdxReg should be asked to handle them.
        
        My question though, is how much we're going to accommodate user inline hand-crafted assembly and their abnormal whitespaces. SDCC manual strongly recommends against using --peep-asm so we're not obliged to. In readIdxReg I tried to accommodate irregular whitespaces. But they're many pl->line[n] == extra checks in the code that assumes single spaces. We can cover them by a macro like:
        
        #define ISCHARAFTERINST(l, len, c) ( (l)[(len) + strspn((l) + (len), " \t")] == (c) )
        
        and replace the line like
        
        if ((ISINST (pl->line, "div") || ISINST (pl->line, "mul")) && pl->line[4] == extra) return true;
        
        by
        
        if ((ISINST (pl->line, "div") || ISINST (pl->line, "mul")) && ISCHARAFTERINST(pl->line, 4, extra)) return true;
        
        But I'm not sure.
        
        peep.c.rev3.diff
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Philipp Klaus Krause - 2025-03-27
        
        Regarding hand-written asm:
        It is fine to not fully handle it, as long as we always err on the safe side.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

STM8 Peephole optimizer "argCont(...)" Revamp

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Group

Searches

Help

#488 STM8 Peephole optimizer "argCont(...)" Revamp

Discussion