Menu

#415 Candidates for Peephole Optimization on STM8

None
closed-fixed
Ben Shi
None
5
2014-11-22
2014-10-03
No

I am using Build 9082 of SDCC on the STM8 platform

my Ringbuffer Function compiles and I found two point where Peephole Optimization would make sense

unsigned char rx_ring[256];
unsigned char rx_head;
unsigned char rx_tail;

unsigned char RXBuffer_ReadBytes()
{
  unsigned char temp;

  temp = rx_ring[rx_tail];
  rx_tail++;

  return temp;
}

compiles to:

;   rx_ringbuffer.c: 27: temp = rx_ring[rx_tail];
    ldw x, #_rx_ring+0
    a, xl
    add a, _rx_tail+0
    ld  xl, a
    ld  a, xh
    adc a, #0x00
    ld  xh, a
    ld  a, (x)
    ld  xl, a
;   rx_ringbuffer.c: 28: rx_tail++;
    inc _rx_tail+0
;   rx_ringbuffer.c: 30: return temp;
    ld  a, xl
    ret

which could be optimized to something like that:

;   rx_ringbuffer.c: 27: temp = rx_ring[rx_tail];
    ldw x, #_rx_ring+0
    a, xl
    add a, _rx_tail+0
        rlwa
    adc a, #0x00
    ld  xh, a
    ld  a, (x)
;   rx_ringbuffer.c: 28: rx_tail++;
    inc _rx_tail+0
;   rx_ringbuffer.c: 30: return temp;
    ret
2 Attachments

Related

Feature Requests: #415

Discussion

  • Ben Shi

    Ben Shi - 2014-10-03
    • assigned_to: Ben Shi
    • Group: -->
     
  • Ben Shi

    Ben Shi - 2014-10-03

    The following peephole rule does not work.

    replace restart {
    ld xl, a
    ld a, xh
    } by {
    rlwa x, a
    } if notUsed('xh')

    I will try to find the reason, or if can be fixed from upstream code generator.

     
    • Philipp Klaus Krause

      If the rule does not work, You might want to check if something is wrng with notUsed(). A problem in notUsed( might affect other rules as well.

      However, there might be cases in the above peephole isproblematic, since rlwa affects the flags, while ld between registers does not.

      Philipp

       
  • Ben Shi

    Ben Shi - 2014-10-03

    I suggest change

    unsigned char rx_head;
    unsigned char rx_tail;

    to

    unsigned int rx_head;
    unsigned int rx_tail;

    then more effiecient code is generated.

     
    • Georg Ottinger

      Georg Ottinger - 2014-10-03

      i am not quite sure if the

      RLWA operation is expecting operands.

      Am 2014-10-03 um 14:58 schrieb Ben Shi:

      I suggest change

      unsigned char rx_head;
      unsigned char rx_tail;

      to

      unsigned int rx_head;
      unsigned int rx_tail;

      then more effiecient code is generated.


      [feature-requests:#415]
      http://sourceforge.net/p/sdcc/feature-requests/415 Candidates for
      Peephole Optimization on STM8

      Status: open
      Group:
      Created: Fri Oct 03, 2014 08:04 AM UTC by Georg Ottinger
      Last Updated: Fri Oct 03, 2014 12:31 PM UTC
      Owner: Ben Shi

      I am using Build 9082 of SDCC on the STM8 platform

      my Ringbuffer Function compiles and I found two point where Peephole
      Optimization would make sense

      unsigned char rx_ring[256];
      unsigned char rx_head;
      unsigned char rx_tail;

      unsigned char RXBuffer_ReadBytes()
      {
      unsigned char temp;

      temp = rx_ring[rx_tail];
      rx_tail++;

      return temp;
      }

      compiles to:

      ; rx_ringbuffer.c: 27: temp = rx_ring[rx_tail];
      ldw x, #_rx_ring+0
      a, xl
      add a, _rx_tail+0
      ld xl, a
      ld a, xh
      adc a, #0x00
      ld xh, a
      ld a, (x)
      ld xl, a
      ; rx_ringbuffer.c: 28: rx_tail++;
      inc _rx_tail+0
      ; rx_ringbuffer.c: 30: return temp;
      ld a, xl
      ret

      which could be optimized to something like that:

      ; rx_ringbuffer.c: 27: temp = rx_ring[rx_tail];
      ldw x, #_rx_ring+0
      a, xl
      add a, _rx_tail+0
      rlwa
      adc a, #0x00
      ld xh, a
      ld a, (x)
      ; rx_ringbuffer.c: 28: rx_tail++;
      inc _rx_tail+0
      ; rx_ringbuffer.c: 30: return temp;
      ret


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/sdcc/feature-requests/415/
      https://sourceforge.net/p/sdcc/feature-requests/415

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      https://sourceforge.net/auth/subscriptions

      --



       

      Related

      Feature Requests: #415

      • Philipp Klaus Krause

        rlwa has only one argument. But that would not stop the peephole optimizer from emitting an rlwa with two operands (later stages should emit an error though).

        Philipp

         
  • Ben Shi

    Ben Shi - 2014-10-04
    1. This rule is not right, for the flags affection.
    2. The reason why notUsed("xh") does not work, needs to be addressed.
    3. This feature request can be implemented in the code generator.
     
  • Maarten Brock

    Maarten Brock - 2014-10-04

    Does the generated code improve if you post-increment rx_tail inside the array access?
    temp = rx_ring[rx_tail++];

    aopGet() should be able to deal with the post-increment and temp should become return-use-only and not go through xl.

     
  • Ben Shi

    Ben Shi - 2014-10-04

    That might not help. Since

    ld a, xl
    add a, _rx_tail+0
    ld  xl, a
    ld  a, xh
    adc a, #0x00
    ld  xh, a
    

    are generated within on IR instruction: rx_ring(16-bit pointer) + rx_tail(8-bit offset), not by the post-increment of rx_tail.

     
  • Ben Shi

    Ben Shi - 2014-11-22

    Implemented in reversion 9113 by peephole, not by upstream code generator.

     
  • Ben Shi

    Ben Shi - 2014-11-22
    • status: open --> closed-fixed
     

Log in to post a comment.