Small Device C Compiler (SDCC) / Feature Requests / #902 optimized rotation

#902 optimized rotation

Status: open

Owner: nobody

Labels: 8051 (8)

Priority: 5

Updated: 2024-03-14

Created: 2024-03-13

Creator: Konstantin Kim

Private: No

An interesting approach was found here: https://lab.whitequark.org/notes/2020-04-06/synthesizing-optimal-8051-code/

An SMT solver (Racket + Rosette) was used to find the optimal instructions sequence.
I believe this approach could potentially be used to generate new peephole rules and verify existing ones.

Discussion

Philipp Klaus Krause - 2024-03-13

I guess so. Another approach with similar benefits and drawbacks is the generation of peephole optimizer rules via a superoptimizer.

Personally, I still intend to look into major code generation and register allocation improvements for mcs51 later this year, so I'm not too motivated to look into the peephole optimizer rules (changes in register allocation and code generation have a tendency to make existing peephole optimizer rules obsolete).

P.S.: Looking at the examples (8- and 16-bit rotates) that the approach was tried on: No surprise for 8-bit, the solver found the "obvious" approaches anyone would have used (and proved them optimal). But some of the 16-bit cases are interesting can be good inspiration for rotation codegen improvements.

Last edit: Philipp Klaus Krause 2024-03-13

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Konstantin Kim - 2024-03-14
  
  I would like to draw attention to a homogeneous approach that utilizes ready-to-use solid tools, potentially allowing decentralization of the work.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maarten Brock - 2024-03-14

Is this just a tip about an approach you found somewhere or do you actually see generated code that can be improved? If the latter, can you give an example?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Konstantin Kim - 2024-03-14
  
  It appears to be both.
  
  The article result shows better code for a 16-bit circular constant shift and offers a straightforward idea for improving the peephole rules.
  
  The 8-bit rotations seem to match the actual SDCC outputs. However, the 16-bit ones are just brilliant!
  
  My favorite is 3 bit shift
  
  unsigned short RR1(unsigned short v) { return ((v) >> 3) | (v << (16-3)); }
  
  sdcc gives (-mmcs51 --opt-code-size )
  
  mov r6, dpl mov r7, dph mov ar4,r6 mov a,r7 swap a rl a xch a,r4 swap a rl a anl a,#0x1f xrl a,r4 xch a,r4 anl a,#0x1f xch a,r4 xrl a,r4 xch a,r4 mov r5,a mov a,r6 swap a rl a anl a,#0xe0 mov r7,a clr a orl a,r4 mov dpl,a mov a,r7 orl a,r5 mov dph,a
  
  For R1:R0 pair, the article suggests the following code:
  
  mov a, r0 xrl a, r1 mov r1, a anl a, #0xf8 xrl a, r0 swap a rl a xch a, r1 swap a rl a xrl a, r1 mov r0, a
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.