## #320 Optimise certain bitwise operations.

closed
5
2011-07-27
2011-05-06
No

The following code:

======== set_res.c ========
#define BIT7 0x80
unsigned char bitfield;

void
set_bit(void)
{
bitfield |= BIT7;
}

void
res_bit(void)
{
bitfield &= ~BIT7;
}
======== end set_res.c ========

when compiled with "sdcc -mz80 -c set_res.c" produces the following assembler (skipping the pre/post-amble):

;set_res.c:6: set_bit(void)
; ---------------------------------
; Function set_bit
; ---------------------------------
_set_bit_start::
_set_bit:
;set_res.c:8: bitfield |= BIT7;
ld iy,#_bitfield
ld a,0 (iy)
set 7, a
ld 0 (iy),a
ret
_set_bit_end::
;set_res.c:12: res_bit(void)
; ---------------------------------
; Function res_bit
; ---------------------------------
_res_bit_start::
_res_bit:
;set_res.c:14: bitfield &= ~BIT7;
ld iy,#_bitfield
ld a,0 (iy)
and a,#0x7F
ld 0 (iy),a
ret
_res_bit_end::

There are multiple things I personally think could be done better here (although I accept that registers already in use in more complex examples could get in the way):
1) It's great that something spotted that "|= 0x80" can be optimised to a set operation. However, the load/modify/store could be further optimised to an atomic "set 7, 0(iy)", and even modified to use hl (if available) to further save bytes and execution time.
2) The res_bit function could also be modified to make use of the res instruction. Then, the same as above applies with regard to the register selection. My preference would be hl as it uses fewer bytes, but a direct res 7, 0(iy) could be used as well.

I suspect this should be done at the code generation / register selection stage rather than trying to do something with the peephole optimiser.

Using "set 7, 0(iy)" reduces the example code from 12 bytes to 8, and using "set 7, (hl)" reduces it to just 5 bytes.
Taking this approach would also make such code safe to be shared by interrupt handlers, although only for the single bit case.

The above output was produced using SDCC : z80 3.0.2 #6484 (May 6 2011) (Solaris i386)

## Discussion

• assigned_to: nobody --> spth

• The optralloc branch has already been generating the res you suggested. As for optimizing into
res 7, 0 (iy)
I've now added a peephole in the optralloc branch to do that (and one for the set).
Out of laziness I made it a peephole for now instead of doing it earlier.

I'll have a look at the iy vs. hl thing later, probably will make a peephole out of that one, too. Unfortunately the iv vs. hl issue can't be solved very well in the code generator, since we don't know if we might be able to reuse the value in iy at some later time.

The clean solution to this mess would be to make #_bitfield a rematerializeable variable. The the optimal register allocator in the optralloc branch would then automatically generate the optimal code using hl, without needing any peepholes. However implementing that will be more work (or use iy, if it decides that's better, e.g. when the value will be needed again later on and we could profit more by using hl for something else in between).

I'll leave this open until the optralloc branch is merged.

Philipp

• In sdcc 3.0.4 #6686 (probably since the optralloc merge) set and res are used as you suggested.

The use of set and res is already there in code generation, however a peephole is used to make them operate directly on 0(iy). I might add another peephole further transforming it to use (hl) instead later (if I encounter this sequence in production code).

Philipp

• status: open --> closed

JavaScript is required for this form.

No, thanks