The _divulong function use a bool variables, but this is not optimal:
- this force the use of the "T" reg as soon as a 32bit division is used on
mcs51, thus potentially reducing the stack size up to 24 bytes.
- This force the compiler to use a bit, which produce suboptimal code
- This does not reduce stack useage as the compiler can allocate the bool
variable to a register
The following patch use a unsigned char for the boolean variable the generated
code is changed as following for the large stack auto code model:
;x Allocated to stack - _bp +1
;reste Allocated to registers r2 r3 r6 r7
;count Allocated to registers r5
-;c Allocated to registers b0
+;c Allocated to registers r4
;------------------------------------------------------------
; _divulong.c:335: _divulong (unsigned long x, unsigned long y)
; -----------------------------------------
@@ -148,8 +134,6 @@
rl a
anl a,#0x01
mov r4,a
- add a,#0xff
- mov b0,c
; _divulong.c:345: x <<= 1;
mov r0,_bp
inc r0
@@ -182,7 +166,8 @@
rlc a
mov r7,a
; _divulong.c:347: if (c)
- jnb b0,00102$
+ mov a,r4
+ jz 00102$
; _divulong.c:348: reste |= 1L;
orl ar2,#0x01
00102$:
On cc253x SoC this lead to a reduction of 4 byte in code size and 5 cycle per
loop iteration, thus saving 160 cycle on a 32bits division.
View and moderate all "patches Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Patches"
I don't like having a special case for such a small gain. IMO, the right thing to do would be to just implement _Bool for mcs51 (instead of the non-compliant fake using __bit we have now).
Philipp
A full implementation of _Bool for mcs51 would make no difference IMO, because it should still map booleans to bitspace whenever possible. In general this should be more optimal. Also most of the time there are ISR's that use alternate register banks and some global variables that easily fill the 24 bytes.
What remains is that the current implementation does not generate optimal code.
The current version of SDCC 3.5.5 #9397 no longer generates the inefficient
But instead gives
This makes the current implementation with bool 1 byte more efficient than the unsigned char in this patch.