Using information on which registers are preserved by a called function, SDCC could generate more efficiet code by not saving those. This would also automatically affect register allocation choices when communicated via the cost function.
The information would be obtained in two ways:
1) From the user using a new keyword. This would mostly be useful for functions implemented in assembler.
2) By automatic analysis of the generated asm code. The information then would be available at subsequent calls to the function in the same compilation unit.
Due to the nature of this optmization it would probably be most useful for the z80 and z180, due to their relatively large number of registers, and CISC-instruction (the latter resulting in asm-implented standard library functions).
Philipp
for the mcs51 some work in direction of way 2) has been done in
sdcc/src/mcs51/rtrack.c
which does value tracking (of literals) in registers by parsing the asm code while it is being generated. It discards info as soon as it sees a label though.
Here's a first implementation of the frontend part for 1)
Philipp
In revision [r9470], I implemented a first version. It is incomplete, but already useful.
What should work: For the z80-related ports, the information on registers b, c, d, e in declarations is taken into account when saving registers for function calls. The information is used by the register allocator, i.e. it will prefer to put variables that are live across function calls into preserved registers.
Philipp
Attachment: Usage example that also shows the interaction with the register allocator.
Last edit: Maarten Brock 2016-01-22
Warning: Support on the peephole optimizer side is still missing, so code generated with the peephole optimizer active is likely to be wrong.
Philipp
Thanks for this feature! :)
Anyway I don't get why the peephole optimizer would break the generated code... isn't it supposed to eventually change asm code with different asm code that does the same thing?
notUsed() would previously assume that no registers other than ix are preserved across function calls, and thus the peephole optimizer would assume all writes to registers before the function call are dead unless the registers are used for register parameters.
Philipp
Oh, wait, I realized what you mean (the peephole could replace registers...)
In revision [r9471] I fixed the handling in the peephole optimizer. It currently is overly conservative, resulting in code size regressions for calls through function pointers. But I also added preserved register infromation for three standard-libary functions, and in the regression tests the code size gains from that are far bigger than what we loose on function pointers.
This feature should now be safe to use. But there is more to implement to unlock further benefits.
Philipp
Last edit: Maarten Brock 2016-01-22
Thanks, I'll start playing with it ASAP (mmm... what happened to snapshot builds? :| )
Philipp,
Can you explain how it is conservative for function pointers? I don't see any exceptions for it in your commits. And a function pointer is called with a 'call' instruction just like any other function, isn't it?
Further, I think the frontend should check this keyword (probably in compareFuncType() in SDCCsymt.c) on function pointer assignment. At least one way. You may assign a preserving function to a non-preserving function pointer, but not the other way around.
Maarten
Function pointers are called via call (hl) or call(iy). Since (h) and (iy) are not valid identifiers, findSym in line 447 of peep.c will return 0, and we get to the conservative fallback in line 472.
I agree on checking function pointer assignments. The assignment x = y should be ok iff the set of preserved registers for x is a subset of the set of preserved registers for y.
Philipp
In revision #9475, the estimate got a bit more exact, and thus less conservative. Some information on presereved registers frm code generation is passed to the peephole optimizer. This help with the code size regression on calls through function pointers.
Philipp
The check on function pointers is implemented in revision #9476.
Philipp
I'm using revision #9479, which I just downloaded, and I can't make this work. :|
For instance, I declared this function, both in my code and in the .h header file:
then I use that function in my program, but in the generated code I see the register pushing/popping which hasn't changed:
I don't get what I'm doing wrong :|
Thanks
Please give a small, compileable example.
Philipp
Sure, sorry.
It looks like the z88dk_fastcall decoration has to be last.
Actually, the preserved regs had to be first, no matter what else or ho many others there would be. Fixed in revision #9484.
Philipp
I had some build trouble with MSVC so hopefully it isn't affecting the results here. The SDCC nightly build is still at #9479 where the same problem below is present.
zsdcc -v
3.5.5 #9485
sdcc -mz80 -S test.c
It looks like z88dk_fastcall now has to be first in the list otherwise it is ignored.
I've just started looking at some results (I've only done preserve_regs on string.h) but here's one snippet:
BEFORE:
AFTER:
strlen_fastcall preserves DE. In the after case, the compiler manages to move what was formerly in BC in the BEFORE case into DE so that in the second case it does not need to push around the call to strlen. Very neat :) I'm not sure how many functions this will touch but it did touch more than I thought in string.h. In the whole program, the BEFORE and AFTER cases came out to the same number of asm lines but I think that's probably down to different code patterns in the AFTER case that are not being optimized by existing peephole cases.
Last edit: alvin 2016-01-26
It always had to be first, this is not a new bug. Fixed in revision #9487.
Philipp
It seems to me it's now working as expected, no more uselessly pushing/popping untouched registers around function calls :)
I'm just curious if the compiler can at this point realize when it's assigning a constant to a register which is already holding that value, across a function call that won't modify it.
I mean, say I call a z88dk_fastcall function twice, passing the same constant value, and I declared that the called function will preserve HL. Next assignment would thus be useless, right?
I understand that it may be way more complicated than what it looks to me, though.
Thanks!
Maybe similar to sverx and a solution I am trying with peephole rules below. This is a snippet of output code:
srand_fastcall() preserves a,b,c,d,e,h,l (HL is the input parameter which is unchanged by the function call). The output from either atoi() or randomize(), in HL, is passed to srand() in HL to act as seed. As you can see the compiler is moving HL into DE and then it is saving that value into BC for reuse after the srand_fastcall(). Of course it doesn't have to do anything - it could just use the value in HL without moving into any registers and also have that HL after the srand_fastcall().
Ideal code would look like this:
But code generation issues aside, I had an idea to partially address this with peephole rules. I tried this:
notUsedFrom() works with conditional branches so I thought I'd try with calls as well but it does not work. The hope embedded in the rule above is the load into bc will be eliminated by other rules.
Is it even possible to get notUsedFrom() to work with function call targets maybe by looking at the preserves_reg() set? A similar rule could be used to fix sverx's example if it cannot be easily fixed in the compiler.
Last edit: alvin 2016-01-27
Currently only information on preserved b, c, d, e is used in code generation.
Philipp