Hello, compiler won't compute constant expressions at compile time (when passing constant argument via function). It's my first time using SDCC. Sorry if I missed something.
Example below:
uint32_t foo(const uint32_t a)
{
return 1000 / a;
}
_foo:
ldx #0xe8
stx *__divulong_PARM_1
ldx #0x03
stx *(__divulong_PARM_1 + 1)
ldx #0x00
stx *(__divulong_PARM_1 + 2)
stx *(__divulong_PARM_1 + 3)
lda *_foo_PARM_1
sta *__divulong_PARM_2
lda *(_foo_PARM_1 + 1)
sta *(__divulong_PARM_2 + 1)
lda *(_foo_PARM_1 + 2)
sta *(__divulong_PARM_2 + 2)
lda *(_foo_PARM_1 + 3)
sta *(__divulong_PARM_2 + 3)
jsr __divulong
sta *_foo_sloc0_1_0
stx *(_foo_sloc0_1_0 + 1)
lda *___SDCC_m6502_ret2
sta *(_foo_sloc0_1_0 + 2)
lda *___SDCC_m6502_ret3
sta *(_foo_sloc0_1_0 + 3)
ora #0x00
sta *___SDCC_m6502_ret3
lda *(_foo_sloc0_1_0 + 2)
sta *___SDCC_m6502_ret2
ldx *(_foo_sloc0_1_0 + 1)
lda *_foo_sloc0_1_0
rts
SDCC version 4.4.0
command:
'sdcc',
'-c',
'-mmos6502',
'--opt-code-speed',
'--model-small',
'foo.c'
What do you expect to be produced but a function foo, in which the only constant is 1000 which is 0x3e8 hex, which appears in
ldx #0xe8andldx #0x03?I do expect some sort of optimization, i.e. when division (or any other arithmetic operation) is performed on two constant values, the final code would perform the load and store instructions only (to return the constant value, calculated on compiler's side), without jump to divide subroutine. I saw that in another implementations (like "kickc" ).
You write "two constant values" but the function argument is a 32-bit number which is unknown to the function, there in the 4 bytes addressed with
_foo_PARM_1. That there's a "const" specified in theconst uint32_t adoesn't change that: a is unknown. The division has to be performed in the function.Moreover, for SDCC, sub-optimal code generation is not considered a bug.
Edit: I've just tried kickc, and for this function I get
error: Runtime division not supported. foo::return#0 = $3e8 / foo::a#0with your function. And as you see, SDCC generates a call to the function which performs division at runtime.Last edit: Janko Stamenović 2024-10-01
Ticket moved from /p/sdcc/bugs/3784/
Can't be converted:
Moving to feature requests, since the generated code is correct.
It's true, kickc doesn't support runtime division. But I'm talking about static code generation at compile time. I'm not saying it's a bug (maybe I created this topic in a wrong section) but like you said, sub-optimal code generation. I think there is room for improvement for your consideration. Besides that, I really appreciate your work on this project and for giving others the opportunity to use such a tool.
kickc results:
c source
asm
Currently SDCC will only be able to do the division at compile time instead of runtime, if the function call is inlined. And SDCC doesn't have much of an inlining heuristic; to get it to inline anything, you need to use the inline keyword, e.g.:
Also, the const qualifier's meaning is "read-only". I.e. a cannot be written to within the function foo. It is perfectly fine to call foo with an argument that is not a constant.
So I guess, we can treat this ticket as a feature request for a better inlining heuristic?
Since the optimization you want happens when the function gets inlined. And an inlining heuristic could see that the function has no side-effects, does not access global state, and thus when called with an integer constant, can be completly calculated at compile-time, so this function should be considered a good candidate for inlining.
We probably want [feature-requests:#413] together with this, so if all calls get inlined, we do not need to put the function code into the binary at all.
Related
Feature Requests: #413
Last edit: Philipp Klaus Krause 2024-10-02
As far as I've seen as I've tried kickc, its precondition is to always compile the program as the whole: trying to compile just a function gives:
On another side SDCC compiles the compilation units and maintains the non-static functions fully functional for the case they are being called through the calls which aren't known by the compiler. If I understand, unless some new modes are introduced, only a static function whose address is impossible to use would be a candidate for inlining and omission of its callable code.
The inlining decision would be made in the compiler, but the compiler would still emit a callable function for use from other translation units. Then [feature-requests:#413] would optimize out the callable function at link time, if it is unused.
Related
Feature Requests: #413
It would be great! That's exactly what I meant. Thank you.