Menu

#944 inlining heuristic: pure functions are good candidatesg, especially for constant expression arguments

None
open
nobody
5
2024-10-04
2024-10-01
Tomasz
No

Hello, compiler won't compute constant expressions at compile time (when passing constant argument via function). It's my first time using SDCC. Sorry if I missed something.
Example below:

uint32_t foo(const uint32_t a)
{
    return 1000 / a;
}

_foo:
    ldx #0xe8
    stx *__divulong_PARM_1
    ldx #0x03
    stx *(__divulong_PARM_1 + 1)
    ldx #0x00
    stx *(__divulong_PARM_1 + 2)
    stx *(__divulong_PARM_1 + 3)
    lda *_foo_PARM_1
    sta *__divulong_PARM_2
    lda *(_foo_PARM_1 + 1)
    sta *(__divulong_PARM_2 + 1)
    lda *(_foo_PARM_1 + 2)
    sta *(__divulong_PARM_2 + 2)
    lda *(_foo_PARM_1 + 3)
    sta *(__divulong_PARM_2 + 3)
    jsr __divulong
    sta *_foo_sloc0_1_0
    stx *(_foo_sloc0_1_0 + 1)
    lda *___SDCC_m6502_ret2
    sta *(_foo_sloc0_1_0 + 2)
    lda *___SDCC_m6502_ret3
    sta *(_foo_sloc0_1_0 + 3)
    ora #0x00
    sta *___SDCC_m6502_ret3
    lda *(_foo_sloc0_1_0 + 2)
    sta *___SDCC_m6502_ret2
    ldx *(_foo_sloc0_1_0 + 1)
    lda *_foo_sloc0_1_0
    rts

SDCC version 4.4.0
command:
'sdcc',
'-c',
'-mmos6502',
'--opt-code-speed',
'--model-small',
'foo.c'

Discussion

  • Janko Stamenović

    What do you expect to be produced but a function foo, in which the only constant is 1000 which is 0x3e8 hex, which appears in ldx #0xe8 and ldx #0x03?

     
  • Tomasz

    Tomasz - 2024-10-01

    I do expect some sort of optimization, i.e. when division (or any other arithmetic operation) is performed on two constant values, the final code would perform the load and store instructions only (to return the constant value, calculated on compiler's side), without jump to divide subroutine. I saw that in another implementations (like "kickc" ).

     
    • Janko Stamenović

      You write "two constant values" but the function argument is a 32-bit number which is unknown to the function, there in the 4 bytes addressed with _foo_PARM_1. That there's a "const" specified in the const uint32_t a doesn't change that: a is unknown. The division has to be performed in the function.

      Moreover, for SDCC, sub-optimal code generation is not considered a bug.

      Edit: I've just tried kickc, and for this function I get error: Runtime division not supported. foo::return#0 = $3e8 / foo::a#0 with your function. And as you see, SDCC generates a call to the function which performs division at runtime.

       

      Last edit: Janko Stamenović 2024-10-01
  • Philipp Klaus Krause

    Ticket moved from /p/sdcc/bugs/3784/

    Can't be converted:

    • _category: MOS6502
     
  • Philipp Klaus Krause

    Moving to feature requests, since the generated code is correct.

     
  • Tomasz

    Tomasz - 2024-10-02

    It's true, kickc doesn't support runtime division. But I'm talking about static code generation at compile time. I'm not saying it's a bug (maybe I created this topic in a wrong section) but like you said, sub-optimal code generation. I think there is room for improvement for your consideration. Besides that, I really appreciate your work on this project and for giving others the opportunity to use such a tool.

    kickc results:
    c source

    #define F_CPU               25000000UL
    
    void CounterInit(uint32_t ms)
    {
        uint32_t initValue = (uint32_t)(F_CPU * ms) / 1000;
        COUNTER0W.INIT_VAL0 = (uint8_t)initValue;
        COUNTER0W.INIT_VAL1 = (uint8_t)(initValue>>8);
        COUNTER0W.INIT_VAL2 = (uint8_t)(initValue>>16);
        COUNTER0W.MCFG = COUNTER_EN | COUNTER_DIR | COUNTER_IRQ_MASK | COUNTER_REINIT;
    }
    
    uint32_t foo(const uint32_t a)
    {
        return 1000 / a;
    }
    
    void main(void)
    {
        CounterInit(100);
        LEDC0W.BYTE0 = (uint8_t)foo(10);
    }
    

    asm

    // void CounterInit(unsigned long ms)
    CounterInit: {
        .const ms = $64
        .const initValue = $17d7840*ms/$3e8
        // COUNTER0W.INIT_VAL0 = (uint8_t)initValue
        lda #$ff&initValue
        sta $d300
        // COUNTER0W.INIT_VAL1 = (uint8_t)(initValue>>8)
        lda #$ff&initValue>>8
        sta $d300+1
        // COUNTER0W.INIT_VAL2 = (uint8_t)(initValue>>16)
        lda #initValue>>$10
        sta $d300+2
        // COUNTER0W.MCFG = COUNTER_EN | COUNTER_DIR | COUNTER_IRQ_MASK | COUNTER_REINIT
        lda #COUNTER_EN|COUNTER_DIR|COUNTER_IRQ_MASK|COUNTER_REINIT
        sta $d300+3
        // }
        rts
    }
    // unsigned long foo(const unsigned long a)
    foo: {
        .const a = $a
        .label return = $3e8/a
        rts
    }
    
     
  • Philipp Klaus Krause

    Currently SDCC will only be able to do the division at compile time instead of runtime, if the function call is inlined. And SDCC doesn't have much of an inlining heuristic; to get it to inline anything, you need to use the inline keyword, e.g.:

    inline uint32_t foo(const uint32_t a)
    {
        return 1000 / a;
    }
    

    Also, the const qualifier's meaning is "read-only". I.e. a cannot be written to within the function foo. It is perfectly fine to call foo with an argument that is not a constant.

     
    • Philipp Klaus Krause

      So I guess, we can treat this ticket as a feature request for a better inlining heuristic?
      Since the optimization you want happens when the function gets inlined. And an inlining heuristic could see that the function has no side-effects, does not access global state, and thus when called with an integer constant, can be completly calculated at compile-time, so this function should be considered a good candidate for inlining.
      We probably want [feature-requests:#413] together with this, so if all calls get inlined, we do not need to put the function code into the binary at all.

       

      Related

      Feature Requests: #413


      Last edit: Philipp Klaus Krause 2024-10-02
      • Janko Stamenović

        As far as I've seen as I've tried kickc, its precondition is to always compile the program as the whole: trying to compile just a function gives:

        Required main() not defined in program.
        

        On another side SDCC compiles the compilation units and maintains the non-static functions fully functional for the case they are being called through the calls which aren't known by the compiler. If I understand, unless some new modes are introduced, only a static function whose address is impossible to use would be a candidate for inlining and omission of its callable code.

         
        • Philipp Klaus Krause

          The inlining decision would be made in the compiler, but the compiler would still emit a callable function for use from other translation units. Then [feature-requests:#413] would optimize out the callable function at link time, if it is unused.

           

          Related

          Feature Requests: #413

  • Philipp Klaus Krause

    • summary: Preventing code from being generated by using const qualifier --> inlining heuristic: pure functions are good candidatesg, especially for constant expression arguments
    • Group: -->
     
  • Tomasz

    Tomasz - 2024-10-02

    It would be great! That's exactly what I meant. Thank you.

     

Log in to post a comment.

MongoDB Logo MongoDB