Small Device C Compiler (SDCC) / Feature Requests / #944 inlining heuristic: pure functions are good candidatesg, especially for constant expression arguments

Janko Stamenović - 2024-10-01

What do you expect to be produced but a function foo, in which the only constant is 1000 which is 0x3e8 hex, which appears in ldx #0xe8 and ldx #0x03?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tomasz - 2024-10-01

I do expect some sort of optimization, i.e. when division (or any other arithmetic operation) is performed on two constant values, the final code would perform the load and store instructions only (to return the constant value, calculated on compiler's side), without jump to divide subroutine. I saw that in another implementations (like "kickc" ).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Janko Stamenović - 2024-10-01
  
  You write "two constant values" but the function argument is a 32-bit number which is unknown to the function, there in the 4 bytes addressed with _foo_PARM_1. That there's a "const" specified in the const uint32_t a doesn't change that: a is unknown. The division has to be performed in the function.
  
  Moreover, for SDCC, sub-optimal code generation is not considered a bug.
  
  Edit: I've just tried kickc, and for this function I get error: Runtime division not supported. foo::return#0 = $3e8 / foo::a#0 with your function. And as you see, SDCC generates a call to the function which performs division at runtime.
  
  Last edit: Janko Stamenović 2024-10-01
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Klaus Krause - 2024-10-02

Ticket moved from /p/sdcc/bugs/3784/

Can't be converted:

_category: MOS6502
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Klaus Krause - 2024-10-02

Moving to feature requests, since the generated code is correct.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

It's true, kickc doesn't support runtime division. But I'm talking about static code generation at compile time. I'm not saying it's a bug (maybe I created this topic in a wrong section) but like you said, sub-optimal code generation. I think there is room for improvement for your consideration. Besides that, I really appreciate your work on this project and for giving others the opportunity to use such a tool.

kickc results:
c source

#define F_CPU               25000000UL

void CounterInit(uint32_t ms)
{
    uint32_t initValue = (uint32_t)(F_CPU * ms) / 1000;
    COUNTER0W.INIT_VAL0 = (uint8_t)initValue;
    COUNTER0W.INIT_VAL1 = (uint8_t)(initValue>>8);
    COUNTER0W.INIT_VAL2 = (uint8_t)(initValue>>16);
    COUNTER0W.MCFG = COUNTER_EN | COUNTER_DIR | COUNTER_IRQ_MASK | COUNTER_REINIT;
}

uint32_t foo(const uint32_t a)
{
    return 1000 / a;
}

void main(void)
{
    CounterInit(100);
    LEDC0W.BYTE0 = (uint8_t)foo(10);
}

asm

// void CounterInit(unsigned long ms)
CounterInit: {
    .const ms = $64
    .const initValue = $17d7840*ms/$3e8
    // COUNTER0W.INIT_VAL0 = (uint8_t)initValue
    lda #$ff&initValue
    sta $d300
    // COUNTER0W.INIT_VAL1 = (uint8_t)(initValue>>8)
    lda #$ff&initValue>>8
    sta $d300+1
    // COUNTER0W.INIT_VAL2 = (uint8_t)(initValue>>16)
    lda #initValue>>$10
    sta $d300+2
    // COUNTER0W.MCFG = COUNTER_EN | COUNTER_DIR | COUNTER_IRQ_MASK | COUNTER_REINIT
    lda #COUNTER_EN|COUNTER_DIR|COUNTER_IRQ_MASK|COUNTER_REINIT
    sta $d300+3
    // }
    rts
}
// unsigned long foo(const unsigned long a)
foo: {
    .const a = $a
    .label return = $3e8/a
    rts
}

Philipp Klaus Krause - 2024-10-02

Currently SDCC will only be able to do the division at compile time instead of runtime, if the function call is inlined. And SDCC doesn't have much of an inlining heuristic; to get it to inline anything, you need to use the inline keyword, e.g.:

inline uint32_t foo(const uint32_t a) { return 1000 / a; }

Also, the const qualifier's meaning is "read-only". I.e. a cannot be written to within the function foo. It is perfectly fine to call foo with an argument that is not a constant.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Philipp Klaus Krause - 2024-10-02
  
  So I guess, we can treat this ticket as a feature request for a better inlining heuristic?
  Since the optimization you want happens when the function gets inlined. And an inlining heuristic could see that the function has no side-effects, does not access global state, and thus when called with an integer constant, can be completly calculated at compile-time, so this function should be considered a good candidate for inlining.
  We probably want [feature-requests:#413] together with this, so if all calls get inlined, we do not need to put the function code into the binary at all.
  
  Related
  
  Feature Requests: #413
  
  Last edit: Philipp Klaus Krause 2024-10-02
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Janko Stamenović - 2024-10-02
    
    As far as I've seen as I've tried kickc, its precondition is to always compile the program as the whole: trying to compile just a function gives:
    
    Required main() not defined in program.
    
    On another side SDCC compiles the compilation units and maintains the non-static functions fully functional for the case they are being called through the calls which aren't known by the compiler. If I understand, unless some new modes are introduced, only a static function whose address is impossible to use would be a candidate for inlining and omission of its callable code.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Philipp Klaus Krause - 2024-10-04
      
      The inlining decision would be made in the compiler, but the compiler would still emit a callable function for use from other translation units. Then [feature-requests:#413] would optimize out the callable function at link time, if it is unused.
      
      Related
      
      Feature Requests: #413
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Klaus Krause - 2024-10-02

summary: Preventing code from being generated by using const qualifier --> inlining heuristic: pure functions are good candidatesg, especially for constant expression arguments

Group: -->
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Tomasz - 2024-10-02

It would be great! That's exactly what I meant. Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

inlining heuristic: pure functions are good candidatesg, especially for...

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Group

Searches

Help

#944 inlining heuristic: pure functions are good candidatesg, especially for constant expression arguments

Discussion

Related

Related