Suspected quadratic performance hog when compiling inline functions

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Brought to you by: benshi, drdani, epetrich, jesusc, and 3 others

#3285 Suspected quadratic performance hog when compiling inline functions

Status: open

Owner: nobody

Labels: None

Category: other

Priority: 5

Updated: 2021-10-02

Created: 2021-10-01

Creator: Oleg Endo

Private: No

I'm compiling larger, mostly generated, code which consists of a lot of (nested) inline functions for MCS-51. The compile times of those codes are just too long, in the minutes for a single source file.

Is this a known issue? Any suggestions where to look or how to improve it?

Discussion

Erik Petrich - 2021-10-01

I don't know of anything specific to inlining, but I am aware of a performance problem in GCSE optimization in large functions that have lots of branches. Inlining will create a lot of temporary variables that GCSE will try to clean up and perhaps this is triggering the problem? You might try the --nogcse option when compiling to see if that makes any significant effect to the compile time. I have started on a possible improvement to the GCSE time, but have not had time to complete and test it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Oleg Endo - 2021-10-01

Yes, it looks like we got a winner.

Recompiling single file in my project without --nogcse:
real 2m26.212s
user 2m24.992s
sys 0m0.641s

and with --nogcse:
real 0m3.638s
user 0m3.356s
sys 0m0.386s

Something's terribly wrong there ;)
Any suggestions where to look in the code? Maybe I can come up with a quick hack for myself. It's really a showstopper for me at the moment. Compile-Run cycles are in the minutes.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Petrich - 2021-10-02

If it's the problem I think it is, then removing these lines from the cseBBlock() function in src/SDCCcse.c should speed it up (these were added in [r9700]):

if (recomputeDataFlow) computeDataFlow (ebbi);

However, that would also reenable bug [#2495].

My current plan is to replace the call to computeDataFlow(), which recomputes the data flow of the entire function, with a call to a new function that only recomputes the data flow of the expressions changed by the GCSE optimization. I have this partially completed and will give finishing it priority when I have time available.

Related

Bugs: ~~#2495~~
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Oleg Endo - 2021-10-02

I've tried to comment out those lines in SDCCcse.c. It is a significant improvement. With those lines commented out and GCSE enabled I get

real 0m11.426s
user 0m11.219s
sys 0m0.306s

It's noticeably slower than no GCSE optimization, but looks like the right direction.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.