Menu

#2296 Compiling of a macros is too slooooooooow

closed-duplicate
None
other
5
2014-10-07
2014-09-26
No

Project the game "Dark Woods". 700 lines of C source - compiles 10-15 sec.
After adding several lines, time of compilation was increased to 4 min!
Probably you'll answer me that is not a bug. But for me this is serious trouble, and I'm interested in its solving.

Time before: 16:05:18

sdcc DarkWoods.c -mz80 --code-loc 45056 --data-loc 63488 --opt-code-size -I "." -I ........\ZXDev\Lib --disable-warning 59 --disable-warning 85 DWRsrc.rel XDev.lib Basic.lib

Time after: 16:09:36

Compiled in: 4 min 18 sec

As you understand, jump from 15 sec to 4 min 18 sec is not very useful for my work. :-(

I have localized the lines that make compilation slow:

http://zx.oberon2.ru/files/DarkWoods.c

SDCC 3.4.1 #9080

Time before: 16:28:06

sdcc -c -mz80 DarkWoods.c

Time after: 16:31:45

Compiled in: 3 min 39 sec

Please, agree with me that 3 min 39 sec for 10Kb source is too many.

1 Attachments

Discussion

  • Philipp Klaus Krause

    sdcc -c -mz80 DarkWoods.c takes less than 9 seconds on my notebook.

    Philipp

     
  • Oleg N. Cher

    Oleg N. Cher - 2014-09-26

    Hmmm. Maybe we need to test under Windows?

    For clean experiment - there is that I do (see archive):

    DarkWoods.bat
    DarkWoods.c
    sdasz80.exe
    sdcc.exe
    sdcpp.exe

    http://zx.oberon2.ru/files/DarkWoods.zip

    Run DarkWoods.bat and you'll see. I tried on my notebook (Windows XP SP3) and desktop (Windows XP 64bit SP2).

     
  • Maarten Brock

    Maarten Brock - 2014-10-04

    When I run this on my i7 with Windows 7 64bit I also see just 8 seconds. But when I try on my 1.6MHz Atom netbook on Windows XP 32bit I also see 3.5 minutes. And it doesn't seem to be memory size since task manager says sdcc uses only ~32MB.

    But even on Linux in a virtual machine on this same netbook it takes less than 1.5 minutes. So there must be something with XP going on here.

     
  • Oleg N. Cher

    Oleg N. Cher - 2014-10-04

    And what may be the reason of this delaying with XP?
    You are going to fix it? despite the fact that XP is obsolete?
    I think, still need to test it under Windows 7 32 bit.

     
  • Oleg N. Cher

    Oleg N. Cher - 2014-10-04

    Maarten, can you profile SDCC on your Atom netbook with Windows XP 32bit ? And you'll see what code makes this delaying.

     
  • Oleg N. Cher

    Oleg N. Cher - 2014-10-04

    Please :)

     
  • Maarten Brock

    Maarten Brock - 2014-10-05

    Some quick profiling indicates it's constantly in malloc. This does not surprise me. And apparently malloc in Windows XP is much less efficient than in linux or Windows 7. See also
    http://www.nedprod.com/programs/portable/nedmalloc/

    I haven't tried to replace malloc with nedmalloc. And I'm not sure we should.

    You could try to use the old register allocator with which it takes only 5 seconds on my netbook to build this file. It doesn't brute force try to find the most optimal register usage and thus doesn't call malloc that often. The resulting code will be bigger though. Maybe you can choose based on debug or release build of your project.

     
  • Oleg N. Cher

    Oleg N. Cher - 2014-10-06

    On the same machine (my notebook) the same source with -oldralloc gives:

    19:00:56,30
    n:\sdcc_bug>sdcc -c -mz80 DarkWoods.c --oldralloc
    19:00:57,70

    So, using old allocator is solution for me. Thanks.

    But I still can not understand the impact of the register allocator on the processing of the macroses. And why such a huge time gap between the old and the new allocator? Maybe too inefficient algorithm used for processing macroses that calls malloc too often?

     
  • Philipp Klaus Krause

    Can You give a link to the code without the macros? I see that for every macro line, we get two 16 bit local variables. That already makes 720 bytes of local variables. I see a total of 1207 bytes of local variables for this functions.

    Furthermore, the tree-decomposition we use has width 11. Even thought the function has tree-width just 2 (i.e. there would be a tree-decomposition of width 2).

    Both of these aspects contribute to a high runtime for the new register allocator.

    Philipp

     
  • Oleg N. Cher

    Oleg N. Cher - 2014-10-06

    Dear Philipp, I have not the code without the macros. This code was automatically generated by Ofront - Oberon-2 to C translator.

    if I were to write directly in C - I could get around this problem. For example, rewrite the code absolutely without macros. But I write in Oberon - so the problem is actual for me. But even in Oberon I can put this code to another module, compile it once and later only attach .rel file.

    So. No, it's nothing. Economical register allocation is more important.

     
  • Philipp Klaus Krause

    The main issue here is the width of the decomposition. Thus I consider this to be a duplicate of bug #1855, even though, in principle some changes in rematerialization might help here as well (by reducing the number of local variables to be assigned).

    Philipp

     
  • Philipp Klaus Krause

    • status: open --> closed-duplicate
    • assigned_to: Philipp Klaus Krause
     

Log in to post a comment.