#258 Evaluate arithmetic stuff at compile time


Currently sdcc evaluates arithmetic stuff at compile time for ints, but not for pointers.


unsigned char b;
unsigned char d;
unsigned char e;
} c;
} a;

void test(void)
a.c.e = 1 + 1 + 1;

The "1 + 1 + 1" is replaced by 3 at compile time, but &(a.c.e) is calculated at runtime as (&a + 1 + 1). It should be done at compile time.

The above code (which is attached, too) should be compiled with
sdcc -mz80 --no-peep to see the problem.
Some ports like hc08 do optimize the pointer calculation in code generation (z80 does it partially in code generation, partially in the peephole optimizer: We see one +1 in the loading of the constant, and an inc after that; the inc would be optimized awy by the peephole optimizer), but I think this should be done at a higher level.
Doing it at a higher level would be cleaner and make it work for more complex examples, too. It would ease register pressure since the register allocator would never allocate registers to intermediate results of the pointer calculation.



  • Philipp Klaus Krause


  • Philipp Klaus Krause

    • priority: 6 --> 7
  • Philipp Klaus Krause

    Logged In: YES
    Originator: YES

    As the compiler comparison at http://sdcc.wiki.sourceforge.net/Philipp%27s+TODO+list has shown this is currently the main problem for efficient code generation in sdcc (I therefore increase the priority).

    The problem can be seen at a high level: Already in the AST tree the tree +1 are combined into +3, while the pointer calculations are not:

    FUNCTION (_test=0x877c40) type (void) args (unsigned-char)
    test.c:13: ASSIGN(=) (0x876a10) type (data-unsigned-char)
    test.c:13: PTR_ACCESS (0x875d80) type (data-unsigned-char)
    test.c:13: ADDRESS_OF (0x875ce0) type (struct __00020001 data-near* )
    test.c:13: PTR_ACCESS (0x875680) type (data-struct __00020001)
    test.c:13: ADDRESS_OF (0x8755e0) type (struct __00010000 near* )
    test.c:13: SYMBOL (a=0x874f80 @ 0x874020) type (struct __00010000)
    test.c:13: SYMBOL (c=0x875540 @ 0x875230)
    test.c:13: SYMBOL (e=0x875c40 @ 0x875930)
    test.c:13: ADD (0x876970) type (unsigned-char)
    test.c:13: SYMBOL (x=0x876130 @ 0x8747f0) type (unsigned-char)
    test.c:13: CONSTANT (0x8766b0) value = 3, 0x3, 3.000000 type (literal-unsigned-char)

  • Philipp Klaus Krause

    Since this is currently the single worst problem in terms of quality of the genrated code (both size and speed) I've raised the priority a bit.

  • Philipp Klaus Krause

    • priority: 7 --> 8
  • Philipp Klaus Krause

    offsetof is better than .

  • Philipp Klaus Krause

    There is another aspect to this problem, related to addresses of variables on the stack being calculated.
    The attached file stackhell.c is a minimal example. For a larger example with a huge impact on code size see sdcc's multiplication of long variables.
    It is this problem that makes sdcc the by far worst compiler at it's own long multiplication routien!
    Code size sdcc: Approx. 700 bytes, code size HITECH-C: Approx 250 bytes.

    For ports that do not produce reentrant code by default, some of the problems mentioned in this feature request are only visible when using --renentrant
    The solution should take place before common subexpression elemination, other wise there's risk that
    x = (y + 4) + 4;
    z = (y + 4) + 6;
    which should be optimized into
    x = y + 8;
    z = y + 10;
    is made into
    tmp = y + 4;
    x = tmp + 4;
    z = tmp + 6;
    (for additions that come from struct accesses).


  • Philipp Klaus Krause

    • priority: 8 --> 9
  • Philipp Klaus Krause

    I suppose icode generation from ast would be the place to do this, since it's the place code such as

    const int x = 0;

    int test(void)
    return *((&x)+1);

    which yields the follwong AST

    FUNCTION (_test=0x829c0d0) type (int fixed) args (void)
    test.c:5: RETURN (0x829bcc8) type (const-int code)
    test.c:5: DEREF (0x829bc70) type (const-int code)
    test.c:5: ADD (0x829bc18) type (const-int code* code)
    test.c:5: ADDRESS_OF (0x829ba28) type (const-int code* code)
    test.c:5: SYMBOL (x=0x829b9d0 @ 0x829ae70) type (const-int code)
    test.c:5: CONSTANT (0x829bbc0) value = 1, 0x1, 1.000000 type (unsigned-char literal)

    is simplified to remove the ADD.


  • Philipp Klaus Krause

    Revision #5528 is a step towards solving this issue. It flattens nested structure accesses. We now combine pointer additions in cse.


  • Philipp Klaus Krause

    Example showing the problem with stack accesses.

  • Philipp Klaus Krause

    • assigned_to: nobody --> spth
    • status: open --> closed
  • Philipp Klaus Krause

    On second though the small change in rev #5528 solved this for nestes structs, global structs. The only remaining problem is direct access to members when the aggregates resides on the stack. So I'll clode this RFE now and open and more specific one for the case of aggregates on stack.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks