Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#258 Evaluate arithmetic stuff at compile time

closed
None
9
2009-09-30
2008-03-23
No

Currently sdcc evaluates arithmetic stuff at compile time for ints, but not for pointers.

Example:

struct
{
unsigned char b;
struct
{
unsigned char d;
unsigned char e;
} c;
} a;

void test(void)
{
a.c.e = 1 + 1 + 1;
}

The "1 + 1 + 1" is replaced by 3 at compile time, but &(a.c.e) is calculated at runtime as (&a + 1 + 1). It should be done at compile time.

The above code (which is attached, too) should be compiled with
sdcc -mz80 --no-peep to see the problem.
Some ports like hc08 do optimize the pointer calculation in code generation (z80 does it partially in code generation, partially in the peephole optimizer: We see one +1 in the loading of the constant, and an inc after that; the inc would be optimized awy by the peephole optimizer), but I think this should be done at a higher level.
Doing it at a higher level would be cleaner and make it work for more complex examples, too. It would ease register pressure since the register allocator would never allocate registers to intermediate results of the pointer calculation.

Philipp

Discussion

  • Example

     
    Attachments
    • priority: 6 --> 7
     
  • Logged In: YES
    user_id=564030
    Originator: YES

    As the compiler comparison at http://sdcc.wiki.sourceforge.net/Philipp%27s+TODO+list has shown this is currently the main problem for efficient code generation in sdcc (I therefore increase the priority).

    The problem can be seen at a high level: Already in the AST tree the tree +1 are combined into +3, while the pointer calculations are not:

    FUNCTION (_test=0x877c40) type (void) args (unsigned-char)
    (null):0:{
    test.c:13: ASSIGN(=) (0x876a10) type (data-unsigned-char)
    test.c:13: PTR_ACCESS (0x875d80) type (data-unsigned-char)
    test.c:13: ADDRESS_OF (0x875ce0) type (struct __00020001 data-near* )
    test.c:13: PTR_ACCESS (0x875680) type (data-struct __00020001)
    test.c:13: ADDRESS_OF (0x8755e0) type (struct __00010000 near* )
    test.c:13: SYMBOL (a=0x874f80 @ 0x874020) type (struct __00010000)
    test.c:13: SYMBOL (c=0x875540 @ 0x875230)
    test.c:13: SYMBOL (e=0x875c40 @ 0x875930)
    test.c:13: ADD (0x876970) type (unsigned-char)
    test.c:13: SYMBOL (x=0x876130 @ 0x8747f0) type (unsigned-char)
    test.c:13: CONSTANT (0x8766b0) value = 3, 0x3, 3.000000 type (literal-unsigned-char)
    (null):0:}

     
  • Since this is currently the single worst problem in terms of quality of the genrated code (both size and speed) I've raised the priority a bit.

     
    • priority: 7 --> 8
     
  • offsetof is better than .

     
    Attachments
  • There is another aspect to this problem, related to addresses of variables on the stack being calculated.
    The attached file stackhell.c is a minimal example. For a larger example with a huge impact on code size see sdcc's multiplication of long variables.
    It is this problem that makes sdcc the by far worst compiler at it's own long multiplication routien!
    Code size sdcc: Approx. 700 bytes, code size HITECH-C: Approx 250 bytes.

    P.S.:
    For ports that do not produce reentrant code by default, some of the problems mentioned in this feature request are only visible when using --renentrant
    P.P.S.:
    The solution should take place before common subexpression elemination, other wise there's risk that
    x = (y + 4) + 4;
    z = (y + 4) + 6;
    which should be optimized into
    x = y + 8;
    z = y + 10;
    is made into
    tmp = y + 4;
    x = tmp + 4;
    z = tmp + 6;
    (for additions that come from struct accesses).

    Philipp

     
    • priority: 8 --> 9
     
  • I suppose icode generation from ast would be the place to do this, since it's the place code such as

    const int x = 0;

    int test(void)
    {
    return *((&x)+1);
    }

    which yields the follwong AST

    FUNCTION (_test=0x829c0d0) type (int fixed) args (void)
    (null):0:{
    test.c:5: RETURN (0x829bcc8) type (const-int code)
    test.c:5: DEREF (0x829bc70) type (const-int code)
    test.c:5: ADD (0x829bc18) type (const-int code* code)
    test.c:5: ADDRESS_OF (0x829ba28) type (const-int code* code)
    test.c:5: SYMBOL (x=0x829b9d0 @ 0x829ae70) type (const-int code)
    test.c:5: CONSTANT (0x829bbc0) value = 1, 0x1, 1.000000 type (unsigned-char literal)
    (null):0:}

    is simplified to remove the ADD.

    Philipp

     
  • Revision #5528 is a step towards solving this issue. It flattens nested structure accesses. We now combine pointer additions in cse.

    Philipp

     
  • Example showing the problem with stack accesses.

     
    Attachments
    • assigned_to: nobody --> spth
    • status: open --> closed
     
  • On second though the small change in rev #5528 solved this for nestes structs, global structs. The only remaining problem is direct access to members when the aggregates resides on the stack. So I'll clode this RFE now and open and more specific one for the case of aggregates on stack.