Menu

#130 Handling of array initialization is incorrect

Bug
closed-fixed
nobody
5
2012-06-12
2012-06-07
Ed Schwartz
No

Consider an array, int x[50000] = {0, };

In 6.7.8, paragraph 21 of the C99 standard:

If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.

In other words, when in doubt, initialize to zero.

CIL does this, but only for up to 16 elements. If it has to put an implied zero for more than 16 elements, it does not put any implied zeros. This is wrong, and has lead to incorrect behavior on real code. Specifically, in card_of_complement from tr.c, in coreutils:

bool in_set[N_CHARS] = { 0, };

where N_CHARS = 256.

There seem to be a few ways of solving this problem:
* Remove the 16 element limit. This could add a lot of statements.
* Allow local variables to use initializers. Global variables are allowed initializers, but not local ones... is there any reason for this?
* Add a loop to initialize the array

I don't mind implementing one of these, but I want to make sure it's something worth including in the CIL codebase. I prefer the third option.

Discussion

  • Ed Schwartz

    Ed Schwartz - 2012-06-07

    By the way, here is an example of how CIL hanhdles _Bool x[4] = { 0 }:

    int f(void)
    {
    _Bool x[4] ;

    {
    #line 2
    x[0] = (_Bool)0;
    #line 2
    x[1] = (_Bool)0;
    #line 2
    x[2] = (_Bool)0;
    #line 2
    x[3] = (_Bool)0;
    #line 4
    return ((int )x[1]);
    }
    }

     
  • Ed Schwartz

    Ed Schwartz - 2012-06-07

    And here is how it handles _Bool x[4096] = { 0 };

    int f(void)
    {
    _Bool x[4096] ;

    {
    #line 2
    x[0] = (_Bool)0;
    #line 4
    return ((int )x[1]);
    }
    }

     
  • Gabriel Kerneis

    Gabriel Kerneis - 2012-06-08

    I would gladly accept a patch. I prefer the second option, but it is not realistic to implement it given the current state of CIL (the reason why it was done that way is to be able to move every local variable to function scope).

    So if you feel like coding the third option, please go ahead. A fourth and shorter solution would be to use __builtin_memset, but it wouldn't work in MSVC mode (unless there is some intrisic memset available there too).

     
  • Ed Schwartz

    Ed Schwartz - 2012-06-08

    Proposed patch

     
  • Ed Schwartz

    Ed Schwartz - 2012-06-11

    I attached a patch for this. Let me know if you have any comments..

     
  • Gabriel Kerneis

    Gabriel Kerneis - 2012-06-11

    Looks good to me. However, is there any reason to keep the first 16 initializers in collectInitializers since we add the loop in every case? I think we should get rid of it but I might be missing something. Or use a loop only for longer arrays, but I prefer avoiding "magic numbers". If you agree, I'll amend the patch, test it and apply it.

     
  • Ed Schwartz

    Ed Schwartz - 2012-06-11

    Hi Gabriel,

    The loop is currently only added when there are more than 16 implied initializers. I also don't like magic numbers, but it could simplify program analysis to have the unrolled initializers outside of the loop when there are only a few of them, so I see a case for both sides.

    I do not feel strongly either way. Feel free to amend the patch in whatever way you think is best.

    Thanks,
    Ed

     
  • Gabriel Kerneis

    Gabriel Kerneis - 2012-06-12

    In fact, your patch is incorrect is several places:

    > The loop is currently only added when there are more than 16 implied initializers

    no it's not (and that's one of the reasons why it breaks the test suite), But anyway I'll remove these 16-initializers trick.

    > BinOp(Ge, Lval ctrlval, Const(CInt64(ni, IUInt, None)), uintType)

    Binop(Ge, …) requires intType, not uintType.

    > castTo iet (typeOfLval lv) (Const(CInt64(0L, IUInt, None)))

    This one is the real problem. IIUC, you copy/pasted it from the SingleInit case, but it does not work. You could have an array of any type (bt), the cast is non-sensical and you have no guarantee that a direct init is possibe (it could be an array of struct containing an array, etc.). I think the solution is to call assignInit recursively with makeZeroInit bt, but I have to test it to be sure.

     
  • Gabriel Kerneis

    Gabriel Kerneis - 2012-06-12

    Committed an amended version of your patch with recursive calls to assignInit.

     
  • Gabriel Kerneis

    Gabriel Kerneis - 2012-06-12
    • status: open --> closed-fixed
     

Log in to post a comment.