Menu

#1162 good code rejected with strange error

closed-fixed
5
2013-05-25
2006-07-06
No

Hi.

The following code (attached) doesn't compile:

---
#define offsetof(TYPE, MEMBER) ((int) &((TYPE *)0)->MEMBER)
struct e {
unsigned char a;
};
struct x {
unsigned char b;
struct e c[0];
};
unsigned char r(unsigned char n)
{
return offsetof(struct x, c[n]);
}
unsigned char main()
{
return r(5);
}
---

$ sdcc --std-sdcc99 off.c
off.c:14: error 151: cannot use generic pointer <null>
to initialize <null>

Discussion

  • Stas Sergeev

    Stas Sergeev - 2006-07-06

    test-case

     
  • Maarten Brock

    Maarten Brock - 2006-08-09

    Logged In: YES
    user_id=888171

    Stas,

    According to the standard this macro should be defined in
    stddef.h so I put it there already 2 years ago. AFAIK it
    works perfectly. Only I used another pair of parentheses
    before taking the address. I would have to look up if your
    code should give the same result.

    Maarten

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    Logged In: YES
    user_id=501371

    Thanks for the pointer, however, removing the macro
    from my test example and including stddef.h doesn't
    change anything at all.
    Your macro works perfectly, but sdcc does not. :-(
    This is not a parentheses bug - my macro was also
    taken from a reliable source.

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10
    • labels: 101552 --> mcs51(8051) target
     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    Logged In: YES
    user_id=501371

    Hmm, what category is this - "msc51"?
    Was it meant to be "mcs51"? A bug? :)
    I changed the category, as per my analises it is a
    codegen bug.

     
  • Maarten Brock

    Maarten Brock - 2006-08-10

    Logged In: YES
    user_id=888171

    Stas,

    I haven't looked deeply into your patch yet, but the
    problem seems to be that SDCC does not know which memory
    type to choose when casting a number to a generic pointer.
    For normal compilation that is a good thing I think and we
    should not start assuming anything and using a default
    memory space.

    However in this special case where we always take the
    address of the created pointer and cast it back to some
    integer type it gets in our way. A good solution is to
    include a memory type in the cast to pointer like this:

    #define offsetof(s,m) (size_t)&(((s __code *)0)->m)

    I'll see if this works for all ports and adapt stddef.h
    accordingly.

    Maarten

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    Logged In: YES
    user_id=501371

    > problem seems to be that SDCC does not know which memory
    > type to choose when casting a number to a generic pointer.
    Yes, that's what I've figured out too.

    > For normal compilation that is a good thing I think and we
    > should not start assuming anything and using a default
    > memory space.
    I think it is good to have an error when such a pointer
    is dereferenced, but not when it is only created. I think
    creating such a pointers should be perfectly legal.
    Unfortunately, I am not familiar with the C standard
    enough to say whether or not the constructs like
    &ptr->member are a dereferences or not. I think they
    are not (the value is never read or written), so I think
    sdcc should not complain. But I need to dig the standards
    to say for sure, unless someone else knows this right
    away. :)

    > A good solution is to
    > include a memory type in the cast to pointer like this:
    This is only a good solution if, and I think you imply
    it is, the &ptr->member is really a dereference. If it
    is not a dereference, then I am going to assert that the
    offsetof macro must not be altered, as being perfectly
    correct.
    Besides, adding an explicit storage class simply makes
    no sense there - "code" will work similar to "xdata" etc,
    so after all it gets unused and therefore smells like a
    bad hack.
    But I see your point too. I think we only disagree on
    whether or not it is a dereference.

    By the way, thinking about it from that point of view,
    my patch is not correct. That patch allows a real
    dereferences, which is not good, as you pointed out.
    Whether or not it would be easy for me to code up the
    patch that will allow only that "pseudo-dereferences",
    is what I doubt in. Hmm...

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    Logged In: YES
    user_id=501371

    I've googled out something relevant, that supports
    my point of view:
    http://c0x.coding-guidelines.com/6.5.3.3.html
    ---
    1104 84) Thus, &*E is equivalent to E (even if E is a null
    pointer), and &(E1[E2]) to ((E1)+(E2))
    ---

    It doesn't explicitly say about ptr->memb, but I think
    ptr[offs] is the same thing, and it explicitly says that
    &(E1[E2]) is just a pointer arithmetic, which I read as
    "not a dereference".
    It also explicitly mentions that &*E (is not a deref and)
    is safe even in case E==NULL. So I think that quote
    covers our case pretty well. The current offsetof macro
    does only the pointer arithmetic, which is safe and
    must always work. It doesn't do any unsafe dereferences.
    That's my understanding.

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    Logged In: YES
    user_id=501371

    Actually, the "dereference" was probably not the right
    term here.
    My point is that the pointer should require to have a
    storage class only as soon as the memory is read or
    written via that pointer. As long as the pointer is
    used only for the pointer arithmetic, it can remain
    "generic". From that point of view, I see
    "(int)&gptr->memb" being identical to
    "(int)(gptr + int_var)". So either both should work
    (my preference), or both should be prohibited.
    If you prohibit only "(int)&gptr->memb", then I
    wonder why "(int)(gptr + int_var)" should still work.

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    Logged In: YES
    user_id=501371

    And so I was able to reduce the test-case to this one
    (attached):
    ---
    struct x {
    unsigned char a,b,c;
    };

    unsigned char r(unsigned char n)
    {
    return ((int)(((struct x *)0)+n));
    }

    unsigned char main()
    {
    return r(5);
    }
    ---

    Do you think this code is illegal too? I was assuming
    this works, but it appeared not...

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-10

    new test-case

     
  • Maarten Brock

    Maarten Brock - 2006-08-10

    Logged In: YES
    user_id=888171

    Stas,

    The question is not if it is illegal. The C standard
    assumes Von Neumann architecture and not Harvard and that
    is the source of the problem here.

    If SDCC only generates an error upon dereferencing an
    unknown storage class pointer chances are you get
    overwhelmed with the same error if you do so. But it is a
    choice we can make.

    I want some input from the other developers on this one.
    In the meantime I'll fix/hack the stddef.h so that at
    least works.

    Maarten

     
  • Maarten Brock

    Maarten Brock - 2006-08-10
    • assigned_to: nobody --> maartenbrock
     
  • Maarten Brock

    Maarten Brock - 2006-08-10

    Logged In: YES
    user_id=888171

    Somehow the flexible array member is involved too. If I
    give the array c length 1 it compiles ok. If x is a simple
    struct containing only char's a,b,c it compiles too.
    struct x {
    unsigned char a,b,c;
    }
    Or even:
    struct x {
    unsigned char b;
    unsigned char c[1];
    }

    But not:
    struct x {
    unsigned char b;
    unsigned char c[0];
    }

    This needs a deeper investigation.

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-11

    Logged In: YES
    user_id=501371

    > If x is a simple
    > struct containing only char's a,b,c it compiles too.
    Hmm, no? In my new test-case that struct exactly
    contains only a,b,c and doesn't compile.
    It compiles for me only if it has 1-byte size, i.e.
    contains only "a", or only c[1].
    Flexible arrays do not seem to be involved in my case,
    only the structure size matters it seems.

     
  • Stas Sergeev

    Stas Sergeev - 2006-08-14

    Logged In: YES
    user_id=501371

    A further observation: changing the function r() in
    my examples to a macro makes the problem to go away.
    The following compiles fine:
    ---
    struct x {
    unsigned char b,k,l;
    };

    #define r(n) \ ((int)(((struct x *)0) + n))

    unsigned char main()
    {
    return r(5);
    }
    ---

    Obviously, the difference is that with the macro, the
    expression is evaluates compile-time. AFAICT (correct
    me if I am wrong) sdcc does the constant folding on
    the icode level. So I assume the icode generation is
    correct, and this is just a codegen bug.
    Since this all compiles well when the constant folding
    is possible, but not otherwise, I think it is obviously
    a bug that have to be fixed. Be it intentional, it
    wouldn't compile even being a macro.

    If I manage to code up the patch that will allow such
    a pointer arith but will complain on an access attempts,
    unless someone else is going to work on that, will it
    be considered for inclusion? I myself am quite happy
    with my current fix, but it would not be a good solution
    for me to maintain the custom patches, so I can try to
    code up a new one if there is a hope it will be included.

     
  • Virgilio Eng. Villatora

    Logged In: YES
    user_id=1610032

    Please see my comment on bug #1518279 .

     
  • Borut Ražem

    Borut Ražem - 2010-07-25
    • milestone: --> fixed
    • status: open --> pending-fixed
     
  • Stas Sergeev

    Stas Sergeev - 2010-07-26
    • status: pending-fixed --> open-fixed
     
  • Stas Sergeev

    Stas Sergeev - 2010-07-26

    Hi Borut.

    I tested, and it seems my test-case now
    compiles even without the use of the
    __builtin_offsetof, so I think someone else
    have already fixed it silently before.
    Of course it would be nice to know the name
    of that hero, but as it is, the bug seems to
    be resolved, thanks!
    So, if my understanding is correct, sdcc now
    fully supports the hand-made variations of
    offsetoff together with the __builtin_offsetof?
    If so, what are the advantages of using
    __builtin_offsetof?

     
  • Maarten Brock

    Maarten Brock - 2010-08-04
    • status: open-fixed --> closed-fixed
     
  • Maarten Brock

    Maarten Brock - 2010-08-04

    Indeed this seems fixed in SDCC and I don't know who fixed it or when and what the advantage of __builtin_offsetof is. Still it's time to close the bug.

     
  • Borut Ražem

    Borut Ražem - 2010-08-04

    > If so, what are the advantages of using __builtin_offsetof?

    Probably none if the macro solution works in all cases. But the big brother (gcc) has it, so the little brother (sdcc) has to have it too ;-)

    The macro offsetof in sdcc library is curreltly implemented as a call to __builtin_offsetof. Maybe we should revet it ot the old "pure macro" soution, so that the eventual bugs in poiner arithmetic will be immediatly exposed? On the other hand I think we have a regression test for pointer artithmetics which also tests the "pure macro" version of offsetoff...

    Borut

     
  • Stas Sergeev

    Stas Sergeev - 2010-08-09

    > Probably none if the macro solution works in all cases.
    The thing is that I've seen an assertion from gcc
    devs that this can't work in all cases. That caught
    my curiosity forever, so if someone happened to
    know when it is better to use __builtin_offsetof, please
    tell me for the big thankyou. :)

     

Log in to post a comment.