Hi.
The following code (attached) doesn't compile:
---
#define offsetof(TYPE, MEMBER) ((int) &((TYPE *)0)->MEMBER)
struct e {
unsigned char a;
};
struct x {
unsigned char b;
struct e c[0];
};
unsigned char r(unsigned char n)
{
return offsetof(struct x, c[n]);
}
unsigned char main()
{
return r(5);
}
---
$ sdcc --std-sdcc99 off.c
off.c:14: error 151: cannot use generic pointer <null>
to initialize <null>
test-case
Logged In: YES
user_id=888171
Stas,
According to the standard this macro should be defined in
stddef.h so I put it there already 2 years ago. AFAIK it
works perfectly. Only I used another pair of parentheses
before taking the address. I would have to look up if your
code should give the same result.
Maarten
Logged In: YES
user_id=501371
Thanks for the pointer, however, removing the macro
from my test example and including stddef.h doesn't
change anything at all.
Your macro works perfectly, but sdcc does not. :-(
This is not a parentheses bug - my macro was also
taken from a reliable source.
Logged In: YES
user_id=501371
Hmm, what category is this - "msc51"?
Was it meant to be "mcs51"? A bug? :)
I changed the category, as per my analises it is a
codegen bug.
Logged In: YES
user_id=888171
Stas,
I haven't looked deeply into your patch yet, but the
problem seems to be that SDCC does not know which memory
type to choose when casting a number to a generic pointer.
For normal compilation that is a good thing I think and we
should not start assuming anything and using a default
memory space.
However in this special case where we always take the
address of the created pointer and cast it back to some
integer type it gets in our way. A good solution is to
include a memory type in the cast to pointer like this:
#define offsetof(s,m) (size_t)&(((s __code *)0)->m)
I'll see if this works for all ports and adapt stddef.h
accordingly.
Maarten
Logged In: YES
user_id=501371
> problem seems to be that SDCC does not know which memory
> type to choose when casting a number to a generic pointer.
Yes, that's what I've figured out too.
> For normal compilation that is a good thing I think and we
> should not start assuming anything and using a default
> memory space.
I think it is good to have an error when such a pointer
is dereferenced, but not when it is only created. I think
creating such a pointers should be perfectly legal.
Unfortunately, I am not familiar with the C standard
enough to say whether or not the constructs like
&ptr->member are a dereferences or not. I think they
are not (the value is never read or written), so I think
sdcc should not complain. But I need to dig the standards
to say for sure, unless someone else knows this right
away. :)
> A good solution is to
> include a memory type in the cast to pointer like this:
This is only a good solution if, and I think you imply
it is, the &ptr->member is really a dereference. If it
is not a dereference, then I am going to assert that the
offsetof macro must not be altered, as being perfectly
correct.
Besides, adding an explicit storage class simply makes
no sense there - "code" will work similar to "xdata" etc,
so after all it gets unused and therefore smells like a
bad hack.
But I see your point too. I think we only disagree on
whether or not it is a dereference.
By the way, thinking about it from that point of view,
my patch is not correct. That patch allows a real
dereferences, which is not good, as you pointed out.
Whether or not it would be easy for me to code up the
patch that will allow only that "pseudo-dereferences",
is what I doubt in. Hmm...
Logged In: YES
user_id=501371
I've googled out something relevant, that supports
my point of view:
http://c0x.coding-guidelines.com/6.5.3.3.html
---
1104 84) Thus, &*E is equivalent to E (even if E is a null
pointer), and &(E1[E2]) to ((E1)+(E2))
---
It doesn't explicitly say about ptr->memb, but I think
ptr[offs] is the same thing, and it explicitly says that
&(E1[E2]) is just a pointer arithmetic, which I read as
"not a dereference".
It also explicitly mentions that &*E (is not a deref and)
is safe even in case E==NULL. So I think that quote
covers our case pretty well. The current offsetof macro
does only the pointer arithmetic, which is safe and
must always work. It doesn't do any unsafe dereferences.
That's my understanding.
Logged In: YES
user_id=501371
Actually, the "dereference" was probably not the right
term here.
My point is that the pointer should require to have a
storage class only as soon as the memory is read or
written via that pointer. As long as the pointer is
used only for the pointer arithmetic, it can remain
"generic". From that point of view, I see
"(int)&gptr->memb" being identical to
"(int)(gptr + int_var)". So either both should work
(my preference), or both should be prohibited.
If you prohibit only "(int)&gptr->memb", then I
wonder why "(int)(gptr + int_var)" should still work.
Logged In: YES
user_id=501371
And so I was able to reduce the test-case to this one
(attached):
---
struct x {
unsigned char a,b,c;
};
unsigned char r(unsigned char n)
{
return ((int)(((struct x *)0)+n));
}
unsigned char main()
{
return r(5);
}
---
Do you think this code is illegal too? I was assuming
this works, but it appeared not...
new test-case
Logged In: YES
user_id=888171
Stas,
The question is not if it is illegal. The C standard
assumes Von Neumann architecture and not Harvard and that
is the source of the problem here.
If SDCC only generates an error upon dereferencing an
unknown storage class pointer chances are you get
overwhelmed with the same error if you do so. But it is a
choice we can make.
I want some input from the other developers on this one.
In the meantime I'll fix/hack the stddef.h so that at
least works.
Maarten
Logged In: YES
user_id=888171
Somehow the flexible array member is involved too. If I
give the array c length 1 it compiles ok. If x is a simple
struct containing only char's a,b,c it compiles too.
struct x {
unsigned char a,b,c;
}
Or even:
struct x {
unsigned char b;
unsigned char c[1];
}
But not:
struct x {
unsigned char b;
unsigned char c[0];
}
This needs a deeper investigation.
Logged In: YES
user_id=501371
> If x is a simple
> struct containing only char's a,b,c it compiles too.
Hmm, no? In my new test-case that struct exactly
contains only a,b,c and doesn't compile.
It compiles for me only if it has 1-byte size, i.e.
contains only "a", or only c[1].
Flexible arrays do not seem to be involved in my case,
only the structure size matters it seems.
Logged In: YES
user_id=501371
A further observation: changing the function r() in
my examples to a macro makes the problem to go away.
The following compiles fine:
---
struct x {
unsigned char b,k,l;
};
#define r(n) \ ((int)(((struct x *)0) + n))
unsigned char main()
{
return r(5);
}
---
Obviously, the difference is that with the macro, the
expression is evaluates compile-time. AFAICT (correct
me if I am wrong) sdcc does the constant folding on
the icode level. So I assume the icode generation is
correct, and this is just a codegen bug.
Since this all compiles well when the constant folding
is possible, but not otherwise, I think it is obviously
a bug that have to be fixed. Be it intentional, it
wouldn't compile even being a macro.
If I manage to code up the patch that will allow such
a pointer arith but will complain on an access attempts,
unless someone else is going to work on that, will it
be considered for inclusion? I myself am quite happy
with my current fix, but it would not be a good solution
for me to maintain the custom patches, so I can try to
code up a new one if there is a hope it will be included.
Logged In: YES
user_id=1610032
Please see my comment on bug #1518279 .
I hope this problem is now fixed by implementing __builtin_offsetof (see path #801126: "__builtin_offsetof implementation" at https://sourceforge.net/tracker/index.php?func=detail&aid=2801126&group_id=599&atid=300599\).
Borut
Hi Borut.
I tested, and it seems my test-case now
compiles even without the use of the
__builtin_offsetof, so I think someone else
have already fixed it silently before.
Of course it would be nice to know the name
of that hero, but as it is, the bug seems to
be resolved, thanks!
So, if my understanding is correct, sdcc now
fully supports the hand-made variations of
offsetoff together with the __builtin_offsetof?
If so, what are the advantages of using
__builtin_offsetof?
Indeed this seems fixed in SDCC and I don't know who fixed it or when and what the advantage of __builtin_offsetof is. Still it's time to close the bug.
> If so, what are the advantages of using __builtin_offsetof?
Probably none if the macro solution works in all cases. But the big brother (gcc) has it, so the little brother (sdcc) has to have it too ;-)
The macro offsetof in sdcc library is curreltly implemented as a call to __builtin_offsetof. Maybe we should revet it ot the old "pure macro" soution, so that the eventual bugs in poiner arithmetic will be immediatly exposed? On the other hand I think we have a regression test for pointer artithmetics which also tests the "pure macro" version of offsetoff...
Borut
> Probably none if the macro solution works in all cases.
The thing is that I've seen an assertion from gcc
devs that this can't work in all cases. That caught
my curiosity forever, so if someone happened to
know when it is better to use __builtin_offsetof, please
tell me for the big thankyou. :)