Thread: [GD-Windows] Compiler code gen
Brought to you by:
vexxed72
From: Brett B. <res...@ga...> - 2004-11-29 12:07:20
|
Hello, I have a really bizarre problem that I'm trying to track down. I got it down to one case that works and one that doesn't and hopefully somebody here can help shed some light on this for me. The situation is that when I pass a float to one of my variadic functions, the paramters in the receiving function are not right. If I change the variable to an integer it works fine. After much looking at the code I can see that floats passed to variadic functions are having 8 bytes pushed onto the stack instead of 4 bytes. If I call a non-variadic function only 4 bytes are used. Is this considered correct behavior for Windows? I assume that the variadic function should be popping the 8 bytes if it pushed it, but I can see in my memory dump that all parameters on the stack after floats appear are trashed. My compiler is CodeWarrior. Thanks, Brett |
From: Jon W. <hp...@mi...> - 2004-11-29 17:20:33
|
> After much looking at the code I can see that floats passed to variadic > functions are having 8 bytes pushed onto the stack instead of 4 bytes. If I > call a non-variadic function only 4 bytes are used. Is this considered > correct behavior for Windows? I believe this is type promotion, as defined by the C/C++ language. You should get the float arguments back out using va_arg(double). Sadly, the C++ standards document just punts to the C standard document, so I can't give you a definitive reference. This program shows how it works (note: func2 is correct, func1 is incorrect): #include <stdarg.h> #include <stdio.h> void func1( int a, ... ) { va_list vl; va_start( vl, a ); float f1 = va_arg( vl, float ); float f2 = va_arg( vl, float ); printf( "%f %f\n", f1, f2 ); } void func2( int a, ... ) { va_list vl; va_start( vl, a ); float f1 = va_arg( vl, double ); float f2 = va_arg( vl, double ); printf( "%f %f\n", f1, f2 ); } int main() { func1( 1, 2.0f, 3.0f ); func2( 1, 2.0f, 3.0f ); return 0; } Cheers, / h+ |
From: Brett B. <res...@ga...> - 2004-11-30 00:20:06
|
>Jon Watte wrote: > I believe this is type promotion, as defined by the C/C++ language. You > should get the float arguments back out using va_arg(double). Sadly, the > C++ standards document just punts to the C standard document, so I can't > give you a definitive reference. I changed my va_arg(ap, float) to va_arg(ap, double) and it works now. Thanks! I think it's questionable that CodeWarrior doesn't do the right thing when retrieving a float from from va_arg, but maybe they have a good reason for doing it that way... Cheers, Brett |
From: Jon W. <hp...@mi...> - 2004-11-30 17:14:58
|
> > I believe this is type promotion, as defined by the C/C++ language. You > > should get the float arguments back out using va_arg(double). Sadly, the > > C++ standards document just punts to the C standard document, so I can't > > give you a definitive reference. > I changed my va_arg(ap, float) to va_arg(ap, double) and it works now. > Thanks! I think it's questionable that CodeWarrior doesn't do the right > thing when retrieving a float from from va_arg, but maybe they have a good > reason for doing it that way... Are you saying it's bad of CodeWarrior to follow the language standard, just like all other compilers? Cheers, / h+ |
From: Javier A. <ja...@py...> - 2004-11-30 18:50:51
|
Jon Watte wrote: >> I changed my va_arg(ap, float) to va_arg(ap, double) and it works >> now. Thanks! I think it's questionable that CodeWarrior doesn't do >> the right thing when retrieving a float from from va_arg, but maybe >> they have a good reason for doing it that way... > > Are you saying it's bad of CodeWarrior to follow the language > standard, just like all other compilers? I guess his point is that va_arg(ap, float) as defined is always going to do the wrong thing because variable argument lists can never contain floats. I don't know how that could be fixed, considering that va_arg() is essentially a macro hack created to avoid adding language constructs. CodeWarrior is definitely innocent here. -- Javier Arevalo Pyro Studios |
From: Jon W. <hp...@mi...> - 2004-11-30 22:06:46
|
> I guess his point is that va_arg(ap, float) as defined is always going to do > the wrong thing because variable argument lists can never contain floats. I > don't know how that could be fixed, considering that va_arg() is essentially > a macro hack created to avoid adding language constructs. In C++, you could catch that use with a compile-time assert. In both C and C++ you can actually pass floats by using unions, and/or type punning. Thus, being able to specify float as the argument for va_arg() isn't entirely useless -- but, as many things, may cause hole in the feet of previously inexperienced programmers. Corollary: experience is directly proportional to the number of holes in your feet, thus more experienced programmers walk slower. Cheers, / h+ |
From: Brett B. <res...@ga...> - 2004-11-30 23:51:15
|
Doh! Not very many holes, but one new one. My gripe with CodeWarrior is that it recognizes the float being passed and promotes it to double from the caller, then I try to retrieve it with the type float and obviously it knows that it must promote that to double but doesn't. I'm not sure it's a standard problem as much as their implementation of the va_arg function. Brett ----- Original Message ----- From: "Jon Watte" <hp...@mi...> To: <gam...@li...> Sent: Wednesday, December 01, 2004 6:06 AM Subject: RE: [GD-Windows] Compiler code gen > >> I guess his point is that va_arg(ap, float) as defined is always going to > do >> the wrong thing because variable argument lists can never contain floats. > I >> don't know how that could be fixed, considering that va_arg() is > essentially >> a macro hack created to avoid adding language constructs. > > In C++, you could catch that use with a compile-time assert. > > In both C and C++ you can actually pass floats by using unions, and/or > type punning. Thus, being able to specify float as the argument for > va_arg() isn't entirely useless -- but, as many things, may cause hole > in the feet of previously inexperienced programmers. > > Corollary: experience is directly proportional to the number of holes > in your feet, thus more experienced programmers walk slower. > > Cheers, > > / h+ > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > Gamedevlists-windows mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=555 > |
From: Jon W. <hp...@mi...> - 2004-12-01 00:44:25
|
This is the well-defined behavior both of how arguments get passed through ellipsis, and how the va_arg() macro works. It's just one of those things, like "don't read memory after you've free()-ed it" or "don't use single-equals in conditional expressions." It bites you once, and you learn to deal with it. The type promotion through ellipsis is how ALL arguments got passed to functions back in K&R C -- just be grateful you have the tighter ANSI type scoping for non-ellipsis arguments ;-) Cheers, / h+ -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Brett Bibby Sent: Tuesday, November 30, 2004 3:56 PM To: gam...@li... Subject: Re: [GD-Windows] Compiler code gen Doh! Not very many holes, but one new one. My gripe with CodeWarrior is that it recognizes the float being passed and promotes it to double from the caller, then I try to retrieve it with the type float and obviously it knows that it must promote that to double but doesn't. I'm not sure it's a standard problem as much as their implementation of the va_arg function. Brett |
From: Andras B. <bn...@ma...> - 2004-12-01 22:34:37
|
I don't know about other platforms/compilers, but with MSVC, passing a 4 component floating point vector by value is far more efficient than passing it as a reference (could anyone explain why this is the case btw?). One more thing that can help performance is correct alignment in memory. Problem is, when I my vectors are aligned, then I cannot pass by value anymore, I'm forced to use reference. These requirements knock each other out, so what can I do? :) The interesting thing is that I can still create local variables on the stack, and they will be correctly aligned, but it cannot align the passed parameters? Why? Thanks, Andras |
From: Dan T. <da...@ar...> - 2004-12-01 23:33:50
|
I'm not an expert by any stretch of the imagination, but here's my thoughts. (*actual* experts encouraged to pipe up, so I know where I'm being stupid) >I don't know about other platforms/compilers, but with MSVC, passing a 4 >component floating point vector by value is far more efficient than passing >it as a reference (could anyone explain why this is the case btw?). > > Huh. That doesn't seem to make sense. Passing by value means you'll be calling whatever copy constructor you have, and dumping that onto the stack (if I remember this stuff right. Who knows if I am). However a 4 component float vector should be something like 16 bytes - and depending on your system the reference is 4 or 8 bytes. So the difference isn't a whole lot. My money is on the gains from the smaller parameter is getting lost in chasing the pointer down however many times you use it - or you are doing something like. void MyFunc(vector4* pMyVec) { vector4 holder = *pMyVec; // avoid copying onto the stack in the call, only to copy onto the stack in the body. // for a grand total of (sizeof(vector4*) + sizeof(vector4)) copied so far, less optimizations. } Of course, the best way to find out is to take a look at the generated asm for each, and see what is really going on. If this is being called a lot, make sure you aren't busting cache lines... If you are calling by reference and the data is in the cache, then there really shouldn't be any problem with the pointer version as I understand it (provided you don't ask for local copies like in the example). Post your function if you can... its kinda hard to figure it out with what you say. Not sure what you are going for with the aligning stuff. It was my understanding the compiler will align anything you need for speed (had this bite me with a struct a couple times). -Dan >One more thing that can help performance is correct alignment in memory. >Problem is, when I my vectors are aligned, then I cannot pass by value >anymore, I'm forced to use reference. > >These requirements knock each other out, so what can I do? :) > >The interesting thing is that I can still create local variables on the >stack, and they will be correctly aligned, but it cannot align the passed >parameters? Why? > > >Thanks, > > >Andras > > > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://productguide.itmanagersjournal.com/ >_______________________________________________ >Gamedevlists-windows mailing list >Gam...@li... >https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows >Archives: >http://sourceforge.net/mailarchive/forum.php?forum_id=555 > > > > |
From: Andras B. <bn...@ma...> - 2004-12-02 00:22:33
|
> Post your function if you can... its kinda hard to figure it out with what > you say. Sure, here's a simple sample :) //______________________________________________________________ template <class T> class Vector { public: union { T v[4]; struct { T x; T y; T z; T w; }; }; Vector() {} Vector(T f): x(f), y(f), z(f), w(f) {} Vector(T ix, T iy, T iz = 0, T iw = 0): x(ix), y(iy), z(iz), w(iw) {} operator T* () {return v;} __forceinline inline T dprod(const Vector u) const { return x*u.x + y*u.y + z*u.z + w*u.w;} }; //______________________________________________________________ typedef float f32; typedef unsigned long int u32; typedef Vector<float> v4f; //______________________________________________________________ inline v4f interpolate_CPP_1(const f32 x, const f32 y, const v4f v[]) { f32 xy = x*y; v4f c(1-x-y+xy, x-xy, y-xy, xy); return v4f( c.dprod(v[0]), c.dprod(v[1]), c.dprod(v[2]), c.dprod(v[3]) ); } //______________________________________________________________ I measured the time of computing 100 million bilinear interpolation using dot products, and changing the dprod operation to accept a const reference instead of the const value makes the test run an order of magnitude slower!! And I also did the same change in a vector math heavy graphics demo, and the speed difference was remarkable! Andras |
From: Jon W. <hp...@mi...> - 2004-12-02 00:37:42
|
> I measured the time of computing 100 million bilinear interpolation using > dot products, and changing the dprod operation to accept a const reference > instead of the const value makes the test run an order of magnitude slower!! Is this Evil Aliasing, Part II ? You could find out by disassembling both versions and comparing. If it's re-loading through the reference pointer all the time, then it's probably aliasing related. References are really pointers, so they can introduce (false) aliasing hazards. Cheers, / h+ |
From: Dan T. <da...@ar...> - 2004-12-02 01:02:14
|
Hmm now I'm confused. I'm going to make some statements and see if I can't show my ignorance in such a way that it can be corrected. It seems weird that aliasing would be a problem - since everything is in 4 bytes chunks. From what I understood about aliasing, that only mattered if you broke dword alignment, e.g. unsigned char* pData = some_aligned_junk; return *(pData + 1); // hit 1 byte off the alignment produced by the compiler. However everything he is dealing with is already in dwords, effectively, so this shouldn't be a problem... Could it be that the compiler recognizes the dprod as const, and as such can inline it directly without creating the parameter as a new instance? i.e. the code effectively just drops in without any copy constructors being called at all, whereas with a reference, it still has to chase the pointers, making it a bit slower? Does this make sense? -Dan Jon Watte wrote: >>I measured the time of computing 100 million bilinear interpolation using >>dot products, and changing the dprod operation to accept a const reference >>instead of the const value makes the test run an order of magnitude >> >> >slower!! > >Is this Evil Aliasing, Part II ? You could find out by disassembling >both versions and comparing. If it's re-loading through the reference >pointer all the time, then it's probably aliasing related. References >are really pointers, so they can introduce (false) aliasing hazards. > >Cheers, > > / h+ > > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://productguide.itmanagersjournal.com/ >_______________________________________________ >Gamedevlists-windows mailing list >Gam...@li... >https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows >Archives: >http://sourceforge.net/mailarchive/forum.php?forum_id=555 > > > > |
From: Jon W. <hp...@mi...> - 2004-12-02 02:07:41
|
> It seems weird that aliasing would be a problem - since everything is in > 4 bytes chunks. From what I understood about aliasing, that only > mattered if you broke dword alignment, e.g. That's CPU register and memory aliasing, which is not the kind of aliasing I'm talking about. In compilers, any access through a pointer might conceivably change the value of the pointer itself (for example), so each time the pointer is accessed, it has to be re-loaded. There are type-based proofs the compiler can execute to not have to do this all the time, but most compilers are quite conservative. A simple disassembly of the two cases should tell you whether this is the case. Cheers, / h+ |
From: Daniel G. <dgl...@cr...> - 2004-12-02 01:10:20
|
Andras Balogh wrote: >I measured the time of computing 100 million bilinear interpolation using >dot products, and changing the dprod operation to accept a const reference >instead of the const value makes the test run an order of magnitude slower!! > Did you look at the disassembly for each function? cheers DanG -- Dan Glastonbury, Senior Programmer, The Creative Assembly PO Box 883 Fortitude Valley, QLD, Australia. 4006 T: +617 3252 1359 W: http://www.creative-assembly.com.au `Pour encourjay lays ortras' |
From: Andras B. <bn...@ma...> - 2004-12-01 23:45:59
|
MSDN says that I can use typedef to make aligned versions of my types, eg: typedef __declspec(align(16)) Fred FredAligned16; This works when I declare variables, eg: Fred fred1; FredAligned16 fred2; fred2 is 16 bytes aligned, while fred1 is not But the strange thing is that sizeof(Fred) == sizeof(FredAligned16) is true, even though Fred's size is not a multiple of 16 bytes... When I declare different types (ie. not using typedef), then the sizeof will be different because the aligned type will be padded to the next multiple of 16 bytes. Is this a bug in the compiler?? Thanks, Andras |
From: Brett B. <res...@ga...> - 2004-12-02 00:07:19
|
Not sure what kind of type Fred is, but Fred and Fred Aligned are the same type, thus the same size. They just _start_ at an aligned location in RAM. If Fred is a struct, your alignment doesn't say anything about the alignment of each element _within_ the struct. If Fred is an element of a struct, then that would effect the size of the struct it is contained within. If you are trying to align data within structs, it's better to pack them nicely and align the struct start address than peppering code with alignments everywhere. There's a presentation about cache issues you might want to check out: http://www.gdconf.com/archives/2003/Ericson_Christer.ppt -Brett ----- Original Message ----- From: "Andras Balogh" <bn...@ma...> To: <gam...@li...> Sent: Thursday, December 02, 2004 7:45 AM Subject: [GD-Windows] using typedef to create aligned types > MSDN says that I can use typedef to make aligned versions of my types, eg: > > typedef __declspec(align(16)) Fred FredAligned16; > > This works when I declare variables, eg: > > Fred fred1; > FredAligned16 fred2; > > fred2 is 16 bytes aligned, while fred1 is not > > But the strange thing is that sizeof(Fred) == sizeof(FredAligned16) is > true, > even though Fred's size is not a multiple of 16 bytes... > > When I declare different types (ie. not using typedef), then the sizeof > will > be different because the aligned type will be padded to the next multiple > of > 16 bytes. > > Is this a bug in the compiler?? > > > > Thanks, > > Andras > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > Gamedevlists-windows mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=555 > |
From: Andras B. <bn...@ma...> - 2004-12-02 00:30:27
|
Let Fred be: class Fred { public: int helloiamfred; }; then sizeof(Fred) will be 4 but if I align the same class to 16 bytes: class __declspec(align(16)) Fred { public: int helloiamfred; }; then sizeof(Fred) will be 16!!! The reason is that if you want an array of Freds, each have to be aligned, so it's not just the base address, that's changed, but the padding too! Look here for more info and examples: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccelng/htm /msmod_18.asp Andras > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...] On Behalf Of > Brett Bibby > Sent: Wednesday, December 01, 2004 4:12 PM > To: gam...@li... > Subject: Re: [GD-Windows] using typedef to create aligned types > > Not sure what kind of type Fred is, but Fred and Fred Aligned are the same > type, thus the same size. They just _start_ at an aligned location in > RAM. > If Fred is a struct, your alignment doesn't say anything about the > alignment > of each element _within_ the struct. If Fred is an element of a struct, > then that would effect the size of the struct it is contained within. > > If you are trying to align data within structs, it's better to pack them > nicely and align the struct start address than peppering code with > alignments everywhere. There's a presentation about cache issues you > might > want to check out: > > http://www.gdconf.com/archives/2003/Ericson_Christer.ppt > > -Brett > > ----- Original Message ----- > From: "Andras Balogh" <bn...@ma...> > To: <gam...@li...> > Sent: Thursday, December 02, 2004 7:45 AM > Subject: [GD-Windows] using typedef to create aligned types > > > > MSDN says that I can use typedef to make aligned versions of my types, > eg: > > > > typedef __declspec(align(16)) Fred FredAligned16; > > > > This works when I declare variables, eg: > > > > Fred fred1; > > FredAligned16 fred2; > > > > fred2 is 16 bytes aligned, while fred1 is not > > > > But the strange thing is that sizeof(Fred) == sizeof(FredAligned16) is > > true, > > even though Fred's size is not a multiple of 16 bytes... > > > > When I declare different types (ie. not using typedef), then the sizeof > > will > > be different because the aligned type will be padded to the next > multiple > > of > > 16 bytes. > > > > Is this a bug in the compiler?? > > > > > > > > Thanks, > > > > Andras > > > > > > > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT Products from real users. > > Discover which products truly live up to the hype. Start reading now. > > http://productguide.itmanagersjournal.com/ > > _______________________________________________ > > Gamedevlists-windows mailing list > > Gam...@li... > > https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows > > Archives: > > http://sourceforge.net/mailarchive/forum.php?forum_id=555 > > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > Gamedevlists-windows mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=555 |
From: Brett B. <res...@ga...> - 2004-12-02 00:48:07
|
Exactly, as I said it's only the start address. If you have an array then it needs to be padded between each start address. ----- Original Message ----- From: "Andras Balogh" <bn...@ma...> To: <gam...@li...> Sent: Thursday, December 02, 2004 8:30 AM Subject: RE: [GD-Windows] using typedef to create aligned types > Let Fred be: > > class Fred { > public: > int helloiamfred; > }; > > then sizeof(Fred) will be 4 > > but if I align the same class to 16 bytes: > > class __declspec(align(16)) Fred { > public: > int helloiamfred; > }; > > then sizeof(Fred) will be 16!!! > > The reason is that if you want an array of Freds, each have to be aligned, > so it's not just the base address, that's changed, but the padding too! > > Look here for more info and examples: > > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccelng/htm > /msmod_18.asp > > > > Andras > > >> -----Original Message----- >> From: gam...@li... >> [mailto:gam...@li...] On Behalf Of >> Brett Bibby >> Sent: Wednesday, December 01, 2004 4:12 PM >> To: gam...@li... >> Subject: Re: [GD-Windows] using typedef to create aligned types >> >> Not sure what kind of type Fred is, but Fred and Fred Aligned are the >> same >> type, thus the same size. They just _start_ at an aligned location in >> RAM. >> If Fred is a struct, your alignment doesn't say anything about the >> alignment >> of each element _within_ the struct. If Fred is an element of a struct, >> then that would effect the size of the struct it is contained within. >> >> If you are trying to align data within structs, it's better to pack them >> nicely and align the struct start address than peppering code with >> alignments everywhere. There's a presentation about cache issues you >> might >> want to check out: >> >> http://www.gdconf.com/archives/2003/Ericson_Christer.ppt >> >> -Brett >> >> ----- Original Message ----- >> From: "Andras Balogh" <bn...@ma...> >> To: <gam...@li...> >> Sent: Thursday, December 02, 2004 7:45 AM >> Subject: [GD-Windows] using typedef to create aligned types >> >> >> > MSDN says that I can use typedef to make aligned versions of my types, >> eg: >> > >> > typedef __declspec(align(16)) Fred FredAligned16; >> > >> > This works when I declare variables, eg: >> > >> > Fred fred1; >> > FredAligned16 fred2; >> > >> > fred2 is 16 bytes aligned, while fred1 is not >> > >> > But the strange thing is that sizeof(Fred) == sizeof(FredAligned16) is >> > true, >> > even though Fred's size is not a multiple of 16 bytes... >> > >> > When I declare different types (ie. not using typedef), then the sizeof >> > will >> > be different because the aligned type will be padded to the next >> multiple >> > of >> > 16 bytes. >> > >> > Is this a bug in the compiler?? >> > >> > >> > >> > Thanks, >> > >> > Andras >> > >> > >> > >> > >> > ------------------------------------------------------- >> > SF email is sponsored by - The IT Product Guide >> > Read honest & candid reviews on hundreds of IT Products from real >> > users. >> > Discover which products truly live up to the hype. Start reading now. >> > http://productguide.itmanagersjournal.com/ >> > _______________________________________________ >> > Gamedevlists-windows mailing list >> > Gam...@li... >> > https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows >> > Archives: >> > http://sourceforge.net/mailarchive/forum.php?forum_id=555 >> > >> >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://productguide.itmanagersjournal.com/ >> _______________________________________________ >> Gamedevlists-windows mailing list >> Gam...@li... >> https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_id=555 > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://productguide.itmanagersjournal.com/ > _______________________________________________ > Gamedevlists-windows mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-windows > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=555 |
From: Daniel G. <dgl...@cr...> - 2004-12-02 01:10:18
|
Andras Balogh wrote: >Let Fred be: > >class Fred { >public: > int helloiamfred; >}; > >then sizeof(Fred) will be 4 > >but if I align the same class to 16 bytes: > >class __declspec(align(16)) Fred { >public: > int helloiamfred; >}; > >then sizeof(Fred) will be 16!!! > > No it won't. As others on the list have already said, in both cases sizeof(Fred) will be 4 bytes. It's still in int! In the case of the aligned Fred, the alignment constraint means the compiler will allocate the object on a 16 byte boundary[1]. If you create an array of Freds, then the compiler will allocate the first Fred on the boundary and then add 12 bytes of padding, before the next Fred. This is the same as: struct s { unsigned char c; int i; } The language rules say that s.i needs to be aligned on a 4 byte boundary, so the compiler will add 3 bytes of padding between s.c and s.i. (This can be overriden by the #pragma (pack) directive.) >The reason is that if you want an array of Freds, each have to be aligned, >so it's not just the base address, that's changed, but the padding too! > > Exactly. The padding has changed. Not the size of Fred. Hope this helps to clear things up. cheers DanG [1] Of course heap allocated objects won't adhere to the constraint, unless you use the aligned version of malloc. -- Dan Glastonbury, Senior Programmer, The Creative Assembly PO Box 883 Fortitude Valley, QLD, Australia. 4006 T: +617 3252 1359 W: http://www.creative-assembly.com.au `Pour encourjay lays ortras' |
From: Daniel G. <dgl...@cr...> - 2004-12-02 01:40:12
|
Daniel Glastonbury wrote: > Andras Balogh wrote: > >> Let Fred be: >> >> class Fred { >> public: >> int helloiamfred; >> }; >> >> then sizeof(Fred) will be 4 >> >> but if I align the same class to 16 bytes: >> >> class __declspec(align(16)) Fred { >> public: >> int helloiamfred; >> }; >> >> then sizeof(Fred) will be 16!!! >> >> > No it won't. I've just been told by a friend that VS.Net spits out the following: * class Fred* *{* *public:* *int helloiamfred;* *};* *class __declspec(align(16)) FredX {* *public:* *int helloiamfred;* *};* *int s1 = sizeof(Fred);* *int s2 = sizeof(FredX);* *s1 is 4, s2 is 16 * His explanation was to do with operator new[] failing if it didn't adjust the size. But my question is then, how does operator new cope with the typedef version? Incorrectly? cheers DanG -- Dan Glastonbury, Senior Programmer, The Creative Assembly PO Box 883 Fortitude Valley, QLD, Australia. 4006 T: +617 3252 1359 W: http://www.creative-assembly.com.au `Pour encourjay lays ortras' |
From: Javier A. <ja...@py...> - 2004-12-09 15:24:14
|
The compiler should call operator new with an adjusted size to cope with both the size of the objects and the padding needed to guarantee alignment, and then perform alignment of the pointer. Off the top of my head, something like this: EffectiveSize = (DataSize + Alignment -1) & ~(Alignment-1) SizeNeeded = EffectiveSize * ArraySize AlignedSizeNeed = SizeNeeded + (Alignment-1) Pointer = new(AlignedSizeNeeded) AlignedPointer = (Pointer + Alignment-1) & ~(Alignment-1) Call the constructor for AlignedPointer and afterwards in EffectiveSize increments I'm betting the compiler calculates the correct numbers, and inserts the final alignment code after the call to operator new. The call to delete[] will need the original Pointer value, so I would imagine the physical pointer to an aligned type is either bigger (to hold both the returned Pointer and the usable AlignedPointer), or the compiler keeps Pointer but does the conversion every time it is used. Also my understanding is that sizeof(*p) = (char*)(p+1) - (char*)(p) by definition, so a class with an int is sizeof 4 and a class with an int and an alignment of 16 must be sizeof 16. Save typos, that should give an idea of what's going on. -- Javier Arevalo Pyro Studios ----- Original Message ----- From: Daniel Glastonbury int s1 = sizeof(Fred); int s2 = sizeof(FredX); s1 is 4, s2 is 16 His explanation was to do with operator new[] failing if it didn't adjust the size. But my question is then, how does operator new cope with the typedef version? Incorrectly? |
From: Daniel G. <dgl...@cr...> - 2004-12-09 22:57:52
|
Javier Arevalo wrote: > Also my understanding is that sizeof(*p) = (char*)(p+1) - (char*)(p) > by definition, so a class with an int is sizeof 4 and a class with an > int and an alignment of 16 must be sizeof 16. > > Save typos, that should give an idea of what's going on. Well then in the typedef __declspec(align(16)) Fred AlignedFred, VC 7.1 is broken. It passes through a size of 40 bytes for a new AlignedFred[10]. Contrast with class __declspec(align(16)) AlignedFred, new AlignedFred[10] passes in a size of 160 bytes. Last time I looked at the implementation for operator new in the CRT it called malloc. I don't see how the declspec is being honoured in such a case. cheers DanG -- Dan Glastonbury, Senior Programmer, The Creative Assembly PO Box 883 Fortitude Valley, QLD, Australia. 4006 T: +617 3252 1359 W: http://www.creative-assembly.com.au `Pour encourjay lays ortras' |