Thread: Re: [Algorithms] How to get 3dvector largest coordinate index? (Page 2)
Brought to you by:
vexxed72
From: Matt J <mjo...@gm...> - 2009-03-02 17:47:23
|
After profiling I realized that eliminating one jump isn't necessarily worth all the work. On my Core II it takes a lot of overhead before a jump is noticeable. Adding more bitwise logic to avoid it also can even slows things down depending on what 'trick' I use. If you want to use integers try the FloatInt union trick Glenn mentioned, that is a much cleaner way of casting. This whole exercise is probably a great example of premature optimization =) If you need the speed, you can dip into assembly. But since I'm working on a math library and exploring more optimization I wanted to jump in. >From what I've seen now is if you really need speed, nothing is going to beat a.) using a good compiler or b.) using SSE With SSE you can square all the components in one instruction, swizzle and do all 3 compares in a few more, do an AND in one instruction and a not-AND on another, and best of all it parallelizes super well. The architectures that have the worst jumps are the ones to benefit from SSE, like the P4, and I read they have a super fast SSE path with high throughput > I guess you're right (I'm referring to Marc Hernandez). Assigning the > result of a float compare to a bool introduces a branch in VC8 as well. > How about this trick then > > template <> > FORCEINLINE > int maxAxis(const Vector3<float>& v) > { > const int32_t* a = reinterpret_cast<const int32_t*>(&v[0]); > int c0 = a[0] < a[1]; > int c1 = a[0] < a[2]; > int c2 = a[1] < a[2]; > return (c0 & ~c2) | ((c1 & c2) << 1); > } > > If we take the binary values of the floats and do an integer compare on > them the result should be equal to the float compare, that is, for > non-negative floats. I still have to check whether this will work using > gcc version 4 and up, since I'm not sure if I'm breaking the > strict-aliasing rule here. > > Gino > > > Gino van den Bergen wrote: > > I would like to share my approach. This code is copy-pasted straight > > from my Vector3 class template so it may look a bit cluttered but I > > hope the idea comes across: > > > > template <typename Scalar> > > FORCEINLINE > > int maxAxis(const Vector3<Scalar>& a) > > { > > int c0 = a[0] < a[1]; > > int c1 = a[0] < a[2]; > > int c2 = a[1] < a[2]; > > return (c0 & ~c2) | ((c1 & c2) << 1); > > } > > > > template <typename Scalar> > > FORCEINLINE > > int minAxis(const Vector3<Scalar>& a) > > { > > int c0 = a[1] < a[0]; > > int c1 = a[2] < a[0]; > > int c2 = a[2] < a[1]; > > return (c0 & ~c2) | ((c1 & c2) << 1); > > } > > > > template <typename Scalar> > > FORCEINLINE > > int closestAxis(const Vector3<Scalar>& a) > > { > > return maxAxis(a * a); > > } > > template <typename Scalar> > > FORCEINLINE > > int furthestAxis(const Vector3<Scalar>& a) > > { > > return minAxis(a * a); > > } > > > > The function minAxis and maxAxis return a value 0, 1, or 2, so the > > result only needs two bits (00, 01, and 10 in base-2). The first term > > of the "|" operator is bit-0 and the second term (the one with the << > > 1) is bit-1. The nice thing about this approach is the fact that its > > branchless. Three boolean values are computed but they are never used > > to branch, so no code-cache misses can happen here. > > > > For finding the minimum of maximum *absolute* value I do not use > > "fabs" but I rather multiply the vector with itself, thus a * a = > > (a.x * a.x, a.y * a.y, a.z * a.z). "closestAxis" returns the world > > axis that is closest (as in most parallel) to vector "a". > > "furthestAxis" returns the most orthogonal world axis. > > > > Cheers, > > > > Gino > > > > > > Sylvain G. Vignaud wrote: > >> Hi, > >> > >> I need to compute the index (not the actual value) of the largest > >> coordinate of a normal, for some space hashing. > >> > >> I'm not sure how fast you guys usually find this index, but I've just > >> created the following trick which I think is quite fast: > >> > >> > >>> inline uint LargestCoordinate(const Vector3d &v) > >>> { > >>> const float x = fabs(v.x); > >>> const float z = fabs(v.z); > >>> const float y = Maths::max( fabs(v.y), z ); > >>> return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > >>> } > >>> > >> > >> I didn't need such function before, so I'm not sure if this is > >> considered fast or slow. Do you guys have something faster? > |
From: <chr...@pl...> - 2009-03-02 17:28:35
|
Gino van den Bergen wrote: > const int32_t* a = reinterpret_cast<const int32_t*>(&v[0]); > > [...] I still have to check whether this will work using > gcc version 4 and up, since I'm not sure if I'm breaking the > strict-aliasing rule here. None of the posted "solutions" that access the float bits using an integer pointer are legit as they all treat an area of memory as having two different types at the same time (which is illegal). Christer Ericson, Director of Tools and Technology Sony Computer Entertainment, Santa Monica |
From: Matt J <mjo...@gm...> - 2009-03-02 21:49:08
|
It was already mentioned using a union fixes the aliasing rules that come into effect when you are adhering to C99 standards. Why is the word "solutions" in quotes? Your blog is full of bit twiddling exercises ;-) > None of the posted "solutions" that access the float bits > using an integer pointer are legit as they all treat an > area of memory as having two different types at the same > time (which is illegal). > > Christer Ericson, Director of Tools and Technology > Sony Computer Entertainment, Santa Monica > |
From: Mat N. <mat...@bu...> - 2009-03-02 22:25:04
|
There's a difference between bit-twiddling something that can't alias to an integer and bit twiddling an integer. MSN From: Matt J [mailto:mjo...@gm...] Sent: Monday, March 02, 2009 1:49 PM To: Game Development Algorithms Subject: Re: [Algorithms] How to get 3dvector largest coordinate index? It was already mentioned using a union fixes the aliasing rules that come into effect when you are adhering to C99 standards. Why is the word "solutions" in quotes? Your blog is full of bit twiddling exercises ;-) None of the posted "solutions" that access the float bits using an integer pointer are legit as they all treat an area of memory as having two different types at the same time (which is illegal). Christer Ericson, Director of Tools and Technology Sony Computer Entertainment, Santa Monica |
From: John B. <dif...@gm...> - 2009-03-03 00:06:24
|
Particularly when going from integer registers to floating point registers... http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/ On Mon, Mar 2, 2009 at 2:11 PM, Mat Noguchi <mat...@bu...> wrote: > There’s a difference between bit-twiddling something that can’t alias to > an integer and bit twiddling an integer. > > > > MSN > > > > *From:* Matt J [mailto:mjo...@gm...] > *Sent:* Monday, March 02, 2009 1:49 PM > *To:* Game Development Algorithms > *Subject:* Re: [Algorithms] How to get 3dvector largest coordinate index? > > > > It was already mentioned using a union fixes the aliasing rules that come > into effect when you are adhering to C99 standards. Why is the word > "solutions" in quotes? Your blog is full of bit twiddling exercises ;-) > > > > None of the posted "solutions" that access the float bits > using an integer pointer are legit as they all treat an > area of memory as having two different types at the same > time (which is illegal). > > Christer Ericson, Director of Tools and Technology > Sony Computer Entertainment, Santa Monica > > > > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, > CA > -OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > -Strategies to boost innovation and cut costs with open source > participation > -Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- -John |
From: Matt J <mjo...@gm...> - 2009-03-03 01:04:38
|
LOL, as long as we're resorting to strawman arguments sure why not =) But in this particular case the optimizer needed the comparison to be integers to remove the jumps from the ternary since it compard a float but returned an integer. I'd assume a better compiler would fix this so you would use conditional moves and not get that load-store hit. Finally, treating a float as an int and doing bit operations on a float isn't that unusual or undefined. How would you take the absolute value of four components in SSE? The only way I can think of is to use the SSE AND operation to 7FFFFFFF (stored in each component of a 128-bit register), which is precisely there to do just that (in SSE2 you can cmp a register to itself, and shift each component by 1 to get 7FFFFFFF..but you get the idea). So, as the link says, it is not just the PS3 that is dual purpose, but also the SSE registers as well are dual purpose. In fact, SSE2 extends MMX on these registers, allowing them to be dual purpose. And the 'official' way of handling conditional operations in SSE in parallel is to decompose it into a compare using an AND and NAND operation As long as we're using strawman arguments to win points, I'd like to point out the original Q3 reciprocal sqrt code does the same int * cast, which I'm sure later on was changed to use a union. Anyway, I assume your next e-mail is to lecture John Carmack for not using 1.0f / sqrt(....) on machines w/o hardware RSQRT operation =) Particularly when going from integer registers to floating point > registers... > > > http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/ > > On Mon, Mar 2, 2009 at 2:11 PM, Mat Noguchi <mat...@bu...> wrote: > >> There’s a difference between bit-twiddling something that can’t alias >> to an integer and bit twiddling an integer. >> >> MSN >> > |
From: <chr...@pl...> - 2009-03-03 03:21:41
|
Matt J wrote: > It was already mentioned using a union fixes the aliasing rules that > come into effect when you are adhering to C99 standards. Why is the > word "solutions" in quotes? Your blog is full of bit twiddling > exercises ;-) "Solutions" was in quotes because the only two valid programs were posted by Jon W and Glenn F (I apologize if someone else also wrote a valid piece of code). All other code snippets would likely generate invalid code on e.g. gcc with -O2 as they were not adhering to the C99 (6.5.7) or C++ (3.10.15) standards. Aliasing problems is not an academic issue. It's a very real issue and people need to start writing proper code for type punning. > Finally, treating a float as an int and doing bit operations on a > float isn't that unusual or undefined. It's not unusual, but most attempts to do so are most certainly undefined. As already witnessed in this thread! ;P The best solutions, as always, are machine specific. Christer Ericson, Director of Tools and Technology Sony Computer Entertainment, Santa Monica |
From: Jon W. <jw...@gm...> - 2009-03-03 03:51:46
|
chr...@pl... wrote: > Aliasing problems is not an academic issue. It's a very real issue > and people need to start writing proper code for type punning. > > Aliasing through types is only allowed through char* these days. However, I have three questions regarding this, for those lawyer-like with the language: 1) Does this include unsigned char* ? 2) Does this include void* ? 3) If I do "float flt; long l = *(long*)(char*)&flt" is that good enough? Btw: if it doesn't include void*, then how can memcpy() be safe, ever? Sincerely, jw |
From: Charles N. <cha...@gm...> - 2009-03-03 04:17:09
|
On Mon, Mar 2, 2009 at 7:51 PM, Jon Watte <jw...@gm...> wrote: > Aliasing through types is only allowed through char* these days. > However, I have three questions regarding this, for those lawyer-like > with the language: > 1) Does this include unsigned char* ? > 2) Does this include void* ? > 3) If I do "float flt; long l = *(long*)(char*)&flt" is that good enough? > > Btw: if it doesn't include void*, then how can memcpy() be safe, ever? > ====================================== C++03 3.10/15: If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behaviour is undefined: - the dynamic type of the object, - a cv-qualified version of the dynamic type of the object, - a type that is the signed or unsigned type corresponding to the dynamic type of the object, - a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object, - an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union) - a type that is a (possibly cv-qualified) base class type of the dynamic type of the object, - a char or unsigned char type ====================================== My reading of this is that (optionally unsigned) char* is the only "legal" way to do it, but gcc allows using void* as an intermediate, even with optimization >= "-O2" and/or "-fstrict-aliasing". I think that your "float flt; long l = *(long*)(char*)&flt" example is compliant. It's also undefined behavior (right?) to write to one field of a union, and read from another (the "union cast" trick). gcc seems to see this as an optimization barrier though, and doesn't do TSAA-based optimizations through the union, so it's a non-portable but safe way to do it, as others have mentioned. I actually have this awful code in my current project (i've posted it before but it's relevant once again :) template< typename To, typename From > inline To union_cast(From from) { STATIC_ASSERT(sizeof(To) == sizeof(From)); union { >From f; To t; } const u = { from }; return u.t; } It's useful for writing middleware, in which you host your structures in user-supplied memory blocks. As for your void* and memcpy, pointers implicitly convert to void*, but isn't that different from punning a Foo* to a void*? Calling memcpy() tells the compiler to make a void* copy of your Foo* to pass to the function, but it doesn't -reinterpret- your Foo* bits to a void*... right? -charles |
From: Jon W. <jw...@gm...> - 2009-03-03 04:34:18
|
Charles Nicholson wrote: > As for your void* and memcpy, pointers implicitly convert to void*, > but isn't that different from punning a Foo* to a void*? Calling > memcpy() tells the compiler to make a void* copy of your Foo* to pass > to the function, but it doesn't -reinterpret- your Foo* bits to a > void*... right? Couldn't conceivably memcpy() be implemented through moving longwords around internally? And couldn't some architecture keep pointers/references to float separate from pointers/references to int? And, if the memcpy() gets aggressively inlined, wouldn't it look like a copy of longwords to the compiler, such that if the data in question was float, there may be an aliasing-based ordering hazard? If "void*" is treated as "char*" according to the quoted standard section, that won't be a problem, though. Sincerely, jw |
From: Charles N. <cha...@gm...> - 2009-03-03 05:05:01
|
On Mon, Mar 2, 2009 at 8:34 PM, Jon Watte <jw...@gm...> wrote: > Charles Nicholson wrote: > > As for your void* and memcpy, pointers implicitly convert to void*, > > but isn't that different from punning a Foo* to a void*? Calling > > memcpy() tells the compiler to make a void* copy of your Foo* to pass > > to the function, but it doesn't -reinterpret- your Foo* bits to a > > void*... right? > > Couldn't conceivably memcpy() be implemented through moving longwords > around internally? And couldn't some architecture keep > pointers/references to float separate from pointers/references to int? > And, if the memcpy() gets aggressively inlined, wouldn't it look like a > copy of longwords to the compiler, such that if the data in question was > float, there may be an aliasing-based ordering hazard? If "void*" is > treated as "char*" according to the quoted standard section, that won't > be a problem, though. > Couldn't it simply cast the void* parameters through char* internally, to whatever it wanted, though? Sorry, I'm not intentionally playing dumb, I feel like I'm missing you here. At work for too long, maybe :) -charles |
From: Jon W. <jw...@gm...> - 2009-03-03 05:45:14
|
Charles Nicholson wrote: > Couldn't it simply cast the void* parameters through char* internally, > to whatever it wanted, though? Sorry, I'm not intentionally playing > dumb, I feel like I'm missing you here. At work for too long, maybe :) I changed my mind: no aggressive inlining needed. If void* is not a sequence point, and float* is something different, then even if memcpy() is out-of-lined, couldn't an aggressive compiler move the memcpy() across stores through or loads from float*? If void* and float* can't alias, then why not? That seems quite similar to the long* store/load case; you're just calling a function taking void* instead of loading through a long*. The problem is this: long func(float arg) { float f = arg; long l = *(long *)(void *)&f; return l; } The read through the long* can actually happen before the store to f, according to this aliasing analysis. Compare this: long func(float arg) { float f = arg; long l; memcpy((void *)&l, (void *)&f, sizeof(l)); return l; } Typewise, there is no difference. Because there is no char* or float* reference to f, isn't the compiler is allowed to move that store to after the call to memcpy(), just like it is allowed to move it across the load through long*? There probably is something in the spec that doesn't actually allow this, but the section that talks about char* being special doesn't talk about void* or function calls, so what is it? Sincerely, jw |
From: Jason H. <jas...@di...> - 2009-03-03 06:33:57
|
Deep sigh. It's hard not to see this thread as evidence of the severe broken-ness of the language, that so many incredibly intelligent people have to worry about such a basic thing. Let's talk about something less depressing, now, like some cool algorithms. :-) JH |
From: James R. <ja...@fu...> - 2009-03-03 09:02:58
|
While I can't find the exact section in the standard, it is illegal to read from a variable you have not written to. And, therefore, illegal to read from a union member you have not previously written to. The (only?) exception to this is where several or all union members start with identical members. Eg, a header structure. It is then legal to write to the 'header' of one union member and read from the header of an other. Unfortunately reading from uninitialised union members isn't something a compiler can usually warn about. Although I believe some static code analysis tools can pick up on this. And gcc is wrong to allow void* as an intermediate pointer type in C++. As you mention, it is legal and perfectly possible to implicitly convert from T* to void*, but not back again. (Not in C++ anyway. It is legal in C.) As for: long l = *( long* )( char* )&f; Firstly there's nothing in the standard which states that sizeof( long ) == sizeof( * ), so you're already on thin ice. But if you (think) know what you're doing then: long l = *reinterpret_cast< long* >( &f ); should suffice. (And is better defined and clearer than multiple casts through intermediate types if you ask me.) ----- Original Message ----- From: Charles Nicholson To: Game Development Algorithms Sent: Tuesday, March 03, 2009 5:17 AM Subject: Re: [Algorithms] How to get 3dvector largest coordinate index? On Mon, Mar 2, 2009 at 7:51 PM, Jon Watte <jw...@gm...> wrote: Aliasing through types is only allowed through char* these days. However, I have three questions regarding this, for those lawyer-like with the language: 1) Does this include unsigned char* ? 2) Does this include void* ? 3) If I do "float flt; long l = *(long*)(char*)&flt" is that good enough? Btw: if it doesn't include void*, then how can memcpy() be safe, ever? ====================================== C++03 3.10/15: If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behaviour is undefined: - the dynamic type of the object, - a cv-qualified version of the dynamic type of the object, - a type that is the signed or unsigned type corresponding to the dynamic type of the object, - a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object, - an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union) - a type that is a (possibly cv-qualified) base class type of the dynamic type of the object, - a char or unsigned char type ====================================== My reading of this is that (optionally unsigned) char* is the only "legal" way to do it, but gcc allows using void* as an intermediate, even with optimization >= "-O2" and/or "-fstrict-aliasing". I think that your "float flt; long l = *(long*)(char*)&flt" example is compliant. It's also undefined behavior (right?) to write to one field of a union, and read from another (the "union cast" trick). gcc seems to see this as an optimization barrier though, and doesn't do TSAA-based optimizations through the union, so it's a non-portable but safe way to do it, as others have mentioned. I actually have this awful code in my current project (i've posted it before but it's relevant once again :) template< typename To, typename From > inline To union_cast(From from) { STATIC_ASSERT(sizeof(To) == sizeof(From)); union { From f; To t; } const u = { from }; return u.t; } It's useful for writing middleware, in which you host your structures in user-supplied memory blocks. As for your void* and memcpy, pointers implicitly convert to void*, but isn't that different from punning a Foo* to a void*? Calling memcpy() tells the compiler to make a void* copy of your Foo* to pass to the function, but it doesn't -reinterpret- your Foo* bits to a void*... right? -charles ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H ------------------------------------------------------------------------------ _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list |
From: Matt J <mjo...@gm...> - 2009-03-03 16:13:34
|
> While I can't find the exact section in the standard, it is illegal to > read from a variable you have not written to. And, therefore, illegal to > read from a union member you have not previously written to. > Apparently in C99 TC3 it adds a footnote: "If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation." Making the union trick documented (although perhaps still implementation-specific- which is implicit anyway, since we're assuming a float is IEEE 754 and is 32-bit and the same size as int). Anyway like any good coding practice, surround the code with #ifdef's for the machine type and maybe compiler. C++ != C99, but apparently GCC applies the same strict aliasing to it. |
From: Jon W. <jw...@gm...> - 2009-03-03 18:11:01
|
James Robertson wrote: > As for: long l = *( long* )( char* )&f; > > Firstly there's nothing in the standard which states that sizeof( long > ) == sizeof( * ), so you're already on thin ice. But if you (think) > know what you're doing then: I think you need to re-read that line of code, as it does not treat a pointer as a long. > > long l = *reinterpret_cast< long* >( &f ); > > should suffice. (And is better defined and clearer than multiple > casts through intermediate types if you ask me.) That doesn't actually go through char*, and thus doesn't actually signal a memory barrier for the object you are taking the address of, which is the whole point of this thread. Sincerely, jw |
From: James R. <ja...@fu...> - 2009-03-03 19:07:23
|
Ah yeah, sorry. I meant sizeof( long ) == sizeof( float ). And I guess that'll teach me to start replying halfway through a conversation in a mailing list I've only just subscribed to. :o) ----- Original Message ----- From: "Jon Watte" <jw...@gm...> To: "Game Development Algorithms" <gda...@li...> Sent: Tuesday, March 03, 2009 7:10 PM Subject: Re: [Algorithms] How to get 3dvector largest coordinate index? > James Robertson wrote: >> As for: long l = *( long* )( char* )&f; >> >> Firstly there's nothing in the standard which states that sizeof( long >> ) == sizeof( * ), so you're already on thin ice. But if you (think) >> know what you're doing then: > > I think you need to re-read that line of code, as it does not treat a > pointer as a long. > >> >> long l = *reinterpret_cast< long* >( &f ); >> >> should suffice. (And is better defined and clearer than multiple >> casts through intermediate types if you ask me.) > > That doesn't actually go through char*, and thus doesn't actually signal > a memory barrier for the object you are taking the address of, which is > the whole point of this thread. > > Sincerely, > > jw > > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, > CA > -OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > -Strategies to boost innovation and cut costs with open source > participation > -Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list |
From: Philip T. <ph...@za...> - 2009-03-03 11:46:43
|
Charles Nicholson wrote: > On Mon, Mar 2, 2009 at 7:51 PM, Jon Watte <jw...@gm...> wrote: > >> Aliasing through types is only allowed through char* these days. >> However, I have three questions regarding this, for those lawyer-like >> with the language: (See further down for expanded answers, and for some IANALL caveats that attempt to give me an excuse in case I'm wrong :-) ) >> 1) Does this include unsigned char* ? Yes. >> 2) Does this include void* ? Question doesn't make sense, since you can't access data from a void*. >> 3) If I do "float flt; long l = *(long*)(char*)&flt" is that good enough? No. Casts are irrelevant. >> Btw: if it doesn't include void*, then how can memcpy() be safe, ever? memcpy is defined in terms of unsigned char, which is safe. (Its arguments are void*, but that's simply a matter of casts, and casts are irrelevant.) > ====================================== > C++03 3.10/15: > > If a program attempts to access the stored value of an object through an > lvalue of other than one of the following types the behaviour is undefined: > - the dynamic type of the object, > - a cv-qualified version of the dynamic type of the object, > - a type that is the signed or unsigned type corresponding to the dynamic > type of the object, > - a type that is the signed or unsigned type corresponding to a cv-qualified > version of the dynamic type of the object, > - an aggregate or union type that includes one of the aforementioned types > among its members (including, recursively, a member of a subaggregate or > contained union) > - a type that is a (possibly cv-qualified) base class type of the dynamic > type of the object, > - a char or unsigned char type > ====================================== > > My reading of this is that (optionally unsigned) char* is the only "legal" > way to do it, but gcc allows using void* as an intermediate, even with > optimization >= "-O2" and/or "-fstrict-aliasing". I think that your "float > flt; long l = *(long*)(char*)&flt" example is compliant. My reading (which fits with other discussions I've heard on this subject, and which I'm therefore reasonably confident in, though IANALL) is that it's all about how you "access the stored value of an object". Casts are never mentioned, and pointers are never mentioned. That's why GCC says "warning: dereferencing type-punned pointer will break strict-aliasing rules" - it's the *dereferencing* that may break the rules, not the pointer casting. The casts themselves are irrelevant (except to the extent that they tell the compiler you're probably going to make a mistake later by dereferencing the newly-cast pointer, hence the warning). If you have four bytes of memory that you dereference as a float, you can never dereference it as a long too, no matter what clever tricks you try to pull. As far as the standard is concerned, void* is irrelevant since you can't dereference it. All it does is hide the type-punning from the compiler so it can't reliably warn you about the danger. In this case, the object referred to as 'flt' has dynamic type (equal to its static type) of float. So you can access it as if it were: - float - const/volatile float - struct/union { ...; float f; ... } (presumably to permit code like "struct s x = y", accessing the members through an lvalue of type 's') - char, unsigned char The only way to reinterpret it as anything other than a float is by going through (signed/unsigned) char, like: long float_to_long(float f) { char c[4]; long l; c[0] = ((char*)&f)[0]; c[1] = ((char*)&f)[1]; ... ((char*)&l)[0] = c[0]; ((char*)&l)[1] = c[1]; ... return l; /* (and you could skip the 'c' temporary if you fancy) */ } where the char-copying lines are equivalent to calling memcpy. As far as I can tell, the rule isn't complicated - simply never dereference anything except as the type it was declared as (modulo const/volatile/unsigned) or as char, and use memcpy if you want to reinterpret bytes and you don't want to rely on platform-specific quirks. > As for your void* and memcpy, pointers implicitly convert to void*, but > isn't that different from punning a Foo* to a void*? Calling memcpy() tells > the compiler to make a void* copy of your Foo* to pass to the function, but > it doesn't -reinterpret- your Foo* bits to a void*... right? Pointer casting is irrelevant. memcpy's operation is defined (in C99) in terms of unsigned char, and it's always fine to access any object as unsigned char. If the compiler's libraries implement it in any other way, it's their responsibility to make sure it works equivalently to the well-defined char-copying method. If you implement it with char-copying, it's well-defined and it's the compiler's responsibility to make your code work. If you implement it any other way, it's undefined behaviour and now it's your problem. -- Philip Taylor ph...@za... |
From: Philip T. <ph...@za...> - 2009-03-03 14:34:08
|
Philip Taylor wrote: >> On Mon, Mar 2, 2009 at 7:51 PM, Jon Watte <jw...@gm...> wrote: >>> [...] >>> 3) If I do "float flt; long l = *(long*)(char*)&flt" is that good enough? > > No. Casts are irrelevant. I tried actually testing this, and fortunately it doesn't contradict my claims. http://zaynar.co.uk/docs/float-aliasing.html shows that GCC generates invalid code in this case (which is fine because it's undefined behaviour). GCC warns about type-punning here. If you cast via void*, then GCC 4.2 doesn't warn (probably because it's very common to cast to void* in legitimate code and it doesn't want a lot of false positives), but GCC 4.3 does some data flow analysis to detect the problem and warn about it. -- Philip Taylor ph...@za... |
From: Jon W. <jw...@gm...> - 2009-03-03 18:15:04
|
Philip Taylor wrote: > Pointer casting is irrelevant. memcpy's operation is defined (in C99) in > terms of unsigned char, and it's always fine to access any object as > unsigned char. > > OK, that's the bit I was missing. Additionally, to properly type pun the value, you should then: float a = 1234; int b; unsigned char *src = (unsigned char *)&a; unsigned char *dst = (unsigned char *)&b; dst[0] = src[0]; dst[1] = src[1]; dst[2] = src[2]; dst[3] = src[3]; At this point, I have properly type punned the data (assuming sizeof(int) == sizeof(float)). And, at that point, any benefit from using bit twiddling instead of just using FPU intrinsics is probably all gone :-) Sincerely, jw |
From: Mat N. <mat...@bu...> - 2009-03-03 18:38:09
|
You would have to do something equivalent to that anyway to be safe. I.e., guaranteeing that it hits memory and doesn't live in a register and all that fun. MSN -----Original Message----- From: Jon Watte [mailto:jw...@gm...] Sent: Tuesday, March 03, 2009 10:15 AM To: Game Development Algorithms Subject: Re: [Algorithms] How to get 3dvector largest coordinate index? Philip Taylor wrote: > Pointer casting is irrelevant. memcpy's operation is defined (in C99) in > terms of unsigned char, and it's always fine to access any object as > unsigned char. > > OK, that's the bit I was missing. Additionally, to properly type pun the value, you should then: float a = 1234; int b; unsigned char *src = (unsigned char *)&a; unsigned char *dst = (unsigned char *)&b; dst[0] = src[0]; dst[1] = src[1]; dst[2] = src[2]; dst[3] = src[3]; At this point, I have properly type punned the data (assuming sizeof(int) == sizeof(float)). And, at that point, any benefit from using bit twiddling instead of just using FPU intrinsics is probably all gone :-) Sincerely, jw ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list |
From: Philip T. <ph...@za...> - 2009-03-03 19:11:50
|
Jon Watte wrote: > Philip Taylor wrote: >> Pointer casting is irrelevant. memcpy's operation is defined (in C99) in >> terms of unsigned char, and it's always fine to access any object as >> unsigned char. > > OK, that's the bit I was missing. Additionally, to properly type pun the > value, you should then: > > float a = 1234; > int b; > unsigned char *src = (unsigned char *)&a; > unsigned char *dst = (unsigned char *)&b; > dst[0] = src[0]; dst[1] = src[1]; dst[2] = src[2]; dst[3] = src[3]; > > At this point, I have properly type punned the data (assuming > sizeof(int) == sizeof(float)). > And, at that point, any benefit from using bit twiddling instead of just > using FPU intrinsics is probably all gone :-) Fortunately compilers are (sometimes) clever, so it's not necessarily that bad. With code like: void sort_positive_finite_floats(float fs[2]) { int a, b; memcpy(&a, &fs[0], 4); memcpy(&b, &fs[1], 4); if (b < a) { float t = fs[0]; fs[0] = fs[1]; fs[1] = t; } } (so the floats are initially in memory, not in FP registers), the compilers I tested (GCC 4.1, MSVC 2005, ICC 10.0, on x86) all simply load the values from memory into integer registers and call 'cmp'. The memcpy is optimised out entirely (but still serves the purpose of preventing the compiler generating invalid code), and the conversion has zero cost. (Manually copying chars appears to be a bit worse than memcpy since it creates a load of individual byte copy instructions.) -- Philip Taylor ph...@za... |
From: Alen L. <ale...@cr...> - 2009-03-03 07:25:59
|
christer wrote at 3/3/2009: > It's not unusual, but most attempts to do so are most certainly > undefined. As already witnessed in this thread! ;P What is your recommendation for a portable way to obtain a float's bits as an integer? Thanks, Alen |
From: Pal-Kristian E. <pal...@na...> - 2009-03-03 19:23:06
|
Try this useful template: template<typename T, typename U> static T bit_cast(const U val) { union { U u; T t; } bits = { val }; return bits.t; } Now: float f = ...; uint32_t u = bit_cast<uint32_t>(f); Should work. Alen Ladavac wrote: > christer wrote at 3/3/2009: > >> It's not unusual, but most attempts to do so are most certainly >> undefined. As already witnessed in this thread! ;P >> > > What is your recommendation for a portable way to obtain a float's bits > as an integer? > > Thanks, > Alen > > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA > -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > -Strategies to boost innovation and cut costs with open source participation > -Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > > -- Pål-Kristian Engstad (en...@na...), Lead Graphics & Engine Programmer, Naughty Dog, Inc., 1601 Cloverfield Blvd, 6000 North, Santa Monica, CA 90404, USA. Ph.: (310) 633-9112. "Emacs would be a far better OS if it was shipped with a halfway-decent text editor." -- Slashdot, Dec 13. 2005. |
From: Alen L. <ale...@cr...> - 2009-03-04 06:08:10
|
Thanks. A template is a nice approach for abstracting this. Btw, I'd add a static_assert(sizeof(U)==sizeof(T)) there. But isn't union a not-completely-legal way to do this? From the remainder of the thread and from some experimentation, I'd say that the best way would be to use memcpy()... Alen Pal-Kristian wrote at 3/3/2009: > Try this useful template: > template<typename T, typename U> > static T bit_cast(const U val) > { > union { U u; T t; } bits = { val }; > return bits.t; > } > Now: > float f = ...; > uint32_t u = bit_cast<uint32_t>(f); > Should work. > Alen Ladavac wrote: >> christer wrote at 3/3/2009: >> >>> It's not unusual, but most attempts to do so are most certainly >>> undefined. As already witnessed in this thread! ;P >>> >> >> What is your recommendation for a portable way to obtain a float's bits >> as an integer? >> >> Thanks, >> Alen >> >> >> ------------------------------------------------------------------------------ >> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA >> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise >> -Strategies to boost innovation and cut costs with open source participation >> -Receive a $600 discount off the registration fee with the source code: SFAD >> http://p.sf.net/sfu/XcvMzF8H >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> >> -- Alen |