You can subscribe to this list here.
2000 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}
(390) 
_{Aug}
(767) 
_{Sep}
(940) 
_{Oct}
(964) 
_{Nov}
(819) 
_{Dec}
(762) 

2001 
_{Jan}
(680) 
_{Feb}
(1075) 
_{Mar}
(954) 
_{Apr}
(595) 
_{May}
(725) 
_{Jun}
(868) 
_{Jul}
(678) 
_{Aug}
(785) 
_{Sep}
(410) 
_{Oct}
(395) 
_{Nov}
(374) 
_{Dec}
(419) 
2002 
_{Jan}
(699) 
_{Feb}
(501) 
_{Mar}
(311) 
_{Apr}
(334) 
_{May}
(501) 
_{Jun}
(507) 
_{Jul}
(441) 
_{Aug}
(395) 
_{Sep}
(540) 
_{Oct}
(416) 
_{Nov}
(369) 
_{Dec}
(373) 
2003 
_{Jan}
(514) 
_{Feb}
(488) 
_{Mar}
(396) 
_{Apr}
(624) 
_{May}
(590) 
_{Jun}
(562) 
_{Jul}
(546) 
_{Aug}
(463) 
_{Sep}
(389) 
_{Oct}
(399) 
_{Nov}
(333) 
_{Dec}
(449) 
2004 
_{Jan}
(317) 
_{Feb}
(395) 
_{Mar}
(136) 
_{Apr}
(338) 
_{May}
(488) 
_{Jun}
(306) 
_{Jul}
(266) 
_{Aug}
(424) 
_{Sep}
(502) 
_{Oct}
(170) 
_{Nov}
(170) 
_{Dec}
(134) 
2005 
_{Jan}
(249) 
_{Feb}
(109) 
_{Mar}
(119) 
_{Apr}
(282) 
_{May}
(82) 
_{Jun}
(113) 
_{Jul}
(56) 
_{Aug}
(160) 
_{Sep}
(89) 
_{Oct}
(98) 
_{Nov}
(237) 
_{Dec}
(297) 
2006 
_{Jan}
(151) 
_{Feb}
(250) 
_{Mar}
(222) 
_{Apr}
(147) 
_{May}
(266) 
_{Jun}
(313) 
_{Jul}
(367) 
_{Aug}
(135) 
_{Sep}
(108) 
_{Oct}
(110) 
_{Nov}
(220) 
_{Dec}
(47) 
2007 
_{Jan}
(133) 
_{Feb}
(144) 
_{Mar}
(247) 
_{Apr}
(191) 
_{May}
(191) 
_{Jun}
(171) 
_{Jul}
(160) 
_{Aug}
(51) 
_{Sep}
(125) 
_{Oct}
(115) 
_{Nov}
(78) 
_{Dec}
(67) 
2008 
_{Jan}
(165) 
_{Feb}
(37) 
_{Mar}
(130) 
_{Apr}
(111) 
_{May}
(91) 
_{Jun}
(142) 
_{Jul}
(54) 
_{Aug}
(104) 
_{Sep}
(89) 
_{Oct}
(87) 
_{Nov}
(44) 
_{Dec}
(54) 
2009 
_{Jan}
(283) 
_{Feb}
(113) 
_{Mar}
(154) 
_{Apr}
(395) 
_{May}
(62) 
_{Jun}
(48) 
_{Jul}
(52) 
_{Aug}
(54) 
_{Sep}
(131) 
_{Oct}
(29) 
_{Nov}
(32) 
_{Dec}
(37) 
2010 
_{Jan}
(34) 
_{Feb}
(36) 
_{Mar}
(40) 
_{Apr}
(23) 
_{May}
(38) 
_{Jun}
(34) 
_{Jul}
(36) 
_{Aug}
(27) 
_{Sep}
(9) 
_{Oct}
(18) 
_{Nov}
(25) 
_{Dec}

2011 
_{Jan}
(1) 
_{Feb}
(14) 
_{Mar}
(1) 
_{Apr}
(5) 
_{May}
(1) 
_{Jun}

_{Jul}

_{Aug}
(37) 
_{Sep}
(6) 
_{Oct}
(2) 
_{Nov}

_{Dec}

2012 
_{Jan}

_{Feb}
(7) 
_{Mar}

_{Apr}
(4) 
_{May}

_{Jun}
(3) 
_{Jul}

_{Aug}

_{Sep}
(1) 
_{Oct}

_{Nov}

_{Dec}
(10) 
2013 
_{Jan}

_{Feb}
(1) 
_{Mar}
(7) 
_{Apr}
(2) 
_{May}

_{Jun}

_{Jul}
(9) 
_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}

2014 
_{Jan}
(14) 
_{Feb}

_{Mar}
(2) 
_{Apr}

_{May}
(10) 
_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}

_{Nov}
(3) 
_{Dec}

2015 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}
(12) 
_{Nov}

_{Dec}
(1) 
S  M  T  W  T  F  S 

1
(8) 
2
(6) 
3
(22) 
4
(3) 
5
(15) 
6
(9) 
7
(4) 
8
(9) 
9
(14) 
10
(23) 
11
(29) 
12
(3) 
13

14

15

16

17

18
(9) 
19

20

21

22

23

24

25

26

27

28

29

30

31





From: Mat Noguchi <matthewn@bu...>  20090302 22:25:04

There's a difference between bittwiddling something that can't alias to an integer and bit twiddling an integer. MSN From: Matt J [mailto:mjohnson2005@...] Sent: Monday, March 02, 2009 1:49 PM To: Game Development Algorithms Subject: Re: [Algorithms] How to get 3dvector largest coordinate index? It was already mentioned using a union fixes the aliasing rules that come into effect when you are adhering to C99 standards. Why is the word "solutions" in quotes? Your blog is full of bit twiddling exercises ;) None of the posted "solutions" that access the float bits using an integer pointer are legit as they all treat an area of memory as having two different types at the same time (which is illegal). Christer Ericson, Director of Tools and Technology Sony Computer Entertainment, Santa Monica 
From: Matt J <mjohnson2005@gm...>  20090302 21:49:08

It was already mentioned using a union fixes the aliasing rules that come into effect when you are adhering to C99 standards. Why is the word "solutions" in quotes? Your blog is full of bit twiddling exercises ;) > None of the posted "solutions" that access the float bits > using an integer pointer are legit as they all treat an > area of memory as having two different types at the same > time (which is illegal). > > Christer Ericson, Director of Tools and Technology > Sony Computer Entertainment, Santa Monica > 
From: Matt J <mjohnson2005@gm...>  20090302 17:47:23

After profiling I realized that eliminating one jump isn't necessarily worth all the work. On my Core II it takes a lot of overhead before a jump is noticeable. Adding more bitwise logic to avoid it also can even slows things down depending on what 'trick' I use. If you want to use integers try the FloatInt union trick Glenn mentioned, that is a much cleaner way of casting. This whole exercise is probably a great example of premature optimization =) If you need the speed, you can dip into assembly. But since I'm working on a math library and exploring more optimization I wanted to jump in. >From what I've seen now is if you really need speed, nothing is going to beat a.) using a good compiler or b.) using SSE With SSE you can square all the components in one instruction, swizzle and do all 3 compares in a few more, do an AND in one instruction and a notAND on another, and best of all it parallelizes super well. The architectures that have the worst jumps are the ones to benefit from SSE, like the P4, and I read they have a super fast SSE path with high throughput > I guess you're right (I'm referring to Marc Hernandez). Assigning the > result of a float compare to a bool introduces a branch in VC8 as well. > How about this trick then > > template <> > FORCEINLINE > int maxAxis(const Vector3<float>& v) > { > const int32_t* a = reinterpret_cast<const int32_t*>(&v[0]); > int c0 = a[0] < a[1]; > int c1 = a[0] < a[2]; > int c2 = a[1] < a[2]; > return (c0 & ~c2)  ((c1 & c2) << 1); > } > > If we take the binary values of the floats and do an integer compare on > them the result should be equal to the float compare, that is, for > nonnegative floats. I still have to check whether this will work using > gcc version 4 and up, since I'm not sure if I'm breaking the > strictaliasing rule here. > > Gino > > > Gino van den Bergen wrote: > > I would like to share my approach. This code is copypasted straight > > from my Vector3 class template so it may look a bit cluttered but I > > hope the idea comes across: > > > > template <typename Scalar> > > FORCEINLINE > > int maxAxis(const Vector3<Scalar>& a) > > { > > int c0 = a[0] < a[1]; > > int c1 = a[0] < a[2]; > > int c2 = a[1] < a[2]; > > return (c0 & ~c2)  ((c1 & c2) << 1); > > } > > > > template <typename Scalar> > > FORCEINLINE > > int minAxis(const Vector3<Scalar>& a) > > { > > int c0 = a[1] < a[0]; > > int c1 = a[2] < a[0]; > > int c2 = a[2] < a[1]; > > return (c0 & ~c2)  ((c1 & c2) << 1); > > } > > > > template <typename Scalar> > > FORCEINLINE > > int closestAxis(const Vector3<Scalar>& a) > > { > > return maxAxis(a * a); > > } > > template <typename Scalar> > > FORCEINLINE > > int furthestAxis(const Vector3<Scalar>& a) > > { > > return minAxis(a * a); > > } > > > > The function minAxis and maxAxis return a value 0, 1, or 2, so the > > result only needs two bits (00, 01, and 10 in base2). The first term > > of the "" operator is bit0 and the second term (the one with the << > > 1) is bit1. The nice thing about this approach is the fact that its > > branchless. Three boolean values are computed but they are never used > > to branch, so no codecache misses can happen here. > > > > For finding the minimum of maximum *absolute* value I do not use > > "fabs" but I rather multiply the vector with itself, thus a * a = > > (a.x * a.x, a.y * a.y, a.z * a.z). "closestAxis" returns the world > > axis that is closest (as in most parallel) to vector "a". > > "furthestAxis" returns the most orthogonal world axis. > > > > Cheers, > > > > Gino > > > > > > Sylvain G. Vignaud wrote: > >> Hi, > >> > >> I need to compute the index (not the actual value) of the largest > >> coordinate of a normal, for some space hashing. > >> > >> I'm not sure how fast you guys usually find this index, but I've just > >> created the following trick which I think is quite fast: > >> > >> > >>> inline uint LargestCoordinate(const Vector3d &v) > >>> { > >>> const float x = fabs(v.x); > >>> const float z = fabs(v.z); > >>> const float y = Maths::max( fabs(v.y), z ); > >>> return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > >>> } > >>> > >> > >> I didn't need such function before, so I'm not sure if this is > >> considered fast or slow. Do you guys have something faster? > 
From: <christer_ericson@pl...>  20090302 17:28:35

Gino van den Bergen wrote: > const int32_t* a = reinterpret_cast<const int32_t*>(&v[0]); > > [...] I still have to check whether this will work using > gcc version 4 and up, since I'm not sure if I'm breaking the > strictaliasing rule here. None of the posted "solutions" that access the float bits using an integer pointer are legit as they all treat an area of memory as having two different types at the same time (which is illegal). Christer Ericson, Director of Tools and Technology Sony Computer Entertainment, Santa Monica 
From: Sylvain G. Vignaud <vignsyl@ii...>  20090302 15:56:22

And the LargerCoordinate version, using abs values: unsigned int LargerCoordinate( const Vector3d &vf ) { const unsigned int SignMask = 0x80000000; const unsigned long *vi = (const unsigned long*)&vf.x; unsigned int x = vi[0] & ~SignMask; unsigned int y = vi[1] & ~SignMask; unsigned int z = vi[2] & ~SignMask; #if 1 unsigned int zx = (xz)>>31; unsigned int zy = (yz)>>31; unsigned int yx = (xy)>>31; unsigned int yz = ~zy; // (zy)>>31; #else unsigned int zx = z>x; unsigned int zy = z>y; unsigned int yx = y>x; unsigned int yz = ~zy; // y>z; #endif return ((zx & zy)<<1)  (yx&yz); } 004010CC mov ecx,dword ptr [esp+34h] 004010D0 mov edx,dword ptr [esp+38h] 004010D4 mov eax,dword ptr [esp+3Ch] 004010D8 and edx,7FFFFFFFh 004010DE and eax,7FFFFFFFh 004010E3 and ecx,7FFFFFFFh 004010E9 mov ebx,ecx 004010EB sub ebx,eax 004010ED mov edi,edx 004010EF sub edi,eax 004010F1 shr edi,1Fh 004010F4 shr ebx,1Fh 004010F7 and edi,ebx 004010F9 sub eax,edx 004010FB add edi,edi 004010FD shr eax,1Fh 00401100 sub ecx,edx 00401102 shr ecx,1Fh 00401105 and eax,ecx 00401107 or edi,eax 00401109 push edi From: Gino van den Bergen <gino.vandenbergen@...> > I guess you're right (I'm referring to Marc Hernandez). Assigning > the > result of a float compare to a bool introduces a branch in VC8 as > well. > How about this trick then > > template <> > FORCEINLINE > int maxAxis(const Vector3<float>& v) > { > const int32_t* a = reinterpret_cast<const int32_t*>(&v[0]); > int c0 = a[0] < a[1]; > int c1 = a[0] < a[2]; > int c2 = a[1] < a[2]; > return (c0 & ~c2)  ((c1 & c2) << 1); > } > > If we take the binary values of the floats and do an integer > compare on > them the result should be equal to the float compare, that is, for > nonnegative floats. I still have to check whether this will work > using > gcc version 4 and up, since I'm not sure if I'm breaking the > strictaliasing rule here. > > Gino > > > Gino van den Bergen wrote: > > I would like to share my approach. This code is copypasted > straight > > from my Vector3 class template so it may look a bit cluttered but > I > > hope the idea comes across: > > > > template <typename Scalar> > > FORCEINLINE > > int maxAxis(const Vector3<Scalar>& a) > > { > > int c0 = a[0] < a[1]; > > int c1 = a[0] < a[2]; > > int c2 = a[1] < a[2]; > > return (c0 & ~c2)  ((c1 & c2) << 1); > > } > > > > template <typename Scalar> > > FORCEINLINE > > int minAxis(const Vector3<Scalar>& a) > > { > > int c0 = a[1] < a[0]; > > int c1 = a[2] < a[0]; > > int c2 = a[2] < a[1]; > > return (c0 & ~c2)  ((c1 & c2) << 1); > > } > > > > template <typename Scalar> > > FORCEINLINE > > int closestAxis(const Vector3<Scalar>& a) > > { > > return maxAxis(a * a); > > } > > template <typename Scalar> > > FORCEINLINE > > int furthestAxis(const Vector3<Scalar>& a) > > { > > return minAxis(a * a); > > } > > > > The function minAxis and maxAxis return a value 0, 1, or 2, so > the > > result only needs two bits (00, 01, and 10 in base2). The first > term > > of the "" operator is bit0 and the second term (the one with > the << > > 1) is bit1. The nice thing about this approach is the fact that > its > > branchless. Three boolean values are computed but they are never > used > > to branch, so no codecache misses can happen here. > > > > For finding the minimum of maximum *absolute* value I do not use > > "fabs" but I rather multiply the vector with itself, thus a * a > = > > (a.x * a.x, a.y * a.y, a.z * a.z). "closestAxis" returns the > world > > axis that is closest (as in most parallel) to vector "a". > > "furthestAxis" returns the most orthogonal world axis. > > > > Cheers, > > > > Gino > > > > > > Sylvain G. Vignaud wrote: > >> Hi, > >> > >> I need to compute the index (not the actual value) of the largest > >> coordinate of a normal, for some space hashing. > >> > >> I'm not sure how fast you guys usually find this index, but I've > just>> created the following trick which I think is quite fast: > >> > >> > >>> inline uint LargestCoordinate(const Vector3d &v) > >>> { > >>> const float x = fabs(v.x); > >>> const float z = fabs(v.z); > >>> const float y = Maths::max( fabs(v.y), z ); > >>> return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > >>> } > >>> > >> > >> I didn't need such function before, so I'm not sure if this is > >> considered fast or slow. Do you guys have something faster? > >> > >> > >>  >  > >> > >> Open Source Business Conference (OSBC), March 2425, 2009, San > >> Francisco, CA > >> OSBC tackles the biggest issue in open source: Open Sourcing > the > >> Enterprise > >> Strategies to boost innovation and cut costs with open source > >> participation > >> Receive a $600 discount off the registration fee with the > source > >> code: SFAD > >> http://p.sf.net/sfu/XcvMzF8H > >> _______________________________________________ > >> GDAlgorithmslist mailing list > >> GDAlgorithmslist@... > >> https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > >> Archives: > >> > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist > >> > >> > > > > >  >  > Open Source Business Conference (OSBC), March 2425, 2009, San > Francisco, CA > OSBC tackles the biggest issue in open source: Open Sourcing the > EnterpriseStrategies to boost innovation and cut costs with open > source participation > Receive a $600 discount off the registration fee with the source > code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist > 
From: Gino van den Bergen <gino.vandenbergen@gm...>  20090302 12:29:24

I guess you're right (I'm referring to Marc Hernandez). Assigning the result of a float compare to a bool introduces a branch in VC8 as well. How about this trick then template <> FORCEINLINE int maxAxis(const Vector3<float>& v) { const int32_t* a = reinterpret_cast<const int32_t*>(&v[0]); int c0 = a[0] < a[1]; int c1 = a[0] < a[2]; int c2 = a[1] < a[2]; return (c0 & ~c2)  ((c1 & c2) << 1); } If we take the binary values of the floats and do an integer compare on them the result should be equal to the float compare, that is, for nonnegative floats. I still have to check whether this will work using gcc version 4 and up, since I'm not sure if I'm breaking the strictaliasing rule here. Gino Gino van den Bergen wrote: > I would like to share my approach. This code is copypasted straight > from my Vector3 class template so it may look a bit cluttered but I > hope the idea comes across: > > template <typename Scalar> > FORCEINLINE > int maxAxis(const Vector3<Scalar>& a) > { > int c0 = a[0] < a[1]; > int c1 = a[0] < a[2]; > int c2 = a[1] < a[2]; > return (c0 & ~c2)  ((c1 & c2) << 1); > } > > template <typename Scalar> > FORCEINLINE > int minAxis(const Vector3<Scalar>& a) > { > int c0 = a[1] < a[0]; > int c1 = a[2] < a[0]; > int c2 = a[2] < a[1]; > return (c0 & ~c2)  ((c1 & c2) << 1); > } > > template <typename Scalar> > FORCEINLINE > int closestAxis(const Vector3<Scalar>& a) > { > return maxAxis(a * a); > } > template <typename Scalar> > FORCEINLINE > int furthestAxis(const Vector3<Scalar>& a) > { > return minAxis(a * a); > } > > The function minAxis and maxAxis return a value 0, 1, or 2, so the > result only needs two bits (00, 01, and 10 in base2). The first term > of the "" operator is bit0 and the second term (the one with the << > 1) is bit1. The nice thing about this approach is the fact that its > branchless. Three boolean values are computed but they are never used > to branch, so no codecache misses can happen here. > > For finding the minimum of maximum *absolute* value I do not use > "fabs" but I rather multiply the vector with itself, thus a * a = > (a.x * a.x, a.y * a.y, a.z * a.z). "closestAxis" returns the world > axis that is closest (as in most parallel) to vector "a". > "furthestAxis" returns the most orthogonal world axis. > > Cheers, > > Gino > > > Sylvain G. Vignaud wrote: >> Hi, >> >> I need to compute the index (not the actual value) of the largest >> coordinate of a normal, for some space hashing. >> >> I'm not sure how fast you guys usually find this index, but I've just >> created the following trick which I think is quite fast: >> >> >>> inline uint LargestCoordinate(const Vector3d &v) >>> { >>> const float x = fabs(v.x); >>> const float z = fabs(v.z); >>> const float y = Maths::max( fabs(v.y), z ); >>> return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); >>> } >>> >> >> I didn't need such function before, so I'm not sure if this is >> considered fast or slow. Do you guys have something faster? >> >> >>  >> >> Open Source Business Conference (OSBC), March 2425, 2009, San >> Francisco, CA >> OSBC tackles the biggest issue in open source: Open Sourcing the >> Enterprise >> Strategies to boost innovation and cut costs with open source >> participation >> Receive a $600 discount off the registration fee with the source >> code: SFAD >> http://p.sf.net/sfu/XcvMzF8H >> _______________________________________________ >> GDAlgorithmslist mailing list >> GDAlgorithmslist@... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist >> >> > 