You can subscribe to this list here.
2000 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}
(390) 
_{Aug}
(767) 
_{Sep}
(940) 
_{Oct}
(964) 
_{Nov}
(819) 
_{Dec}
(762) 

2001 
_{Jan}
(680) 
_{Feb}
(1075) 
_{Mar}
(954) 
_{Apr}
(595) 
_{May}
(725) 
_{Jun}
(868) 
_{Jul}
(678) 
_{Aug}
(785) 
_{Sep}
(410) 
_{Oct}
(395) 
_{Nov}
(374) 
_{Dec}
(419) 
2002 
_{Jan}
(699) 
_{Feb}
(501) 
_{Mar}
(311) 
_{Apr}
(334) 
_{May}
(501) 
_{Jun}
(507) 
_{Jul}
(441) 
_{Aug}
(395) 
_{Sep}
(540) 
_{Oct}
(416) 
_{Nov}
(369) 
_{Dec}
(373) 
2003 
_{Jan}
(514) 
_{Feb}
(488) 
_{Mar}
(396) 
_{Apr}
(624) 
_{May}
(590) 
_{Jun}
(562) 
_{Jul}
(546) 
_{Aug}
(463) 
_{Sep}
(389) 
_{Oct}
(399) 
_{Nov}
(333) 
_{Dec}
(449) 
2004 
_{Jan}
(317) 
_{Feb}
(395) 
_{Mar}
(136) 
_{Apr}
(338) 
_{May}
(488) 
_{Jun}
(306) 
_{Jul}
(266) 
_{Aug}
(424) 
_{Sep}
(502) 
_{Oct}
(170) 
_{Nov}
(170) 
_{Dec}
(134) 
2005 
_{Jan}
(249) 
_{Feb}
(109) 
_{Mar}
(119) 
_{Apr}
(282) 
_{May}
(82) 
_{Jun}
(113) 
_{Jul}
(56) 
_{Aug}
(160) 
_{Sep}
(89) 
_{Oct}
(98) 
_{Nov}
(237) 
_{Dec}
(297) 
2006 
_{Jan}
(151) 
_{Feb}
(250) 
_{Mar}
(222) 
_{Apr}
(147) 
_{May}
(266) 
_{Jun}
(313) 
_{Jul}
(367) 
_{Aug}
(135) 
_{Sep}
(108) 
_{Oct}
(110) 
_{Nov}
(220) 
_{Dec}
(47) 
2007 
_{Jan}
(133) 
_{Feb}
(144) 
_{Mar}
(247) 
_{Apr}
(191) 
_{May}
(191) 
_{Jun}
(171) 
_{Jul}
(160) 
_{Aug}
(51) 
_{Sep}
(125) 
_{Oct}
(115) 
_{Nov}
(78) 
_{Dec}
(67) 
2008 
_{Jan}
(165) 
_{Feb}
(37) 
_{Mar}
(130) 
_{Apr}
(111) 
_{May}
(91) 
_{Jun}
(142) 
_{Jul}
(54) 
_{Aug}
(104) 
_{Sep}
(89) 
_{Oct}
(87) 
_{Nov}
(44) 
_{Dec}
(54) 
2009 
_{Jan}
(283) 
_{Feb}
(113) 
_{Mar}
(154) 
_{Apr}
(395) 
_{May}
(62) 
_{Jun}
(48) 
_{Jul}
(52) 
_{Aug}
(54) 
_{Sep}
(131) 
_{Oct}
(29) 
_{Nov}
(32) 
_{Dec}
(37) 
2010 
_{Jan}
(34) 
_{Feb}
(36) 
_{Mar}
(40) 
_{Apr}
(23) 
_{May}
(38) 
_{Jun}
(34) 
_{Jul}
(36) 
_{Aug}
(27) 
_{Sep}
(9) 
_{Oct}
(18) 
_{Nov}
(25) 
_{Dec}

2011 
_{Jan}
(1) 
_{Feb}
(14) 
_{Mar}
(1) 
_{Apr}
(5) 
_{May}
(1) 
_{Jun}

_{Jul}

_{Aug}
(37) 
_{Sep}
(6) 
_{Oct}
(2) 
_{Nov}

_{Dec}

2012 
_{Jan}

_{Feb}
(7) 
_{Mar}

_{Apr}
(4) 
_{May}

_{Jun}
(3) 
_{Jul}

_{Aug}

_{Sep}
(1) 
_{Oct}

_{Nov}

_{Dec}
(10) 
2013 
_{Jan}

_{Feb}
(1) 
_{Mar}
(7) 
_{Apr}
(2) 
_{May}

_{Jun}

_{Jul}
(9) 
_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}

2014 
_{Jan}
(14) 
_{Feb}

_{Mar}
(2) 
_{Apr}

_{May}
(10) 
_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}

_{Nov}
(3) 
_{Dec}

2015 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}
(12) 
_{Nov}

_{Dec}
(1) 
2016 
_{Jan}

_{Feb}
(1) 
_{Mar}
(1) 
_{Apr}
(1) 
_{May}

_{Jun}
(1) 
_{Jul}

_{Aug}
(1) 
_{Sep}

_{Oct}

_{Nov}

_{Dec}

S  M  T  W  T  F  S 

1

2

3

4

5

6
(1) 
7
(3) 
8
(4) 
9
(3) 
10
(11) 
11
(19) 
12
(17) 
13
(5) 
14
(6) 
15
(1) 
16

17
(3) 
18
(7) 
19
(8) 
20

21

22

23

24
(2) 
25
(3) 
26
(4) 
27
(12) 
28
(4) 
From: Matt J <mjohnson2005@gm...>  20090228 23:16:38

I did some profiling on MSVC 2009, and integer methods are ~10% faster: unsigned int x = 0x7FFFFFFF & *(unsigned int *)&n.x; unsigned int y = 0x7FFFFFFF & *(unsigned int *)&n.y; unsigned int z = 0x7FFFFFFF & *(unsigned int *)&n.z; return (x > y) ? (x > z) ? 0 : 2 : (y > z) ? 1 : 2; For some odd reason only when using integers can it simplify the two inner ternary methods using bit 'magic'. If we use floats or use SSE the integer methods are about ~5% faster because this compiler generates a bunch of jumps. The method above (as well as mine) generates 1 jump. I actually eliminated all jumps but since I use one too many registers it slows it down. I think the CPU is smart enough to look ahead if you get down to one conditional ..so yeah I think if you use the stack it can slow things down slightly but you do see a gain if you use integers. I assume conditional jumps would be the best way, assuming the compiler can do it depth first because you need 3 conditionals  the inner two, and then a conditional move on the results of those two Asm: ; _n$ = edx ; 61 : unsigned int x = 0x7FFFFFFF & *(unsigned int *)&n.x; mov eax, DWORD PTR [edx] ; 62 : unsigned int y = 0x7FFFFFFF & *(unsigned int *)&n.y; mov ecx, DWORD PTR [edx+4] ; 63 : unsigned int z = 0x7FFFFFFF & *(unsigned int *)&n.z; mov edx, DWORD PTR [edx+8] and eax, 2147483647 ; 7fffffffH and ecx, 2147483647 ; 7fffffffH and edx, 2147483647 ; 7fffffffH ; 69 : return (x > y) ? (x > z) ? 0 : 2 : (y > z) ? 1 : 2; cmp eax, ecx jbe SHORT $LN3@... cmp edx, eax sbb eax, eax and eax, 2 ; fffffffeH add eax, 2 ; 70 : } ret 0 $LN3@...: ; 69 : return (x > y) ? (x > z) ? 0 : 2 : (y > z) ? 1 : 2; cmp edx, ecx sbb eax, eax add eax, 2 ; 70 : } ret 
From: Jon Watte <jwatte@gm...>  20090228 06:51:49

Matt J wrote: > > Yeah, okay, i get it, i got it backwards. I just said that =) > > Anyway, here another way that only has two conditionals, no shifts, > and no max's and easier to follow. Not sure if it is faster though. > The FP uses a stack either way, right? I think this will work but > never tested x = fabsf(x); y = fabsf(y); z = fabsf(z); if (x > y) if (x > z) return 0; else return 2; else if (y > z) return 1; else return 2; This is the canonical case which I already posted. It only has two conditionals in the path of execution. It also doesn't use indirect addressing, which may cause aliasing confusion, and forces data out of registers. If you compile with intrinsics, then fabsf is a single instruction and keeps data in registers. I think you'll find it hard to beat. Sincerely, jw 
From: Sylvain G. Vignaud <vignsyl@ii...>  20090228 06:43:42

From: Stefan Sandberg <keffo.sandberg@...> > By all means, elaborate on what you're doing, if you don't mind... :) Well I was simply trying to get rid of as many computation as possible in my justforfun homeproject of realtime global illumination approximation. I'm computing that on the CPU, per vertex. I'm of course using an octree to find out which vertexes "see" each other (then computing GI is simply computing vertex colors and propagating them to connected vertexes). The idea behind my previous question was to split each octree cells depending on the normal of the vertexes. Actually I'm now using a very simplified way to do so: > RoundingOf(1+normal.x) + (normal.z<0.0f ? 3 : 0) I don't split my octree cells depending on the normal.y for simplicity. Now I'm down to 1.53 seconds for the Sponza Atrium, and 3.16 seconds for the Sibenik Cathedral  models from Marko Dabrovic available at: http://hdri.cgtechniques.com/~sponza/ (these models are not very clean: lots of duplicated vertexes, not shared edges, holed, inverted faces creating holes...) My project is still in a very early stage, so the results are not that impressive yet ; with only one light in the scene: fakedGI: http://tfpsly.free.fr/Files/SiCath_gi.jpg Classical diffuse+ambient: http://tfpsly.free.fr/Files/SiCath_ambient.jpg Only the diffuse: http://tfpsly.free.fr/Files/SiCath_noambient.jpg > Sylvain G. Vignaud wrote: > > From: Jon Watte <jwatte@...> > > > >> Sylvain G. Vignaud wrote: > >> > >>> I didn't need such function before, so I'm not sure if this is > >>> considered fast or slow. Do you guys have something faster? > >>> > >>> > >> What hardware are you running on? Does it have conditional > moves? > >> Is it > >> inorder or outoforder? Does it have memory prefetch? > >> > > > > PC, current gen. It's for a real time pseuso global illumination > > attempt. I currently need a 10 second preprocess step to compute how > > vertexes "see" each other on the Sponza Atrium or the Sibenik > > Cathedrals from http://hdri.cgtechniques.com/~sponza/ > > Which means I can have dynamic lights, but my models are to be > > static. I'd like to see how much I could speed that up, even if > > dynamic meshes seem quite far away. } 
From: Matt J <mjohnson2005@gm...>  20090228 02:32:35

Yeah, okay, i get it, i got it backwards. I just said that =) Anyway, here another way that only has two conditionals, no shifts, and no max's and easier to follow. Not sure if it is faster though. The FP uses a stack either way, right? I think this will work but never tested const float f[3] = { fabsf(n.x), fabsf(n.y), fabsf(n.z) }; unsigned int i = ( f[1] > f[0] ) ? 1 : 0; return (f[2] > f[i]) ? 2 : i; ..........Finally, an all integer way: const unsigned int u[3] = { 0x7FFFFFFF & *(unsigned int *)&n.x, 0x7FFFFFFF & *(unsigned int *)&n.y, 0x7FFFFFFF & *(unsigned int *)&n.z }; unsigned int i = ( u[1] > u[0] ) ? 1 : 0; return (u[2] > u[i]) ? 2 : i; If (y > x) == 0 then that means x <= y. > nope: > If (y > x) == 0 then that means x >= y. > > 2009/2/27 Matt J <mjohnson2005@...> > >> >> Hmm, I fail to see how this works: >> >> If (y > x) == 0 then that means x <= y. >> >> So, >> >> If x <= y and z < y then 0 << 0 = 0 >> If x <= y and z >= y then 0 << 1 = 0 >> >> In the first case, y is the greatest component In the second case, z is >> the greatest component. >> >> Both are encoded as 0. Something I'm missing? >> >> >> Hi, >>> >>> I need to compute the index (not the actual value) of the largest >>> coordinate of a normal, for some space hashing. >>> >>> I'm not sure how fast you guys usually find this index, but I've just >>> created the following trick which I think is quite fast: >>> >>> > inline uint LargestCoordinate(const Vector3d &v) >>> > { >>> > const float x = fabs(v.x); >>> > const float z = fabs(v.z); >>> > const float y = Maths::max( fabs(v.y), z ); >>> > return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); >>> > } >>> >> 
From: Richard Fabian <raspo1@gm...>  20090227 21:56:58

If (y > x) == 0 then that means x <= y. nope: If (y > x) == 0 then that means x >= y. 2009/2/27 Matt J <mjohnson2005@...> > > Hmm, I fail to see how this works: > > If (y > x) == 0 then that means x <= y. > > So, > > If x <= y and z < y then 0 << 0 = 0 > If x <= y and z >= y then 0 << 1 = 0 > > In the first case, y is the greatest component In the second case, z is the > greatest component. > > Both are encoded as 0. Something I'm missing? > > > Hi, >> >> I need to compute the index (not the actual value) of the largest >> coordinate of a normal, for some space hashing. >> >> I'm not sure how fast you guys usually find this index, but I've just >> created the following trick which I think is quite fast: >> >> > inline uint LargestCoordinate(const Vector3d &v) >> > { >> > const float x = fabs(v.x); >> > const float z = fabs(v.z); >> > const float y = Maths::max( fabs(v.y), z ); >> > return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); >> > } >> > > >  > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, > CA > OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > Strategies to boost innovation and cut costs with open source > participation > Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist >  fabs(); Just because the world is full of people that think just like you, doesn't mean the other ones can't be right. 
From: Matt J <mjohnson2005@gm...>  20090227 17:11:53

Ah, yeah. And !(y > x) is !(y <= x) e.g. x >= y i forgot to double negate in my head. > > > The max. > > That formula has more fabs than sense. A quick rewrite would be: > x = abs(v.x); y = abs(v.y); z = abs(v.z) > return (x < max(y,z)) << (z >= y) > > which is obviously correct. > > OG. > 
From: Matt J <mjohnson2005@gm...>  20090227 16:52:20

oops, ignore the *(float *) part > That's a good trick, something like this? > > unsigned int x = 0x7FFFFFFF & *(unsigned int *)&n.x; > unsigned int y = 0x7FFFFFFF & *(unsigned int *)&n.y; unsigned int z > = 0x7FFFFFFF & *(unsigned int *)&n.z; > > return *(float *)((x > y) ? (x > z) ? 0 : 2 : (y > z) ? 1 : 2); > > Actually, that could technically pipeline well because your mixing ints and > floats. This int trick won't work with NaN's, though, so you might want to > add a debug assertion to see if the float uses NaNs to be more robust > > Finally, compilers don't need cmov to perform a conditional move > optimization, see: http://www.nynaeve.net/?p=178 > > You should check the asm output to see what your optimizing compiler does.. > also are you targeting it with the right instruction set (conditional moves > began with Pentium Pro i think) > > >> >> I've never found this to show up in any profile, and on x86, I believe a >> competent compiler generates cmov. >> I guess you could even do it with ints where you mask away the top bit, >> if I knew that type aliasing wouldn't kill me. >> >> Sincerely, >> >> jw >> >   Matt Johnson http://otowngraphics.blogspot.com 
From: Matt J <mjohnson2005@gm...>  20090227 16:49:19

That's a good trick, something like this? unsigned int x = 0x7FFFFFFF & *(unsigned int *)&n.x; unsigned int y = 0x7FFFFFFF & *(unsigned int *)&n.y; unsigned int z = 0x7FFFFFFF & *(unsigned int *)&n.z; return *(float *)((x > y) ? (x > z) ? 0 : 2 : (y > z) ? 1 : 2); Actually, that could technically pipeline well because your mixing ints and floats. This int trick won't work with NaN's, though, so you might want to add a debug assertion to see if the float uses NaNs to be more robust Finally, compilers don't need cmov to perform a conditional move optimization, see: http://www.nynaeve.net/?p=178 You should check the asm output to see what your optimizing compiler does.. also are you targeting it with the right instruction set (conditional moves began with Pentium Pro i think) > > I've never found this to show up in any profile, and on x86, I believe a > competent compiler generates cmov. > I guess you could even do it with ints where you mask away the top bit, > if I knew that type aliasing wouldn't kill me. > > Sincerely, > > jw > 
From: Stefan Sandberg <keffo.sandberg@gm...>  20090227 16:48:22

By all means, elaborate on what you're doing, if you don't mind... :) Sylvain G. Vignaud wrote: > From: Jon Watte <jwatte@...> > >> Sylvain G. Vignaud wrote: >> >>> I didn't need such function before, so I'm not sure if this is >>> considered fast or slow. Do you guys have something faster? >>> >>> >> What hardware are you running on? Does it have conditional moves? >> Is it >> inorder or outoforder? Does it have memory prefetch? >> > > PC, current gen. It's for a real time pseuso global illumination > attempt. I currently need a 10 second preprocess step to compute how > vertexes "see" each other on the Sponza Atrium or the Sibenik Cathedrals > from http://hdri.cgtechniques.com/~sponza/ > Which means I can have dynamic lights, but my models are to be static. > I'd like to see how much I could speed that up, even if dynamic meshes > seem quite far away. > > >  > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, CA > OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > Strategies to boost innovation and cut costs with open source participation > Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist > > 
From: Olivier Galibert <galibert@po...>  20090227 16:47:50

On Fri, Feb 27, 2009 at 11:02:58AM 0500, Matt J wrote: > Hmm, I fail to see how this works: > > If (y > x) == 0 then that means x <= y. > > So, > > If x <= y and z < y then 0 << 0 = 0 > If x <= y and z >= y then 0 << 1 = 0 > > In the first case, y is the greatest component In the second case, z is the > greatest component. > > Both are encoded as 0. Something I'm missing? The max. That formula has more fabs than sense. A quick rewrite would be: x = abs(v.x); y = abs(v.y); z = abs(v.z) return (x < max(y,z)) << (z >= y) which is obviously correct. OG. > > > Hi, > > > > I need to compute the index (not the actual value) of the largest > > coordinate of a normal, for some space hashing. > > > > I'm not sure how fast you guys usually find this index, but I've just > > created the following trick which I think is quite fast: > > > > > inline uint LargestCoordinate(const Vector3d &v) > > > { > > > const float x = fabs(v.x); > > > const float z = fabs(v.z); > > > const float y = Maths::max( fabs(v.y), z ); > > > return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > > > } > > >  > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, CA > OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > Strategies to boost innovation and cut costs with open source participation > Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist 
From: Sylvain G. Vignaud <vignsyl@ii...>  20090227 16:28:38

From: Jon Watte <jwatte@...> > Sylvain G. Vignaud wrote: > > I didn't need such function before, so I'm not sure if this is > > considered fast or slow. Do you guys have something faster? > > > > What hardware are you running on? Does it have conditional moves? > Is it > inorder or outoforder? Does it have memory prefetch? PC, current gen. It's for a real time pseuso global illumination attempt. I currently need a 10 second preprocess step to compute how vertexes "see" each other on the Sponza Atrium or the Sibenik Cathedrals from http://hdri.cgtechniques.com/~sponza/ Which means I can have dynamic lights, but my models are to be static. I'd like to see how much I could speed that up, even if dynamic meshes seem quite far away. 
From: Sylvain G. Vignaud <vignsyl@ii...>  20090227 16:23:29

Test is: uint(y>x) << uint(z>=y)  don't know why I let all these useless fabs in the previous email. If largest is: * X: 0 << ? = 0 as y<x (and we don't care about the second test) * Y: 1 << 0 = 1 as y>x and z<y * Z: 1 << 1 = 2 as y is replaced with z, so the tests are z>x and z>=z From: Matt J <mjohnson2005@...> > Hmm, I fail to see how this works: > > If (y > x) == 0 then that means x <= y. !!!WRONG!!! it means that x >= y > So, > > If x <= y and z < y then 0 << 0 = 0 > If x <= y and z >= y then 0 << 1 = 0 > > In the first case, y is the greatest component In the second case, > z is the > greatest component. > > Both are encoded as 0. Something I'm missing? > > > Hi, > > > > I need to compute the index (not the actual value) of the largest > > coordinate of a normal, for some space hashing. > > > > I'm not sure how fast you guys usually find this index, but I've > just> created the following trick which I think is quite fast: > > > > > inline uint LargestCoordinate(const Vector3d &v) > > > { > > > const float x = fabs(v.x); > > > const float z = fabs(v.z); > > > const float y = Maths::max( fabs(v.y), z ); > > > return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > > > } > > > 
From: Matt J <mjohnson2005@gm...>  20090227 16:03:04

Hmm, I fail to see how this works: If (y > x) == 0 then that means x <= y. So, If x <= y and z < y then 0 << 0 = 0 If x <= y and z >= y then 0 << 1 = 0 In the first case, y is the greatest component In the second case, z is the greatest component. Both are encoded as 0. Something I'm missing? Hi, > > I need to compute the index (not the actual value) of the largest > coordinate of a normal, for some space hashing. > > I'm not sure how fast you guys usually find this index, but I've just > created the following trick which I think is quite fast: > > > inline uint LargestCoordinate(const Vector3d &v) > > { > > const float x = fabs(v.x); > > const float z = fabs(v.z); > > const float y = Maths::max( fabs(v.y), z ); > > return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > > } > 
From: Alen Ladavac <alenlml@cr...>  20090227 07:29:43

Jon wrote: > I've never found this to show up in any profile, and on x86, I believe a > competent compiler generates cmov. I got burned with this some time ago, so I'll throw in a warning: A competent compiler (e.g. icc) does indeed, but a very common one (msvc) does not (at least not in VC8.0, which was the latest I tested for that). The compiler team says that it does in some cases, but I haven't been able to make it do so. Ever. It certainly doesn't do so for x<y?x:y or for x<y?0:1. If someone does know how to do that, please let me know. Btw, I've actually had cases where the lack of cmov did show on the profile (BIH raycaster), and manually rewritten code using SSE (one float per register!) was significantly faster, so cache performance was not an issue there. Interestingly enough, in many cases like this, when I try to recompile with icc, it turns out its code is very similar to my handoptimized one. And to add one more idea for the OP, if you feel confident in the normality of the inputs, you can try using floatasinteger comparing tricks, but the speed is better only if you don't generate the input using float operations just before doing the compares (i.e. the input is already in memory). HTH, Alen 
From: Jon Watte <jwatte@gm...>  20090227 05:13:50

Sylvain G. Vignaud wrote: > I didn't need such function before, so I'm not sure if this is > considered fast or slow. Do you guys have something faster? > What hardware are you running on? Does it have conditional moves? Is it inorder or outoforder? Does it have memory prefetch? In general, I would guess that there just won't be a measurable difference compared to the time spent waiting for the normal to arrive out of DRAM. My code to do the same thing does: float x = fabsf(n.x); float y = fabsf(n.y); float z = fabsf(n.z); return (x > y) ? (x > z) ? 0 : 2 : (y > z) ? 1 : 2; I've never found this to show up in any profile, and on x86, I believe a competent compiler generates cmov. I guess you could even do it with ints where you mask away the top bit, if I knew that type aliasing wouldn't kill me. Sincerely, jw 
From: Sylvain G. Vignaud <vignsyl@ii...>  20090227 03:41:22

Hi, I need to compute the index (not the actual value) of the largest coordinate of a normal, for some space hashing. I'm not sure how fast you guys usually find this index, but I've just created the following trick which I think is quite fast: > inline uint LargestCoordinate(const Vector3d &v) > { > const float x = fabs(v.x); > const float z = fabs(v.z); > const float y = Maths::max( fabs(v.y), z ); > return uint(fabs(y)>fabs(x)) << uint(fabs(z)>=fabs(y)); > } I didn't need such function before, so I'm not sure if this is considered fast or slow. Do you guys have something faster? 
From: Jon Watte <jwatte@gm...>  20090226 18:41:48

Ruslan Shestopalyuk wrote: > How should I transform skeletons, animations, meshes etc. during the > export? If you have matrix Q, which transforms from DCC to engine, then you should apply Q at the top of your hierearchy. There are two ways to do this: 1) Add a new "phantom" node on top that has Q as its transform 2) Apply Q individually to each data point If you do 2, however, you have to make very sure to not apply Q twice, or in the wrong coordinate space (such as after other transforms have already been applied)! In general, doing 1 is a lot easier. However, I ended up doing 2) for my Max > X file exporter, which means that I have to actually apply T^ Q to the global vertex data, where T is the transform matrix leading to the mesh in the scene graph. Another gotcha is when exporting skinning, where you have to make sure that your vertex data, your bind pose, and your animated skeleton all end up assuming the same coordinate space. I would highly recommend 1) ! :) Sincerely, jw 
From: Ruslan Shestopalyuk <ruslans@fu...>  20090226 14:30:25

Quite often it happens that coordinate system used in DCC tool does not match the other tool's one, or is different from the one used in the game engine. Particular cases of that, which are encountered often, are left vs righthandedness, scaling the units, changing the direction of the up axis, rebasing the root transform for the model etc. Now, I would like to understand what would be the math behind solving the problem, when it is put in a general way, such as: * Given the transform matrix Q (it may include negative scale), which maps all the points in coordinate system A to the corresponding points in the target coordinate system B, specify the operations, which should be applied to:  world space transform matrices  parent space transform matrices  vertices (normals/tangents) for the meshes (skinned/static)  other possible parameters affected For example, to get the model into the engine's coordinate system, I would like to flip Z and Y axis, and then rotate model 180 degrees around Y axis, and also scale everything by 0.01. Also I'd like to shift the model's root into (0,0,0) from some position. How should I transform skeletons, animations, meshes etc. during the export? How the code could be organized, so if later I decide not to rotate the root 180 degrees, and scale by 0.1, it would affect only the definition of the matrix Q? Thanks, Ruslan Shestopalyuk 
From: Jon Watte <jwatte@gm...>  20090226 00:44:48

Damian Coventry wrote: > The paper is from 2001, so I'm sure many more techniques have come > along since them. What approach did you take to implement colliding > translating, rotating boxes against poly meshes? I just sweep the rotated box, I don't collide the actual "corkscrew" shape, so yes, I'm cheating :) Sincerely, jw 
From: Damian Coventry <damian.coventry@gm...>  20090226 00:31:20

I haven't performed any profiling; currently I have only a frame rate counter and the 'feel' from a player's perspective when loading up a fair amount of moving cylinders. Stan's paper has graphs, statistics and quite indepth testing that reports microsecond timings if you want the actual performance impact of this algorithm. The paper is from 2001, so I'm sure many more techniques have come along since them. What approach did you take to implement colliding translating, rotating boxes against poly meshes? ~Damian(); On Thu, Feb 26, 2009 at 12:48 PM, Jon Watte <jwatte@...> wrote: > Damian Coventry wrote: > > Jon Watte: This technique solves more than just spheres. Yes, for just > > spheres, you don't need to jump through these hoops. With this > > technique you can implement any convex shape for which you can > > calculate a point tangential to a plane. So an AABB, a cylinder, and > > yes a sphere too. I suppose if you had a translating, rotating box > > you collide that too. > > But I already collide translating, rotating boxes against poly meshes. > Is there really a noticeable speed improvement from not having to do the > final box/poly test, assuming the BSP does all the broadphase culling? > > Sincerely, > > jw > > > >  > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, > CA > OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > Strategies to boost innovation and cut costs with open source > participation > Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist > 
From: Jon Watte <jwatte@gm...>  20090225 23:48:59

Damian Coventry wrote: > Jon Watte: This technique solves more than just spheres. Yes, for just > spheres, you don't need to jump through these hoops. With this > technique you can implement any convex shape for which you can > calculate a point tangential to a plane. So an AABB, a cylinder, and > yes a sphere too. I suppose if you had a translating, rotating box > you collide that too. But I already collide translating, rotating boxes against poly meshes. Is there really a noticeable speed improvement from not having to do the final box/poly test, assuming the BSP does all the broadphase culling? Sincerely, jw 
From: Damian Coventry <damian.coventry@gm...>  20090225 23:23:01

Jay Stelly: You're right, it's not going to nicely map to a sphere's surface, it's just an approximation. It's a fast approximation, which is the real selling point  you get reasonable collisions for not much CPU. Since my last email to this list I have been successful with this technique and have the socalled 'zero volume cells' algorithm working. It kicks butt. I have suffered no noticeable performance hit and I'm colliding spheres and cylinders nicely. Jon Watte: This technique solves more than just spheres. Yes, for just spheres, you don't need to jump through these hoops. With this technique you can implement any convex shape for which you can calculate a point tangential to a plane. So an AABB, a cylinder, and yes a sphere too. I suppose if you had a translating, rotating box you collide that too. ~Damian(); On Wed, Feb 25, 2009 at 6:19 PM, Gregory Junker <gjunker@...> wrote: > @ Jon  the OP is at the bottom; the person you are replying to is not the > OP. > > Greg > > > Original Message > > From: Jon Watte [mailto:jwatte@...] > > Sent: Tuesday, February 24, 2009 3:48 PM > > To: Game Development Algorithms > > Subject: Re: [Algorithms] Dynamic Plane Shifting BSP Traversal Problem > > > > Jay Stelly wrote: > > > The bevelling doesn't completely solve the problem though. In order > > > to shrink a sphere down to a point you need to spherically extend > > > (i.e. convolve with a sphere) the BSP geometry. Or to put it another > > > way  you need to add bevel planes to the BSP for every potential > > > separating axis. > > > > The word for that kind of expansion of geometry is "Minkowski sum". You > > get some Google hits on that, but most of them are not quite realtime > > or implementationoriented enough for games. > > > > Color me confused, though: I have previously collided spheres against > > geometry in a BSP tree, and it works just fine without the plane > > shifting  without any artifacts. Is there some other benefit you are > > trying to get out of the plane shifting than avoiding having to do > > sphere/polygon tests (i e, live with only the slightly faster > > sphere/plane tests)? > > It seems to me that the cost of sphere/poly tests, once culled based on > > a BSP, is such a small cost as to be negligible, and certainly not worth > > all the hours that have been spent on it already even on this list :) > > > > Sincerely, > > > > jw > > > > > > >  > >  > > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, > > CA > > OSBC tackles the biggest issue in open source: Open Sourcing the > > Enterprise > > Strategies to boost innovation and cut costs with open source > > participation > > Receive a $600 discount off the registration fee with the source code: > > SFAD > > http://p.sf.net/sfu/XcvMzF8H > > _______________________________________________ > > GDAlgorithmslist mailing list > > GDAlgorithmslist@... > > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > > Archives: > > > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist > > > >  > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, > CA > OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > Strategies to boost innovation and cut costs with open source > participation > Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist > 
From: Gregory Junker <gjunker@da...>  20090225 05:45:52

@ Jon  the OP is at the bottom; the person you are replying to is not the OP. Greg > Original Message > From: Jon Watte [mailto:jwatte@...] > Sent: Tuesday, February 24, 2009 3:48 PM > To: Game Development Algorithms > Subject: Re: [Algorithms] Dynamic Plane Shifting BSP Traversal Problem > > Jay Stelly wrote: > > The bevelling doesn't completely solve the problem though. In order > > to shrink a sphere down to a point you need to spherically extend > > (i.e. convolve with a sphere) the BSP geometry. Or to put it another > > way  you need to add bevel planes to the BSP for every potential > > separating axis. > > The word for that kind of expansion of geometry is "Minkowski sum". You > get some Google hits on that, but most of them are not quite realtime > or implementationoriented enough for games. > > Color me confused, though: I have previously collided spheres against > geometry in a BSP tree, and it works just fine without the plane > shifting  without any artifacts. Is there some other benefit you are > trying to get out of the plane shifting than avoiding having to do > sphere/polygon tests (i e, live with only the slightly faster > sphere/plane tests)? > It seems to me that the cost of sphere/poly tests, once culled based on > a BSP, is such a small cost as to be negligible, and certainly not worth > all the hours that have been spent on it already even on this list :) > > Sincerely, > > jw > > >  >  > Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, > CA > OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > Strategies to boost innovation and cut costs with open source > participation > Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > GDAlgorithmslist mailing list > GDAlgorithmslist@... > https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist 
From: Jon Watte <jwatte@gm...>  20090224 23:48:37

Jay Stelly wrote: > The bevelling doesn't completely solve the problem though. In order > to shrink a sphere down to a point you need to spherically extend > (i.e. convolve with a sphere) the BSP geometry. Or to put it another > way  you need to add bevel planes to the BSP for every potential > separating axis. The word for that kind of expansion of geometry is "Minkowski sum". You get some Google hits on that, but most of them are not quite realtime or implementationoriented enough for games. Color me confused, though: I have previously collided spheres against geometry in a BSP tree, and it works just fine without the plane shifting  without any artifacts. Is there some other benefit you are trying to get out of the plane shifting than avoiding having to do sphere/polygon tests (i e, live with only the slightly faster sphere/plane tests)? It seems to me that the cost of sphere/poly tests, once culled based on a BSP, is such a small cost as to be negligible, and certainly not worth all the hours that have been spent on it already even on this list :) Sincerely, jw 
From: Jay Stelly <Jay@va...>  20090224 23:13:56

The bevelling doesn't completely solve the problem though. In order to shrink a sphere down to a point you need to spherically extend (i.e. convolve with a sphere) the BSP geometry. Or to put it another way  you need to add bevel planes to the BSP for every potential separating axis. For a polyhedron (instead of a sphere) this is finite and can be precomputed, but for a sphere you either need something other than a BSP or an infinite number of planes. Quake2 (and HalfLife 2) use this technique for boxes, adding bevel planes for each potential separating axis. You could possibly do the inifinite number of planes thing by adding some procedural bevel planes. So their normals would get computed as a function of the static features and the swept sphere query. I haven't implemented that but I think it would work. But you may end up solving quadratics in order to compute the normals which kind of defeats the simplicity of this technique. The actual collision manifold for the sphere expanded tree is curved along each edge and at each vertex of the original BSP. Anyway, the technique in the paper just adds some average planes. So while that will still give you usable collision detection, the shape that is actually hitting the BSP is some kind of spacevarying polyhedron, not a sphere (meaning it's a polyhedron, but a different polyhedron in different parts of the space because the bevels depend only on the static geometry). But that approximates a sphere and should be workable if you don't need it to be exactly a sphere. Jay ________________________________ From: Damian Coventry [mailto:damian.coventry@...] Sent: Thursday, February 19, 2009 11:12 AM To: Game Development Algorithms Subject: Re: [Algorithms] Dynamic Plane Shifting BSP Traversal Problem Bevelling seems like it's the solution to the problem alright. I think I should have read the paper more thoroughly, because this paragraph from the paper sums it up succintly: "Although it may not always seem necessary, this beveling step is very important, even if the initial boundary representation contains no acutely convex angles. Our initial experiments in altering the plane equations, where we had not included the beveling step, were unsuccessful. The movement of characters in our video game was blocked in the strangest places as if someone had randomly placed invisible walls in the scene. This happens because the BSP compilation process splits polygons and creates cells with sharp angles in unexpected places. Beveling solved this problem and made the dynamic plane shifting technique viable." "randomly placed invisible walls" is definitely what I'm suffering. So off I go to implement bevelling... :) Cheers for the reply Marius, ~Damian(); On Thu, Feb 19, 2009 at 1:18 AM, Marius Elvert <marius.elvert@...<mailto:marius.elvert@...>> wrote: Hello, this seems to be a case of extralong extrusion that Melax is 'fixing' by beveling. In this case, it's the very sharp edge between plane 1 and plane 5 that gets extruded further than you'd intuitively think. Much like the right corner in figure 3 of Melax' paper. This is an inherent problem of the algorithm because it only expands planes, never lines or points  it doesn't 'really' work with distance to volumes this way. Marius Quoting Damian Coventry <damian.coventry@...<mailto:damian.coventry@...>>: > Hi all, > > I've come across a collision detection use case which I cannot get to work > with dynamic plane shifting when recursing a BSP tree. > > The paper I've read which I'm trying to implement is Stan Melax's Dynamic > Plane Shifting BSP Traversal, found here: http://www.melax.com/melaxbsp.pdf. > > The use case is when two planes are close to, but not, perpendicular to one > another. The shifting of the plane results in a plane that formerly was in > front of the other plane when the tree was built, but now is behind it, due > to the shift. This results in a false positive and you collide with "thin > air". > > Apparently this is the algorithm used for collision detection in MDK2 and > NWN, so it must work. I must have misunderstood Stan's paper, or my > implementation is wrong. My implementation works for every other use case > I've thrown at it. > > For a full explanation of the use case I've outlined everything with > diagrams and text on this web page: > http://damiancoventry.com/BSPProblem/DynamicPlaneShiftingBSPTraversalProblem.htm > > Surely this algorithm can handle this use case, and surely others have > encountered this before? > > ~Damian(); >  Open Source Business Conference (OSBC), March 2425, 2009, San Francisco, CA OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise Strategies to boost innovation and cut costs with open source participation Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ GDAlgorithmslist mailing list GDAlgorithmslist@...<mailto:GDAlgorithmslist@...> https://lists.sourceforge.net/lists/listinfo/gdalgorithmslist Archives: http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithmslist 