RE: [Algorithms] Current state of shadow maps?
Brought to you by:
vexxed72
|
From: <c.s...@ph...> - 2005-10-04 10:01:16
|
Beware of user clip planes. They're not invariant w.r.t. other clip = planes or turning clip planes off, so they're useless for multipass'ing = (Because the rasterizer gets different vertices, of course, but I had to = see it with my own eyes just how ugly it looks).=20 -----Original Message----- From: gda...@li... = [mailto:gda...@li...] On Behalf Of Tom = Forsyth Sent: Tuesday, October 04, 2005 8:45 AM To: gda...@li... Subject: RE: [Algorithms] Current state of shadow maps? Yep - that's the tricky part. One method is to partition the large = objects with user clip planes. Another is to simply decimate them into smaller = and smaller chunks of triangles as needed (rather expensive, but it should = only happen to a few objects - hopefully!). Another is to use cube-map shadowbuffers, if hardware is available. But I haven't actually tried = any of these yet :-) There are some good papers on cube-map shadowbuffers = around though - I recall one from some nVidians. TomF. -----Original Message----- From: gda...@li... [mailto:gda...@li...] On Behalf Of = David Whatley Sent: 03 October 2005 23:30 To: gda...@li... Subject: Re: [Algorithms] Current state of shadow maps? Tom, That stuff is great! Just wish it didn't have the one limitation that = makes it not work for me... "The only places that do not currently work well = are large objects with lights close to or inside them." Any dude carrying a torch on my terrain would describe that... the terrain is a "large = object" in that sense. Ah well. But stencil shadows are looking great so far! = I hope to combine them with some form of what you are doing for a nice = hybrid shadowing approach. -- David Tom Forsyth wrote:=20 I've updated the description of the algorithm and included some = pictures. Hopefully it's a bit clearer, but this stuff can be tough to explain. = The odd toroidal topology of StarTopia doesn't help :-) http://www.eelpi.gotdns.org/papers/shadowbuffer_pseudocode.html =20 The army-with-lots-of-short-range-torches example is an=20 =20 interesting one for =20 shadowbuffers. =20 Just include it in your demo *g* =20 I did this - just hacked in a light floating above every person's head. = It works pretty well. It's slow, but not absurdly slow, considering what = it's doing! I'll try to take some pics some time - it looks pretty goofy. You're right that to get "perfect" precision you need to render twice as many shadowbuffer texels as pixels, but in practice you need a lot less = than this, even with the horrible alpha-test shadows I'm using here (I find = that half as many texels as pixels works well). With PCF, you can drop it a = bit more, and if you put in soft-edged shadows with something like Smoothies = or Willem's smooth-shadows method (http://www.whdeboer.com/writings.html), = then you need even fewer texels. TomF. =20 From: Christian Sch=FCler =20 Creating a lot of frustrums, but not necessarily 1 per reciever per light - it's very likely you could merge quite a few of those frustrums together, given an army is usually walking in close formation =20 That'd be lossy compression then ... but this opportunity to=20 short-cut is not restricted to shadowbuffers, is applies to=20 stencil too (Each unique "frustum" translates to an extrusion=20 center). Besides I can see the danger of popping if the=20 merger is inconsitent between frames. =20 The army-with-lots-of-short-range-torches example is an=20 =20 interesting one for =20 shadowbuffers. =20 Just include it in your demo *g* =20 Here's some really really rough back-of-the-envelope=20 =20 figures to compare the =20 two. Warning - lots of assumptions ahead! =20 I don't want to start a war. I just would not equate the=20 overall performance to the # of Z reads/writes. I have experience with the "army of torches" scenario with=20 stencils, and you can get decent performance if the average=20 screen space area was just small enough.=20 So there is little cost associated "per light" and large=20 costs for "screen space covered" and "vertices touched". In=20 the dynamic environment where all the recievers / casters=20 were moving, guess the limiting factor for the CPU work was=20 (for me) ---> the scene database queries to just get the=20 objects for each light! With shadow buffers I can see=20 shifting the cost more towards per light while per pixel and=20 per vertex costs may be smaller, with added penalties of=20 constant costs, like this: stencil: n lights =3D n passes=20 where n being the # of scene database queries=20 shadow buffers n lights =3D 2 * n passes (minimum) + n / c * ( render target=20 switches + stall penalty for leaving the framebuffer / coming=20 back to the framebuffer, etc etc) where c being how much buffers you can pack into a shadowbuffer atlas My experience also says that in order to over a 100^2 pixel=20 screen area, you need a 200^2 shadow buffer, because on=20 average the projected texels are stretched out due to the=20 light hitting at grazing angles. A 1024'er screen would need=20 a 2048'er shadow map. But that's a minor issue.=20 -----Original Message----- From: gda...@li...=20 [mailto:gda...@li...] On=20 Behalf Of Tom Forsyth Sent: Friday, September 09, 2005 7:38 AM To: gda...@li... Subject: RE: [Algorithms] Current state of shadow maps? Interestingly: =20 Stencil volumes win the indoor/urban, night scenarios=20 (think doom3, or neverwinter nights for the record) - shadows from vegetation can be neglected. - many omnidirectional light sources, or lightsources with large frustra, for which shadowbuffer is unoptimal (too many render =20 targets) =20 - most light sources have small screen space extent=20 and world extent, so stencil is not expensive =20 ...actually describes your average StarTopia scene moderately well :-) http://www.eelpi.gotdns.org/startopia/startopia_pictures.html (yes, I will get the demo version done soon, I promise!) The army-with-lots-of-short-range-torches example is an=20 interesting one for shadowbuffers. When the range of a light is small compared to the view frustum (as will be the case with >90% of the torches), then=20 my scheme will just reduce to essentially a cube map per light. Actually, it=20 gets slightly better - if there's nothing above the torch in range of it=20 (likely), then that face never gets created, and also the face view angles=20 can be opened up to about 120 degrees and still remain efficient - this=20 typically means you lose another face and only need four frustums per light=20 rather a cube-map's six. Here's some really really rough back-of-the-envelope figures=20 to compare the two. Warning - lots of assumptions ahead! Assume the shadowbuffers are the type that only write to a Z/stencil surface, not a colour buffer as well. Remember that my scheme=20 allocates shadowbuffer texels so that you get 1 texel per screen pixel=20 for the area it covers, if you turn the detail to "max", i.e. pixel-perfect. Let's also assume that each light's radius sphere covers 10k pixels (a 100x100 pixel area - not unreasonable). Also approximate the=20 shadowbuffer coverage - in practice many pixels in that area won't have=20 receivers, and many others will have multiple receivers. Let's call it even=20 for the sake of argument. Also assume that in any rendering pass, all the=20 pixels get tested, and half get rejected because of overdraw (an entire scene=20 will have more overdraw, but my experience is that shadowbuffer/volume=20 shadows, because of their limited range, get lower overdraw, and 2x is reasonable). Shadowbuffers: Per light, rendering shadowbuffers: 10,000 Z tests + 5,000 Z=20 writes =3D 15k reads/writes. Per light, rendering actual scene: 10,000 shadowbuffer reads. Total =3D 25k reads/writes. Volume shadows: Per light, rendering volumes (remembering that volumes have=20 two sides!): 2*10,000 Z tests + 2*5,000 Z writes =3D 30k reads/writes. Per light, rendering actual scene: the stencil tests come=20 free with the Z reads. No extra cost. Total =3D 30k reads/writes. So in terms of fillrate, it's pretty close - shadowbuffering=20 slightly ahead, but I made a lot of assumptions. But shadowbuffering has some=20 big aces up its sleeve: The first is that I said the quality slider was on "best" -=20 one texel per screen pixel. But you can turn that down - you can easily=20 halve it without any quality loss. In fact, if you have a soft-edged shadow shader, you _want_ to turn it down lots! So that dramatically reduces the fillrate required for shadowbuffers. The second is that you can render a single receiver with multiple shadowbuffers in one pass - because you're just sampling a=20 texture and doing a comparison. So you can do more than one of these per=20 shader. Let's say you can do two - that's totally realistic for PS2.0 hardware. So=20 you've now halved the number of passes you do when rendering the scene=20 (I didn't list those reads/writes in the above). This can't be done with=20 volume shadows (that I know of) - it can only reject the pixel or accept it, it can't half-shade it. That's a huge win! Also, the process of extruding volume shadows is far more=20 expensive than the equivalent shadowbuffer thing, which is just rendering the=20 object from a different POV. I believe most people using VS-driven=20 extrusion find that they are frequently limited by triangle throughput rather=20 than fillrate. And people using CPU-driven extrusion wish they were doing=20 VS-driven extrusion :-) TomF. =20 -----Original Message----- From: gda...@li...=20 [mailto:gda...@li...] On=20 Behalf Of Megan Fox Sent: 08 September 2005 13:07 To: gda...@li... Subject: Re: [Algorithms] Current state of shadow maps? Well, let's take the army with torches but apply stencil shadows instead (and let's say they're on a field of battle, a heightmap) - how is that still not a nightmare scenario? With shadow buffers (using Tom's method), you'd end up: - Creating a lot of frustrums, but not necessarily 1 per=20 =20 reciever per =20 light - it's very likely you could merge quite a few of those frustrums together, given an army is usually walking in close formation With stencil, you'd end up: - Casting your extrusions back for every light/occluder pair. You can't really merge (I don't think?), so that's "it." Especially after using Tom's handy-dandy frustum=20 =20 merge-o-matic method, =20 it seems like the two methods would be comperable - mind, both would probably keel over and die in a slurry of render passes (and in both cases, you'd probably enable your "oh god we're in trouble start merging nearby lights into single lights" optimization code), but it seems like neither does terribly well. I'd thought the "big" win scenario for stencil over buffers was more scenes with few occluders and many recievers (that is, your average FPS environment)? =20 Stencil volumes win the indoor/urban, night scenarios=20 =20 (think doom3, or neverwinter nights for the record) =20 - shadows from vegetation can be neglected. - many omnidirectional light sources, or lightsources with=20 =20 large frustra, for which shadowbuffer is unoptimal (too many=20 render targets) =20 - most light sources have small screen space extent and=20 =20 world extent, so stencil is not expensive =20 However shadowbuffers have other qualities that make them=20 =20 attractive (image based, soft edges), so it would be=20 desireable to use them for all purposes. It's just a pity=20 that they are so unfeasible for omni lights (I imagine an=20 army with torches here ...). ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development=20 Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams *=20 Testing & QA Security * Process Improvement & Measurement *=20 http://www.sqe.com/bsce5sf _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_ida88 =20 ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development=20 Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams *=20 Testing & QA Security * Process Improvement & Measurement *=20 http://www.sqe.com/bsce5sf _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_ida88 ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development=20 Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams *=20 Testing & QA Security * Process Improvement & Measurement *=20 http://www.sqe.com/bsce5sf _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_ida88 =20 ------------------------------------------------------- This SF.Net email is sponsored by: Power Architecture Resource Center: Free content, downloads, = discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_ida88 =20 ------------------------------------------------------- This SF.Net email is sponsored by: Power Architecture Resource Center: Free content, downloads, = discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_ida88 |