Re: [Algorithms] Current state of shadow maps?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Tom,

That stuff is great!  Just wish it didn't have the one limitation that=20
makes it not work for me... "The only places that do not currently work=20
well are large objects with lights close to or inside them."  Any dude=20
carrying a torch on my terrain would describe that... the terrain is a=20
"large object" in that sense.  Ah well.  But stencil shadows are looking=20
great so far!  I hope to combine them with some form of what you are=20
doing for a nice hybrid shadowing approach.

-- David

Tom Forsyth wrote:

>I've updated the description of the algorithm and included some pictures.
>Hopefully it's a bit clearer, but this stuff can be tough to explain. Th=
e
>odd toroidal topology of StarTopia doesn't help :-)
>
>http://www.eelpi.gotdns.org/papers/shadowbuffer_pseudocode.html
>
> =20
>
>>>The army-with-lots-of-short-range-torches example is an=20
>>>     =20
>>>
>>interesting one for
>>   =20
>>
>>>shadowbuffers.
>>>     =20
>>>
>>Just include it in your demo *g*
>>   =20
>>
>
>I did this - just hacked in a light floating above every person's head. =
It
>works pretty well. It's slow, but not absurdly slow, considering what it=
's
>doing! I'll try to take some pics some time - it looks pretty goofy.
>
>You're right that to get "perfect" precision you need to render twice as
>many shadowbuffer texels as pixels, but in practice you need a lot less =
than
>this, even with the horrible alpha-test shadows I'm using here (I find t=
hat
>half as many texels as pixels works well). With PCF, you can drop it a b=
it
>more, and if you put in soft-edged shadows with something like Smoothies=
 or
>Willem's smooth-shadows method (http://www.whdeboer.com/writings.html), =
then
>you need even fewer texels.
>
>TomF.
>
>
> =20
>
>>From: Christian Sch=FCler
>>
>>   =20
>>
>>>Creating a lot of frustrums, but not necessarily 1 per reciever per
>>>light - it's very likely you could merge quite a few of those
>>>frustrums together, given an army is usually walking in close
>>>formation
>>>     =20
>>>
>>That'd be lossy compression then ... but this opportunity to=20
>>short-cut is not restricted to shadowbuffers, is applies to=20
>>stencil too (Each unique "frustum" translates to an extrusion=20
>>center). Besides I can see the danger of popping if the=20
>>merger is inconsitent between frames.
>>
>>   =20
>>
>>>The army-with-lots-of-short-range-torches example is an=20
>>>     =20
>>>
>>interesting one for
>>   =20
>>
>>>shadowbuffers.
>>>     =20
>>>
>>Just include it in your demo *g*
>>
>>
>>   =20
>>
>>>Here's some really really rough back-of-the-envelope=20
>>>     =20
>>>
>>figures to compare the
>>   =20
>>
>>>two. Warning - lots of assumptions ahead!
>>>     =20
>>>
>>I don't want to start a war. I just would not equate the=20
>>overall performance to the # of Z reads/writes.
>>I have experience with the "army of torches" scenario with=20
>>stencils, and you can get decent performance if the average=20
>>screen space area was just small enough.=20
>>So there is little cost associated "per light" and large=20
>>costs for "screen space covered" and "vertices touched". In=20
>>the dynamic environment where all the recievers / casters=20
>>were moving, guess the limiting factor for the CPU work was=20
>>(for me) ---> the scene database queries to just get the=20
>>objects for each light! With shadow buffers I can see=20
>>shifting the cost more towards per light while per pixel and=20
>>per vertex costs may be smaller, with added penalties of=20
>>constant costs, like this:
>>
>>
>>stencil:
>>n lights =3D n passes=20
>>where n being the # of scene database queries=20
>>
>>shadow buffers
>>n lights =3D 2 * n passes (minimum) + n / c * ( render target=20
>>switches + stall penalty for leaving the framebuffer / coming=20
>>back to the framebuffer, etc etc)
>>where c being how much buffers you can pack into a shadowbuffer atlas
>>
>>
>>My experience also says that in order to over a 100^2 pixel=20
>>screen area, you need a 200^2 shadow buffer, because on=20
>>average the projected texels are stretched out due to the=20
>>light hitting at grazing angles. A 1024'er screen would need=20
>>a 2048'er shadow map. But that's a minor issue.=20
>>
>>
>>-----Original Message-----
>>From: gda...@li...=20
>>[mailto:gda...@li...] On=20
>>Behalf Of Tom Forsyth
>>Sent: Friday, September 09, 2005 7:38 AM
>>To: gda...@li...
>>Subject: RE: [Algorithms] Current state of shadow maps?
>>
>>
>>Interestingly:
>>
>>   =20
>>
>>>Stencil volumes win the indoor/urban, night scenarios=20
>>>(think doom3, or neverwinter nights for the record)
>>>- shadows from vegetation can be neglected.
>>>- many omnidirectional light sources, or lightsources with
>>>large frustra, for which shadowbuffer is unoptimal (too many render
>>>     =20
>>>
>>targets)
>>   =20
>>
>>>- most light sources have small screen space extent=20
>>>and world extent, so stencil is not expensive
>>>     =20
>>>
>>...actually describes your average StarTopia scene moderately well :-)
>>http://www.eelpi.gotdns.org/startopia/startopia_pictures.html
>>
>>(yes, I will get the demo version done soon, I promise!)
>>
>>
>>
>>The army-with-lots-of-short-range-torches example is an=20
>>interesting one for
>>shadowbuffers. When the range of a light is small compared to the view
>>frustum (as will be the case with >90% of the torches), then=20
>>my scheme will
>>just reduce to essentially a cube map per light. Actually, it=20
>>gets slightly
>>better - if there's nothing above the torch in range of it=20
>>(likely), then
>>that face never gets created, and also the face view angles=20
>>can be opened up
>>to about 120 degrees and still remain efficient - this=20
>>typically means you
>>lose another face and only need four frustums per light=20
>>rather a cube-map's
>>six.
>>
>>
>>Here's some really really rough back-of-the-envelope figures=20
>>to compare the
>>two. Warning - lots of assumptions ahead!
>>
>>Assume the shadowbuffers are the type that only write to a Z/stencil
>>surface, not a colour buffer as well. Remember that my scheme=20
>>allocates
>>shadowbuffer texels so that you get 1 texel per screen pixel=20
>>for the area it
>>covers, if you turn the detail to "max", i.e. pixel-perfect.
>>
>>Let's also assume that each light's radius sphere covers 10k pixels (a
>>100x100 pixel area - not unreasonable). Also approximate the=20
>>shadowbuffer
>>coverage - in practice many pixels in that area won't have=20
>>receivers, and
>>many others will have multiple receivers. Let's call it even=20
>>for the sake of
>>argument. Also assume that in any rendering pass, all the=20
>>pixels get tested,
>>and half get rejected because of overdraw (an entire scene=20
>>will have more
>>overdraw, but my experience is that shadowbuffer/volume=20
>>shadows, because of
>>their limited range, get lower overdraw, and 2x is reasonable).
>>
>>Shadowbuffers:
>>
>>Per light, rendering shadowbuffers: 10,000 Z tests + 5,000 Z=20
>>writes =3D 15k
>>reads/writes.
>>
>>Per light, rendering actual scene: 10,000 shadowbuffer reads.
>>
>>Total =3D 25k reads/writes.
>>
>>
>>Volume shadows:
>>
>>Per light, rendering volumes (remembering that volumes have=20
>>two sides!):
>>2*10,000 Z tests + 2*5,000 Z writes =3D 30k reads/writes.
>>
>>Per light, rendering actual scene: the stencil tests come=20
>>free with the Z
>>reads. No extra cost.
>>
>>Total =3D 30k reads/writes.
>>
>>
>>So in terms of fillrate, it's pretty close - shadowbuffering=20
>>slightly ahead,
>>but I made a lot of assumptions. But shadowbuffering has some=20
>>big aces up
>>its sleeve:
>>
>>The first is that I said the quality slider was on "best" -=20
>>one texel per
>>screen pixel. But you can turn that down - you can easily=20
>>halve it without
>>any quality loss. In fact, if you have a soft-edged shadow shader, you
>>_want_ to turn it down lots! So that dramatically reduces the fillrate
>>required for shadowbuffers.
>>
>>The second is that you can render a single receiver with multiple
>>shadowbuffers in one pass - because you're just sampling a=20
>>texture and doing
>>a comparison. So you can do more than one of these per=20
>>shader. Let's say you
>>can do two - that's totally realistic for PS2.0 hardware. So=20
>>you've now
>>halved the number of passes you do when rendering the scene=20
>>(I didn't list
>>those reads/writes in the above). This can't be done with=20
>>volume shadows
>>(that I know of) - it can only reject the pixel or accept it, it can't
>>half-shade it. That's a huge win!
>>
>>
>>Also, the process of extruding volume shadows is far more=20
>>expensive than the
>>equivalent shadowbuffer thing, which is just rendering the=20
>>object from a
>>different POV. I believe most people using VS-driven=20
>>extrusion find that
>>they are frequently limited by triangle throughput rather=20
>>than fillrate. And
>>people using CPU-driven extrusion wish they were doing=20
>>VS-driven extrusion
>>:-)
>>
>>
>>
>>TomF.
>>
>>
>>
>>   =20
>>
>>>-----Original Message-----
>>>From: gda...@li...=20
>>>[mailto:gda...@li...] On=20
>>>Behalf Of Megan Fox
>>>Sent: 08 September 2005 13:07
>>>To: gda...@li...
>>>Subject: Re: [Algorithms] Current state of shadow maps?
>>>
>>>
>>>Well, let's take the army with torches but apply stencil shadows
>>>instead (and let's say they're on a field of battle, a heightmap) -
>>>how is that still not a nightmare scenario?
>>>
>>>With shadow buffers (using Tom's method), you'd end up:
>>>
>>>- Creating a lot of frustrums, but not necessarily 1 per=20
>>>     =20
>>>
>>reciever per
>>   =20
>>
>>>light - it's very likely you could merge quite a few of those
>>>frustrums together, given an army is usually walking in close
>>>formation
>>>
>>>With stencil, you'd end up:
>>>
>>>- Casting your extrusions back for every light/occluder pair.  You
>>>can't really merge (I don't think?), so that's "it."
>>>
>>>
>>>Especially after using Tom's handy-dandy frustum=20
>>>     =20
>>>
>>merge-o-matic method,
>>   =20
>>
>>>it seems like the two methods would be comperable - mind, both would
>>>probably keel over and die in a slurry of render passes (and in both
>>>cases, you'd probably enable your "oh god we're in trouble start
>>>merging nearby lights into single lights" optimization code), but it
>>>seems like neither does terribly well.
>>>
>>>
>>>I'd thought the "big" win scenario for stencil over buffers was more
>>>scenes with few occluders and many recievers (that is, your average
>>>FPS environment)?
>>>
>>>     =20
>>>
>>>>Stencil volumes win the indoor/urban, night scenarios=20
>>>>       =20
>>>>
>>>(think doom3, or neverwinter nights for the record)
>>>     =20
>>>
>>>>- shadows from vegetation can be neglected.
>>>>- many omnidirectional light sources, or lightsources with=20
>>>>       =20
>>>>
>>>large frustra, for which shadowbuffer is unoptimal (too many=20
>>>render targets)
>>>     =20
>>>
>>>>- most light sources have small screen space extent and=20
>>>>       =20
>>>>
>>>world extent, so stencil is not expensive
>>>     =20
>>>
>>>>However shadowbuffers have other qualities that make them=20
>>>>       =20
>>>>
>>>attractive (image based, soft edges), so it would be=20
>>>desireable to use them for all purposes. It's just a pity=20
>>>that they are so unfeasible for omni lights (I imagine an=20
>>>army with torches here ...).
>>>
>>>
>>>-------------------------------------------------------
>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>September 19-22, 2005 * San Francisco, CA * Development=20
>>>Lifecycle Practices
>>>Agile & Plan-Driven Development * Managing Projects & Teams *=20
>>>Testing & QA
>>>Security * Process Improvement & Measurement *=20
>>>http://www.sqe.com/bsce5sf
>>>
>>>_______________________________________________
>>>GDAlgorithms-list mailing list
>>>GDA...@li...
>>>https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
>>>Archives:
>>>http://sourceforge.net/mailarchive/forum.php?forum_ida88
>>>
>>>     =20
>>>
>>
>>-------------------------------------------------------
>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>September 19-22, 2005 * San Francisco, CA * Development=20
>>Lifecycle Practices
>>Agile & Plan-Driven Development * Managing Projects & Teams *=20
>>Testing & QA
>>Security * Process Improvement & Measurement *=20
>>http://www.sqe.com/bsce5sf
>>
>>_______________________________________________
>>GDAlgorithms-list mailing list
>>GDA...@li...
>>https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
>>Archives:
>>http://sourceforge.net/mailarchive/forum.php?forum_ida88
>>
>>
>>-------------------------------------------------------
>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>September 19-22, 2005 * San Francisco, CA * Development=20
>>Lifecycle Practices
>>Agile & Plan-Driven Development * Managing Projects & Teams *=20
>>Testing & QA
>>Security * Process Improvement & Measurement *=20
>>http://www.sqe.com/bsce5sf
>>
>>_______________________________________________
>>GDAlgorithms-list mailing list
>>GDA...@li...
>>https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
>>Archives:
>>http://sourceforge.net/mailarchive/forum.php?forum_ida88
>>
>>   =20
>>
>
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by:
>Power Architecture Resource Center: Free content, downloads, discussions=
,
>and more. http://solutions.newsforge.com/ibmarch.tmpl
>_______________________________________________
>GDAlgorithms-list mailing list
>GDA...@li...
>https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
>Archives:
>http://sourceforge.net/mailarchive/forum.php?forum_ida88
> =20
>