RE: [Algorithms] Current state of shadow maps?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Beware of user clip planes. They're not invariant w.r.t. other clip =
planes or turning clip planes off, so they're useless for multipass'ing =
(Because the rasterizer gets different vertices, of course, but I had to =
see it with my own eyes just how ugly it looks).=20

-----Original Message-----
From: gda...@li... =
[mailto:gda...@li...] On Behalf Of Tom =
Forsyth
Sent: Tuesday, October 04, 2005 8:45 AM
To: gda...@li...
Subject: RE: [Algorithms] Current state of shadow maps?

Yep - that's the tricky part. One method is to partition the large =
objects
with user clip planes. Another is to simply decimate them into smaller =
and
smaller chunks of triangles as needed (rather expensive, but it should =
only
happen to a few objects - hopefully!). Another is to use cube-map
shadowbuffers, if hardware is available. But I haven't actually tried =
any of
these yet :-) There are some good papers on cube-map shadowbuffers =
around
though - I recall one from some nVidians.

TomF.

-----Original Message-----
From: gda...@li...
[mailto:gda...@li...] On Behalf Of =
David
Whatley
Sent: 03 October 2005 23:30
To: gda...@li...
Subject: Re: [Algorithms] Current state of shadow maps?

Tom,

That stuff is great!  Just wish it didn't have the one limitation that =
makes
it not work for me... "The only places that do not currently work well =
are
large objects with lights close to or inside them."  Any dude carrying a
torch on my terrain would describe that... the terrain is a "large =
object"
in that sense.  Ah well.  But stencil shadows are looking great so far!  =
I
hope to combine them with some form of what you are doing for a nice =
hybrid
shadowing approach.

-- David

Tom Forsyth wrote:=20
I've updated the description of the algorithm and included some =
pictures.
Hopefully it's a bit clearer, but this stuff can be tough to explain. =
The
odd toroidal topology of StarTopia doesn't help :-)

http://www.eelpi.gotdns.org/papers/shadowbuffer_pseudocode.html

 =20
The army-with-lots-of-short-range-torches example is an=20
     =20
interesting one for
   =20
shadowbuffers.
     =20
Just include it in your demo *g*
   =20

I did this - just hacked in a light floating above every person's head. =
It
works pretty well. It's slow, but not absurdly slow, considering what =
it's
doing! I'll try to take some pics some time - it looks pretty goofy.

You're right that to get "perfect" precision you need to render twice as
many shadowbuffer texels as pixels, but in practice you need a lot less =
than
this, even with the horrible alpha-test shadows I'm using here (I find =
that
half as many texels as pixels works well). With PCF, you can drop it a =
bit
more, and if you put in soft-edged shadows with something like Smoothies =
or
Willem's smooth-shadows method (http://www.whdeboer.com/writings.html), =
then
you need even fewer texels.

TomF.

 =20
From: Christian Sch=FCler

   =20
Creating a lot of frustrums, but not necessarily 1 per reciever per
light - it's very likely you could merge quite a few of those
frustrums together, given an army is usually walking in close
formation
     =20
That'd be lossy compression then ... but this opportunity to=20
short-cut is not restricted to shadowbuffers, is applies to=20
stencil too (Each unique "frustum" translates to an extrusion=20
center). Besides I can see the danger of popping if the=20
merger is inconsitent between frames.

   =20
The army-with-lots-of-short-range-torches example is an=20
     =20
interesting one for
   =20
shadowbuffers.
     =20
Just include it in your demo *g*

   =20
Here's some really really rough back-of-the-envelope=20
     =20
figures to compare the
   =20
two. Warning - lots of assumptions ahead!
     =20
I don't want to start a war. I just would not equate the=20
overall performance to the # of Z reads/writes.
I have experience with the "army of torches" scenario with=20
stencils, and you can get decent performance if the average=20
screen space area was just small enough.=20
So there is little cost associated "per light" and large=20
costs for "screen space covered" and "vertices touched". In=20
the dynamic environment where all the recievers / casters=20
were moving, guess the limiting factor for the CPU work was=20
(for me) ---> the scene database queries to just get the=20
objects for each light! With shadow buffers I can see=20
shifting the cost more towards per light while per pixel and=20
per vertex costs may be smaller, with added penalties of=20
constant costs, like this:

stencil:
n lights =3D n passes=20
where n being the # of scene database queries=20

shadow buffers
n lights =3D 2 * n passes (minimum) + n / c * ( render target=20
switches + stall penalty for leaving the framebuffer / coming=20
back to the framebuffer, etc etc)
where c being how much buffers you can pack into a shadowbuffer atlas

My experience also says that in order to over a 100^2 pixel=20
screen area, you need a 200^2 shadow buffer, because on=20
average the projected texels are stretched out due to the=20
light hitting at grazing angles. A 1024'er screen would need=20
a 2048'er shadow map. But that's a minor issue.=20

-----Original Message-----
From: gda...@li...=20
[mailto:gda...@li...] On=20
Behalf Of Tom Forsyth
Sent: Friday, September 09, 2005 7:38 AM
To: gda...@li...
Subject: RE: [Algorithms] Current state of shadow maps?

Interestingly:

   =20
Stencil volumes win the indoor/urban, night scenarios=20
(think doom3, or neverwinter nights for the record)
- shadows from vegetation can be neglected.
- many omnidirectional light sources, or lightsources with
large frustra, for which shadowbuffer is unoptimal (too many render
     =20
targets)
   =20
- most light sources have small screen space extent=20
and world extent, so stencil is not expensive
     =20
...actually describes your average StarTopia scene moderately well :-)
http://www.eelpi.gotdns.org/startopia/startopia_pictures.html

(yes, I will get the demo version done soon, I promise!)

The army-with-lots-of-short-range-torches example is an=20
interesting one for
shadowbuffers. When the range of a light is small compared to the view
frustum (as will be the case with >90% of the torches), then=20
my scheme will
just reduce to essentially a cube map per light. Actually, it=20
gets slightly
better - if there's nothing above the torch in range of it=20
(likely), then
that face never gets created, and also the face view angles=20
can be opened up
to about 120 degrees and still remain efficient - this=20
typically means you
lose another face and only need four frustums per light=20
rather a cube-map's
six.

Here's some really really rough back-of-the-envelope figures=20
to compare the
two. Warning - lots of assumptions ahead!

Assume the shadowbuffers are the type that only write to a Z/stencil
surface, not a colour buffer as well. Remember that my scheme=20
allocates
shadowbuffer texels so that you get 1 texel per screen pixel=20
for the area it
covers, if you turn the detail to "max", i.e. pixel-perfect.

Let's also assume that each light's radius sphere covers 10k pixels (a
100x100 pixel area - not unreasonable). Also approximate the=20
shadowbuffer
coverage - in practice many pixels in that area won't have=20
receivers, and
many others will have multiple receivers. Let's call it even=20
for the sake of
argument. Also assume that in any rendering pass, all the=20
pixels get tested,
and half get rejected because of overdraw (an entire scene=20
will have more
overdraw, but my experience is that shadowbuffer/volume=20
shadows, because of
their limited range, get lower overdraw, and 2x is reasonable).

Shadowbuffers:

Per light, rendering shadowbuffers: 10,000 Z tests + 5,000 Z=20
writes =3D 15k
reads/writes.

Per light, rendering actual scene: 10,000 shadowbuffer reads.

Total =3D 25k reads/writes.

Volume shadows:

Per light, rendering volumes (remembering that volumes have=20
two sides!):
2*10,000 Z tests + 2*5,000 Z writes =3D 30k reads/writes.

Per light, rendering actual scene: the stencil tests come=20
free with the Z
reads. No extra cost.

Total =3D 30k reads/writes.

So in terms of fillrate, it's pretty close - shadowbuffering=20
slightly ahead,
but I made a lot of assumptions. But shadowbuffering has some=20
big aces up
its sleeve:

The first is that I said the quality slider was on "best" -=20
one texel per
screen pixel. But you can turn that down - you can easily=20
halve it without
any quality loss. In fact, if you have a soft-edged shadow shader, you
_want_ to turn it down lots! So that dramatically reduces the fillrate
required for shadowbuffers.

The second is that you can render a single receiver with multiple
shadowbuffers in one pass - because you're just sampling a=20
texture and doing
a comparison. So you can do more than one of these per=20
shader. Let's say you
can do two - that's totally realistic for PS2.0 hardware. So=20
you've now
halved the number of passes you do when rendering the scene=20
(I didn't list
those reads/writes in the above). This can't be done with=20
volume shadows
(that I know of) - it can only reject the pixel or accept it, it can't
half-shade it. That's a huge win!

Also, the process of extruding volume shadows is far more=20
expensive than the
equivalent shadowbuffer thing, which is just rendering the=20
object from a
different POV. I believe most people using VS-driven=20
extrusion find that
they are frequently limited by triangle throughput rather=20
than fillrate. And
people using CPU-driven extrusion wish they were doing=20
VS-driven extrusion
:-)

TomF.

   =20
-----Original Message-----
From: gda...@li...=20
[mailto:gda...@li...] On=20
Behalf Of Megan Fox
Sent: 08 September 2005 13:07
To: gda...@li...
Subject: Re: [Algorithms] Current state of shadow maps?

Well, let's take the army with torches but apply stencil shadows
instead (and let's say they're on a field of battle, a heightmap) -
how is that still not a nightmare scenario?

With shadow buffers (using Tom's method), you'd end up:

- Creating a lot of frustrums, but not necessarily 1 per=20
     =20
reciever per
   =20
light - it's very likely you could merge quite a few of those
frustrums together, given an army is usually walking in close
formation

With stencil, you'd end up:

- Casting your extrusions back for every light/occluder pair.  You
can't really merge (I don't think?), so that's "it."

Especially after using Tom's handy-dandy frustum=20
     =20
merge-o-matic method,
   =20
it seems like the two methods would be comperable - mind, both would
probably keel over and die in a slurry of render passes (and in both
cases, you'd probably enable your "oh god we're in trouble start
merging nearby lights into single lights" optimization code), but it
seems like neither does terribly well.

I'd thought the "big" win scenario for stencil over buffers was more
scenes with few occluders and many recievers (that is, your average
FPS environment)?

     =20
Stencil volumes win the indoor/urban, night scenarios=20
       =20
(think doom3, or neverwinter nights for the record)
     =20
- shadows from vegetation can be neglected.
- many omnidirectional light sources, or lightsources with=20
       =20
large frustra, for which shadowbuffer is unoptimal (too many=20
render targets)
     =20
- most light sources have small screen space extent and=20
       =20
world extent, so stencil is not expensive
     =20
However shadowbuffers have other qualities that make them=20
       =20
attractive (image based, soft edges), so it would be=20
desireable to use them for all purposes. It's just a pity=20
that they are so unfeasible for omni lights (I imagine an=20
army with torches here ...).

-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development=20
Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams *=20
Testing & QA
Security * Process Improvement & Measurement *=20
http://www.sqe.com/bsce5sf

_______________________________________________
GDAlgorithms-list mailing list
GDA...@li...
https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
Archives:
http://sourceforge.net/mailarchive/forum.php?forum_ida88

     =20

-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development=20
Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams *=20
Testing & QA
Security * Process Improvement & Measurement *=20
http://www.sqe.com/bsce5sf

_______________________________________________
GDAlgorithms-list mailing list
GDA...@li...
https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
Archives:
http://sourceforge.net/mailarchive/forum.php?forum_ida88

-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development=20
Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams *=20
Testing & QA
Security * Process Improvement & Measurement *=20
http://www.sqe.com/bsce5sf

_______________________________________________
GDAlgorithms-list mailing list
GDA...@li...
https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
Archives:
http://sourceforge.net/mailarchive/forum.php?forum_ida88

   =20

-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, =
discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
GDAlgorithms-list mailing list
GDA...@li...
https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
Archives:
http://sourceforge.net/mailarchive/forum.php?forum_ida88
 =20

-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, =
discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
GDAlgorithms-list mailing list
GDA...@li...
https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
Archives:
http://sourceforge.net/mailarchive/forum.php?forum_ida88