Thread: [Algorithms] General-purpose shadowbuffer implementation.
Brought to you by:
vexxed72
From: Tom F. <tom...@ee...> - 2004-08-25 16:46:48
|
So a few days ago I finished my StarTopia "patch" that properly and = fairly robustly implements the stuff I talked about in my GDC 2004 talk (which = was a fairly blatant hack^H^H^H^H proof of concept with a horrible bug I = only discovered later). If you want to actually run it, you need a copy of the game, and then download both patches from my site (link below). Probably almost = impossible to find in the shops, but plenty on Ebay and P2Ps and suchlike (for = *ahem* evaluation purposes, obviously) http://www.eelpi.gotdns.org/startopia/startopia.html There's some pretty pictures and daft text as well. If anyone wants an = even slower version with lots of debugging info on the shadowbuffers, just = yell and I'll see what I can rustle up for you. Sadly, StarTopia's engine is rather inefficient at drawing a single = object multiple times, so performance is rather underwhelming because of that, = and the renderstates are being thrashed mercilessly because I had to turn = off all the caching to get things to work - it's a bit of a mess in there. = Ah well - Mr. Moore is helping quite a lot. The algorithm is basically the "non-persistent" one I talked about at = GDC, and it works pretty well (couldn't use the "persistent" version - = StarTopia has no concept of persistent lights to hang it off!). The bits I'm not = that happy about are: (1) the geometric objects I use to figure out my shadowbuffer frustums = are cones. But shadowbuffers are square. And the cone must fit entirely = within the shadowbuffer, or stuff will fall off the edges. So by definition at least 1-(PI/4) of my texels are wasted, and usually more. Possibly using square-based pyramids instead of cones would work better, but it feels = like that's going to be a lot of work. The other trick might be to use cones = to _find_ frustums, but then once you go to actually allocate the = shadowbuffer, find the square bounds of the objects it contains, rather than using the bounding cone. (2) I still don't have a tried-and-tested solution for when a single = object covers an area that is too big for a single frustum, i.e. it's too large and/or close to the light. I have a fairly good idea that chopping the object up with user clip planes into the six "cubemap" parts and then treating each chunk separately is going to work (90 degree shadowbuffer = FOVs are fine - the current implementation will push FOVs up to 120 degrees before rejecting objects and just drawing them unshadowed), but I = haven't tried it out because the pain of retrofitting that scheme to StarTopia = is too much for me to bear. (3) This one uses a vanilla unsorted ID buffer, and (because I'm limited = to DX7 tech) doesn't do multiple samples and/or PCF. So each object has = nasty shadowed borders around the edges because of sampling errors. I have a = DX9 demo that solves this with multiple samples, and I have also solved it = by roughly sorting IDs and using a <=3D test rather than an =3D=3D test - = that algorithm works really well and is still very cheap to render. I suspect = I could fairly easily retro-fit this sorting method into StarTopia (it's a regular-grid-based game - sorting should be pretty easy!), but I haven't actually done it (yet), so I still declare that "unproven" in a real = game. Maybe I'll fix that shortly - who knows. (4) StarTopia is embarassingly slow at rendering a single object = multiple times, as I said, AND I had to turn off all the renderstate caching to = get stuff working (although I wrote the system, I can't remember - three = years later - how it all works). It's not exactly the best demo for the tech because of this - all I can do is splutter and say that it's not the algorithm that's slow, it's the shonky graphics engine around it :-) (5) For my shadowbuffers, I'm allocating RGBA8888 buffers, each with = their own 32-bit Zbuffer, when I actually only need the 8-bit alpha channel, = and could share the Z buffer between all of them. This is because so many = cards don't support A8 textures that I couldn't be bothered with the = compatibility hassles. And there's trouble on some cards with sharing Z buffers = between different-sized render targets. So the easy thing was to give each a = unique one. So it's using 8x the memory and using 4x the fillrate it could do = on something like a console where I know exactly what the hardware is = capable of (or on the PC if I had the time to spend in a compatibility lab). So = if you "only" have a 64Mb card, you're going to run out of memory at the = higher quality settings (it does fail gracefully though - you'll just see = shadows pop randomly in and out of existence). Sorry. Note that these are standard 2D flat-projected shadowbuffers. No PSMs, = no trapezoidal stuff, no cubemaps. And it's all vanilla fixed-function dual-texture DX7 tech. It's very easy to add PSMs and other fancy projections into this scheme to reduce memory and fillrate costs, and = the best thing is that when they don't work because of annoying camera = angles and the divide-by-zero plane being in some awkward place, you know you always have something that will work. Obviously using some decent = shaders to do the actual shadowbuffering would make them look a lot prettier as = well. If anyone prods me hard enough, I'll expand on exactly what the = algorithm is - pseudo code and all that good stuff. Although it's broadly what I = talked about at GDC, there's only so much detail you can go into in an = hour-long 7x7 slide deck, and the devil is always in the details. Especially for shadowbuffers it seems :-) TomF. |
From: Tom F. <tom...@ee...> - 2004-08-27 09:13:36
|
Goodnes - loads of emails off-list, none on-list. Come on people - don't = be shy! Anyway, due to popular demand I will do a wbe page or a post about the implementation details. So I was reading the Trapezoidal Shadow Map paper that everyone was = talking about ages ago. http://www.comp.nus.edu.sg/~tants/tsm.html Yeah, I'm = slow. It's interesting reading - it's basically a smarter version of PSM from = what I can see - selecting a better warping for the shadowmap that minimises texel wastage. They take the camera frustum and draw it in the light's = space (clipping as necessary). Then they approximate that with a trapezoid. = And then they find a 4x4 projection matrix that maps that to the unit = square. And that's your shadowbuffer projection. So by definition it doesn't have the divide-by-zero problems that PSMs = have because all your thinking is done in the light's frustum, and by = definition if something is outside the light's frustum, it's not lit anyway and so can't be shadowed or cast shadows. So that solves that. It's also nice that it's continuous - a small movement of light, camera = or objects doesn't cause a large change in the trapezoidal approximation, = and therefore the mapping always changes smoothly, which solves some of the abrupt popping of most shadowbuffer methods. (actually, objects are irrelevant, unlike PSM, because TSM doesn't consider them at all - just camera frustum and light). (The paper also has some interesting (but orthogonal) comments about = picking a depth epsilon to try to reduce surface acne and peterpanning, and how = a trapezoidal projection makes this a lot worse and how to make it not so = bad by de-projecting the epsilon in the shader. But I'm an ID/priority = fanboy, so I just skimmed that bit) So it's all great for distant lights and shortish view distances. But it seems like there's a pretty glaring hole in the algorithm. If your view = is any decent distance compared to the light, and the light is not a nice narrow-cone spotlight (so you can just clip off the large bits of the = view frustum), then your view frustum in light space is huge. So your = trapezoid approximation doesn't help you much. Also, it doesn't seem to help the duelling frustum case at all that I can see (because I don't believe = there _is_ a way to solve that with a single buffer projected with a simple = 4x4 matrix). However, there's the stuff about the 80/20 mix that I don't fully = understand - maybe that somehow compensates for large far-clip-plane distances. But = who uses a far clip plane that isn't basically infinite these days? The only case I can think of is corridor shooters, and 90% of your lights there = are very close to the view frustum, so again - minimal gain from TSM (in a corridor shooter, apart from the omnidirection-light problem, which is a different kettle of fish altogether, you can pretty easily get away with dumb BB shadowbuffers because the change in texel density is fairly moderate, unlike something like the sun outdoors). So it seems like it's better than PSM for some cases, but doesn't solve = any of the real-world cases (duelling frustums), and as far as I can see, pushing the far clip plane out to sensible distances causes it lots of trouble. I also don't understand the videos - their PSM implementation seems to = be performing terribly - far worse than I'd expect. There's no reason I can = see that it should ever be _worse_ than the relatively dumb BB case. Another (rather more minor) quibble I have is that if one of the Cool Features = about TSM is that the changes in mapping are continuous, so you might get aliasing, but you never get pops, then pick a buffer resoloution that is = low enough to show me the texels not-popping. They've picked one where I = can't actually see any texels, so they could be popping every single frame and you'd never know it. Has anyone actually tried this stuff in a real game? Or has the = experience of PSMs put everyone off wacky projections for life? I know I'm a lot = more wary about investing time in this stuff after that (hey - that's why I'm posting this message :-). TomF. > -----Original Message----- > From: gda...@li...=20 > [mailto:gda...@li...] On=20 > Behalf Of Tom Forsyth > Sent: 25 August 2004 09:47 > To: gda...@li... > Subject: [Algorithms] General-purpose shadowbuffer implementation. >=20 >=20 > So a few days ago I finished my StarTopia "patch" that=20 > properly and fairly > robustly implements the stuff I talked about in my GDC 2004=20 > talk (which was > a fairly blatant hack^H^H^H^H proof of concept with a=20 > horrible bug I only > discovered later). >=20 > If you want to actually run it, you need a copy of the game, and then > download both patches from my site (link below). Probably=20 > almost impossible > to find in the shops, but plenty on Ebay and P2Ps and=20 > suchlike (for *ahem* > evaluation purposes, obviously) >=20 > http://www.eelpi.gotdns.org/startopia/startopia.html >=20 > There's some pretty pictures and daft text as well. If anyone=20 > wants an even > slower version with lots of debugging info on the=20 > shadowbuffers, just yell > and I'll see what I can rustle up for you. <snip> |
From: J. W. R. <jra...@in...> - 2004-08-27 16:41:37
|
Here is a shameless request *on list*. Tom, can you make a small sample application that demonstrates the technique in a general purpose way using DX9 and the .FX file interface? Or, is it just the same thing as what already comes with the DirectX 9 SDK? I want to revise my physics demo app to have general purpose shadow maps at very high frame rate. Shadows are essential to dynamics simulations to make things look correct. Whenever I am running a demo on a flat ground plane its cheaper to just do hack shadows. However, in more complex environments (which are what I am working on now) it would be most useful. I previously implemented the shadow map demo that comes with DirectX 9C but I got terrible aliasing artifacts and the frame rate generally sucked ass. Since then I have run demos on my machine which do shadow mapping that ran at extremely high frame rate, so I am assuming there are other approaches and techniques available. Since I plan to make my entire physics demo application available as free open source, this would benefit the community as a whole. -----Original Message----- From: gda...@li... [mailto:gda...@li...] On Behalf Of Tom Forsyth Sent: Friday, August 27, 2004 4:13 AM To: gda...@li... Subject: RE: [Algorithms] General-purpose shadowbuffer implementation. Goodnes - loads of emails off-list, none on-list. Come on people - don't be shy! Anyway, due to popular demand I will do a wbe page or a post about the implementation details. So I was reading the Trapezoidal Shadow Map paper that everyone was talking about ages ago. http://www.comp.nus.edu.sg/~tants/tsm.html Yeah, I'm slow. It's interesting reading - it's basically a smarter version of PSM from what I can see - selecting a better warping for the shadowmap that minimises texel wastage. They take the camera frustum and draw it in the light's space (clipping as necessary). Then they approximate that with a trapezoid. And then they find a 4x4 projection matrix that maps that to the unit square. And that's your shadowbuffer projection. So by definition it doesn't have the divide-by-zero problems that PSMs have because all your thinking is done in the light's frustum, and by definition if something is outside the light's frustum, it's not lit anyway and so can't be shadowed or cast shadows. So that solves that. It's also nice that it's continuous - a small movement of light, camera or objects doesn't cause a large change in the trapezoidal approximation, and therefore the mapping always changes smoothly, which solves some of the abrupt popping of most shadowbuffer methods. (actually, objects are irrelevant, unlike PSM, because TSM doesn't consider them at all - just camera frustum and light). (The paper also has some interesting (but orthogonal) comments about picking a depth epsilon to try to reduce surface acne and peterpanning, and how a trapezoidal projection makes this a lot worse and how to make it not so bad by de-projecting the epsilon in the shader. But I'm an ID/priority fanboy, so I just skimmed that bit) So it's all great for distant lights and shortish view distances. But it seems like there's a pretty glaring hole in the algorithm. If your view is any decent distance compared to the light, and the light is not a nice narrow-cone spotlight (so you can just clip off the large bits of the view frustum), then your view frustum in light space is huge. So your trapezoid approximation doesn't help you much. Also, it doesn't seem to help the duelling frustum case at all that I can see (because I don't believe there _is_ a way to solve that with a single buffer projected with a simple 4x4 matrix). However, there's the stuff about the 80/20 mix that I don't fully understand - maybe that somehow compensates for large far-clip-plane distances. But who uses a far clip plane that isn't basically infinite these days? The only case I can think of is corridor shooters, and 90% of your lights there are very close to the view frustum, so again - minimal gain from TSM (in a corridor shooter, apart from the omnidirection-light problem, which is a different kettle of fish altogether, you can pretty easily get away with dumb BB shadowbuffers because the change in texel density is fairly moderate, unlike something like the sun outdoors). So it seems like it's better than PSM for some cases, but doesn't solve any of the real-world cases (duelling frustums), and as far as I can see, pushing the far clip plane out to sensible distances causes it lots of trouble. I also don't understand the videos - their PSM implementation seems to be performing terribly - far worse than I'd expect. There's no reason I can see that it should ever be _worse_ than the relatively dumb BB case. Another (rather more minor) quibble I have is that if one of the Cool Features about TSM is that the changes in mapping are continuous, so you might get aliasing, but you never get pops, then pick a buffer resoloution that is low enough to show me the texels not-popping. They've picked one where I can't actually see any texels, so they could be popping every single frame and you'd never know it. Has anyone actually tried this stuff in a real game? Or has the experience of PSMs put everyone off wacky projections for life? I know I'm a lot more wary about investing time in this stuff after that (hey - that's why I'm posting this message :-). TomF. > -----Original Message----- > From: gda...@li... > [mailto:gda...@li...] On > Behalf Of Tom Forsyth > Sent: 25 August 2004 09:47 > To: gda...@li... > Subject: [Algorithms] General-purpose shadowbuffer implementation. > > > So a few days ago I finished my StarTopia "patch" that > properly and fairly > robustly implements the stuff I talked about in my GDC 2004 > talk (which was > a fairly blatant hack^H^H^H^H proof of concept with a > horrible bug I only > discovered later). > > If you want to actually run it, you need a copy of the game, and then > download both patches from my site (link below). Probably > almost impossible > to find in the shops, but plenty on Ebay and P2Ps and > suchlike (for *ahem* > evaluation purposes, obviously) > > http://www.eelpi.gotdns.org/startopia/startopia.html > > There's some pretty pictures and daft text as well. If anyone > wants an even > slower version with lots of debugging info on the > shadowbuffers, just yell > and I'll see what I can rustle up for you. <snip> ------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ GDAlgorithms-list mailing list GDA...@li... https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list Archives: http://sourceforge.net/mailarchive/forum.php?forum_ida88 |
From: Guido de H. <gu...@gu...> - 2004-08-30 12:00:05
|
Have you seen the Light Space PSM's by Michael Wimmer et al.? http://www.cg.tuwien.ac.at/research/vr/lispsm/ Looks promising. Doesn't solve your jitter problem though, but doesn't need all the special case PSM stuff. Guido |