Thread: Re: [Algorithms] Terrain performance comparrisons
Brought to you by:
vexxed72
From: De B. <be...@ii...> - 2003-07-24 11:39:01
|
Anyone know of a paper where a number of terrain algorithms are compared objectively in terms of speed/quality? Most of the time I see a paper say on an extension to an algorithm and they state how much extra fps the extension had on the old algorithm. Quality isn't usually even measured. But is there any papers dedicated just to "comparing" terrain algorithms as objectively as possible? Although opinions of "which algorithms are best" would be useful I am really looking for papers with objective measurements (opinions seem to vary way too much). Here is the old original msg from 2000 if you don't remember it: >Concerning the speed of terrain LOD algorithms: >If i made performance claims on my site, people would certainly complain that the results are entire based on context (type of >dataset, amount of camera motion) and that their implementation is really fast if i had only tested it their way :-) >In fact, it's even harder than that to make meaningful comparisons, since some algorithms (e.g. LK, TV) enforce a global error >metric, while others (ROAM, SM) enforce a polygon count target. The first kind will have framerates that will vary widely, while >the second will produce relatively stable performance. >A lot of the discussion on the Algorithms list has been about how many triangles/second each implementation gets through the >rendering pipeline, which seems to me a rather silly metric - it's fps at a level of perceived quality, NOT raw throughput, that >is the value of a LOD algorithm. >-Ben >http://vterrain.org/ -Ryan De Boer |
From: De B. <be...@ii...> - 2003-07-25 09:10:36
|
Basicly I want to do real-time visualization of detailed landscapes from human height off ground like a fps. I decided heightmaps due to their simplicity, and am implementing a few terrain algorithms to compare their speed/quality in my situation. The terrain implementations are from Trent Polaks book atm. Plants are very important and I have had much success with them as billboards and alpha blend to high LOD up close. I have also had success with multitexture splatting. I may be calling the terrain segment a node and joining multiple of these together and hopefully obtain a very large scale terrain where each node can be designed in independence. I got this idea originally from dungeon siege (http://www.dungeonsiege.com/su_216.shtml) and am looking forward to getting games gems2 for that article Tom mentioned. The only paper I could find on this was http://www.merl.com/papers/TR95-16a/ "Locales and Beacons: Efficient and Precise Support For Large Multi-User Virtual Environments" very good read, and many papers cite it. Anyhow now you know where my research is. I am suspecting ROAM2 will give the best speed/quality since I don't see any pops in Trents implementation however I will try to take objective measurements. I know this is highly specific to my framework but hopefully afterwards I can be more sure of which terrain algorithms perform better, and it might help me get a job. OK I may as well show you a picture, I know this shows more plants then terrain but it will give you an idea of what I have achieved: http://members.iinet.net.au/~bertdb/ryan/trees/mygrass.jpg (Using Toms ideas I can make the plants look even better but more passes, I probably will include that in my LOD system later) -Ryan De Boer |
From: Chris B. \(BUNGIE\) <cbu...@mi...> - 2003-07-25 16:17:20
|
> -----Original Message----- > From: gda...@li... [mailto:gdalgorithms- > lis...@li...] On Behalf Of Jonathan Blow > Sent: Friday, July 25, 2003 2:10 AM > To: gda...@li... > Subject: Re: [Algorithms] Terrain performance comparrisons >=20 > In fact, I am still waiting for people to understand that frame coherence > is > to be avoided in games whenever possible. I am going to submit a siggraph > presentation on that (and some other things) for next year. >=20 I would restate that a little. It seems to me that temporal coherence is extremely important in games, but the nature of the tasks being performed each frame varies extremely rapidly for anything _related to rendering the current view_. We use temporal coherence successfully in lots of ways in Halo 2, but not for anything view-dependent. For example all of your simulation tasks can take great advantage of temporal coherence, as can certain rendering tasks as well, like recomputing lighting coefficients for objects. But visibility from frame-to-frame clearly must be computed from scratch. -- Chris Butcher Networking & Simulation Lead Halo 2 | Bungie Studios bu...@bu... |
From: Jonathan B. <jo...@nu...> - 2003-07-25 18:49:35
|
> We use temporal coherence successfully in lots of ways in Halo 2, but > not for anything view-dependent. For example all of your simulation > tasks can take great advantage of temporal coherence, as can certain > rendering tasks as well, like recomputing lighting coefficients for > objects. But visibility from frame-to-frame clearly must be computed > from scratch. Well there are two different levels to this. The visibility aspect definitely is more severe, and I agree with you on the "clearly" part there, and it annoys me that e.g. a lot of people working on terrain algorithms don't even think about this. The reason why I say "avoid frame coherence whenever possible" goes probably a lot like the viewpoint-specific case, but I figure hey, I am here typing, I will explain it for anyone on the list who is wondering what we are talking about. Effectiveness of frame coherence in general depends on how close the two frames are, in whatever state space some specific\ algorithm cares about. Since a game is a simulation, these states almost always diverge over time. (For a viewpoint moving around, it's just that you have some velocity that you multiply by dt, so you have moved further; but for physics it might be that all your objects have moved, or that your points of contact are more likely to be different, or whatever). So there is a dependency between the frame time and the effectiveness of all frame-coherent algorithms. If there is a hiccup that causes one frame to go more slowly, the next frame is going to suffer and go more slowly as well (and this will keep propagating, through n frames). In the worst case, your dependency is so strong that the frame after the glitch is even *slower* than the glitch frame, in which case you get a sort of downward spiral effect and your frame rate plummets for a while. This is pretty severe though, and much more common is just that these dependencies react to small changes in your frame rate, amplifying them. So the upshot is, you increase the variance of your frame rate. That is still a very bad thing, because we like level, solid-feeling frame rates. Frame-coherent algorithms, by definition, work against that. In general, I characterize frame-coherent algorithms as having "interactive volatility" -- because of the realtime nature of the sim, they behave badly. Whereas for something like an offline batch renderer, they're fine; there is no constraint between realtime dt and simulation dt, so frame coherent algorithms are a 100% win there. And that's the area in which the tradition of frame-coherent stuff arose: offline simulation and rendering, back before we had computers that would even do this stuff realtime. But the problem is that people working on new algorithms today follow in this blind tradition of "frame coherence is 100% good". In fact it's one of the first weapons they pull from their holster when coming up with a new algorithm to do whatever. And that's a bad thing. I'm not saying frame coherence is 100% bad, because the fact is that we don't have viable non-coherent alternatives for a lot of tasks. Physics is a good example. I just think people need to look carefully at their options and understand what they're doing. -Jonathan. |
From: Thatcher U. <tu...@tu...> - 2003-07-25 20:05:34
|
On Jul 25, 2003 at 01:49 -0500, Jonathan Blow wrote: > I'm not saying frame coherence is 100% bad, because the fact is that > we don't have viable non-coherent alternatives for a lot of tasks. > Physics is a good example. I just think people need to look > carefully at their options and understand what they're doing. Yeah, I agree with this last bit. And I agree that fine-grained frame-coherence a la ROAM is mostly obsolete. "Frame coherence is generally bad" might be misleading though, IMO. Modern high-performance computing depends entirely on memory hierarchies that require temporal and spatial coherence to work acceptably. Coherence in games is getting *more* important (although at a coarser grain), due to hard disks. CPU/GPUs have gotten faster, disks have gotten much bigger (but not much faster), and DVD's have become a common distribution format. A realistic game needs to strictly limit the speed and mobility of the viewer in order to work at all, by dividing the world into levels separated by discrete loading events, or streaming sectors in the background, etc. No game can afford to randomly show a different part of the whole game-world in successive frames, unless that game-world is unacceptably small. Anyway, I suspect I'm railing against something you didn't exactly say. I just want to make sure coherence (at a coarser level) gets its due, because IMO it's just about the most important aspect of terrain LOD. As I see it, choosing a terrain LOD algo boils down to: * how much detail? * how big? * how fast can the viewpoint move? * how are you going to manage that amount of data? * does the terrain change? and everything else falls out of that. My cookbook in a nutshell: if the amount of data is relatively small (i.e. fits in RAM), then brute force, ROAM or its relatives, or geomipmapping are probably what you want. The minute you start positing expensive procedural stuff, or large disk-based datasets, then coherence becomes extremely important and you ought to think about chunking. If you can tolerate limited view distance, then forget LOD; just string together portal-connected areas, or use a wall of fog within the radius of a couple of chunks. If you need big aerial vistas then you should think about hierarchical chunking. -- Thatcher Ulrich http://tulrich.com |
From: Jonathan B. <jo...@nu...> - 2003-07-25 21:18:59
|
> A realistic game needs to > strictly limit the speed and mobility of the viewer in order to work > at all, by dividing the world into levels separated by discrete > loading events, or streaming sectors in the background, etc. No game > can afford to randomly show a different part of the whole game-world > in successive frames, unless that game-world is unacceptably small. > > Anyway, I suspect I'm railing against something you didn't exactly > say. I just want to make sure coherence (at a coarser level) gets its > due, because IMO it's just about the most important aspect of terrain > LOD. Right, and I agree with that. At least in the current computing paradigm, streamed loading is here to stay. I think there's a big stability difference, though, between the kinds of cases we're talking about. Something like Chunk LOD requires a certain kind of coherence that translates into a cap on the viewpoint speed. But as long as you stay within that cap, it's extremely stable. I can insert a function into your Chunk LOD demo that says Sleep(rand() % 50) and your demo doesn't give a crap, it still runs like a rock. (Aside from, of course, the jitteriness caused by the sleep). That kind of thing I have absolutely no problem with. Also, the speed cap is very very high. But if you put Sleep(rand() % 50) into something like full ROAM, you are really asking for trouble. And in general the speed cap is a whole lot lower, which isn't cool. So maybe "frame coherence" isn't a precise enough term to differentiate between these two cases. Maybe a different phrase, like "timestep sensitivity" or something. I think it's true to say "a lot of popularly espoused frame coherence optimizations are dangerously timestep-sensitive." So for something like terrain LOD, I would say, there are known timestep-insensitive ways of doing it, and if you choose something that has a timestep sensitivity, you really ought to have a good reason for that. In something like physics, we just don't know how to solve the problem in a way that isn't timestep-sensitive, and that really kind of sucks. It's a difficult issue because if you drill down far enough, everything has a timestep sensitivity. Somewhere though, there's a very wide very fuzzy line, and everything on one side of the line is stable, and everything on the other side is openly questionable, and everything in the middle is, well, I don't know. -Jonathan. |
From: Lucas A. <ack...@ll...> - 2003-07-25 22:36:05
|
Thatcher Ulrich wrote: >On Jul 25, 2003 at 01:49 -0500, Jonathan Blow wrote: > > > >>I'm not saying frame coherence is 100% bad, because the fact is that >>we don't have viable non-coherent alternatives for a lot of tasks. >>Physics is a good example. I just think people need to look >>carefully at their options and understand what they're doing. >> >> > >Yeah, I agree with this last bit. And I agree that fine-grained >frame-coherence a la ROAM is mostly obsolete. > > Agreed, noone is advocating fine-grained approaches anymore, and chunk-based output is one of the things that ROAM 2.0 utilizes. It is still a good underlying methodology though. The incrimental optimization approach works beautifully on coarser chunks, and bumping the chunk resolution up is trivial. I think the important observation is that on short enough intervals the view-dependant contribution to the refinement metric diminishes, so local view-independant chunks fit well. There are more sophisticated ways to make chunk-internal geometry however, such as using VIPM on the chunk interiors (one of the many things Mark D is trying). -Lucas |
From: Mark D. <duc...@ll...> - 2003-07-26 00:07:36
|
I will toss in one message to the terrain opinion fest: -- Jonathan's idea about coherence-exploitation leading to a cycle of death completely forgets the idea of "split-merge until time's up" that was in the original ROAM paper. -- if you want 60Hz absolutely solid, the ONLY known method (that scales to big terrain and has any claims of accuracy) is ROAM-like systems with the "time's up" option, as far as I know (any other contenders?). I don't count "threshhold feedback" schemes -- they are not guaranteed solid (without, ehem, guarantees of frame coherence!). -- Chunking is critical to eliminating CPU work. Latest ROAM's spend all their time creating fractal detail, not doing CLOD "thinking". -- Chunking can be dynamic. The linux ROAM 2 work in progress produces detail on the fly, and presumably could re-produce it as needed if you change the terrain. Of course tossing something like ocean waves into a terrain engine will be a little awkward...but the computation of the waves will likely dominate the CPU anyway (do it on the GPU???). You still need chunk CLOD for a big ocean. BTW, is "chunk CLOD" and oxymoron? ;-) -- progressive vertex arrays are way cool as replacements for ROAM "triangles". I'm getting 100M tri/sec with PVAs as the primitives in my latest experimental ROAM-like renderer with sliding progression on a radeon 9700 pro. -- split-merge dual queues and PVAs are great for very wacky geometry, e.g. isosurfaces of 3D fully developed turbulent fluid flow. Maybe you can toss your whole game world into a 3D hierarchy (think tetrahedron bintree) with PVAs per tet. That's where I'm headed... Cheers, --Mark D |
From: Jonathan B. <jo...@nu...> - 2003-07-26 00:26:00
|
> -- Jonathan's idea about coherence-exploitation leading to > a cycle of death completely forgets the idea of > "split-merge until time's up" that was in the original > ROAM paper. I was not forgetting; I just don't accept the answer. But again, the cycle of death is not my main point; it's a dramatic extreme-case illustration of my main point, which is just the dt-dependency. |
From: Tom F. <tom...@bl...> - 2003-07-26 12:33:02
|
> -- progressive vertex arrays are way cool as replacements for > ROAM "triangles". I'm getting 100M tri/sec with PVAs as the > primitives in my latest experimental ROAM-like renderer with > sliding progression on a radeon 9700 pro. If anyone can hear me over the noise :-), I'd love to hear the gritty details of this. TomF > -----Original Message----- > From: gda...@li... > [mailto:gda...@li...] On > Behalf Of Mark Duchaineau > Sent: 26 July 2003 01:07 > To: gda...@li... > Subject: Re: [Algorithms] Terrain performance comparisons > > > I will toss in one message to the terrain opinion fest: > > -- Jonathan's idea about coherence-exploitation leading to > a cycle of death completely forgets the idea of > "split-merge until time's up" that was in the original > ROAM paper. > > -- if you want 60Hz absolutely solid, the ONLY known method > (that scales to big terrain and has any claims of > accuracy) is ROAM-like systems with the "time's up" option, > as far as I know (any other contenders?). I don't count > "threshhold feedback" schemes -- they are not guaranteed > solid (without, ehem, guarantees of frame coherence!). > > -- Chunking is critical to eliminating CPU work. Latest ROAM's > spend all their time creating fractal detail, not doing > CLOD "thinking". > > -- Chunking can be dynamic. The linux ROAM 2 work in progress > produces detail on the fly, and presumably could re-produce > it as needed if you change the terrain. Of course tossing > something like ocean waves into a terrain engine will be a > little awkward...but the computation of the waves will likely > dominate the CPU anyway (do it on the GPU???). You still > need chunk CLOD for a big ocean. BTW, is "chunk CLOD" > and oxymoron? ;-) > > -- progressive vertex arrays are way cool as replacements for > ROAM "triangles". I'm getting 100M tri/sec with PVAs as the > primitives in my latest experimental ROAM-like renderer with > sliding progression on a radeon 9700 pro. > > -- split-merge dual queues and PVAs are great for very wacky > geometry, e.g. isosurfaces of 3D fully developed turbulent > fluid flow. Maybe you can toss your whole game world into > a 3D hierarchy (think tetrahedron bintree) with PVAs per tet. > That's where I'm headed... > > Cheers, > > --Mark D |
From: Andras B. <bn...@ma...> - 2003-07-26 18:04:48
|
For really large and very detailed terrains, data access will always be the bottleneck. One solution is adding procedural detail on the fly. Considering that procedural detail is usually some kind of low amplitude noise, maintaining small projected error for them doesn't seem to be a huge problem. Do we really need to throw millions of triangles at them? It seems to me that everybody's concerned only about geometry, trying to push as many triangles to the GPU as possible, but I think the real problem lies in the correct lighting of this detailed geometry at every pixel! If your terrain is precisely shaded in a way that reflects the true geometry, then you wouldn't need nearly as much triangles. Just my 2c, Bandi |
From: Mark D. <duc...@ll...> - 2003-07-26 21:08:16
|
Hi Tom, Erhm, this really needs a whole paper to explain (which I'm working on), but here's an outline of the ideas... I should first note that what I'm about to describe is for the "wacky general geometry" case...terrain with triangle bintrees is much simpler. The progressive vertex array implementation I'm putting together is based on many of the ideas you and Charles Bloom and others on this list have tossed around about VIPMs with sliding windows, dealing with vertex cache coherence, and dealing with seams nicely when chunking. My first implementation assumes a completely static set of geometry and takes a long time to make really coherenct and high quality progressions into a single vertex+index array per chunk. I use queue-based edge collapse ala memoryless simplification (Lindstrom's work) but allowing edge collapses that create non-manifold geometry (in effect allowing arbitrary amounts of collapse, not limited by the genus of the surface). Collapses are organized into some kind of spatial 3D (volumetric) hierarchy (specifically the 3D extension of ROAM "diamonds"), where the collapses can affect only the triangles/vertices/edges on the interior of the 3D cells. To avoid problems like in Hoppe's vis98 terrain paper, which keeps the mesh on cell boundaries at fullest resolution, the cells don't nest like octrees but rather alternate as with ROAM diamonds. In 3D there are three phases of diamond cells, so you get collapses for any edge in at least two out of three phases of collapses, thus you end up with no edges getting "stuck" at full resolution. For 2D cell hierarchies that are not too deep (e.g. Hoppe's terrain examples), keeping edges at full resolution sort-of works okay. But in 3D for general geometry you are hosed (I tried it...). The alternating boundaries are critical. Within each 3D diamond you get a progression that is put into a single vertex+index array (not stripped, but very vertex-cache coherent). There is a kind of priority queue to order the triangles in cache coherent order, while at the same time re-ordering the progression a bit (in essence clustering some of the collapses into a single neighborhood collapse involving a user-specified maximum number of triangles, and also walking these "clusters" in vertex cache coherent order). The collapse dependencies are used to figure out the clustering. Lots more details here. At this point you have a hierarchy of sliding-window vertex arrays (PVAs), with ROAM-like continuity at the diamond boundaries with no crack fixups/flanges/etc needed. I generally have thousands to tens of thousands of vertices per 3D diamond at the moment (haven't really tuned this yet). Vertex counts are reduced by a modest factor at each phase (dividing the count by the cube root of four is a good theoretical choice for smooth geometry). On the macro scale, ROAM split-merge is performed on the diamonds. Frustum culling is done ala the ROAM paper but only on the octree cell phases, where you have proper nesting. The interesting details here are how to quickly get the slider (progression window start/end indices) in the split-merge process quickly. I use OpenGL ATI object buffer extensions under linux on the radeon 9700 pro. For a huge computational fluid dynamics isosurface representing the mixing boundary of two gases undergoing violent (explosion-induced) turbulence, i.e. a REALLY nasty surface, I am getting a solid 100M tri/sec with essentially no CPU work per frame (from the app anyway -- I can't speak for the drivers). This surface has depth complexity 50 on average, so I really need to minimize the window size to get this performance (we haven't gotten the occlusion culling for this finished yet -- that's another big project!). My goal is to get this written up by the end of the year with a full demo. The code is largely all there, enough to get numbers and show early demos. Cheers, --Mark D. Tom Forsyth wrote: > > > -- progressive vertex arrays are way cool as replacements for > > ROAM "triangles". I'm getting 100M tri/sec with PVAs as the > > primitives in my latest experimental ROAM-like renderer with > > sliding progression on a radeon 9700 pro. > > If anyone can hear me over the noise :-), I'd love to hear the gritty > details of this. > > TomF |
From: Tom F. <tom...@bl...> - 2003-07-26 22:50:35
|
> From: Mark Duchaineau > > Hi Tom, > > Erhm, this really needs a whole paper to explain (which I'm working > on), but here's an outline of the ideas... I'll try to keep up :-) > I should first note that what I'm about to describe is for the > "wacky general geometry" case...terrain with triangle > bintrees is much simpler. Ah, explosion isosurfaces. Yep, wacky alright. I don't think us game guys have nearly that sort of nastiness. Hmmm... gives me an idea for a game :-) > The progressive vertex array implementation I'm putting together is > based on many of the ideas you and Charles Bloom and others on this > list have tossed around about VIPMs with sliding windows, dealing > with vertex cache coherence, and dealing with seams nicely when > chunking. My first implementation assumes a completely static set > of geometry and takes a long time to make really coherenct and high > quality progressions into a single vertex+index array per chunk. > I use queue-based edge collapse ala memoryless simplification > (Lindstrom's work) but allowing edge collapses that create > non-manifold geometry (in effect allowing arbitrary amounts of > collapse, not limited by the genus of the surface). Daring! But then you can't simply smack your artists around until they clean up the geometry. > Collapses are organized into some kind of spatial 3D (volumetric) > hierarchy (specifically the 3D extension of ROAM "diamonds"), > where the collapses can affect only the triangles/vertices/edges > on the interior of the 3D cells. To avoid problems like in > Hoppe's vis98 terrain paper, which keeps the mesh on cell > boundaries at fullest resolution, the cells don't nest like > octrees but rather alternate as with ROAM diamonds. In 3D there > are three phases of diamond cells, so you get collapses for any > edge in at least two out of three phases of collapses, thus you > end up with no edges getting "stuck" at full resolution. For 2D > cell hierarchies that are not too deep (e.g. Hoppe's terrain > examples), keeping edges at full resolution sort-of works okay. > But in 3D for general geometry you are hosed (I tried it...). > The alternating boundaries are critical. Now this IS a cool insight. This could work just as well on 2D grids, and avoids all the mucking around with flanges and fins. I still like the "no crosstalk" purity of flanges/fins which makes each chunk completely isolated, but in practice I don't think it's actually that big a win as long as your chunks are sensibly large, and doesn't create the sort of topological constraints that flanges/fins have. > Within each 3D diamond you get a progression that is put > into a single vertex+index array (not stripped, but > very vertex-cache coherent). Strips are overrated anyway. Modern hardware does indexed lists just as well. I use whichever uses the fewest indices, since a lot of hardware has to have indices copied over by the CPU, and even if they are handled natively, it's still often got to go over the AGP bus. If it's at all a hassle to use strips, I don't - life is too short and branches too easy to mispredict. :-) > There is a kind of priority > queue to order the triangles in cache coherent order, > while at the same time re-ordering the progression a > bit (in essence clustering some of the collapses into > a single neighborhood collapse involving a user-specified > maximum number of triangles, and also walking these "clusters" > in vertex cache coherent order). The collapse dependencies > are used to figure out the clustering. Lots more details > here. I can guess some of them. Using sliding window places restrictions on what you can collapse when while avoiding switching buffers completely - I can see that adding a vertex-coherency measure in there could tweak the order even more. Though I'm getting 140Mtris/sec on my R9700 here, so I'm not displeased with the results so far. [snip] > I generally have thousands to tens of thousands of vertices > per 3D diamond at the moment (haven't really tuned this yet). Feels about right. > Vertex counts are reduced by a modest factor at each > phase (dividing the count by the cube root of four is a good > theoretical choice for smooth geometry). I get about this factor when doing manifold collapses - it's a decent balance between doing out-of-optimal order collapses and having to switch index sub-buffers. One slightly worrying thing is that my index buffers are now about the size of my vertex buffers for simple vertices, but if you start to put any sort of tangent-space information in the vertices it's back up to little-and-large levels again, so I don't think it's too significant. > On the macro scale, ROAM split-merge is performed on the > diamonds. Frustum culling is done ala the ROAM paper > but only on the octree cell phases, where you have > proper nesting. The interesting details here are how > to quickly get the slider (progression window start/end > indices) in the split-merge process quickly. I just use a big array - it's only a 64-bit entry: [32-bit start of indices (my index buffers get big!), 16-bit vertex count, 16-bit index count] for each level. It's a cache miss, but doing a lot of maths is a branch predict miss. There's life in the old LUT yet :-) > I use OpenGL ATI object buffer extensions under linux > on the radeon 9700 pro. For a huge computational > fluid dynamics isosurface representing the mixing boundary > of two gases undergoing violent (explosion-induced) > turbulence, i.e. a REALLY nasty surface, I am getting > a solid 100M tri/sec with essentially no CPU work > per frame (from the app anyway -- I can't speak for > the drivers). The drivers on the 9700 should be happily throwing commands straight at the card with very little work unless you're changing other pipeline state. Other cards will need to copy the indices from your buffer to the card's command buffer, so there will be some work there, but still not too heavy - really just a memcpy to AGP memory. > This surface has depth complexity 50 > on average, so I really need to minimize the window size > to get this performance (we haven't gotten the > occlusion culling for this finished yet -- that's > another big project!). > > My goal is to get this written up by the end of the year > with a full demo. The code is largely all there, > enough to get numbers and show early demos. Cool. Many thanks for the info. I like that alternating-pattern idea. > Cheers, > > --Mark D. |
From: Jonathan B. <jo...@nu...> - 2003-07-30 02:44:27
|
Mark Duchaineau wrote: > For a huge computational > fluid dynamics isosurface representing the mixing boundary > of two gases undergoing violent (explosion-induced) > turbulence, i.e. a REALLY nasty surface, I am getting > a solid 100M tri/sec with essentially no CPU work > per frame (from the app anyway -- I can't speak for > the drivers). This surface has depth complexity 50 > on average, so I really need to minimize the window size > to get this performance (we haven't gotten the > occlusion culling for this finished yet -- that's > another big project!). That sounds really cool, actually. Are you able to share this data set? I'd like to do some experiments with something like that. -Jonathan. |
From: Mark D. <duc...@ll...> - 2003-08-01 07:29:09
|
Hi Jonathan, The dataset I mention is available to the public, in the sense that it has been approved for release. It is too big to put on the little servers we have accessible to the public, however, at least for all the timesteps (total data is 274*2048*2048*1920 bytes, i.e. 2.4 terabytes). A single timestep gzips down to about 2 gig. I'll see if I can get that on our web site. I can ftp to whoever asks, but I can't handle a lot of requests by hand like that. If someone has a lot of server disk space and is willing to host the dataset, that might be a good solution. The company hosting my cognigraph.com site has severe limits on the disk space (I may put up my own server eventually to overcome this). The easiest thing to hand out (size-wise) is the raw brick of field data and some code to extract an isocontour surface. The surfaces end up bigger that the original brick data! Beware that you end up with about half a billion triangles for a single isocontour at the final timestep. All our code works in chunks, never on the whole dataset at once in memory. Here is some info on the dataset: snapshot at late time, volume rendered: http://www.llnl.gov/CASC/asciturb/pubs/html-ppt-134237/sld012.htm slides on the science and the computational effort: http://www.llnl.gov/CASC/asciturb/pubs/html-ppt-134237/sld043.htm Cheers, --Mark Jonathan Blow wrote: > > Mark Duchaineau wrote: > > > For a huge computational > > fluid dynamics isosurface representing the mixing boundary > > of two gases undergoing violent (explosion-induced) > > turbulence, i.e. a REALLY nasty surface, I am getting > > a solid 100M tri/sec with essentially no CPU work > > per frame (from the app anyway -- I can't speak for > > the drivers). This surface has depth complexity 50 > > on average, so I really need to minimize the window size > > to get this performance (we haven't gotten the > > occlusion culling for this finished yet -- that's > > another big project!). > > That sounds really cool, actually. Are you able to share this > data set? I'd like to do some experiments with something > like that. > > -Jonathan. > > ------------------------------------------------------- > This SF.Net email sponsored by: Free pre-built ASP.NET sites including > Data Reports, E-commerce, Portals, and Forums are available now. > Download today and enter to win an XBOX or Visual Studio .NET. > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=6188 |
From: Jonathan B. <jo...@nu...> - 2003-08-01 17:57:22
|
> The dataset I mention is available to the public, in the sense > that it has been approved for release. It is too big to > put on the little servers we have accessible to the public, however, > at least for all the timesteps (total data is 274*2048*2048*1920 > bytes, i.e. 2.4 terabytes). A single timestep gzips > down to about 2 gig. I'll see if I can get that on > our web site. I can ftp to whoever asks, but I can't > handle a lot of requests by hand like that. Some 2 gig files, I can put on my server. 2.4TB is too much for me though. So if you can ftp me an interesting frame or two, I will make a little web page that exports the data to everyone. I'll talk to you off-list about the ftp. > The easiest thing to hand out (size-wise) > is the raw brick of field data and some code to > extract an isocontour surface. > The surfaces end up bigger that the original brick data! Not too surprising. I think everyone is realizing that triangle meshes are actually somewhat inefficient for high-res data, not that anyone has a clearly better idea though. -Jonathan. |
From: Lucas A. <ack...@ll...> - 2003-07-30 22:13:17
|
Jonathan Blow wrote: >Mark Duchaineau wrote: > > > >>For a huge computational >>fluid dynamics isosurface representing the mixing boundary >>of two gases undergoing violent (explosion-induced) >>turbulence, i.e. a REALLY nasty surface, I am getting >>a solid 100M tri/sec with essentially no CPU work >>per frame (from the app anyway -- I can't speak for >>the drivers). This surface has depth complexity 50 >>on average, so I really need to minimize the window size >>to get this performance (we haven't gotten the >>occlusion culling for this finished yet -- that's >>another big project!). >> >> > >That sounds really cool, actually. Are you able to share this >data set? I'd like to do some experiments with something >like that. > > -Jonathan. > > > I'm not informed as to its availibility (and have learned not to ask certain kinds of questions around here). There are a couple pages with some background info and lots of pictures: http://www.llnl.gov/CASC/asciturb/simulations.shtml and movies: http://www.llnl.gov/CASC/asciturb/movies.html It's one monster of an isosurface. Mark does have a butchered version that can be rendered (static LOD chunk hierarchy) in realtime on a laptop. -Lucas |
From: Lucas A. <ack...@ll...> - 2003-07-25 22:18:56
|
I recieved Jon's second comment immediately after sending my reply. Funny that. Jonathan Blow wrote: [snip] >So there is a dependency between the frame time and the effectiveness >of all frame-coherent algorithms. If there is a hiccup that causes one >frame to go more slowly, the next frame is going to suffer and go >more slowly as well (and this will keep propagating, through n frames). >In the worst case, your dependency is so strong that the frame after >the glitch is even *slower* than the glitch frame, in which case you >get a sort of downward spiral effect and your frame rate plummets >for a while. > >This is pretty severe though, and much more common is just that >these dependencies react to small changes in your frame rate, amplifying >them. So the upshot is, you increase the variance of your frame rate. >That is still a very bad thing, because we like level, solid-feeling frame >rates. Frame-coherent algorithms, by definition, work against that. > > >In general, I characterize frame-coherent algorithms as having >"interactive volatility" -- because of the realtime nature of the sim, they >behave badly. Whereas for something like an offline batch renderer, >they're fine; there is no constraint between realtime dt and simulation >dt, so frame coherent algorithms are a 100% win there. And that's >the area in which the tradition of frame-coherent stuff arose: offline >simulation and rendering, back before we had computers that would >even do this stuff realtime. But the problem is that people working >on new algorithms today follow in this blind tradition of "frame >coherence is 100% good". In fact it's one of the first weapons they >pull from their holster when coming up with a new algorithm to do >whatever. And that's a bad thing. > > I was expecting a reapparance of the "cycle of death" arguement, and I just don't buy it. Incremental algorithms do not require that you always process all the possible work in a given frame. As I noted: > ROAM can quite easily accomodate per-frame time constraints due to the > nature of the dual-queue optimizing process: the work happens in order > of most important to least important changes, so stopping the proccess > early is ok (as the linear cost of incremental work brings diminishing > returns), and successive frames will pick up where it left off to fill > in the details when the view stabilizes again and coherence increases. If you need framerate stability, that's no reason to ditch coherence altogether. The incremental approach can still be a big win. Some of Mark D.'s more recent 2.0 flavor implimentations don't even update all priorities every frame (in addition to queuing pending mesh edits, client-server update style), so the queue updates are themselves queued. It makes for a tight pipeline, and really it's really slick. -Lucas |
From: Tom F. <to...@mu...> - 2003-07-29 14:01:02
|
If time is short, try Grek Snook's version in Games Programming Gems 2. It is startlingly simple, and gets you easily 90% of the benefit of any of the other routines over having no LOD, and it's very quick. Because it is so simple (i.e. dumb) it works fine with dynamic terrains. Tom Forsyth - Muckyfoot bloke and Microsoft MVP. This email is the product of your deranged imagination, and does not in any way imply existence of the author. > -----Original Message----- > From: Pierre Terdiman [mailto:p.t...@wa...] > Sent: 29 July 2003 23:17 > To: gda...@li... > Subject: Re: [Algorithms] Terrain performance comparrisons > > > > I'm personally in the Chunked-LOD camp myself. It's flexible, > > easy-to-implement and speedy. The only problem I can > foresee with the > > algorithm is that I think using a dynamic dataset (ala > TreadMarks) may not > > be feasible. I haven't played around with it much myself, > so there's not > > much to back that statement up with other than mere > speculation. I think > it > > would be possible to use a dynamic dataset, but it would > probably require > a > > rather heavily modified implementation of the algorithm. > This is though, > as > > I said, mere speculation. > > I would be interested to hear more opinions about this (esp. > Thatcher's one, > of course). I need a dynamic terrain for a project. Time is > short. I'm a bit > familiar with Thatcher's chunk-LOD implementation, but maybe > not enough to > imagine all required modifications. > > It looks like it's possible since the code already performs > vertex-morphing > all the time, but maybe I'm missing one or two difficulties. > > Pierre > > > > > ------------------------------------------------------- > This SF.Net email sponsored by: Free pre-built ASP.NET sites including > Data Reports, E-commerce, Portals, and Forums are available now. > Download today and enter to win an XBOX or Visual Studio .NET. > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet > _072303_01/01 > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=6188 > |
From: Pierre T. <p.t...@wa...> - 2003-07-29 16:55:19
|
> If time is short, try Grek Snook's version in Games Programming Gems 2. It > is startlingly simple, and gets you easily 90% of the benefit of any of the > other routines over having no LOD, and it's very quick. Because it is so > simple (i.e. dumb) it works fine with dynamic terrains. Going to try that ASAP. > On the other hand, if you're talking localized craters and stuff like > that, Exactly. > one approach would be to modify verts in memory, and remember > where your craters are, so that when you re-load a chunk from disk, > you can re-apply any vert mods. Yes, that's what I wanted to do. I just wonder how it's going to scale when there are craters all over the place. Hmm. > There are some problems with this as well; e.g. if you drop a bomb > onto a flat area, there may not be many verts there, so the crater > would look bad. I didn't even think of this ! Ok, seems like I'm in for the easier GPG2 solution then. (Not much time to investigate) Thanks, Pierre |
From: Stefan M. <me...@sk...> - 2003-07-29 18:50:42
|
> > If time is short, try Grek Snook's version in Games Programming > Gems 2. It > > is startlingly simple, and gets you easily 90% of the benefit of any of > the > > other routines over having no LOD, and it's very quick. Because it is so > > simple (i.e. dumb) it works fine with dynamic terrains. > > Going to try that ASAP. Hi, we've used that LOD for terrain in Race Tracks Unlimited (unpublished but very nice looking ;) ). But I would recommend to write some code for the index table generation. In Gems 2 static tables are used. As I needed to be able to quickly test speed hit of different chunk sizes, I wrote an automatic table generation for that algo. If you drop me a line, I could send you the (ugly) code. My Email : metron at skynet dot be Kind regards, Stefan |
From: Tony C. <to...@mi...> - 2003-07-29 16:59:41
|
This is all true, but the guy who is working on the renderer wants to make the best renderer they can. They're not in a position to fix many of the other issues you mention. So it seems perfectly reasonable to have the discussion about how to make that renderer as good as you can make it. For example, I know that the biggest problem with my game right now are some issues in AI. Does that mean that I tell my graphics dev that he should stop tuning the renderer and get up to speed on the AI code? Of course not. It's definitely true that you should focus your resources on what gives you the biggest customer impact, but it's also true that in a reasonably sized team there is going to be some partitioning of skills and expertise. So, if you can't bring all of your resources to bear on your top problem, you do the next best thing, and bring resources to bear on the problems you can do something about. Tony Cox - Development Lead, Hockey Microsoft Games Studios - Sports -----Original Message----- From: gda...@li... [mailto:gda...@li...] On Behalf Of Charles Bloom Sent: Tuesday, July 29, 2003 8:28 AM To: gda...@li... Subject: Re: [Algorithms] Terrain performance comparrisons In fact, I would say that all this concern for terrain LOD is absolutely ridiculous. Rendering the terrain is just about the easiest aspect of any=20 system that would make use of it. Say, for example, that I want to a huge=20 continuous MMORPG with detailed terrain. What kind of problems do I have=20 (that are directly related to the large world and desire to have far view=20 distances and seamless movement) ? 1) seamless paging/streaming of lots of data 2) LOD/locality of network data 3) network prediction and lag compensation for many/far entities 4) server distribution and seamless links 5) LOD for all the game logic in the distance 6) LOD for all the characters, plants, shaders, etc. 7) LOD for the client-side physics engine 8) LOD for the terrain rendering Hmm, terrain rendering is just about at the bottom. As a concrete example,=20 have a look at Planetside or Asheron's Call 2. Sure, both could be served=20 by a better terrain renderer, but in terms of the failings that need=20 improvement, terrain rendering is way down the list. At 01:17 AM 7/29/2003 -0400, Thatcher Ulrich wrote: >2. In my opinion, the *real* reasons people don't always use LOD (of > whatever flavor) are based on practical engineering economics, not > anything social or purely technical. The fact is that except for a > few game genres, like flight sims, there are way more important > things on the agenda than scalable LOD. > > Take shading. Maybe 50% of the traffic on this list concerns > shading techniques. And that's because quality of shading has huge > leverage over the player experience, and is still heavily > resource-constrained. Whereas LOD generally isn't making or > breaking anybody's game nowadays, due to faster hardware. So > anything (e.g. LOD) that takes developer time and makes shading > more complicated has to have an extra big payoff, over and above > any intrinsic benefit. |
From: Jonathan B. <jo...@nu...> - 2003-07-29 18:56:59
|
I think Charles has the right viewpoint here... Tony Cox wrote: > This is all true, but the guy who is working on the renderer wants to > make the best renderer they can. They're not in a position to fix many > of the other issues you mention. ... > It's definitely true that you should focus your resources on what gives > you the biggest customer impact, but it's also true that in a reasonably > sized team there is going to be some partitioning of skills and > expertise. So, if you can't bring all of your resources to bear on your > top problem, you do the next best thing, and bring resources to bear on > the problems you can do something about. If game development were modular, this would be true. But game development isn't like that. When you tell your graphics guy to go work on a more complicated LOD algorithm, all of the following things happen; many of the effects are small, but you put them all together and they can't be ingored: * You increase the load on QA * You increase the number of bugs that slip through QA/code-fixing into the final distribution * You increase the load on your designers, who need to adapt their work practices to the character of the LOD algorithm * You increase build times, thus decreasing productivity for all programmers * You increase your executable size * You probably increase your heap memory usage / fragmentation * You decrease the longevity of your engine prior to re-writes (maybe not an issue for console guys) * You remove the ability of that programmer to perform other graphics speed / quality work. (i.e. there is an opportunity cost) * You make the graphics code harder for other graphics programmers to understand and modify, creating a dependency bottleneck that will crop up later in the development cycle. These things above (QA load, designer load, runtime memory) are all finite resources, and they are resources that are inevitably stretched to their limits during every game development cycle. These resources are definitely contended for by all of the items that Charles listed in his "big terrain" example. So where are you going to spend them? Aside from all this is the basic fact that the technical components in a game are not orthogonal. Your LOD system *will* interact with the rest of your engine, and the more complicated it is, the more problems you will have. Once you decide to add some complexity anywhere into an engine, you end up paying for that complexity every single day until you ship. A failure to appreciate these facts is, I think, a big contributor in why so many developers flounder. Let's look at this in a slightly different way: most graphics LOD algorithms are focused on decreasing the number of triangles/sec needed to display the scene. But triangles/sec is the single most abundant, uncontended-for resource in all of game development. So WTF are we putting all this energy there? -Jonathan. |
From: Lucas A. <ack...@ll...> - 2003-07-29 21:36:38
|
Jonathan Blow wrote: >Let's look at this in a slightly different way: most graphics LOD algorithms >are focused on decreasing the number of triangles/sec needed to display >the scene. But triangles/sec is the single most abundant, >uncontended-for resource in all of game development. So WTF are >we putting all this energy there? > > > LOD is not about quantity, it's about quality. Quantity can increase with hardware speed and software efficiency, but quality improves by increasing the intellignece with which you allocate your limited quantity. If triangles/sec were the fundamental measure of game quality, LOD would be pointless. Decreasing the quantity of triangles required to display part of a scene without decreasing the quality is only half of the equation. The other half is to increase the quantity used to display the parts of the scene that will contribute most to the quality. It is also possible, of course, to use the flexibility in quality as a means to put more stuff in a given scene, or to look at a scene in ways you couldn't before. The reason a lot of time and energy goes into LOD is that it's not an easy problem to solve all aspects of in a way that satisfies all people. -Lucas |
From: Trent P. <tr...@po...> - 2003-07-29 22:14:44
|
I worked a lot with Mark D. on developing a "teaching" version of ROAM 2.0 during the time I spent writing the book last year. As complex as ROAM 2.0 can be to implement, the results are more than worthwhile. A simple unoptimized demo was able to display such detail in a seemingly randomly-generated mesh, that it had my jaw on the floor. Having a terrain engine that can push an incredible amount of triangles per second doesn't mean shit unless those triangles are being placed in such a way that they actually DO add visible quality to the end-render of a terrain mesh. When you say CLOD may be "useless" due to the amazing amount of triangles a card can push per second, just think about WHERE a CLOD engine puts all those triangles. I can show you a nice flat quad that consists of millions of triangles, but does that mean the visual quality of the quad changes? No. Basically, I'm backing up what Lucas stated. "LOD [or CLOD] is not about quantity, it's about quality." Your average CLOD terrain algorithm does a nice job of making sure that triangles that aren't being seen, aren't wasting any CPU/GPU power. However, your best CLOD algos make sure that if a triangle isn't being seen (or isn't adding any amount of visual quality to something that is being seen), then that triangle can be placed on that hill the player is ascending, or that mountain-range to-be ascended. That is why CLOD was used in the past, and is still useful today. That said, sometimes an uber-kickass CLOD algorithm just is NOT needed for some games. As Charles mentioned, Asheron's Call 2 could have a much better terrain engine, but it's still playable, and pretty, in its current form, so I don't much care that they decided to put their resources to work elsewhere. Terrain is purely a case-dependant thing. Games like TreadMarks need the best terrain algorithm the programmer can code up. Games like Asheron's Call, while based around terrain, don't need an incredibly realistic engine; just something good enough to get the job done. --- Trent Polack tr...@po... www.polycat.net ----- Original Message ----- From: "Lucas Ackerman" <ack...@ll...> To: <gda...@li...> Sent: Tuesday, July 29, 2003 5:25 PM Subject: Re: [Algorithms] Terrain performance comparrisons > Jonathan Blow wrote: > > >Let's look at this in a slightly different way: most graphics LOD algorithms > >are focused on decreasing the number of triangles/sec needed to display > >the scene. But triangles/sec is the single most abundant, > >uncontended-for resource in all of game development. So WTF are > >we putting all this energy there? > > > > > > > LOD is not about quantity, it's about quality. Quantity can increase > with hardware speed and software efficiency, but quality improves by > increasing the intellignece with which you allocate your limited > quantity. If triangles/sec were the fundamental measure of game > quality, LOD would be pointless. > > Decreasing the quantity of triangles required to display part of a scene > without decreasing the quality is only half of the equation. The other > half is to increase the quantity used to display the parts of the scene > that will contribute most to the quality. > > It is also possible, of course, to use the flexibility in quality as a > means to put more stuff in a given scene, or to look at a scene in ways > you couldn't before. > > The reason a lot of time and energy goes into LOD is that it's not an > easy problem to solve all aspects of in a way that satisfies all people. > > -Lucas > > > > ------------------------------------------------------- > This SF.Net email sponsored by: Free pre-built ASP.NET sites including > Data Reports, E-commerce, Portals, and Forums are available now. > Download today and enter to win an XBOX or Visual Studio .NET. > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=6188 > > |