Re: [Algorithms] Terrain LOD algos & HW friendliness
Brought to you by:
vexxed72
From: Mark D. <duc...@ll...> - 2000-07-21 18:40:34
|
Jonathan, I hope this message reaches you before you head to siggraph, as I think there are critical points you should be aware of. Jonathan Blow wrote: > Mark Duchaineau wrote: > > > In a full-blown ROAM implementation, only a few percent of the triangles > > change each frame. > > I find that, actually, this is not so true. (Not just for ROAM, but for any > CLOD implementation). This statement is highly dependant on the density > of your terrain samples, and on the speed of viewpoint motion relative to > that density. Now, in games, we are aiming for higher detail, and at the > same time we want to be able to have fast motion. I have found that under > these conditions quite a number of polygons can change each frame. Now > this of course also depends on the frame rate: the lower the frame rate, > the more polygons you need to change each frame to maintain reasonable > quality. Suppose you're cruising along at some acceptable frame rate > and then you hit a frame that is a bit more expensive than usual. > Your viewpoint moves a bit further during that frame, so you need to > change more polygons than you were previously, to keep up. Because you're > changing more polygons, you're eating more CPU, so the frame rate > dives lower... it's a vicious cycle. This is a fundamental problem of > simulation (physics guys are familiar with this phenomenon). There is > a catastrophe point in CLOD rendering: on one side of it, you're cruising > along fine at decent frame rates; on the other side, you crash and burn > without hope of recovery. I think you misunderstand our main points in the ROAM paper. You can stop the ROAM optimization at any point in time. For example, if you want to maintain 85 frames/sec *consistently*, you can just watch the clock and stop when time's up. ROAM will do the best possible incremental work during the time available, i.e. it will give the best quality possible given the amount of time you had to spend. It is easy to know ahead of time whether it is best to skip the incremental work and start doing split-only from the base mesh. Also, the amount of work done by the ROAM optimizer is proportionate to the number of splits and merges you end up doing, which is never worse than being proportionate to the number of tirangles you output. No algorithms besides ROAM are able to optimize the full output mesh every frame in time less than some multiple of the output size. Hoppe's idea of optimizing say 1/10th of the output mesh each frame does give a 10x speedup, but fails to optimize the full mesh each frame which can lead to arbitrarily bad results. Here's another way of looking at this: suppose you think that in some cases if is better to just do recursive splits or just a static view-independent mesh. How much time does it take, and what quality do you get? If you are in an unusually incoherent situation where split-only would win, split-only takes time proportionate to the output size. ROAM would do exactly what split-only would do, taking the same *theoretical* amount of time, because we easily detected these cases. ROAM has more overhead in current implementations than simple split-only methods, but there is no fundamental reason why this has to be the case. If you think just having a static mesh is better, again the theory says no but the practice has to catch up. Here's why. If ROAM were allowed to operate until convergence, it would give the optimal mesh, meaning it could never be worse than the static mesh if you wait half a second. Suppose you managed a static mesh that matched ROAM after a half-second startup. Now start moving. If you do no work at all in ROAM, the mesh will remain static and you will be no worse than that static mesh. If you do any work at all in ROAM each frame, you will beat the static mesh. So every microsecond you get to spend ROAMing helps, and *in theory* is better than a pre-optimized static mesh. Again, there is no fundamental reason that ROAM implementations can't match this theory, but it may take a lot of hard work to do so. > > > Static LOD is not so susceptible to this problem which is another reason > why it's a lot more comfortable to use. > > > This gives you a great opportuinity to lock geometry > > on the graphics card, with the big IF: you need to be able to make > > a few scattered changes to the vertex and triangle arrays. If this is > > not possible in current hardware or APIs, I don't see why it couldn't be. > > Any OpenGL guru's have an opinion on this? > > I do not know much about the hardware so I can't venture a guess on the > feasibility of updating partial vertex buffers. I suppose that it could > be done (The same way there are API hooks for updating partial textures) > but I am not sure that it's a high priority to implement on hardware. > (Does anyone's implementation of glTexSubImage actually refrain from copying > the whole texture back into hardware?) > I've just done some tests to help answer this. Under Linux with the Nvidia 0.9-4 drivers on a 450mHz PIII AGP2x GeForce DDR, I was able to get 14.9meg tris/sec with static geometry and around 5meg tris/sec with 10% or the geometry changing each frame. This is without the special Nvidia OpenGL extensions for memory allocation/priorities. When I get access to those functions I'll report what I get again. But even without, it is still worth while to change some geometry each frame. The cost is higher than we would like, but we can just take this into account and still do the optimal thing. > > > > * Top-down terrain rendering algorithm that performs LOD computations faster > > > than the published algorithms (Lindstrom-Koller, ROAM, Rottger etc) > > > That isn't hard ;-). We were all publishing research prototypes. I'm finding > > it possible to tune my new ROAM code to get about 20x faster just through > > careful data structure design and re-ordering operations to be cache friendly. > > Yep, although I am speaking of algorithmically faster, not implementation-detail > faster. > Look at my explanation above: ROAM is the theoretical best. You can't beat it *theoretically*. In practice you might for now. If you managed to tie ROAM theoretically I will be very pleased--I haven't yet seen another approach that get's time O(output change), and that is rather boring ;-). > > > Woa! First let's see if we agree on quality: the most common but imperfect > > definition is the max of the projected error bounds in screen space... > > This is imperfect because it > > does not take into account temporal effects ("popping"), occlusion, color > > gradients, other perceptual issues. But it is fast and practical to measure > > and gives good results. > > Yes, we agree there. I do not think projected error is the "right" answer but > I do not have anything better, so that is what I use. Good...so far... > > > > Given this definition, a correct bottom-up and top-down give exactly > > the same results! > > I do not believe this is true. The basic nature of ROAM and other top-down > algorithms is that they place a bounding volume around an area of terrain and > use that volume to determine subdivision. In order for ROAM to be > conservative, it must take the maximal length of the projection of this > bounding volume onto the viewport (again, with the usual conventions of ignoring > certain things like perspective distortion to make the computations simpler). Not even close! ROAM projects the *error vector from the linear approximation* onto the screen. If a huge (in screen space) coarse-level triangle has a tiny error relative to each finest-resolution grid point, then the "thickness segment" will be tiny and project to a tiny length on the screen. We describe a conservative bound in the paper based on this, but the level of over-conservativeness is small on average in real applications. Measure it. Furthermore, it is not strictly bad to be over conservative--the critical thing is to be *consistent* in how over-conservative you are. If each projected error is exactly twice what it should have been, then you get *exactly* the same optimal mesh! It is only the *inconsistencies* in how conservative the bounds are that causes sub-optimal meshing per frame. > > > A ROAM wedgie is as thick as it is because some interior terrain points pushed > the ceiling upward or the floor downward. (Actually in the ROAM paper you have > only one thickness value, but significant gains can be had if you allow the > floor and ceiling to move independently, and this is really a minor change). We worked out that a two-sided bound gives consistently half the thickness values of a one-sided bound. So from the logic above, you are doing twice as much work and using twice the storage for nothing... > > Now when you conservatively project this wedgie onto the viewport, you are > making the assumption that the terrain points that forced the wedgie to be > as thick as it is are located at that pessimal point. This is almost never > true; they usually lie somewhere inside the wedgie. > Yes, but where? We were going for a conservative bound. You can always relax this if your app isn't so picky. > > Because they lie somewhere inside the wedgie, they are in general further from > the viewpoint than the part of the wedgie that is being used for projection. > Because they are further from the viewpoint, they project smaller (because > their z value is larger). > > Thus in general the actual projection of the error displacements within a > wedgie will be less than the actual projection of the wedgie. The wedgie > is overconservative. This means that in many cases ROAM will subdivide a > wedgie, creating extra polygons, when in fact there was not enough error > inside the wedgie to justify that. Once I coded up a test in my ROAM renderer > that iterated over the currently instantiated terrain and tried to gauge > which of the polygons *really* needed to be there (it did a brute force > iteration that actually projected each vertex to the viewport using the > same error metric) and depending on the scene, 25% to 50% of the polygons > were extraneous. Well, try this again when you've gotten the logic right. I think you may get a slightly sub-optimal mesh because the conservativeness is inconsistent, but not more than a few %. > > > Lindstrom-Koller does not exhibit this behavior. That algorithm will > never instantiate a vertex unless the actual projection of the error value, > from the exact position of the vertex, exceeds the error threshold. They > do pay a CPU cost for this -- it's what their whole region of uncertainty > thing is all about. If you think of a Lindstrom block as being analogous > to a ROAM wedge, then you will see that where Lindstrom-Koller has a > mechanism for the uncertainty region (and delta-max and delta-min), > ROAM just assumes that any wedge exceeding delta-min must be subdivided. ROAM is optimal and correct. It is possible to spend extra work to get less over-conservative, but this isn't what Lindstrom et al do for their final bottom-up pass as far as I recall...but my recollections on that paper are fuzzy now (Peter?). > > > I am viewing this as being separate from the issue of "do we need to > consider the total accumulated error or just the error of one pop". > I agree with you that considering the total accumulation is a good > idea. It is not too hard to do this even with the Lindstrom-Koller > algorithm; it's just a preprocessing issue. ROAM certainly does it, > but the way it does it is overconservative. I'd be happy to hear any ideas on making the ROAM bound less over-conservative *and* more consistently over-conservative *and* that is not too slow. I haven't yet ;-). Hope to see you at SIGGRAPH. --Mark D. |