Re: [Algorithms] Terrain LOD algos & HW friendliness

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Jonathan,

I hope this message reaches you before you head to siggraph,
as I think there are critical points you should be aware of.

Jonathan Blow wrote:

> Mark Duchaineau wrote:
>
> > In a full-blown ROAM implementation, only a few percent of the
triangles
> > change each frame.
>
> I find that, actually, this is not so true. (Not just for ROAM, but
for any
> CLOD implementation).  This statement is highly dependant on the
density
> of your terrain samples, and on the speed of viewpoint motion relative
to
> that density.  Now, in games, we are aiming for higher detail, and at
the
> same time we want to be able to have fast motion.  I have found that
under
> these conditions quite a number of polygons can change each frame.
Now
> this of course also depends on the frame rate: the lower the frame
rate,
> the more polygons you need to change each frame to maintain reasonable

> quality.  Suppose you're cruising along at some acceptable frame rate
> and then you hit a frame that is a bit more expensive than usual.
> Your viewpoint moves a bit further during that frame, so you need to
> change more polygons than you were previously, to keep up.  Because
you're
> changing more polygons, you're eating more CPU, so the frame rate
> dives lower... it's a vicious cycle.  This is a fundamental problem of

> simulation (physics guys are familiar with this phenomenon).  There is

> a catastrophe point in CLOD rendering: on one side of it, you're
cruising
> along fine at decent frame rates; on the other side, you crash and
burn
> without hope of recovery.

I think you misunderstand our main points in the ROAM paper.
You can stop the ROAM optimization at any point in time.  For example,
if you want to maintain 85 frames/sec *consistently*, you can just watch

the clock and stop when time's up.  ROAM will do the best possible
incremental work during the time available, i.e. it will give the best
quality
possible given the amount of time you had to spend.  It is easy to know
ahead of time whether it is best to skip the incremental work and start
doing split-only from the base mesh.  Also, the amount of work done
by the ROAM optimizer is proportionate to the number of splits and
merges you end up doing, which is never worse than being proportionate
to the number of tirangles you output.  No algorithms besides ROAM
are able to optimize the full output mesh every frame in time
less than some multiple of the output size.  Hoppe's idea of
optimizing say 1/10th of the output mesh each frame does
give a 10x speedup, but fails to optimize the full mesh each frame
which can lead to arbitrarily bad results.

Here's another way of looking at this: suppose you think that in some
cases if is better to just do recursive splits or just a static
view-independent
mesh.  How much time does it take, and what quality do you get?
If you are in an unusually incoherent situation where split-only would
win, split-only takes time proportionate to the output size.  ROAM
would do exactly what split-only would do, taking the same *theoretical*

amount of time, because we easily detected these cases.  ROAM has more
overhead in current implementations than simple split-only methods,
but there is no fundamental reason why this has to be the case.
If you think just having a static mesh is better, again the theory
says no but the practice has to catch up.  Here's why.  If ROAM
were allowed to operate until convergence, it would give the optimal
mesh, meaning it could never be worse than the static mesh
if you wait half a second.  Suppose you managed a static mesh
that matched ROAM after a half-second startup.  Now start
moving.  If you do no work at all in ROAM, the mesh will remain
static and you will be no worse than that static mesh.  If you do
any work at all in ROAM each frame, you will beat the static mesh.
So every microsecond you get to spend ROAMing helps, and
*in theory* is better than a pre-optimized static mesh.  Again,
there is no fundamental reason that ROAM implementations
can't match this theory, but it may take a lot of hard work
to do so.

>
>
> Static LOD is not so susceptible to this problem which is another
reason
> why it's a lot more comfortable to use.
>
> > This gives you a great opportuinity to lock geometry
> > on the graphics card, with the big IF: you need to be able to make
> > a few scattered changes to the vertex and triangle arrays.  If this
is
> > not possible in current hardware or APIs, I don't see why it
couldn't be.
> > Any OpenGL guru's have an opinion on this?
>
> I do not know much about the hardware so I can't venture a guess on
the
> feasibility of updating partial vertex buffers.  I suppose that it
could
> be done (The same way there are API hooks for updating partial
textures)
> but I am not sure that it's a high priority to implement on hardware.
> (Does anyone's implementation of glTexSubImage actually refrain from
copying
> the whole texture back into hardware?)
>

I've just done some tests to help answer this.  Under Linux with the
Nvidia
0.9-4 drivers on a 450mHz PIII AGP2x GeForce DDR, I was able
to get 14.9meg tris/sec with static geometry and around 5meg tris/sec
with 10% or the geometry changing each frame.  This is without the
special Nvidia OpenGL extensions for memory allocation/priorities.
When I get access to those functions I'll report what I get again.
But even without, it is still worth while to change some geometry each
frame.
The cost is higher than we would like, but we can just take this into
account and still do the optimal thing.

>
> > >   * Top-down terrain rendering algorithm that performs LOD
computations faster
> > >     than the published algorithms (Lindstrom-Koller, ROAM, Rottger
etc)
>
> > That isn't hard ;-).  We were all publishing research prototypes.
I'm finding
> > it possible to tune my new ROAM code to get about 20x faster just
through
> > careful data structure design and re-ordering operations to be cache
friendly.
>
> Yep, although I am speaking of algorithmically faster, not
implementation-detail
> faster.
>

Look at my explanation above: ROAM is the theoretical best.  You can't
beat it
*theoretically*.  In practice you might for now.  If you managed to tie
ROAM
theoretically I will be very pleased--I haven't yet seen another
approach
that get's time O(output change), and that is rather boring ;-).

>
> > Woa!  First let's see if we agree on quality: the most common but
imperfect
> > definition is the max of the projected error bounds in screen
space...
> > This is imperfect because it
> > does not take into account temporal effects ("popping"), occlusion,
color
> > gradients, other perceptual issues.  But it is fast and practical to
measure
> > and gives good results.
>
> Yes, we agree there.  I do not think projected error is the "right"
answer but
> I do not have anything better, so that is what I use.

Good...so far...

>
>
> > Given this definition, a correct bottom-up and top-down give exactly

> > the same results!
>
> I do not believe this is true.  The basic nature of ROAM and other
top-down
> algorithms is that they place a bounding volume around an area of
terrain and
> use that volume to determine subdivision.  In order for ROAM to be
> conservative, it must take the maximal length of the projection of
this
> bounding volume onto the viewport (again, with the usual conventions
of ignoring
> certain things like perspective distortion to make the computations
simpler).

Not even close!  ROAM projects the *error vector from the linear
approximation*
onto the screen.  If a huge (in screen space) coarse-level triangle has
a tiny
error relative to each finest-resolution grid point, then the "thickness
segment"
will be tiny and project to a tiny length on the screen.  We describe a
conservative
bound in the paper based on this, but the level of over-conservativeness
is
small on average in real applications.  Measure it.  Furthermore, it is
not
strictly bad to be over conservative--the critical thing is to be
*consistent*
in how over-conservative you are.  If each projected error is exactly
twice what
it should have been, then you get *exactly* the same optimal mesh!  It
is
only the *inconsistencies* in how conservative the bounds are that
causes
sub-optimal meshing per frame.

>
>
> A ROAM wedgie is as thick as it is because some interior terrain
points pushed
> the ceiling upward or the floor downward.  (Actually in the ROAM paper
you have
> only one thickness value, but significant gains can be had if you
allow the
> floor and ceiling to move independently, and this is really a minor
change).

We worked out that a two-sided bound gives consistently half the
thickness
values of a one-sided bound.  So from the logic above, you are doing
twice
as much work and using twice the storage for nothing...

>
> Now when you conservatively project this wedgie onto the viewport, you
are
> making the assumption that the terrain points that forced the wedgie
to be
> as thick as it is are located at that pessimal point.  This is almost
never
> true; they usually lie somewhere inside the wedgie.
>

Yes, but where?  We were going for a conservative bound.  You can always

relax this if your app isn't so picky.

>
> Because they lie somewhere inside the wedgie, they are in general
further from
> the viewpoint than the part of the wedgie that is being used for
projection.
> Because they are further from the viewpoint, they project smaller
(because
> their z value is larger).
>
> Thus in general the actual projection of the error displacements
within a
> wedgie will be less than the actual projection of the wedgie.  The
wedgie
> is overconservative.  This means that in many cases ROAM will
subdivide a
> wedgie, creating extra polygons, when in fact there was not enough
error
> inside the wedgie to justify that.  Once I coded up a test in my ROAM
renderer
> that iterated over the currently instantiated terrain and tried to
gauge
> which of the polygons *really* needed to be there (it did a brute
force
> iteration that actually projected each vertex to the viewport using
the
> same error metric) and depending on the scene, 25% to 50% of the
polygons
> were extraneous.

Well, try this again when you've gotten the logic right.  I think you
may
get a slightly sub-optimal mesh because the conservativeness is
inconsistent, but not more than a few %.

>
>
> Lindstrom-Koller does not exhibit this behavior.  That algorithm will
> never instantiate a vertex unless the actual projection of the error
value,
> from the exact position of the vertex, exceeds the error threshold.
They
> do pay a CPU cost for this -- it's what their whole region of
uncertainty
> thing is all about.  If you think of a Lindstrom block as being
analogous
> to a ROAM wedge, then you will see that where Lindstrom-Koller has a
> mechanism for the uncertainty region (and delta-max and delta-min),
> ROAM just assumes that any wedge exceeding delta-min must be
subdivided.

ROAM is optimal and correct.  It is possible to spend extra work to get
less over-conservative, but this isn't what Lindstrom et al do for
their final bottom-up pass as far as I recall...but my recollections
on that paper are fuzzy now (Peter?).

>
>
> I am viewing this as being separate from the issue of "do we need to
> consider the total accumulated error or just the error of one pop".
> I agree with you that considering the total accumulation is a good
> idea.  It is not too hard to do this even with the Lindstrom-Koller
> algorithm; it's just a preprocessing issue.  ROAM certainly does it,
> but the way it does it is overconservative.

I'd be happy to hear any ideas on making the ROAM bound less
over-conservative *and* more consistently over-conservative
*and* that is not too slow.  I haven't yet ;-).

Hope to see you at SIGGRAPH.

--Mark D.