RE: [Algorithms] VIPM Strips

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hmmmm... interesting. And got me thinking.

I have a bit of a knee-jerk response to strips, because of their
worse-than-list vertex caching (up to twice as bad). However, since normal
VIPM pretty much destroys the vertex cache anyway, this argument doesn't
hold much water!

I think the fundamental difference this approach takes is abandoning the
attempt to ditch collapsed tris as early as possible, which is what forces
the re-ordering of the tris that destroys the vertex cache performance of
"standard" VIPM. And thinking about it, that's fair enough. This still bins
tris, but by making them degenerate, rather than have them fall off the end
of the list and not feeding them to the chip at all. So you still incur the
cost of sending the indices, but not the clipping, setup costs, etc
associated with a triangle.

But of course the bandwidth cost of the indices is tiny compared to the
bandwidth of the vertices, and if by sending (say) double the number of
indices, you can reduce the number of vertex resends (i.e. how well the
vertex cache is working) by even a small amount, you win! The rejection of
degenerate tris is lightning-fast (done right at the front-end I would
assume - three 16-bit compares), and in parallel with all the "expensive"
stuff, so doubling the number of times it does that is not going to affect
performance in any measurable way I would guess.

So having stated that I dislike strips, I see no reason not to switch the
algorithm back to using indexed lists. Arrange your lists optimally for
cache use, and do as the algorithm below says - make the binned ones
degenerate, and move the others, just like a normal VIPM rout. The only
difference from the routine that Charles has in his tutorial is that you
don't reduce the number of tris, and when generating the change list, you
need to include an index change that makes the binned triangle degenerate.
Wow - that's pretty easy, and vertex cache efficiency is nicely maintained
all the way through the collapses.

A few points that need mulling over:

(1) Using indexed lists, you need to make more index changes per collapse
than for strips. Three times as many, I think - not 100% sure. This means
slightly larger collapse information, slightly more of a cache hit. However,
the collapse info is linear, and so very cache-friendly, and since the aim
is to be vertex-cache friendly with our tris, the indices that need changing
are likely to be close together as well, or at least in just two parts of
the index list, so again fairly cache-friendly (remember - it's stepping
outside the cache _once_ that hurts - subsequent accesses to that area are
very cheap). So actually not much more expensive than strips I think. And
better vertex cache performance (using the "fractal gasket" traversal method
that I still can't figure out for arbitrary meshes).

(2) Once a tri has been made degenerate, does it need to be involved in
further collapses? For example, once a tri 1,2,3 has been made into 1,1,3,
do we need to update it when a subsequent collapse bins vertex 3? On some
hardware, the degenerate rejection may be done right up-front on the
indices, and that triangle never makes it any further, so even though it
references vertex 3, because no non-degenerate tri ever references it,
vertex 3 is never loaded. However, some hardware may not do the rejection
until after the vertex cache fetch, which means that vertex 3 gets fetched,
and possible T&L'd, even though it is then never used. If the latter, then
you may be better off doing slightly more index edits to "fix"
already-degenerate tris so that they only reference used vertices, and
having more collapse info. Tricky - I can't decide which option has the
worse worst-case performance. Maybe having both versions and runtime testing
at load time would be the best option.

Compared to "my" two-part method, I think it's a lot simpler. The more I
think about the two-part scheme, the more I spot gotchas, and tris that need
including in the "slow" pool, the more fragmented the remaining "static"
tris get (more fragmentation = less vertex-cache coherency).

The advantage that my two-part method retains is the ability to share lots
of the index data between instances of objects, which will be a definate
advantage in some cases. But it is more complex, and my feeling is that it
will achieve poorer vertex caching than this new method.

Can I just mention that I love this stuff? It's changing so fast, and there
are so many ideas being thrown into the melting-pot. And it just takes one
idea (in this case "why do we need to not send the tri? Can't we just make
it degenerate?") to make us rethink things and get a better variation of the
algorithm kicked off. Thanks Charles.

Tom Forsyth - Muckyfoot bloke.
Whizzing and pasting and pooting through the day.

> -----Original Message-----
> From: Charles Bloom [mailto:cb...@cb...]
> Sent: 17 August 2000 07:20
> To: gda...@li...
> Cc: cb...@su...
> Subject: [Algorithms] VIPM Strips
> 
> 
> 
> I just read the paper "Skip Strips" by El-Sana et.al.
> which seems to be a much-overlooked work on VDPM.  I
> found it as part of the course notes in the simplification
> course on the Siggraph 2k CD.
> 
> It's an interesting read, though quite archaic.
> They make some clever data structures to speed up the
> inherently slow algorithm of VDPM.
> 
> Anyway, it gave me ideas about how to strip VIPM.  In
> fact, it's so easy, it's ridiculous.  Generate your
> vertices in collapse order.  Create an optimal Indexed
> Triangle Strip over those vertices, using duplicated
> vertex indices to make it one single strip over the
> entire mesh.  When you do a collapse, do it just like
> normal VIPM, except that rather than changing indices
> in a list of indexed triangles, you're changing them
> in a list of indexed strips; it's just the same operation: 
> 	IndexList[ vsplit->changeIndex ] = newVertIndex;
> You also can't reduce the number of triangles by two,
> because the two collapsed triangles may not be at the
> end of the list, and their verts are needed to continue
> the strip on either side.
> 
> After you collapse some number of verts, you switch down
> to a new strip which is optimized on that number of
> verts.  This takes a cue from Tom Forsyth, and something
> like power-of-2 sets of triangle strips are recommended.
> For example, a mesh with 5000 triangles at full LOD might
> have strip lists for {5000,3000,2000,1000,500,250,125}
> triangles.
> 
> Now note that the memory usage of this scheme is almost
> identical to that of normal VIPM (!!).  If you use power-
> of-2 triangle strip list changes, then you only duplicate
> the number of indices stored, and by using strips instead
> of lists, you roughly half the storage needed.
> 
> The disadvantage of this scheme is obvious : your triangle
> count stays the same until you swap down to a shorter strip.
> In that sence, it's a lot like discrete LOD.  The difference
> is that you're still doing collapses one by one, and reducing the
> vertex count one by one, and virtually eliminating popping.
> 
> Finally, what's the advantage?  I'm not sure.  This method
> is surely faster than conventional VIPM.  On the other hand,
> I'm not sure it compares terribly well to Tom's idea of
> stripped prefixes and tri-list suffixes.
> 
> -------------------------------------------------------
> Charles Bloom    cb...@cb...    http://www.cbloom.com