RE: [Algorithms] VIPM Strips
Brought to you by:
vexxed72
From: Tom F. <to...@mu...> - 2000-08-17 09:39:16
|
Hmmmm... interesting. And got me thinking. I have a bit of a knee-jerk response to strips, because of their worse-than-list vertex caching (up to twice as bad). However, since normal VIPM pretty much destroys the vertex cache anyway, this argument doesn't hold much water! I think the fundamental difference this approach takes is abandoning the attempt to ditch collapsed tris as early as possible, which is what forces the re-ordering of the tris that destroys the vertex cache performance of "standard" VIPM. And thinking about it, that's fair enough. This still bins tris, but by making them degenerate, rather than have them fall off the end of the list and not feeding them to the chip at all. So you still incur the cost of sending the indices, but not the clipping, setup costs, etc associated with a triangle. But of course the bandwidth cost of the indices is tiny compared to the bandwidth of the vertices, and if by sending (say) double the number of indices, you can reduce the number of vertex resends (i.e. how well the vertex cache is working) by even a small amount, you win! The rejection of degenerate tris is lightning-fast (done right at the front-end I would assume - three 16-bit compares), and in parallel with all the "expensive" stuff, so doubling the number of times it does that is not going to affect performance in any measurable way I would guess. So having stated that I dislike strips, I see no reason not to switch the algorithm back to using indexed lists. Arrange your lists optimally for cache use, and do as the algorithm below says - make the binned ones degenerate, and move the others, just like a normal VIPM rout. The only difference from the routine that Charles has in his tutorial is that you don't reduce the number of tris, and when generating the change list, you need to include an index change that makes the binned triangle degenerate. Wow - that's pretty easy, and vertex cache efficiency is nicely maintained all the way through the collapses. A few points that need mulling over: (1) Using indexed lists, you need to make more index changes per collapse than for strips. Three times as many, I think - not 100% sure. This means slightly larger collapse information, slightly more of a cache hit. However, the collapse info is linear, and so very cache-friendly, and since the aim is to be vertex-cache friendly with our tris, the indices that need changing are likely to be close together as well, or at least in just two parts of the index list, so again fairly cache-friendly (remember - it's stepping outside the cache _once_ that hurts - subsequent accesses to that area are very cheap). So actually not much more expensive than strips I think. And better vertex cache performance (using the "fractal gasket" traversal method that I still can't figure out for arbitrary meshes). (2) Once a tri has been made degenerate, does it need to be involved in further collapses? For example, once a tri 1,2,3 has been made into 1,1,3, do we need to update it when a subsequent collapse bins vertex 3? On some hardware, the degenerate rejection may be done right up-front on the indices, and that triangle never makes it any further, so even though it references vertex 3, because no non-degenerate tri ever references it, vertex 3 is never loaded. However, some hardware may not do the rejection until after the vertex cache fetch, which means that vertex 3 gets fetched, and possible T&L'd, even though it is then never used. If the latter, then you may be better off doing slightly more index edits to "fix" already-degenerate tris so that they only reference used vertices, and having more collapse info. Tricky - I can't decide which option has the worse worst-case performance. Maybe having both versions and runtime testing at load time would be the best option. Compared to "my" two-part method, I think it's a lot simpler. The more I think about the two-part scheme, the more I spot gotchas, and tris that need including in the "slow" pool, the more fragmented the remaining "static" tris get (more fragmentation = less vertex-cache coherency). The advantage that my two-part method retains is the ability to share lots of the index data between instances of objects, which will be a definate advantage in some cases. But it is more complex, and my feeling is that it will achieve poorer vertex caching than this new method. Can I just mention that I love this stuff? It's changing so fast, and there are so many ideas being thrown into the melting-pot. And it just takes one idea (in this case "why do we need to not send the tri? Can't we just make it degenerate?") to make us rethink things and get a better variation of the algorithm kicked off. Thanks Charles. Tom Forsyth - Muckyfoot bloke. Whizzing and pasting and pooting through the day. > -----Original Message----- > From: Charles Bloom [mailto:cb...@cb...] > Sent: 17 August 2000 07:20 > To: gda...@li... > Cc: cb...@su... > Subject: [Algorithms] VIPM Strips > > > > I just read the paper "Skip Strips" by El-Sana et.al. > which seems to be a much-overlooked work on VDPM. I > found it as part of the course notes in the simplification > course on the Siggraph 2k CD. > > It's an interesting read, though quite archaic. > They make some clever data structures to speed up the > inherently slow algorithm of VDPM. > > Anyway, it gave me ideas about how to strip VIPM. In > fact, it's so easy, it's ridiculous. Generate your > vertices in collapse order. Create an optimal Indexed > Triangle Strip over those vertices, using duplicated > vertex indices to make it one single strip over the > entire mesh. When you do a collapse, do it just like > normal VIPM, except that rather than changing indices > in a list of indexed triangles, you're changing them > in a list of indexed strips; it's just the same operation: > IndexList[ vsplit->changeIndex ] = newVertIndex; > You also can't reduce the number of triangles by two, > because the two collapsed triangles may not be at the > end of the list, and their verts are needed to continue > the strip on either side. > > After you collapse some number of verts, you switch down > to a new strip which is optimized on that number of > verts. This takes a cue from Tom Forsyth, and something > like power-of-2 sets of triangle strips are recommended. > For example, a mesh with 5000 triangles at full LOD might > have strip lists for {5000,3000,2000,1000,500,250,125} > triangles. > > Now note that the memory usage of this scheme is almost > identical to that of normal VIPM (!!). If you use power- > of-2 triangle strip list changes, then you only duplicate > the number of indices stored, and by using strips instead > of lists, you roughly half the storage needed. > > The disadvantage of this scheme is obvious : your triangle > count stays the same until you swap down to a shorter strip. > In that sence, it's a lot like discrete LOD. The difference > is that you're still doing collapses one by one, and reducing the > vertex count one by one, and virtually eliminating popping. > > Finally, what's the advantage? I'm not sure. This method > is surely faster than conventional VIPM. On the other hand, > I'm not sure it compares terribly well to Tom's idea of > stripped prefixes and tri-list suffixes. > > ------------------------------------------------------- > Charles Bloom cb...@cb... http://www.cbloom.com |