RE: [Algorithms] VIPM console implementations

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I've been thinking about this a bit. And probably for exactly the same
hypothetical equipment you're thinking of. It's certainly a real pain having
to feed your T&L through push-mode streaming DMA, rather than a
demand-loaded vertex cache.

My current thoughts are to (offline) take the fully-expanded stream of
indices (whether list or strip, doesn't realyl matter). Then simulate a
vertex cache of around the size that will fit - you mention 100 verts, which
is hooooj by vertex cache standards. Anyway, you will of course get very few
cache misses, but you will get a few.

While doing this, write out a "command buffer" sort of thing that contains
indices, vertex data when the cache misses (replicated for each miss) along
with the position in the cache that this will go to. And the indices are
these cache positions (so in this case, you can shrink them to bytes, not
words, which is a bonus), not their original values. So you are basically
implementing a vertex cache, but where the behaviour of the cache is
calculated beforehand, offline, so that you can use push-model.

Now, to use VIPM, you do standard VIPM on this modified command buffer with
the CPU (the vertex data in the command stream is ignored), and then throw
the buffer at the T&L unit, up to the last triangle you want to draw.

And of course you have precompiled versions for different powers of two of
vertices, which will bin stuff earlier as well.

To be clear, I'm talking about the old-style VIPM, where indices are stored
in reverse collapse order, not the stuff Charles was talking about very
recently.

This is just a rough thought-sketch, but it's essentially the same as normal
VIPM, but with push-mode instead. It is slightly less efficient, since if
there is a triangle that references vertex 3, and then that vertex gets
binned through an edge collapse, but the triangle is still drawn (because it
now uses the other vertex on that edge), then the data for vertex 3 will
still be in the "cache", and will still use DMA space. But as soon as all
tris that, at full, rez, use this vertex, that vertex will also be binned,
so vertices do eventually vanish from the DMA stream, just not as soon as
with the demand-loaded case.

I'm sure someone will point out the flaws in this - I haven't really worked
it through properly yet.

Note that although the collapse-order indexed list method is now officially
Old And Slow (and without a line of code being written :-), that is because
of poor vertex cache performance. But in this case, we have a hooooooooj
cache, so that really shouldn't matter. And the powers-of-two versions will
help a lot - they need to, as the cache use isn't dynamic, so as you do more
collapses, you effectively have a smaller cache (since entries are taken up
by vertices that are no longer referenced).

Tom Forsyth - Muckyfoot bloke.
Whizzing and pasting and pooting through the day.

> -----Original Message-----
> From: Nicolas Serres [mailto:nic...@ch...]
> Sent: 18 August 2000 00:23
> To: gda...@li...
> Subject: [Algorithms] VIPM console implementations
> 
> 
> I'm rather curious about VIPM; I don't use this technique 
> right now, and I'd
> like to know if some of you already implemented it in an 
> efficient way as
> microcode for a next-gen console having a DMA? I don't think 
> indexed stuff
> is very efficient in that case, but I might be wrong.. I'd 
> like to hear from
> your implementation issues
> 
> You tend to have many many vertices, that can be efficiently 
> int-quantized
> and depacked after the dma transfer so you save lots of 
> bandwidth, but you
> still have to store the depacked stuff in the small "tl-like" 
> unit memory.
> To be efficient you also have to double buffer this memory. 
> This leads to a
> model splited in realtively small chunks.
> 
> Let's say my "hypothetical" console lets me have 8 k in a 
> half-buffer, a
> typical simple depacked vertex (with normal, mapping, and 
> rgb) is about 64
> bytes in temporary local memory.. This leads to about 100 
> vertices and index
> data. This is relatively small. Of course I can't have an index that
> references something outside of the current chunk..  I think 
> it is a very
> strong limitation for VIPM.
> 
> Is VIPM still efficient done that way, with such small chunks 
> ?  Did I miss
> something ?
> 
> 
> Nicolas
> 
> 
> _______________________________________________
> GDAlgorithms-list mailing list
> GDA...@li...
> http://lists.sourceforge.net/mailman/listinfo/gdalgorithms-list
>