RE: [Algorithms] VIPM console implementations
Brought to you by:
vexxed72
From: Tom F. <to...@mu...> - 2000-08-18 00:00:57
|
I've been thinking about this a bit. And probably for exactly the same hypothetical equipment you're thinking of. It's certainly a real pain having to feed your T&L through push-mode streaming DMA, rather than a demand-loaded vertex cache. My current thoughts are to (offline) take the fully-expanded stream of indices (whether list or strip, doesn't realyl matter). Then simulate a vertex cache of around the size that will fit - you mention 100 verts, which is hooooj by vertex cache standards. Anyway, you will of course get very few cache misses, but you will get a few. While doing this, write out a "command buffer" sort of thing that contains indices, vertex data when the cache misses (replicated for each miss) along with the position in the cache that this will go to. And the indices are these cache positions (so in this case, you can shrink them to bytes, not words, which is a bonus), not their original values. So you are basically implementing a vertex cache, but where the behaviour of the cache is calculated beforehand, offline, so that you can use push-model. Now, to use VIPM, you do standard VIPM on this modified command buffer with the CPU (the vertex data in the command stream is ignored), and then throw the buffer at the T&L unit, up to the last triangle you want to draw. And of course you have precompiled versions for different powers of two of vertices, which will bin stuff earlier as well. To be clear, I'm talking about the old-style VIPM, where indices are stored in reverse collapse order, not the stuff Charles was talking about very recently. This is just a rough thought-sketch, but it's essentially the same as normal VIPM, but with push-mode instead. It is slightly less efficient, since if there is a triangle that references vertex 3, and then that vertex gets binned through an edge collapse, but the triangle is still drawn (because it now uses the other vertex on that edge), then the data for vertex 3 will still be in the "cache", and will still use DMA space. But as soon as all tris that, at full, rez, use this vertex, that vertex will also be binned, so vertices do eventually vanish from the DMA stream, just not as soon as with the demand-loaded case. I'm sure someone will point out the flaws in this - I haven't really worked it through properly yet. Note that although the collapse-order indexed list method is now officially Old And Slow (and without a line of code being written :-), that is because of poor vertex cache performance. But in this case, we have a hooooooooj cache, so that really shouldn't matter. And the powers-of-two versions will help a lot - they need to, as the cache use isn't dynamic, so as you do more collapses, you effectively have a smaller cache (since entries are taken up by vertices that are no longer referenced). Tom Forsyth - Muckyfoot bloke. Whizzing and pasting and pooting through the day. > -----Original Message----- > From: Nicolas Serres [mailto:nic...@ch...] > Sent: 18 August 2000 00:23 > To: gda...@li... > Subject: [Algorithms] VIPM console implementations > > > I'm rather curious about VIPM; I don't use this technique > right now, and I'd > like to know if some of you already implemented it in an > efficient way as > microcode for a next-gen console having a DMA? I don't think > indexed stuff > is very efficient in that case, but I might be wrong.. I'd > like to hear from > your implementation issues > > You tend to have many many vertices, that can be efficiently > int-quantized > and depacked after the dma transfer so you save lots of > bandwidth, but you > still have to store the depacked stuff in the small "tl-like" > unit memory. > To be efficient you also have to double buffer this memory. > This leads to a > model splited in realtively small chunks. > > Let's say my "hypothetical" console lets me have 8 k in a > half-buffer, a > typical simple depacked vertex (with normal, mapping, and > rgb) is about 64 > bytes in temporary local memory.. This leads to about 100 > vertices and index > data. This is relatively small. Of course I can't have an index that > references something outside of the current chunk.. I think > it is a very > strong limitation for VIPM. > > Is VIPM still efficient done that way, with such small chunks > ? Did I miss > something ? > > > Nicolas > > > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > http://lists.sourceforge.net/mailman/listinfo/gdalgorithms-list > |