RE: [Algorithms] VIPM With T&L - what about roam
Brought to you by:
vexxed72
From: Tom F. <to...@mu...> - 2000-09-15 11:16:27
|
> From: Charles Bloom [mailto:cb...@cb...] > [snip] > >2) In the simple VIPM scheme, just send three indices > >per tri, and verts and tris are listed in the arrays in > >progressive split order, the indices need to be updated > >for triangles whose vertices moved in any splits > >introduced in the frame, which you precompute as a kind > >of table lookup of state-change lists. These changes > >will generally be scattered through index memory, > >leading to bad PC-memory cache behavior. But you > >hope that only a tiny fraction of the indices need > >to be updated per frame (this is a very ROAM-like > >hope ;-). > > This is actually not a problem for the CPU, because you > never ever read from the index list. That means all you > ever do is writes, and you can just fire them off, and > they get retired asynchronously, and the CPU never stalls. > If you want to get fancy you can even tell the CPU cache > not to mirror that memory in cache, just write straight > to main memory. In DX8, there are "index buffers" that can be (and usually will be) placed in AGP memory, so this behaviour will apply automagically - AGP memory is uncached, and is written to using the writeback queue. I am sure there will be similar optimisations under OpenGL if possible. Yes, the data structures being (the collapse/expand data) is strictly linear, and so well-cached. At the moment, the destination is in system memory and fairly random, and so is poorly cached (sadly on x86 architectures, the memory is read into the cache, even if you only ever write to it), but this is a feature of the API rather than the hardware, and once they move to AGP memory (DX8 is released in about a month or two), this will be sorted. > The wonderful Athlon chip can have 32 > outstanding queue'd stores, so this is no problemo. The > problem would only arise if you write and read from the same > data structure, which we would never do. Note that you must, > however, wait a while before rendering after you make your > index changes, or your AGP DMA will stall on the CPU stores > finishing. If this were the bottleneck, we'd be golden. Since drivers typically do around a frame's worth of data buffering anyway, this is almost never a problem. The writes will be finished well before the chip needs them. [snip] > For example, consider the "skip strips" of El-Sana et. al. Whoops - just realised that in my previous mails, I've been using "skipstrips" in an ambiguous way. What I mean is "VIPM skipstrips", i.e. VIPM strips that just keep the existing strip, but make some tris in the middle of it degenerate when they collapse an edge. I don't mean that the V_D_PM part of El-Sana et. al. is used, just the trick of using strips. [snip] > Charles Bloom www.cbloom.com Tom Forsyth - Muckyfoot bloke. Whizzing and pasting and pooting through the day. |