Re: [Algorithms] VIPM With T&L - what about roam
Brought to you by:
vexxed72
From: Mark D. <duc...@ll...> - 2000-09-14 18:07:09
|
Tom, So the picture you and Charles are painting is this (see if I get this right): 1) send verts attrribs (x,y,z,u,v,...) and index arrays across the AGP bus each frame, and let the textures and frame buffer dominate the on-card memory bandwidth. If all your lighting is done in textures/normal maps, and if you use tri bintree meshes per surface "patch", then the info per vert is (x,y,z): 3 floats=12 bytes, plus (u,v): 2 shorts=4 bytes, total=16 bytes. Since you are sending each vert across the AGP bus, and there is only a dinky little cache on the other end, you have to be very careful to arrange the order the verts are indexed to avoid sending them multiple times. This is of coarse dependent on what the hardware's replacement strategy is and what the cache size is. 2) In the simple VIPM scheme, just send three indices per tri, and verts and tris are listed in the arrays in progressive split order, the indices need to be updated for triangles whose vertices moved in any splits introduced in the frame, which you precompute as a kind of table lookup of state-change lists. These changes will generally be scattered through index memory, leading to bad PC-memory cache behavior. But you hope that only a tiny fraction of the indices need to be updated per frame (this is a very ROAM-like hope ;-). The state-change info takes 14 bytes per vert according to Charles' web page, so you are almost doubling the mem per vert. If your progressive scheme was tri bintree split-only order, then there is no additional storage for index changes, you just know what they are based on which diamonds (which correspond one-to-one with verts) are split (Charles alluded to this on his VIPM page). Per frame index transmission across the AGP bus per tri is 3 shorts=6 bytes. So if you are very lucky and send each vertex (16 bytes) once, then on average you have 2 tris per vert and so 16+12=28 bytes per vert including indices. If you are at "infinite strip optimum" you get each vertex sent twice, leading to 2*16+12=44 bytes sent per vert including indices. Let's imagine you using a graphics chip capable of 30M tris/sec, and you want to actually achieve this (ha!): this would mean pumping 840-1320M bytes/sec over the bus. Okay, the bus can handle this in theory on AGP4x (1GB/sec) on the wildly optimistic side, but not in any real situation. Also, this is sucking up a big chunk of your PC memory system bandwith *continuously*, so the rest of your app is going to take a performance nose-dive. So...does it make sense to put some geometry info into graphics-card mem? Of course the optimistic scenario requires extreme care in the order the tris are listed and indexed. Since this is not a single static mesh, you have to come up with index orders that are best for the whole range of surfaces you get, not just one. If you really want to minimize the number of times verts are sent, you need to allow much more index manipulation per frame to optimize, whether via precomputed state-change lists or through some yet unknown on-the-fly technique. 3) In the "stripped" version of VIPM, cover the chunk with strips (chosen in a particular way?) and fiddle with the indices just as in case (2). But keep drawing the same strips. This means you send the index data for the whole chunk at full res. This limits how big a swing in resolution you can have in a chunk before this cost dominates. If you are at full res then the index cost is 1/3 of case (2). If you are at 1/3 res then the cost is the same as case (2). Since you are trying to force strip order, then you will generally do no better than case (2) for vertex on-card cache coherence, and probably worse. The card has to expend some effort in theory to eliminate the degenerate tris, although this could be negligible for a good card/driver. I don't see this as either a big win or big loss versus the simple scheme, so I would tend to go simple. Of course, you could use the incremental stripping idea from the ROAM paper, which works on any locally updating mesh including PM. Since you are clearly hoping for coherence in the index-update step, this is a cheap way to make pretty good strips. Plus it avoids the issue of loosing vertex-cache coherence for any but the one mesh you optmized for. --Mark D. Tom Forsyth wrote: > Oops. Yes. > > Tom Forsyth - Muckyfoot bloke. > Whizzing and pasting and pooting through the day. > > > From: Tony Cox [mailto:to...@mi...] > > > > >In practice the extra bandwidth of the indices is pretty > > tiny. One DWORD > > per > > >tri? Peanuts, considering your average fairly efficiant > > vertex-caching > > >scheme will need to load a whole vertex (around 32 DWORDS) per tri. > > > > You mean 32 BYTEs not DWORDs, right Tom? Still much bigger > > than the index > > data though. Your other comments about lists versus strips > > for VIPM seem > > right on the money. > > > > Tony Cox - DirectX Luminary |