From: Keith W. <ke...@pr...> - 2000-04-21 22:34:56
|
OK, for anyone who's interested: I've committed (to the DRI tree under the mga-0-0-3-branch tag) a new even faster path for MGA setupdma (indexed vertex buffers). I see a 10-15% speedup at the q3dm1 spawn point at 640x480 (more benchmarks later). The path is pretty simple, and looks something like this: - do obj->clip transform - cliptest and project - walk the clipmask array - for each unclipped vertex, emit that vertex to a waiting dma buffer - walk the element list and identify triangles to render - if triangle is unclipped, emit 3 indices to dma - if triangle is clipped: - build 3 clipspace vertices - perform clipping - project and emit any newly created vertices to dma - emit indices. Buffers are organized so that indices are emitted from low addresses up (the way you would expect them to), and vertices are emitted from the other end of the buffer, growing downwards. There are a couple of restrictions in the current implementation, which may be difficult to remove: - indices are emitted to hardware as the physical addresses of the referenced vertex in agp space. In order to calculate these, I emit unclipped vertices to a contiguous piece of agp space, allowing a simple relationship between element and physical address. In effect this requires that all the unclipped vertices can fit in a single dma buffer. Quake3 calls this path on arrays of up to 1024 vertices, vertices are 10 dwords but must be 4-dword aligned, thus effectively 12 dwords/vertex. This means that dma buffers must be 64k or so in size for this path to be useful for q3. - there is no way to perform the vertex manipulations necessary to draw accelerated lines or points, or two-sided, flat-shaded, or other exotic triangles with this path. - mundane assembly issues add a requirement that the ModelProject matrix be "general" -- typically requiring a perspective transform. These are fairly small limitations, but do justify the continued existance of the fastpath mga hardware. I've been able to reuse the assembly support from the "main" mesa path (the slowpath?) in this code. By writing some new assembly a small additional speedup might be gained. Keith |