From: James S. <arr...@gm...> - 2006-09-04 22:05:32
|
I have a dual head ATI system (One GPU, two Render SPUs), which performs very poorly. After much profiling I have determined that it is spending a lot of time (near 99.4%) in fglrx.so. I find that the functions in this library are mostly invoked by functions related to the processing of unrolled glDrawElemnts commands (glArrayElement calls are causing this CPU usage). I wrote a simple test application. My results show poor performance by the ATI GPU when two concurrent processes use VBOs and glArrayElement to draw objects. As I only need to be rendering to one of the ATI heads at a time, I think a possible solution is to filter out unneeded glDrawElements commands. This could be done by checking the rendering window rectangle against the rectangle of each monitor. If the rectangles intersect, we would set the Pack Buffer to thread->buffer[current_server] and then do what we normally do to translate and pack the command for that server. Repeat for each server. When done, set the Pack buffer back to thread->geometry_buffer (This is what it was before, right)? This would prevent the glDrawElement command from effecting servers that do not have the GL rendering window on them. Is dropping glDrawElements commands for render SPUs who's monitors don't intersect the OpenGL output window acceptable practice? Will it cause problems for downstream SPUs? What is the best method to integrate such optimizations into Chromium? I am only aware of 2 types of pack buffers in the tilesort spu, the geometry_buffer, and the server specific buffers. Are there any others I should know about? Thank you for your time, James Steven Supancic III |