I've looked at that crossbar patch for r200 again and improved it a bit.
It will now
- disable texture sampling of units if the result is not used
- reorder tex env instructions to be always in-order on the gpu
(according to earlier tests, this can make a performance difference,
I've yet to find an app which doesn't enable the units in-order, the
only thing in real world I've found which doesn't was a marbleblastdemo,
and it only doesn't because it fails the texture completeness test, not
because it actually doesn't enable the unit...)
- tries to optimize away env instructions. This is not a general
optimizer, which would be very hard to do anyway and more or less
impossible due to the requirement of OpenGL to clamp the results after
each stage, but it will try to ditch the tex env if it is GL_REPLACE
(for both rgb and alpha) by replacing the args in the next tex env.
Seems to work, for instance ut2003 sometimes uses tex envs with 4 units
enabled, and the optimizer reduces this to 3 sampled textures, and 2 env
instructions. Impressive, isn't it? Unfortunately this makes absolutely
no difference in performance... (ut2003 is horribly limited by vertex
throughput with the current state of the driver, and anything which
causes more cpu cycles to be used will probably make it slower, no
matter how many gpu cycles this might save, plus I believe these tex
envs which can be optimized are only used for small parts of the screen
It MIGHT make more of a performance difference with radeon 8500/9100, as
those can sample more textures per pass (at least under some
circumstances afaik), but have the same amout of arithmetic resources
Does this look somewhat reasonable? The code is a bit ugly (especially
the GL_REPLACE env optimize stuff), I don't like that the env args have
to be parsed two times, and it does cause some more cpu cycles spent
(roughly 2.5 times as much as previously in the driver's tex env
functions according to some quick profiling, it was still only 0.2
percent or so however). But there doesn't seem to be a good way to clean
it up (without making it quite a bit slower at least).
Get latest updates about Open Source Projects, Conferences and News.