From: Marek O. <ma...@gm...> - 2010-04-03 23:09:58
|
On Sat, Apr 3, 2010 at 9:31 AM, Tom Stellard <tst...@gm...> wrote: > 1. Enable branch emulation for Gallium drivers: > The goal of this task will be to create an optional "optimization" pass > over the TGSI code to translate branch instructions into instructions > that are supported by cards without hardware branching. The basic > strategy for doing this translation will be: > > A. Copy values of in scope variables > to a temporary location before executing the conditional statement. > > B. Execute the "if true" branch. > > C. Test the conditional expression. If it evaluates to false, rollback > all values that were modified in the "if true" branch. > > D. Repeat step 2 with the "if false" branch, and then step 3, but this > time only rollback if the conditional expression evaluates to true. > > The TGSI instructions SLT, SNE, SGE, SEQ will be used to test the > conditional expression and the instruction CND will be used to rollback > the values. > > There will be two phases to this task. For phase 1, I will implement a > simple translator that will be able to translate the branch instructions > with only one pass through the TGSI code. This simple translator will > copy all in scope variables to a temporary location before executing the > conditional statement, even if those variables will not not be modified > in either of the branches. > > Phase 2 will add a preliminary pass before to the code translation > pass that will mark variables that might be modified by the conditional > statement. Then, during the translation pass, only the variables that > could potentially be modified inside either of the conditional branches > will be copied before the conditional statement is executed. > First I really appreciate you're looking into this. I'd like to propose something doable in GSoC timeframe. Since Nicolai has already implemented the branch emulation and some other optimizations, it would be nice to take over his work. I tried to use the branch emulation on vertex shaders and it did not work correctly, I guess it needs little fixing. See this branch in his repo: http://cgit.freedesktop.org/~nh/mesa/log/?h=r300g-glsl<http://cgit.freedesktop.org/%7Enh/mesa/log/?h=r300g-glsl> Especially this commit implements exactly what you propose (see comments in the code): http://cgit.freedesktop.org/~nh/mesa/commit/?h=r300g-glsl&id=71c8d4c745da23b0d4f3974353b19fad89818d7f<http://cgit.freedesktop.org/%7Enh/mesa/commit/?h=r300g-glsl&id=71c8d4c745da23b0d4f3974353b19fad89818d7f> Reusing this code for Gallium seems more reasonable to me than reinventing the wheel and doing basically the same thing elsewhere. I recommend implementing a TGSI backend in the r300 compiler, which will make possible using it with TGSI shaders. So basically a TGSI shader would be converted to the RC representation the way it's done in r300g right now, and code for converting RC -> hw code would get replaced by conversion RC -> TGSI. Both RC and TGSI are very similar so it'll be pretty straightforward. With a TGSI backend, another step would be to make a nice hw-independent and configurable interface on top of it which should go to util. So far it's simple, now comes some real work: fixing the branch emulation and continuing from (2) in your list. Then it'll be up to developers of other drivers whether they want to implement their own hw-specific optimization passes and lowering transformations. Even linking various shaders would be much easier done with the compiler (and more efficient with its elimination of dead-code due to removed shader outputs/inputs), this is used in classic r300 and I recall Luca wanted such a feature in nouveau drivers. There is also an emulation of shadow samplers, WPOS, and an emulation of various instructions, so this is a nice and handy tool. (I would do it but I have a lot of more important stuff to do.) This may really help Gallium drivers until a real optimization framework emerges. -Marek |