From: Nicolai H. <nha...@gm...> - 2010-04-05 15:06:34
|
On Sun, Apr 4, 2010 at 10:42 PM, Luca Barbieri <luc...@gm...> wrote: >> Way back I actually looked into LLVM for R300. I was totally >> unconvinced by their vector support back then, but that may well have >> changed. In particular, I'm curious about how LLVM deals with >> writemasks. Writing to only a select subsets of components of a vector >> is something I've seen in a lot of shaders, but it doesn't seem to be >> too popular in CPU-bound SSE code, which is probably why LLVM didn't >> support it well. Has that improved? >> >> The trouble with writemasks is that it's not something you can just >> implement one module for. All your optimization passes, from simple >> peephole to the smartest loop modifications need to understand the >> meaning of writemasks. > > You should be able to just use > shufflevector/insertelement/extractelement to mix the new computed > values with the previous values in the vector register (as well as > doing swizzles). Okay, that looks good. > There is also the option of immediately scalarizing, optimizing the > scalar code, and then revectorizing. > This risks pessimizing the input code, but might turn out to work well. This might depend on the target: R600+, for example, is quite scalar-oriented anyway (modulo a lot of subtle limitations), so just pretending that everything is scalar could work well there since revectorizing is almost unnecessary. cu, Nicolai |