From: Roland S. <sr...@tu...> - 2007-02-12 15:52:07
|
Roland Scheidegger wrote: >>> Rune Petersen >>> >> Ok commited. > > I didn't look too closely at this but I've a couple of comments. > - COS looks too complicated & broken. If you'd want to get 2 with a > LOG2, you'd need 0.25 as source. But even using RCP instead, that's 5 > instructions before performing the sine, for something you can easily do > in two, using another constant (just 1 add + 1 cmp needed, if you use > the right constants for the add). Maybe it's not that bad though, I > don't know how many rgb and a slots it will actually consume, but still, > are constant slots that rare? > Second, you'd really need to do range reduction of the input, otherwise > results will be very wrong for inputs outside [-pi, pi]. This would be > true for taylor approximation too, of course, unless you do an infinite > series :-). You wouldn't need to do that for SCS. Oh, and forgot to mention, you probably really want to use the higher precision variant by default. 12% max relative error (and even absolute it's still 6%) will likely be visible in some cases depending what the shader is doing. Even the enhanced version seems to miss opengl conformance (accurate to "about 1 part in 10^5") by roughly a factor of 10, which stretches the meaning of "about" a bit probably already. You could also rely on the precision hint for fragment programs to switch to the faster version instead of a dri conf option (note though the spec explicitly states implementations are discouraged even in this case to perform optimizations which could have significant impact on the output). Roland |