Another option, even if you can’t find a much faster
approximation, is to unroll your loop 4 times and calculate 4 pows at the same
time using SSE/VMX.

**From:** Jeff Russell
[mailto:jeffdr@8monkeylabs.com]

**Sent:** 19 August 2010 2:16am

**To:** Game Development Algorithms

**Subject:** [Algorithms] fast pow() for limited inputs

So I need to speed up the CRT pow() function a bit, but I
have some restrictions on the input which hopefully should give me some room to
optimize:

I need to compute pow(x,y), where x is on [0,1], and y is guaranteed positive
and could have an upper bound in the neighborhood of 1000.

I need "reasonable" accuracy (could be a little looser than the
standard pow() implementation). I've searched online and found some bit
twiddling approaches that claim to be very fast, but they seem to be too
inaccurate for my purposes. I've tried implementing pow() as exp( log(x), y ),
with my own cheap Taylor series in place of the natural log function. It did
produce good output but wasn't very fast (slightly slower than the CRT pow()).
It is probably worth mentioning before anyone asks that yes I have confirmed
pow() as the bottleneck with a profiling tool ;-)

I would also love to just see a sample implementation of pow(), log(), and
exp() somewhere, even that might be helpful.

Thanks,

Jeff

--

Jeff Russell

Engineer, 8monkey Labs

www.8monkeylabs.com