Another option, even if you can’t find a much faster
approximation, is to unroll your loop 4 times and calculate 4 pows at the same
time using SSE/VMX.
From: Jeff Russell
Sent: 19 August 2010 2:16am
To: Game Development Algorithms
Subject: [Algorithms] fast pow() for limited inputs
So I need to speed up the CRT pow() function a bit, but I
have some restrictions on the input which hopefully should give me some room to
I need to compute pow(x,y), where x is on [0,1], and y is guaranteed positive
and could have an upper bound in the neighborhood of 1000.
I need "reasonable" accuracy (could be a little looser than the
standard pow() implementation). I've searched online and found some bit
twiddling approaches that claim to be very fast, but they seem to be too
inaccurate for my purposes. I've tried implementing pow() as exp( log(x), y ),
with my own cheap Taylor series in place of the natural log function. It did
produce good output but wasn't very fast (slightly slower than the CRT pow()).
It is probably worth mentioning before anyone asks that yes I have confirmed
pow() as the bottleneck with a profiling tool ;-)
I would also love to just see a sample implementation of pow(), log(), and
exp() somewhere, even that might be helpful.
Engineer, 8monkey Labs