Re: [Algorithms] fast pow() for limited inputs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Depending on accuracy needs, a look-up table with interpolation (say,
cubic interpolation) can be just fine and dandy. Consider the
0-255/0-255 texture typically used to approximate pow() for specular
lighting back in the days; it would be 65 kB in size and thus fit in
L2 on a modern CPU (if you use bytes to represent that 0..1 range).

Another, even faster, and even worse, approximation is simply to make
a line from (1,1) that intersects y=0 somewhere between x=0 and x=1,
and move it farther to the right for higher exponents. It all depends
on what kind of precision you need this for.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world.
Nevertheless, whether we get there willingly or not, we shall soon
have lower consumption rates, because our present rates are
unsustainable.

On Thu, Aug 19, 2010 at 11:17 AM, Fabian Giesen <ry...@gm...> wrote:
> On 19.08.2010 10:57, Robin Green wrote:
>> On Wed, Aug 18, 2010 at 11:35 PM, Fabian Giesen<ry...@gm...>  wrote:
>>>
>>>> I would also love to just see a sample implementation of pow(), log(),
>>>> and exp() somewhere, even that might be helpful.
>>>
>>> glibc math implementations are in sysdeps/ieee754 for generic IEEE-754
>>> compliant platforms, with optimized versions for all relevant
>>> architectures in sysdeps/<arch>. If you really want to know how it's
>>> implemented :)
>>
>>
>> What he said.
>>
>> Also, take a look at the CEPHES library for platform agnostic
>> reference implementations of the C math functions and some extras like
>> cotangent, cuberoot and integer powers:
>>
>>      http://www.netlib.org/cephes/
>>
>> And here's an X86 specific implementation of powf() that claims to be
>> faster (than what, it doesn't say):
>>
>>     http://www.xyzw.de/c190.html
>
> Now that's interesting :). I wrote most of that header file, around 2000
> or so. It's faster than what used to be the standard pow()
> implementation on x86 (as in the VC++ 6.0 runtime library), using fscale
> (that method is still used for sFExp below). This is all code for 64k
> intros so it was optimized for size originally, but pow was a bottleneck
> during texture generation, and Agner Fogs version was 20-30% faster if I
> recall correctly. (This was back when P3s were the norm though, no idea
> how it looks now). The main change is to replace the fscale (which used
> to be very slow on some processors) with a longer code sequence that's
> faster.
>
> The original code sequence used to be commented out before the "//
> faster pow" comment, but I guess that got removed at some point :).
>
> Since VS2002 or 2003, the C library contains a much better pow()
> implementation (using SSE on processors that support it) that should be
> faster than this code. It's also a lot bigger though.
>
> -Fabian
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by
>
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> http://p.sf.net/sfu/RIM-dev2dev
> _______________________________________________
> GDAlgorithms-list mailing list
> GDA...@li...
> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list
> Archives:
> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list
>