Re: [Algorithms] fast pow() for limited inputs
Brought to you by:
vexxed72
From: Fabian G. <ry...@gm...> - 2010-08-19 18:17:42
|
On 19.08.2010 10:57, Robin Green wrote: > On Wed, Aug 18, 2010 at 11:35 PM, Fabian Giesen<ry...@gm...> wrote: >> >>> I would also love to just see a sample implementation of pow(), log(), >>> and exp() somewhere, even that might be helpful. >> >> glibc math implementations are in sysdeps/ieee754 for generic IEEE-754 >> compliant platforms, with optimized versions for all relevant >> architectures in sysdeps/<arch>. If you really want to know how it's >> implemented :) > > > What he said. > > Also, take a look at the CEPHES library for platform agnostic > reference implementations of the C math functions and some extras like > cotangent, cuberoot and integer powers: > > http://www.netlib.org/cephes/ > > And here's an X86 specific implementation of powf() that claims to be > faster (than what, it doesn't say): > > http://www.xyzw.de/c190.html Now that's interesting :). I wrote most of that header file, around 2000 or so. It's faster than what used to be the standard pow() implementation on x86 (as in the VC++ 6.0 runtime library), using fscale (that method is still used for sFExp below). This is all code for 64k intros so it was optimized for size originally, but pow was a bottleneck during texture generation, and Agner Fogs version was 20-30% faster if I recall correctly. (This was back when P3s were the norm though, no idea how it looks now). The main change is to replace the fscale (which used to be very slow on some processors) with a longer code sequence that's faster. The original code sequence used to be commented out before the "// faster pow" comment, but I guess that got removed at some point :). Since VS2002 or 2003, the C library contains a much better pow() implementation (using SSE on processors that support it) that should be faster than this code. It's also a lot bigger though. -Fabian |