From: David K. <de...@ko...> - 2001-03-26 16:03:08
|
Klaus Niederkrueger wrote: > Hi, > > I have been thinking about making a faster 'log_2' function by programming > it myself instead of using the one in libc. > I started by looking at the gl_sqrt function that is being used in the > Mesa-library in the file 'src/mmath.c'. > > I made a small program to test both the speed and the precision and at > home on a K6-2 with Debian-2.1 I had a rather terrible result: If I did > not use the compiler-option '-fstrict-aliasing' the libc-sqrt is almost > double as fast as the 'gl_sqrt', while the precision is o.k. 1% error or > so. There are a number of issues here. You can't just play with compiler optimization flags willy-nilly. Furthermore, I think you need to be a bit more careful with your numerical analysis (1% error in sqrt is NOT acceptable). You don't mean "precision", you mean "accuracy", and for those of where it matters that the image quality is high, accuracy of square root is very important. > > > But with '-fstrict-aliasing', which was used in my auto-generated Makefile > for Mesa, the results for sqrt are for some numbers up to 10 times greater > than the once generated by gl_sqrt, while the speed for normal sqrt is > still higher. > Did you read the man page for gcc and learn what -fstrict-aliasing does? I wouldn't use that feature for math-intensive routines. > > So I wanted to write you and tell this but then I just tried my program > here on a PIII whith RH7.0 I got complete different results: gl_sqrt is a > bit faster than sqrt, and 'strict-aliasing' has no effect on speed or > results. The error in the results is always up to 20%. If I use > '-mcpu=pentium-pro -march=pentium-pro' the precision is completely wrong > (20 or 30 digit number instead of 1.000). > RH7.0 has a different version of the compiler, modified by Red Hat. An there were a lot of changes in the compiler since the version in Debian 2.1. A LOT of changes. What do you mean "the precision is completely wrong" when you use "-mcpu-pentium-pro -march=pentium-pro"? So it's a 20 or 30 digit number ... what's the error? Binary representation of floating point is a very, very ugly numerical situation, it's not surprising to me you get some garbage off to the right- what really matters is the raw error, between the numerically correct answer and your approximate solution. You should also remote -ffast-math from your compile line, it's a dubious routine, and I'm sure there is at least one place where Mesa violates the rules that -ffast-math assumes you're following. |