Re: [atlas-devel] -mfpmath=387 (snrm2 miscompiled on x86_64)
Brought to you by:
rwhaley,
tonyc040457
From: Alexey T. <at...@al...> - 2006-12-20 17:05:54
|
On Wed, Dec 13, 2006 at 08:16:45AM -0600, Clint Whaley wrote: > >So... If you don't want to break LAPACK test suite, always use > >-mfpmath=3D387. >=20 > Let me ask if maybe you used the arch defaults, but changed the compiler > flags? If so, you can get this problem. The arch def use the x87 for Yes, I changed the default and removed -mfpmath=3D387. Actually I've got generic Make.inc that takes a few arguments from RPM (like TOPdir =3D $(RPM_BUILD_DIR)/ATLAS ). So this was like my fault... But I managed to resolve that myself. So... this is not support, but for some reason I'm just trying various things... > AMDs. Not only is it as fast as scalar SSE (faster in practice), but it > also gets 80 instead of 64-bit accuracy. Now, ATLAS has some kernels that > do NRM2 w/o special overflow detections, and these only work when impleme= nted > in higher precision (as in 80-bit x87). They are **much** faster than the > versions implemented in normal precision (which must use computationally > expensive tricks such as sum of squares), so the ATLAS NRM2 tuner tries > them, and only uses them when no overflow occurs. However, if you take t= he > arch defaults, this tuning does not occur, and the x87-tuned kernels are > used. If you then switch the compiler flags, of course the library will > now fail. Thanks for explanation. I think I've seen that long thread about i387 performance dropdown on bugs.gcc.org or somewhere. I just did not realise that it could actually break something in x86_64. |