Soon Gerard's 30% speed optim on SSE code for new CPUs

Gerard did optimize the SSE calculation routine using AMD's Pipeline analyzer.
The speed improvements make FFFF ~30% faster on P4 and newer Athlons.
My tests on a P4@2.4GHz results in 375 MegaIters/sec (about 6.5 clock cycles per iteration). On my P3, the speed improvement is ~5%.

Posted by Daniele Paccaloni 2002-10-07

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks