Re: [Fastlibnt-devel] ssfft progress report

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I've now also run the profiles on my old G3 laptop. It really 
screams........ typically 30-40% faster than the old code; and there 
are a number of regions where it's consistently 60-100% faster (i.e. 
1.6 - 2.0x faster). There don't appear to be any regions at all where 
it's consistently slower.

david

On Apr 16, 2007, at 11:41 PM, David Harvey wrote:

> I've been running some profiles of some of the new development ssfft 
> code (in the ssfft3 branch) against the trunk version.
>
> It's generally looking pretty good. I've done profiles on sage (= 
> sage.math), martinj (= jason martin's machine), bsd (= william stein's 
> xeon), and my G5 (haven't got results for that yet). The target 
> function is ssfft_fft_iterative(), which is designed to handle 
> L1-sized transforms, using a plain iterative FFT. In particular it's 
> intended for use with short coefficients where bitshift factoring is 
> inappropriate, and small enough transforms that FFT factoring is not 
> necessary. I've been running it for transform lengths M = 16, 32, 64, 
> 128, 256, 512, 1024, with a range of truncation parameters, and 
> coefficient lengths n = 1, 2, 3, 4, 6, 8 limbs.
>
> For lengths >= 256, it is unconditonally faster than the old code, on 
> all platforms (modulo a few random data points). The speedups are 
> typically:
>
> sage: 5-15%
> bsd: 15-25%
> martinj: 15-25%
>
> Some combinations get speedups of up to 40%.
>
> For lengths 64 and 128, there are a few problem areas, particularly 
> for n = 3 on all platforms, although mostly it's still ahead of the 
> old code.
>
> Length 32 and below is really a mixed bag. I find this surprising. 
> This code should work particularly well on small problems.
>
> On all platforms, apart from a few outliers, the new code was never 
> worse than 10% slower than the old code.
>
> I still have some investigation to do to figure out what's going on in 
> the slower regions. But generally I'm pretty happy so far.
>
> david
>