|
From: Bill H. <goo...@go...> - 2009-03-02 23:26:51
|
David tells me that in fact zn_poly *does* reuse the FFT. I got mixed up there. I tried some larger divisions and more of them to minimise the effect of random generation. But it is basically inconclusive. There isn't a lot of difference in speed between FLINT and zn_poly for this algorithm now that everything else has been made faster. Sometime FLINT appears to be a few seconds faster (in a 100 or 200 second test) and sometimes zn_poly is faster. This is not very scientific of me, and zn_poly probably does beat FLINT for polynomials of certain sizes, so I'll just leave zn_poly in for now for this algorithm. As far as I am concerned, that brings the inclusion of zn_poly into FLINT to a close. It makes a pretty impressive improvement. Here are the best time improvements (some times - not listed - appear to have gone up slightly, but that is surely to do with random seeds - some of the times below are obviously decreased far too much to be to do with randomness): Testing zmod_poly_mul()... Cpu = 2660 ms Wall = 2668 ms ... ok Testing zmod_poly_mul()... Cpu = 2310 ms Wall = 2318 ms ... ok Testing zmod_poly_mul_trunc_left_n()... Cpu = 2600 ms Wall = 2601 ms ... ok Testing zmod_poly_mul_trunc_left_n()... Cpu = 2520 ms Wall = 2511 ms ... ok Testing zmod_poly_scalar_mul()... Cpu = 220 ms Wall = 218 ms ... ok Testing zmod_poly_scalar_mul()... Cpu = 170 ms Wall = 167 ms ... ok Testing zmod_poly_make_monic()... Cpu = 440 ms Wall = 440 ms ... ok Testing zmod_poly_make_monic()... Cpu = 330 ms Wall = 331 ms ... ok Testing zmod_poly_divrem_classical()... Cpu = 1640 ms Wall = 1645 ms ... ok Testing zmod_poly_divrem_classical()... Cpu = 1020 ms Wall = 1028 ms ... ok Testing zmod_poly_div_classical()... Cpu = 1870 ms Wall = 1877 ms ... ok Testing zmod_poly_div_classical()... Cpu = 1180 ms Wall = 1186 ms ... ok Testing zmod_poly_div()... Cpu = 5760 ms Wall = 5759 ms ... ok Testing zmod_poly_div()... Cpu = 5340 ms Wall = 5337 ms ... ok Testing zmod_poly_divrem_divconquer()... Cpu = 1720 ms Wall = 1722 ms ... ok Testing zmod_poly_divrem_divconquer()... Cpu = 1340 ms Wall = 1338 ms ... ok Testing zmod_poly_div_divconquer()... Cpu = 1950 ms Wall = 1949 ms ... ok Testing zmod_poly_div_divconquer()... Cpu = 1460 ms Wall = 1461 ms ... ok Testing zmod_poly_div_divconquer_recursive()... Cpu = 1940 ms Wall = 1935 ms ... ok Testing zmod_poly_div_divconquer_recursive()... Cpu = 1490 ms Wall = 1489 ms ... ok Testing zmod_poly_half_gcd_iter()... Cpu = 2310 ms Wall = 2307 ms ... ok Testing zmod_poly_half_gcd_iter()... Cpu = 1910 ms Wall = 1913 ms ... ok Testing zmod_poly_gcd_hgcd()... Cpu = 6670 ms Wall = 6667 ms ... ok Testing zmod_poly_gcd_hgcd()... Cpu = 5390 ms Wall = 5390 ms ... ok Testing zmod_poly_half_gcd()... Cpu = 1970 ms Wall = 1974 ms ... ok Testing zmod_poly_half_gcd()... Cpu = 1650 ms Wall = 1645 ms ... ok Testing zmod_poly_gcd()... Cpu = 3830 ms Wall = 3835 ms ... ok Testing zmod_poly_gcd()... Cpu = 3130 ms Wall = 3137 ms ... ok Testing zmod_poly_gcd_invert()... Cpu = 5190 ms Wall = 5186 ms ... ok Testing zmod_poly_gcd_invert()... Cpu = 4590 ms Wall = 4585 ms ... ok Testing zmod_poly_xgcd()... Cpu = 4410 ms Wall = 4413 ms ... ok Testing zmod_poly_xgcd()... Cpu = 3800 ms Wall = 3797 ms ... ok Testing zmod_poly_powmod()... Cpu = 4470 ms Wall = 4470 ms ... ok Testing zmod_poly_powmod()... Cpu = 4030 ms Wall = 4032 ms ... ok Testing zmod_poly_isirreducible()... Cpu = 4700 ms Wall = 4704 ms ... ok Testing zmod_poly_isirreducible()... Cpu = 3790 ms Wall = 3791 ms ... ok Testing zmod_poly_factor_berlekamp()... Cpu = 1720 ms Wall = 1721 ms ... ok Testing zmod_poly_factor_berlekamp()... Cpu = 1420 ms Wall = 1412 ms ... ok Testing zmod_poly_factor()... Cpu = 2910 ms Wall = 2925 ms ... ok Testing zmod_poly_factor()... Cpu = 2680 ms Wall = 2681 ms ... ok Testing zmod_poly_2x2_mat_mul_classical_strassen()... Cpu = 790 ms Wall = 783 ms ... ok Testing zmod_poly_2x2_mat_mul_classical_strassen()... Cpu = 450 ms Wall = 453 ms ... ok Testing zmod_poly_2x2_mat_mul()... Cpu = 2280 ms Wall = 2285 ms ... ok Testing zmod_poly_2x2_mat_mul()... Cpu = 1540 ms Wall = 1535 ms ... ok Bill. 2009/3/2 Bill Hart <goo...@go...>: > I've tried some different cutoffs and the cutoff used seems OK. > > The test timings are somewhat dependent on the random seed, which > changes when the code is recompiled. It might be that. > > Or it might be that FLINT uses a better algorithm for the inversion. > It can use the same FFT for truncated multiply and middle product, > which zn_poly can't. That may account for it. > > I'll experiment with some longer runs later tonight. Gotta go out now. > > Bill. > > 2009/3/2 Bill Hart <goo...@go...>: >> I've now switched to zn_poly's newton inversion for polynomials of >> length > 2000. >> >> Times: >> >>> BEFORE: >> >>> Testing zmod_poly_divrem_newton()... Cpu = 2480 ms Wall = 2487 ms ... ok >>> Testing zmod_poly_divrem()... Cpu = 8510 ms Wall = 8506 ms ... ok >>> Testing zmod_poly_div_classical()... Cpu = 1160 ms Wall = 1166 ms ... ok >>> Testing zmod_poly_div()... Cpu = 4560 ms Wall = 4561 ms ... ok >>> Testing zmod_poly_divrem_divconquer()... Cpu = 1300 ms Wall = 1307 ms ... ok >>> Testing zmod_poly_div_divconquer()... Cpu = 1430 ms Wall = 1431 ms ... ok >>> Testing zmod_poly_div_divconquer_recursive()... Cpu = 1420 ms Wall = >>> 1421 ms ... ok >>> Testing zmod_poly_newton_invert_basecase()... Cpu = 470 ms Wall = 468 ms ... ok >>> Testing zmod_poly_newton_invert()... Cpu = 9700 ms Wall = 9699 ms ... ok >>> Testing zmod_poly_div_series()... Cpu = 1200 ms Wall = 1202 ms ... ok >>> Testing zmod_poly_div_newton()... Cpu = 1090 ms Wall = 1088 ms ... ok >>> Testing zmod_poly_gcd_euclidean()... Cpu = 6300 ms Wall = 6300 ms ... ok >>> Testing zmod_poly_half_gcd_iter()... Cpu = 1870 ms Wall = 1868 ms ... ok >>> Testing zmod_poly_gcd_hgcd()... Cpu = 5240 ms Wall = 5243 ms ... ok >>> Testing zmod_poly_half_gcd()... Cpu = 1630 ms Wall = 1630 ms ... ok >>> Testing zmod_poly_gcd()... Cpu = 3070 ms Wall = 3068 ms ... ok >>> Testing zmod_poly_gcd_invert()... Cpu = 4410 ms Wall = 4410 ms ... ok >>> Testing zmod_poly_xgcd()... Cpu = 3700 ms Wall = 3704 ms ... ok >>> Testing zmod_poly_resultant_euclidean()... Cpu = 840 ms Wall = 839 ms ... ok >>> Testing zmod_poly_mulmod()... Cpu = 2340 ms Wall = 2339 ms ... ok >>> Testing zmod_poly_powmod()... Cpu = 3630 ms Wall = 3629 ms ... ok >>> Testing zmod_poly_isirreducible()... Cpu = 3730 ms Wall = 3731 ms ... ok >>> Testing zmod_poly_factor_berlekamp()... Cpu = 1390 ms Wall = 1390 ms ... ok >>> Testing zmod_poly_factor_square_free()... Cpu = 2740 ms Wall = 2734 ms ... ok >>> Testing zmod_poly_factor()... Cpu = 2360 ms Wall = 2359 ms ... ok >>> >> >> AFTER: >> >> Testing zmod_poly_divrem_newton()... Cpu = 2970 ms Wall = 2970 ms ... ok >> Testing zmod_poly_divrem()... Cpu = 9650 ms Wall = 9655 ms ... ok >> Testing zmod_poly_div_classical()... Cpu = 1170 ms Wall = 1164 ms ... ok >> Testing zmod_poly_div()... Cpu = 5240 ms Wall = 5242 ms ... ok >> Testing zmod_poly_divrem_divconquer()... Cpu = 1310 ms Wall = 1316 ms ... ok >> Testing zmod_poly_div_divconquer()... Cpu = 1440 ms Wall = 1438 ms ... ok >> Testing zmod_poly_div_divconquer_recursive()... Cpu = 1430 ms Wall = >> 1430 ms ... ok >> Testing zmod_poly_newton_invert_basecase()... Cpu = 470 ms Wall = 469 ms ... ok >> Testing zmod_poly_newton_invert()... Cpu = 8470 ms Wall = 8466 ms ... ok >> Testing zmod_poly_div_series()... Cpu = 1430 ms Wall = 1430 ms ... ok >> Testing zmod_poly_div_newton()... Cpu = 1400 ms Wall = 1404 ms ... ok >> Testing zmod_poly_gcd_euclidean()... Cpu = 6300 ms Wall = 6297 ms ... ok >> Testing zmod_poly_half_gcd_iter()... Cpu = 1860 ms Wall = 1862 ms ... ok >> Testing zmod_poly_gcd_hgcd()... Cpu = 5270 ms Wall = 5272 ms ... ok >> Testing zmod_poly_half_gcd()... Cpu = 1620 ms Wall = 1621 ms ... ok >> Testing zmod_poly_gcd()... Cpu = 3090 ms Wall = 3090 ms ... ok >> Testing zmod_poly_gcd_invert()... Cpu = 4520 ms Wall = 4520 ms ... ok >> Testing zmod_poly_xgcd()... Cpu = 3700 ms Wall = 3693 ms ... ok >> Testing zmod_poly_resultant_euclidean()... Cpu = 830 ms Wall = 837 ms ... ok >> Testing zmod_poly_mulmod()... Cpu = 2920 ms Wall = 2915 ms ... ok >> Testing zmod_poly_powmod()... Cpu = 3960 ms Wall = 3966 ms ... ok >> Testing zmod_poly_isirreducible()... Cpu = 3760 ms Wall = 3754 ms ... ok >> Testing zmod_poly_factor_berlekamp()... Cpu = 1400 ms Wall = 1398 ms ... ok >> Testing zmod_poly_factor_square_free()... Cpu = 3090 ms Wall = 3091 ms ... ok >> Testing zmod_poly_factor()... Cpu = 2640 ms Wall = 2637 ms ... ok >> >> I guess the FLINT crossover is wrong for zn_poly. The newton_invert >> time went down, but the time for everything else that relies on it >> went up. I'll have to have a fiddle this evening. >> >> Bill. >> > |