|
From: William H. <ha...@ya...> - 2008-11-01 23:28:13
|
Half GCD ======== The half-gcd algorithm has been implemented for Z/pZ[x] in zmod_poly. I did some timings for large polynomials and it seems to keep up with Magma. I also modified the modular gcd algorithm in fmpz_poly to use the new half-gcd. But this didn't speed things up much. I discovered this was due to not using a modular division algorithm. I also coded that up. Here is the current graph comparing GCD in Z[x] with Magma (ignore the title, it's wrong): http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd6.png I've linked it against the new eMPIRe package to obtain this graph. I believe the red region on the left (where Magma wins) is probably due to their implementation of the heuristic GCD algorithm. I do not know the cause of the red region on the right of the graph (where the polynomials are very long). There are still some inefficiencies in the code (for example I do one scalar multiplication in the code which is unnecessary). However it may just be due to Magma packing their polynomials with more than one coefficient per limb to improve cache performance. I will need to take the graph out much further to see if things continue to get worse or whether there is just some constant factor difference as things get larger. F_mpz_poly ========== I have realise that I have implemented F_mpz_poly (and F_mpz_mat) incorrectly. Currently each polynomial is implemented as an array of limbs. Each limb either represents a signed integer of 62 bits of less, or it represents an index into an array of mpz_t's if the coefficient is larger. The problem is that I implemented it so that there is a different array for every polynomial. But this is silly as it means that to multiply two coefficients together for example, the interface has to be: F_mpz_poly_coeff_mul(poly_result, coeff_result, poly1, coeff1, poly2, coeff2) It's even worse for matrices: F_mpz_mat_entry_mul(mat_res, row_res, col_res, mat1, row1, col1, mat2, row2, col2) Instead I should abstract out an F_mpz module, where a coefficient is just a limb. If the second most significant bit is set it represents not a signed integer of 62 bits, but an index into a single flint wide array of fmpz_t's. Then the above functions become the single: F_mpz_mul(res, F_mpz_1, F_mpz_2) I still marvel at the fact that I didn't think of doing this from the start. And what is worse, this will be the 8th time I have tried to implement a FLINT large integer format!! Bill. |