You can subscribe to this list here.
| 2007 |
Jan
|
Feb
|
Mar
|
Apr
(118) |
May
(140) |
Jun
(56) |
Jul
(86) |
Aug
(4) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(94) |
Aug
(86) |
Sep
|
Oct
(3) |
Nov
(18) |
Dec
(27) |
| 2009 |
Jan
(15) |
Feb
(15) |
Mar
(27) |
Apr
(2) |
May
(1) |
Jun
(6) |
Jul
(10) |
Aug
(4) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
|
From: Bill H. <goo...@go...> - 2008-11-07 03:08:08
|
I implemented the heuristic gcd for polynomials whose bitsize is up to 63 bits. The result looks like this: http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd9.png Interestingly although it makes the bottom of the graph nice and blue, the big red patch on the left is still there. I think perhaps this is just a highly efficient euclidean algorithm, or perhaps a Lehmer GCD algorithm. The red on the right is nor due to FLINT. That is due to a lack of integer half-gcd in GMP. We're still working on fixing that. Bill. |
|
From: Bill H. <goo...@go...> - 2008-11-06 13:23:32
|
I plotted a new graph of FLINT fmpz_poly_gcd vs Magma. I'm using an older version of Magma, 2-13.5, and for lengths 2-16 Magma goes into an infinite loop for polynomials over a certain bit size. Hence there is a cutout in the graph to avoid this problem area. http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd8.png The point of this graph is to see what happens a bit further out. On the right hand side, one can clearly see that Magma is not asymptotically faster, but simply a constant faster for small bit sizes. This is purely a caching effect to do with the way FLINT and Magma store polynomials. In fmpz_poly FLINT currently stores polynomials using a minimum of 2 limbs per coefficient (one sign/size limb and one data limb). Magma uses half a limb. In zmod_poly, FLINT currently uses a minimum of 1 limb per coefficient and does not pack multiple coefficients into a limb as Magma does. Probably it is the second issue that is really costing us here as about 4.5 of 5.5 seconds for a very large poly gcd is taken up by the hgcd algorithm in Z/pZ[x]. So we need to pack some limbs. I'm now going to concentrate on the big red region to the left. I think that is due to Magma's implementation of the heuristic gcd. If practical I will implement this algorithm. The red at the top of the first column is due to a bug in eMPIRe which prevents us from taking advantage of the new integer half gcd code written by Niels Moller. Jason Martin. myself and Brian Gladman are working on fixing that. This is not a FLINT issue. Bill. |
|
From: Bill H. <goo...@go...> - 2008-11-02 17:52:49
|
Then again, I don't particularly understand why the red stripes appear in the middle of the graph without the extra scalar multiply. Bill. 2008/11/2 Bill Hart <goo...@go...>: > I removed the additional scalar multiplication in the fmpz_poly_gcd > algorithm. But it actually worked out to be slower: > > http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd7.png > > than with the multiply: > > http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd6.png > > The reason is almost certainly that the additional scalar multiply > avoids having two copies of the polynomial in memory at once (the > scalar division which the multiply is undoing, is done in place). Thus > it looks like things are very cache sensitive for long polynomials. > > This probably explains the extra red at the right side of the graph. > So Magma's packed format is making a difference for them. > > The new F_mpz_poly format does not have this drawback. > > Bill. > > 2008/11/1 William Hart <ha...@ya...>: >> Half GCD >> ======== >> >> The half-gcd algorithm has been implemented for Z/pZ[x] in zmod_poly. I did some timings for large polynomials and it seems to keep up with Magma. >> >> I also modified the modular gcd algorithm in fmpz_poly to use the new half-gcd. But this didn't speed things up much. I discovered this was due to not using a modular division algorithm. I also coded that up. >> >> Here is the current graph comparing GCD in Z[x] with Magma (ignore the title, it's wrong): >> >> http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd6.png >> >> I've linked it against the new eMPIRe package to obtain this graph. >> >> I believe the red region on the left (where Magma wins) is probably due to their implementation of the heuristic GCD algorithm. >> >> I do not know the cause of the red region on the right of the graph (where the polynomials are very long). There are still some inefficiencies in the code (for example I do one scalar multiplication in the code which is unnecessary). However it may just be due to Magma packing their polynomials with more than one coefficient per limb to improve cache performance. >> >> I will need to take the graph out much further to see if things continue to get worse or whether there is just some constant factor difference as things get larger. >> >> F_mpz_poly >> ========== >> >> I have realise that I have implemented F_mpz_poly (and F_mpz_mat) incorrectly. Currently each polynomial is implemented as an array of limbs. Each limb either represents a signed integer of 62 bits of less, or it represents an index into an array of mpz_t's if the coefficient is larger. >> >> The problem is that I implemented it so that there is a different array for every polynomial. But this is silly as it means that to multiply two coefficients together for example, the interface has to be: >> >> F_mpz_poly_coeff_mul(poly_result, coeff_result, poly1, coeff1, poly2, coeff2) >> >> It's even worse for matrices: >> >> F_mpz_mat_entry_mul(mat_res, row_res, col_res, mat1, row1, col1, mat2, row2, col2) >> >> Instead I should abstract out an F_mpz module, where a coefficient is just a limb. If the second most significant bit is set it represents not a signed integer of 62 bits, but an index into a single flint wide array of fmpz_t's. Then the above functions become the single: >> >> F_mpz_mul(res, F_mpz_1, F_mpz_2) >> >> I still marvel at the fact that I didn't think of doing this from the start. And what is worse, this will be the 8th time I have tried to implement a FLINT large integer format!! >> >> Bill. >> >> >> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >> Build the coolest Linux based applications with Moblin SDK & win great prizes >> Grand prize is a trip for two to an Open Source event anywhere in the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> flint-devel mailing list >> fli...@li... >> https://lists.sourceforge.net/lists/listinfo/flint-devel >> > |
|
From: Bill H. <goo...@go...> - 2008-11-02 17:51:22
|
I removed the additional scalar multiplication in the fmpz_poly_gcd algorithm. But it actually worked out to be slower: http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd7.png than with the multiply: http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd6.png The reason is almost certainly that the additional scalar multiply avoids having two copies of the polynomial in memory at once (the scalar division which the multiply is undoing, is done in place). Thus it looks like things are very cache sensitive for long polynomials. This probably explains the extra red at the right side of the graph. So Magma's packed format is making a difference for them. The new F_mpz_poly format does not have this drawback. Bill. 2008/11/1 William Hart <ha...@ya...>: > Half GCD > ======== > > The half-gcd algorithm has been implemented for Z/pZ[x] in zmod_poly. I did some timings for large polynomials and it seems to keep up with Magma. > > I also modified the modular gcd algorithm in fmpz_poly to use the new half-gcd. But this didn't speed things up much. I discovered this was due to not using a modular division algorithm. I also coded that up. > > Here is the current graph comparing GCD in Z[x] with Magma (ignore the title, it's wrong): > > http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd6.png > > I've linked it against the new eMPIRe package to obtain this graph. > > I believe the red region on the left (where Magma wins) is probably due to their implementation of the heuristic GCD algorithm. > > I do not know the cause of the red region on the right of the graph (where the polynomials are very long). There are still some inefficiencies in the code (for example I do one scalar multiplication in the code which is unnecessary). However it may just be due to Magma packing their polynomials with more than one coefficient per limb to improve cache performance. > > I will need to take the graph out much further to see if things continue to get worse or whether there is just some constant factor difference as things get larger. > > F_mpz_poly > ========== > > I have realise that I have implemented F_mpz_poly (and F_mpz_mat) incorrectly. Currently each polynomial is implemented as an array of limbs. Each limb either represents a signed integer of 62 bits of less, or it represents an index into an array of mpz_t's if the coefficient is larger. > > The problem is that I implemented it so that there is a different array for every polynomial. But this is silly as it means that to multiply two coefficients together for example, the interface has to be: > > F_mpz_poly_coeff_mul(poly_result, coeff_result, poly1, coeff1, poly2, coeff2) > > It's even worse for matrices: > > F_mpz_mat_entry_mul(mat_res, row_res, col_res, mat1, row1, col1, mat2, row2, col2) > > Instead I should abstract out an F_mpz module, where a coefficient is just a limb. If the second most significant bit is set it represents not a signed integer of 62 bits, but an index into a single flint wide array of fmpz_t's. Then the above functions become the single: > > F_mpz_mul(res, F_mpz_1, F_mpz_2) > > I still marvel at the fact that I didn't think of doing this from the start. And what is worse, this will be the 8th time I have tried to implement a FLINT large integer format!! > > Bill. > > > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > |
|
From: William H. <ha...@ya...> - 2008-11-01 23:28:13
|
Half GCD ======== The half-gcd algorithm has been implemented for Z/pZ[x] in zmod_poly. I did some timings for large polynomials and it seems to keep up with Magma. I also modified the modular gcd algorithm in fmpz_poly to use the new half-gcd. But this didn't speed things up much. I discovered this was due to not using a modular division algorithm. I also coded that up. Here is the current graph comparing GCD in Z[x] with Magma (ignore the title, it's wrong): http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd6.png I've linked it against the new eMPIRe package to obtain this graph. I believe the red region on the left (where Magma wins) is probably due to their implementation of the heuristic GCD algorithm. I do not know the cause of the red region on the right of the graph (where the polynomials are very long). There are still some inefficiencies in the code (for example I do one scalar multiplication in the code which is unnecessary). However it may just be due to Magma packing their polynomials with more than one coefficient per limb to improve cache performance. I will need to take the graph out much further to see if things continue to get worse or whether there is just some constant factor difference as things get larger. F_mpz_poly ========== I have realise that I have implemented F_mpz_poly (and F_mpz_mat) incorrectly. Currently each polynomial is implemented as an array of limbs. Each limb either represents a signed integer of 62 bits of less, or it represents an index into an array of mpz_t's if the coefficient is larger. The problem is that I implemented it so that there is a different array for every polynomial. But this is silly as it means that to multiply two coefficients together for example, the interface has to be: F_mpz_poly_coeff_mul(poly_result, coeff_result, poly1, coeff1, poly2, coeff2) It's even worse for matrices: F_mpz_mat_entry_mul(mat_res, row_res, col_res, mat1, row1, col1, mat2, row2, col2) Instead I should abstract out an F_mpz module, where a coefficient is just a limb. If the second most significant bit is set it represents not a signed integer of 62 bits, but an index into a single flint wide array of fmpz_t's. Then the above functions become the single: F_mpz_mul(res, F_mpz_1, F_mpz_2) I still marvel at the fact that I didn't think of doing this from the start. And what is worse, this will be the 8th time I have tried to implement a FLINT large integer format!! Bill. |
|
From: Bill H. <goo...@go...> - 2008-10-30 15:55:56
|
I switched on the new half-gcd code (it's in zmod_poly for polys in Z/pZ[x]) and connected it up to the gcd code in fmpz_poly (for polys in Z[x]). It didn't appear to make much difference, so I did some timings and found that the division code was too slow (it needs to do trial division to check whether the purported gcd actually divides both the original polynomials). I timed Magma and realised that its division code is actually too slow to be using that, so they must be doing something special (as usual), i.e. the trial division is taking many times longer than the GCD itself. I realised that a multimodular fmpz_poly_divides function would be much faster (there is no coefficient explosion with the multimodular approach). Essentially it works as follows: I want to know if f(x) divides g(x), where f and g are in Z[x] 0) If f is zero I return true and the quotient 0 1) If g is zero I raise an exception 2) If g has trailing zero coefficients, I check that f also has at least that many trailing zero coefficients. If not I return false. 3) I deal with the special case where the only nonzero coefficient of g is the leading coefficient, by calling another division algorithm (multimodular division cannot handle this case as the remainder is always zero in Z/pZ[x]) 4) I then compute the quotient q of the leading coefficients of f and g. If there is a nonzero remainder, I return false. 5) I find a prime p which does not divide either of the leading coefficients of f and g, of either 53 or 62 bits (30 bits on a 32 bit machine) depending on the sizes of f and g, and reduce f and g mod p and do the division in Z/pZ[x]. If there is a nonzero remainder in Z/pZ[x], I return false. 6) If the number of bits of the maximum coefficient of g plus the number of bits of the prime is more than the number of bits of f, we may be able to lift the quotient in Z/pZ[x] to a quotient in Z[x], I check whether we are done by multiplying the purported quotient by g and seeing if I get f. If so, I return true and the quotient. 7) If we are not done yet I find another prime p and do the same thing except that now I do chinese remainder recombination to build up the quotient Q in Z[x] that we are after 8) Once the chinese remaindering stabilises I check whether the purported quotient is really the quotient by multiplying out. The only thing I may not be doing correctly is I currently do not stop once one reaches the Mignotte bound. I am presuming that one will always eventually find a prime p for which the quotient in Z/pZ[x] will have a nonzero remainder if f doesn't divide g. This is certainly not the case for the polynomials I deal with in step 3, but perhaps there are other examples where this can happen. I have not thought of a complete proof yet. Anyhow, replacing the trial division in the gcd algorithm with this new multimodular fmpz_poly_divides_modular() algorithm results in the following comparison with Magma. The blue patches are where FLINT wins. The bottom axis gives log_2(length) and the vertical axis gives log_2(bits). I take three polynomials f, g and h in Z[x] with the given number of bits per coefficient and with the given length and compute gcd(f*g, f*h) (so that the GCD is generically f). http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/gcd.png Some tuning and tightening up of the code may eliminate some of the remaining red and strengthen some of the blue. I also have not implemented Lehmer's gcd for the base case, which should speed everything up fairly dramatically by a constant factor. I also need the heuristic gcd algorithm to get rid of one of the big patches of red on the left. It is not clear whether Magma's packed polynomial representation may also be helping when the coefficients are less than 16 bits in f, g and h. Bill. |
|
From: Bill H. <goo...@go...> - 2008-10-28 20:51:40
|
I think I figured out why my gcd code for Z[x] is still slow compared with Magma, even with half-gcd implemented for Z/pZ[x]. I timed the gcd in Z/pZ[x] on its own and it is now faster than Magma. So the problem is with the modular gcd algorithm in Z[x]. I did some profiling and it spends most of its time in checking by division that the gcd actually divides both the input polynomials. I'm pretty sure the reason this is so slow is because Magma uses a modular algorithm for this division. It must compute the quotient modulo a certain number of primes, until it stabilises. Then it multiplies the quotient by the divisor to see if it equals the dividend. I guess if it is not equal then this doesn't prove anything. In that situation one might be able to prove that it doesn't divide some other way. However the generic case will of course be that it does divide, since almost all the time the gcd algorithm gets things right when the chinese remaindeing stabilises. I'll have to code up this modular division in the next few days. That should dramatically speed things up. Bill. |
|
From: Bill H. <goo...@go...> - 2008-10-27 13:33:03
|
Hi Burcin and Andy, Welcome to the list. We'll take flint development discussions on list again. There are also numerous other people signed up who probably are wondering what is happening with FLINT recently. For their benefit I'll give a summary here. * FLINT 1.1 should be released within a few weeks. The list of new features is rather immense, and I'll post it here and to the Sage list when the release is announced. But in short, we will have polynomial factoring over Z/pZ (due to Richard Howell-Peak), polynomial composition and evaluation over Z, derivative and half GCD (asymptotically fast GCD). * The current status is that only half GCD remains to be implemented. This is what I am currently working on. I have begun the implementation by coding up 2x2 polynomial matrix multiplication including Strassen, which is faster for matrices whose entries are polynomials with more than 20 terms. After implementing half GCD a handful of test functions need to be written and general cleaning up , build testing, valgrinding etc needs to be done before the release. Half gcd is not just one algorithm however. There are numerous half gcd functions to code up and the same tricks can be used for computing the resultant. I hope to have a basic half gcd algorithm working by the end of today. * Andy and I are currently working on implementing his factoring algorithm in Z[x]. To do this we need LLL, p-adic arithmetic, Zassenhaus' factoring algorithm * The status of this is that I have completed a basic matrix module (F_mpz_mat) and rewritten the "fast" implementation of LLL in Damien Stehle's fpLLL to use F_mpz_mat. For small integers this is faster than using mpz_t's. It passes the "test" code, but a handful of functions still don't have test functions. * Peter Shrimpton and I are currently "working" on writing code to test vast numbers of integers below 10^17 for primality using various primality tests. This code will run on a cluster for many months, looking for a counter example to a certain conjecture (which is worth a modest sum of money) and searching for a faster unconditional primality test for integers below the limit (about 10^17). * I have been "working" on a new polynomial library F_mpz_poly (which will eventually be renamed fmpz_poly and replace the existing module of that name - the interface will be kept as close to the same as possible). This uses a different integer format than fmpz_poly. Essentially coefficients of 62 bits or below are stored in a single machine word. Larger coefficients are stored as mpz_t's. This works well and is actually faster than fmpz_poly for polynomials of more than about length 12. This work is about 1/3 of the way through (including writing of test code) and will form the basis of FLINT 2.0. I'm still unhappy that there isn't something between fmpz_poly and F_mpz_poly, but I've thought about it for months now and there just isn't a more efficient way to do it that I can think of. Perhaps I will end up leaving both fmpz_poly and F_mpz_poly in FLINT. I'm not sure yet. Bill. |
|
From: William H. <ha...@ya...> - 2008-08-25 15:50:32
|
Can you take a look at the current code and test code, because when I use anything other than a = 1 and b = 1, it doesn't declare all primes to be pseudoprime. I did try to fix some minor issues with the original code, but maybe I screwed something up. As for your question: If I want to do a/2 mod p, then if a is even I just do a/2, if a is odd I just do (a+p)/2. No inverse mod p is necessary. Therefore all I need to do is: 1) let beg = beginning mod p 2) let s = p - beg (let s = 0 if beg = 0) 3) compute s/2 mod p. Bill. Bill. --- Peter Shrimpton <ps...@gm...> wrote: > Hello. Sorry for the late response. > > The first version should work for everything except > for a=1,b=1 and a=-1,b=1 > although I am not sure how 'good' it is for large > values of a and b. (i.e. > how many composites get through). I noticed that the > "bug" I thought was in > the code was actually me misreading the $620 > problem. There appear to be two > different versions: > > 1) a composite congruent to 3 or 7 mod 10 which > passes both a > 2-Fermat-pesudoprime test and a Lucas test with > parameters a=1,b=-1 > > 2) any composite which passes a 2-Strong-pesudoprime > test and the second > version of the Lucas test. > > I think the first is significantly quicker > especially when a and b are fixed > because we can avoid doing the gcd and issquare > tests. > > Also I did some more tests on the Montgomery > powering that I wrote with > mixed results. I did random 2-Fermat-pesudoprime > tests to numbers of various > lengths and found that it was quicker.......but only > when the numbers > reached 60 bits in length. We are only dealing with > numbers between 53 and > 57 bits. > > I am currently working on optimizing the sieve so > that it only looks at odd > numbers but still factorizes the even numbers. I > need to calculate the > smallest j such that: > > beginning+2*j = 0 mod p > > I have tried using z_invert and modular arrithmethic > but it seams to be > slower then a simple while (begining+2*j mod p !=0) > j++; and there is no > guarantee that they will give the smallest j which > is required. Is there a > quicker way of doing it? > > Peter > > 2008/8/23 Bill Hart <goo...@go...> > > > Hi Peter, > > > > I merged you Lucas pseudoprime tests and wrote > test code. However I > > don't know what values of a and b are allowed in > the first version of > > the test. > > > > I removed some minor bugs, and it also seems that > P was being set to 1 > > in the second test and therefore not being used, > so I removed it. > > > > The second version works beautifully!! > > > > Bill. > > > > 2008/7/25 Peter Shrimpton <ps...@gm...>: > > > Hello > > > > > > Attached is the code that I have written to be > attached to long_extras > > along > > > with the test code. Everything is passing the > tests and I am confident > > that > > > everything is working. I am fairly certain that > some things can be > > optimized > > > more which I will look into. Also could I ask > for gcc to be upgraded on > > the > > > server so that I can begin programing in OpenMP. > > > > > > Peter > > > > > > > ------------------------------------------------------------------------- > > > This SF.Net email is sponsored by the Moblin > Your Move Developer's > > challenge > > > Build the coolest Linux based applications with > Moblin SDK & win great > > > prizes > > > Grand prize is a trip for two to an Open Source > event anywhere in the > > world > > > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > > _______________________________________________ > > > flint-devel mailing list > > > fli...@li... > > > > https://lists.sourceforge.net/lists/listinfo/flint-devel > > > > > > > > > > > ------------------------------------------------------------------------- > > This SF.Net email is sponsored by the Moblin Your > Move Developer's > > challenge > > Build the coolest Linux based applications with > Moblin SDK & win great > > prizes > > Grand prize is a trip for two to an Open Source > event anywhere in the world > > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > > flint-devel mailing list > > fli...@li... > > > https://lists.sourceforge.net/lists/listinfo/flint-devel > > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your > Move Developer's challenge > Build the coolest Linux based applications with > Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source > event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/> _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > |
|
From: Peter S. <ps...@gm...> - 2008-08-25 12:18:48
|
Hello. Sorry for the late response. The first version should work for everything except for a=1,b=1 and a=-1,b=1 although I am not sure how 'good' it is for large values of a and b. (i.e. how many composites get through). I noticed that the "bug" I thought was in the code was actually me misreading the $620 problem. There appear to be two different versions: 1) a composite congruent to 3 or 7 mod 10 which passes both a 2-Fermat-pesudoprime test and a Lucas test with parameters a=1,b=-1 2) any composite which passes a 2-Strong-pesudoprime test and the second version of the Lucas test. I think the first is significantly quicker especially when a and b are fixed because we can avoid doing the gcd and issquare tests. Also I did some more tests on the Montgomery powering that I wrote with mixed results. I did random 2-Fermat-pesudoprime tests to numbers of various lengths and found that it was quicker.......but only when the numbers reached 60 bits in length. We are only dealing with numbers between 53 and 57 bits. I am currently working on optimizing the sieve so that it only looks at odd numbers but still factorizes the even numbers. I need to calculate the smallest j such that: beginning+2*j = 0 mod p I have tried using z_invert and modular arrithmethic but it seams to be slower then a simple while (begining+2*j mod p !=0) j++; and there is no guarantee that they will give the smallest j which is required. Is there a quicker way of doing it? Peter 2008/8/23 Bill Hart <goo...@go...> > Hi Peter, > > I merged you Lucas pseudoprime tests and wrote test code. However I > don't know what values of a and b are allowed in the first version of > the test. > > I removed some minor bugs, and it also seems that P was being set to 1 > in the second test and therefore not being used, so I removed it. > > The second version works beautifully!! > > Bill. > > 2008/7/25 Peter Shrimpton <ps...@gm...>: > > Hello > > > > Attached is the code that I have written to be attached to long_extras > along > > with the test code. Everything is passing the tests and I am confident > that > > everything is working. I am fairly certain that some things can be > optimized > > more which I will look into. Also could I ask for gcc to be upgraded on > the > > server so that I can begin programing in OpenMP. > > > > Peter > > > > ------------------------------------------------------------------------- > > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > > Build the coolest Linux based applications with Moblin SDK & win great > > prizes > > Grand prize is a trip for two to an Open Source event anywhere in the > world > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > > flint-devel mailing list > > fli...@li... > > https://lists.sourceforge.net/lists/listinfo/flint-devel > > > > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > |
|
From: Bill H. <goo...@go...> - 2008-08-23 13:32:53
|
Hi Peter, I merged you Lucas pseudoprime tests and wrote test code. However I don't know what values of a and b are allowed in the first version of the test. I removed some minor bugs, and it also seems that P was being set to 1 in the second test and therefore not being used, so I removed it. The second version works beautifully!! Bill. 2008/7/25 Peter Shrimpton <ps...@gm...>: > Hello > > Attached is the code that I have written to be attached to long_extras along > with the test code. Everything is passing the tests and I am confident that > everything is working. I am fairly certain that some things can be optimized > more which I will look into. Also could I ask for gcc to be upgraded on the > server so that I can begin programing in OpenMP. > > Peter > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > > |
|
From: Bill H. <goo...@go...> - 2008-08-20 17:15:19
|
Sorry, I mean the numbers can exceed the size of a double, not an unsigned long.
Bill.
2008/8/20 William Hart <ha...@ya...>:
> Hi Peter,
>
> This is very interesting. There is a fairly dramatic
> jump at 1122004669633. This is the point where 5
> pseudoprime tests are used instead of 4, and not 10^16
> where Miller Rabin is used.
>
> The reason for the jump is that SPRP64 has to be used
> instead of SPRP since the numbers can exceed the size
> of an unsigned long.
>
> The Lucas test seems to be very good, and given that a
> counterexample to the conjecture that it plus a 2-SPRP
> test is a primality test is worth money, we may as
> well use it in FLINT for our probably prime testing up
> to 2^64.
>
> We'll use Pocklington-Lehmer for z_isprime after
> 10^16, I think. I'll be very interested to see what
> the time comparison is like for that, not that it
> matters, because we currently don't have any test in
> FLINT which is guaranteed to separate composites from
> primes after 10^16.
>
> Bill.
>
> --- Peter Shrimpton <ps...@gm...> wrote:
>
>> Hello
>>
>> Attached are spreadsheets (one excel the other
>> openoffice) showing the
>> relative speeds of z_isprime and a Baillie-PSW test.
>> In contradiction to my
>> earlier conversation with Bill (I was getting the
>> colours confused) the
>> Baillie-PSW test does perform a lot better then then
>> z_isprime as soon as
>> z_isprime has to do 5 pseudprime tests. However
>> whist testing this I found
>> that there is a bug in my lucas test code which
>> makes it return a handfull
>> of composites as prime. It only seems to affect
>> about 10 or so numbers which
>> are all 'quite small' (about 8 digits long).
>>
>> Also I have managed to get mongomery powering
>> working for unsigned longs
>> (see attached). I realised that as you only need the
>> first few bits for some
>> of the multiplications and so the whole opperation
>> can fit into 2 unsigned
>> longs. Currently it is slightly slower than z_powmod
>> in most places although
>> I can occasionally get improvements. I think It
>> depends on the relative
>> sizes of the parameters and/or how many 1's or 0's
>> they have but as there
>> are 3 of them I have no idea how to test it
>> completely to see which cases it
>> is faster.
>>
>> Peter
>> > #include <gmp.h>
>> #include "flint.h"
>> #include "long_extras.h"
>> #include <stdio.h>
>> #include "longlong.h"
>>
>> unsigned long mprod_precomp(unsigned long x,
>> unsigned long y, unsigned long n, unsigned long
>> nprime, unsigned long r1, unsigned long bits)
>> {
>> unsigned long z,high,low,xyh,xyl;
>>
>> umul_ppmm(xyh,xyl,x,y);
>> z=(xyl)&r1;
>> z=(z*(nprime))&r1;
>> umul_ppmm(high,low,z,n);
>> add_ssaaaa(high,low,xyh,xyl,high,low);
>> z=(low>>bits)+(high<<(FLINT_BITS-bits));
>>
>>
>> if (z>n){
>> return z-n;
>> }
>> return z;
>> }
>>
>>
>> unsigned long mpow(unsigned long const x, unsigned
>> long const y, unsigned long const n)
>> {
>> int bits=FLINT_BIT_COUNT(n);
>> unsigned long r,xhat,phat,power,nprime;
>> pre_inv2_t ninv;
>> r=(1L<<bits);
>> power=1L<<FLINT_BIT_COUNT(y);
>> ninv=z_precompute_inverse(n);
>> nprime=r-z_invert(n,r);
>> phat=z_mod2_precomp(r,n,ninv);
>> xhat=z_mulmod2_precomp(x,phat,n,ninv);
>> r=r-1;
>>
>> while (power)
>> {
>> phat=mprod_precomp(phat,phat,n,nprime,r,bits);
>> if ((y&power)>0){
>> phat=mprod_precomp(phat,xhat,n,nprime,r,bits);
>> }
>> power=power>>1;
>> }
>> return mprod_precomp(phat,1L,n,nprime,r,bits);
>>
>> }
>>
>> int main()
>> {
>> unsigned long x,y,n;
>> int i,lengthx,lengthy,lengthn,minlen,maxlen;
>> minlen=60;
>> maxlen=63;
>> //x=2L;
>>
>> for (lengthx=minlen; lengthx<maxlen; lengthx++)
>> {
>> for (lengthy=minlen; lengthy<maxlen; lengthy++)
>> {
>> for (lengthn=minlen; lengthn<maxlen; lengthn++)
>> {
>> //printf("%d^%d mod
>> %d\n",lengthx,lengthy,lengthn);
>> for (i=0; i<100000; i++)
>> {
>> x=z_randbits(lengthx)+1;
>> //y=z_randbits(lengthy)+1;
>> n=2*z_randbits(lengthn-1)+1;
>> y=n-1;
>> //mpow(x,y,n);
>> z_powmod2(x,y,n);
>> }
>> }
>> }
>> }
>> }
>>
>> >
> -------------------------------------------------------------------------
>> This SF.Net email is sponsored by the Moblin Your
>> Move Developer's challenge
>> Build the coolest Linux based applications with
>> Moblin SDK & win great prizes
>> Grand prize is a trip for two to an Open Source
>> event anywhere in the world
>>
> http://moblin-contest.org/redirect.php?banner_id=100&url=/>
> _______________________________________________
>> flint-devel mailing list
>> fli...@li...
>>
> https://lists.sourceforge.net/lists/listinfo/flint-devel
>>
>
>
>
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> flint-devel mailing list
> fli...@li...
> https://lists.sourceforge.net/lists/listinfo/flint-devel
>
|
|
From: William H. <ha...@ya...> - 2008-08-20 17:08:10
|
Hi Peter,
This is very interesting. There is a fairly dramatic
jump at 1122004669633. This is the point where 5
pseudoprime tests are used instead of 4, and not 10^16
where Miller Rabin is used.
The reason for the jump is that SPRP64 has to be used
instead of SPRP since the numbers can exceed the size
of an unsigned long.
The Lucas test seems to be very good, and given that a
counterexample to the conjecture that it plus a 2-SPRP
test is a primality test is worth money, we may as
well use it in FLINT for our probably prime testing up
to 2^64.
We'll use Pocklington-Lehmer for z_isprime after
10^16, I think. I'll be very interested to see what
the time comparison is like for that, not that it
matters, because we currently don't have any test in
FLINT which is guaranteed to separate composites from
primes after 10^16.
Bill.
--- Peter Shrimpton <ps...@gm...> wrote:
> Hello
>
> Attached are spreadsheets (one excel the other
> openoffice) showing the
> relative speeds of z_isprime and a Baillie-PSW test.
> In contradiction to my
> earlier conversation with Bill (I was getting the
> colours confused) the
> Baillie-PSW test does perform a lot better then then
> z_isprime as soon as
> z_isprime has to do 5 pseudprime tests. However
> whist testing this I found
> that there is a bug in my lucas test code which
> makes it return a handfull
> of composites as prime. It only seems to affect
> about 10 or so numbers which
> are all 'quite small' (about 8 digits long).
>
> Also I have managed to get mongomery powering
> working for unsigned longs
> (see attached). I realised that as you only need the
> first few bits for some
> of the multiplications and so the whole opperation
> can fit into 2 unsigned
> longs. Currently it is slightly slower than z_powmod
> in most places although
> I can occasionally get improvements. I think It
> depends on the relative
> sizes of the parameters and/or how many 1's or 0's
> they have but as there
> are 3 of them I have no idea how to test it
> completely to see which cases it
> is faster.
>
> Peter
> > #include <gmp.h>
> #include "flint.h"
> #include "long_extras.h"
> #include <stdio.h>
> #include "longlong.h"
>
> unsigned long mprod_precomp(unsigned long x,
> unsigned long y, unsigned long n, unsigned long
> nprime, unsigned long r1, unsigned long bits)
> {
> unsigned long z,high,low,xyh,xyl;
>
> umul_ppmm(xyh,xyl,x,y);
> z=(xyl)&r1;
> z=(z*(nprime))&r1;
> umul_ppmm(high,low,z,n);
> add_ssaaaa(high,low,xyh,xyl,high,low);
> z=(low>>bits)+(high<<(FLINT_BITS-bits));
>
>
> if (z>n){
> return z-n;
> }
> return z;
> }
>
>
> unsigned long mpow(unsigned long const x, unsigned
> long const y, unsigned long const n)
> {
> int bits=FLINT_BIT_COUNT(n);
> unsigned long r,xhat,phat,power,nprime;
> pre_inv2_t ninv;
> r=(1L<<bits);
> power=1L<<FLINT_BIT_COUNT(y);
> ninv=z_precompute_inverse(n);
> nprime=r-z_invert(n,r);
> phat=z_mod2_precomp(r,n,ninv);
> xhat=z_mulmod2_precomp(x,phat,n,ninv);
> r=r-1;
>
> while (power)
> {
> phat=mprod_precomp(phat,phat,n,nprime,r,bits);
> if ((y&power)>0){
> phat=mprod_precomp(phat,xhat,n,nprime,r,bits);
> }
> power=power>>1;
> }
> return mprod_precomp(phat,1L,n,nprime,r,bits);
>
> }
>
> int main()
> {
> unsigned long x,y,n;
> int i,lengthx,lengthy,lengthn,minlen,maxlen;
> minlen=60;
> maxlen=63;
> //x=2L;
>
> for (lengthx=minlen; lengthx<maxlen; lengthx++)
> {
> for (lengthy=minlen; lengthy<maxlen; lengthy++)
> {
> for (lengthn=minlen; lengthn<maxlen; lengthn++)
> {
> //printf("%d^%d mod
> %d\n",lengthx,lengthy,lengthn);
> for (i=0; i<100000; i++)
> {
> x=z_randbits(lengthx)+1;
> //y=z_randbits(lengthy)+1;
> n=2*z_randbits(lengthn-1)+1;
> y=n-1;
> //mpow(x,y,n);
> z_powmod2(x,y,n);
> }
> }
> }
> }
> }
>
> >
-------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your
> Move Developer's challenge
> Build the coolest Linux based applications with
> Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source
> event anywhere in the world
>
http://moblin-contest.org/redirect.php?banner_id=100&url=/>
_______________________________________________
> flint-devel mailing list
> fli...@li...
>
https://lists.sourceforge.net/lists/listinfo/flint-devel
>
|
|
From: William H. <ha...@ya...> - 2008-08-18 14:57:13
|
The timings certainly sound encouraging! I should mention that sometimes you can be misled about a timing due to optimisations the compiler is making. It might decide that it can rearrange things in a different order, and then you end up timing something you didn't intend to. I've certainly had this happen. Usually when you are making a call to a function in another file or module this is not a problem, but it is definitely worth checking by timing everything then timing just the bit you are not interested in, and taking the difference, to make sure the result is what you expect. Regarding primes, Magma has special code for primes p such that p-1 can be represented in 1, 2, 4, (possibly also 8 and 16), 22/23, 30 or 64 bits I believe. Strangely I've not been able to find variances at 53 bits which I found odd, since this is the size of a double. I would try a small prime, e.g. p = 5, one at 22 bits, one at 30 and one at about 62 or 63 bits. I suspect you'll be way ahead for large primes. Bill. --- Richard Howell-Peak <ric...@gm...> wrote: > Well I'm pretty sure I don't time how long it takes > to generate the > polynomial anyway since I start timing just before > and after I call > Factorisation. I had the iterations low just so it > would run at a reasonable > speed on my laptop, I was thinking maybe 10 or so > would be sufficient. > > I was wondering what data ranges would be good. I > was thinking of doing a > few primes, a small one, a medium one and a large > and comparing the output, > then maybe later in the week I'll do one that runs > through a whole load of > primes and makes a 2x2 array for each one which I > can make into a nice > picture. At the moment I've only been going up to > polynomials of length 250 > and it looks quite promising - sometimes flint is > way ahead, sometimes magma > has the edge and sometimes they're pretty close so > we'll see what happens. > > -- > Richard > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your > Move Developer's challenge > Build the coolest Linux based applications with > Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source > event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/> _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > |
|
From: Richard Howell-P. <ric...@gm...> - 2008-08-18 14:29:57
|
Well I'm pretty sure I don't time how long it takes to generate the polynomial anyway since I start timing just before and after I call Factorisation. I had the iterations low just so it would run at a reasonable speed on my laptop, I was thinking maybe 10 or so would be sufficient. I was wondering what data ranges would be good. I was thinking of doing a few primes, a small one, a medium one and a large and comparing the output, then maybe later in the week I'll do one that runs through a whole load of primes and makes a 2x2 array for each one which I can make into a nice picture. At the moment I've only been going up to polynomials of length 250 and it looks quite promising - sometimes flint is way ahead, sometimes magma has the edge and sometimes they're pretty close so we'll see what happens. -- Richard |
|
From: Bill H. <goo...@go...> - 2008-08-18 11:26:46
|
Well, apart from the fact that you'll get a much nicer graph the other
way, and the fact that you are being a little unfair to yourself by
not subtracting the time it takes to generate the random polynomials,
this looks like it should work.
It's up to you how you'd like to present your data. You definitely
want to use more than 3 iterations, otherwise the timings will be very
inconsistent. Also, you want to use a number of different *sized*
primes and length polynomials. I suspect you'll find Magma is way
faster for very small primes.
I have in the past profiled by hand on occasion, as it is "quicker".
The time function is alright I find down to times of about 10ms. Of
course you can increase the amount of time by putting everything in a
loop and timing the whole loop. You will of course get microsecond
timing if you use the FLINT code. Magma uses millisecond timing, but
the example profile code is set up to do all the loops for you and
figure out the individual times.
Fortunately factorisation takes a long time on average, so it
shouldn't be that hard to time.
Bill.
2008/8/18 Richard Howell-Peak <ric...@gm...>:
> ok, I will look into that. Before I read your reply I wrote this code which
> just does timings in Flint, then I was going to do some magma timings and do
> the graph in Excel. I guess proper timings are important officially for the
> project but if this way is quicker then I'd do it instead so I can get
> something on my poster. Is this a reasonable way to time the factorisation
> function? It seems to work for me but I'm not sure I trust the std C time
> library very much.
>
> here goes:
>
> #include <time.h>
> #include <sys/times.h>
> #include <unistd.h>
> /**
> * Main timing function for zmod_poly_factorise
> */
> #define MIN_LENGTH 250
> #define MAX_LENGTH 270 //Maximum length of polynomial to be timed
> #define NUMBER 3 //Number of random poly's we test to get average
> time
> void timing()
> {
> //Variables
> clock_t start, end;
> double total_time;
> double timings[MAX_LENGTH - MIN_LENGTH]; //store the average time of
> poly length [i]
> zmod_poly_t input;
> zmod_poly_factor_t factors;
>
> printf("Computing Timings \n");
> for(unsigned long k = MIN_LENGTH; k < MAX_LENGTH; ++k)
> {
> total_time = 0;
> for(unsigned long i = 0; i < NUMBER; ++i)
> {
> //Set up a polynomial
> zmod_poly_init(input, PRIME);
> randpoly(input, k, PRIME);
> zmod_poly_make_monic(input, input);
> //set up factors
> zmod_poly_factor_init(factors);
>
> //Begin
> printf(".");
> start = clock();
> //do stuff
> zmod_poly_factorise(factors, input);
> //End
> end = clock();
> printf("/");
> fflush(stdout);
> //Calc timing
> total_time = ((double) (end - start)) / CLOCKS_PER_SEC;
>
> //clear memory
> zmod_poly_clear(input);
> zmod_poly_factor_clear(factors);
> }
> total_time = ((double) total_time ) / NUMBER;
> timings[k - MIN_LENGTH] = total_time;
> }
>
> //output the data
> printf("\n\t \t Average Timings: \n");
> printf("Length \t Average Time \n");
> for(unsigned long k=0; k < MAX_LENGTH - MIN_LENGTH; ++k)
> printf("%lu \t %f \n", k + MIN_LENGTH, timings[k]);
> }
>
>
>
> --
> Richard
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> flint-devel mailing list
> fli...@li...
> https://lists.sourceforge.net/lists/listinfo/flint-devel
>
>
|
|
From: Richard Howell-P. <ric...@gm...> - 2008-08-18 10:43:28
|
ok, I will look into that. Before I read your reply I wrote this code which
just does timings in Flint, then I was going to do some magma timings and do
the graph in Excel. I guess proper timings are important officially for the
project but if this way is quicker then I'd do it instead so I can get
something on my poster. Is this a reasonable way to time the factorisation
function? It seems to work for me but I'm not sure I trust the std C time
library very much.
here goes:
#include <time.h>
#include <sys/times.h>
#include <unistd.h>
/**
* Main timing function for zmod_poly_factorise
*/
#define MIN_LENGTH 250
#define MAX_LENGTH 270 //Maximum length of polynomial to be timed
#define NUMBER 3 //Number of random poly's we test to get average
time
void timing()
{
//Variables
clock_t start, end;
double total_time;
double timings[MAX_LENGTH - MIN_LENGTH]; //store the average time of
poly length [i]
zmod_poly_t input;
zmod_poly_factor_t factors;
printf("Computing Timings \n");
for(unsigned long k = MIN_LENGTH; k < MAX_LENGTH; ++k)
{
total_time = 0;
for(unsigned long i = 0; i < NUMBER; ++i)
{
//Set up a polynomial
zmod_poly_init(input, PRIME);
randpoly(input, k, PRIME);
zmod_poly_make_monic(input, input);
//set up factors
zmod_poly_factor_init(factors);
//Begin
printf(".");
start = clock();
//do stuff
zmod_poly_factorise(factors, input);
//End
end = clock();
printf("/");
fflush(stdout);
//Calc timing
total_time = ((double) (end - start)) / CLOCKS_PER_SEC;
//clear memory
zmod_poly_clear(input);
zmod_poly_factor_clear(factors);
}
total_time = ((double) total_time ) / NUMBER;
timings[k - MIN_LENGTH] = total_time;
}
//output the data
printf("\n\t \t Average Timings: \n");
printf("Length \t Average Time \n");
for(unsigned long k=0; k < MAX_LENGTH - MIN_LENGTH; ++k)
printf("%lu \t %f \n", k + MIN_LENGTH, timings[k]);
}
--
Richard
|
|
From: Bill H. <goo...@go...> - 2008-08-18 10:10:46
|
Hi Richard, You could just obtain some timings in both and compare. If you wanted to do a more comprehensive timing then in the magma-profiles directory of the source tree are some example profiles and a template which you can modify. Basically you'll see in the examples that in one part of the code the entire thing is timed, e.g. suppose we are timing multiplication of polynomials, we'd time the entire thing, including multiplying the polynomials, random generation of the polynomials and setting them up. In the second part, we time just generating the random polynomials and setting them up. The latter time is subtracted from the former, leaving the time it took to multiply the polynomials. This is done for a range of different values and output to a file. For example to run the polynomial multiplication example, you have to type: magma -b poly-mult.m > magma.prof & To time FLINT is slightly trickier, but not much. You have to add a file called zmod_poly-profile.c, which you can get by modifying fmpz_poly-profile.c say. That file contains numerous examples and you should throw all of them away except one, which you can use as a template for ideas. Replace the randpoly function with one that generates random zmod_poly's with a modulus with a given number of bits and of a given length. You can probably throw away the run_triangle function. I'd replace it with a run_rectangle function which just runs from bits = 2 .. 63 and length from 1 .. 250 (there isn't much point going beyond 250 since we don't have asymptotically fast GCD yet). Keep for example one specific example, e.g. the functions: sample_fmpz_poly_mul_KS, profDriverString_fmpz_poly_mul_KS, profDriverDefaultParams_fmpz_poly_mul_KS, profDriver_fmpz_poly_mul_KS. Obviously you'll have to modify each, including all the names and text strings and constants so it does what you want and times you zmod_poly_factor function (which is the equivalent of what Magma does). You'll have to add some stuff to the makefile so it builds zmod_poly-profile.o and zmod_poly-profile. Finally to run the profile do: ./zmod_poly-profile -t 1 > flint.prof & You can of course see the last few lines of the profile files being generated using tail flint.prof or tail magma.prof. Once they are done, it is actually possible to create a colour graph in an automated way to compare the two. I'll describe how to do that once you are up to that stage. Most importantly, please retain all profiling code. It is very important when publicly comparing FLINT and Magma that people can see the precise profiling code used. This is very important for fairness and so others can check it. Bill. 2008/8/18 Richard Howell-Peak <ric...@gm...>: > I was wondering how best to do the time comparisons with magma for the > factorisation algorithm. Is there an interface in flint for calling magma > functions, or shall I get some timings in magma, then do some timings in > flint and compare them like that? > > -- > Richard > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > > |
|
From: Richard Howell-P. <ric...@gm...> - 2008-08-18 09:19:34
|
I was wondering how best to do the time comparisons with magma for the factorisation algorithm. Is there an interface in flint for calling magma functions, or shall I get some timings in magma, then do some timings in flint and compare them like that? -- Richard |
|
From: Richard Howell-P. <ric...@gm...> - 2008-08-15 15:10:50
|
I include here 3 things, the improved square-free algorithm, some
documentation for it and some test code which it seems to pass ok. I have
noticed however that some polynomials cause it to put in way too many powers
in the exponents part, I think it might be to do with this part of the code:
* if (res->num_factors) res->exponents[res->num_factors-1] *= i;
i++;
*which I am not even sure what it is doing or why. I think it is a problem
when you input a polynomial with factors of high multiplicity, which maybe
some of the exponents get multiplied when they shouldn't but I'm not really
sure. I've also written a zmod_poly_faactorise function which takes care of
factoring and seems to work whenever square-free does.
/**
* This function takes an arbitary polynomial and factorises it. It first
performs a square-free factorisation, then
* factorises all of the square free polynomails and returns the leading
coefficient, all the factors will be monic.
*/
unsigned long zmod_poly_factorise(zmod_poly_factor_t result, zmod_poly_t
input)
{
//This will store all of the square-free factors
zmod_poly_factor_t sqfree_factors;
zmod_poly_factor_init(sqfree_factors);
zmod_poly_t monic_input;
zmod_poly_init(monic_input, zmod_poly_modulus(input));
//Now we must make sure the input polynomial is monic. Get the highest
coeff and store it then call make monic
unsigned long leading_coeff = zmod_poly_get_coeff_ui(input,
zmod_poly_degree(input));
zmod_poly_make_monic(monic_input,input);
//Now we do the square-free factorisation of the input polynomial
zmod_poly_factor_square_free(sqfree_factors, monic_input);
//space to store factors
zmod_poly_factor_t* factors;
factors = (zmod_poly_factor_t*) malloc (sizeof (*factors) *
sqfree_factors->num_factors );
//Run berlekamp on each of the square-free factors
for(unsigned long i = 0; i < sqfree_factors->num_factors; ++i)
{
zmod_poly_factor_init(factors[i]);
zmod_poly_get_coeff_ui(sqfree_factors->factors[i],
zmod_poly_degree(sqfree_factors->factors[i]));
zmod_poly_make_monic(sqfree_factors->factors[i],sqfree_factors->factors[i]);
zmod_poly_factor_berlekamp(factors[i], sqfree_factors->factors[i]);
zmod_poly_factor_pow(factors[i], sqfree_factors->exponents[i]);
//Now add all the factors to the final array
zmod_poly_factor_concat(result, factors[i]);
//clean up the memory being used
zmod_poly_factor_clear(factors[i]);
}
//clean up memory
zmod_poly_factor_clear(sqfree_factors);
free(factors);
return leading_coeff;
}
/**
* Computes the square free factorisation of the given polynomial. Input
must be a monic polynomial. All the factors will be monic.
* @param f The polynomial to factorise.
* @param fac The factorisation into monic square free factors of
<code>f</code>.
*/
/**
* Square-Free Algorithm, takes an arbitary polynomial in F_p[X] and returns
an array of square free factors
* LOW MULTIPLICITIES
*/
void zmod_poly_factor_square_free(zmod_poly_factor_t res, zmod_poly_t f)
{
if (f->length <= 1)
{
res->num_factors = 0;
return;
}
if (f->length == 2)
{
zmod_poly_factor_add(res, f);
return;
}
unsigned long p = zmod_poly_modulus(f); //order of the field
unsigned long deg = zmod_poly_degree(f); //degree of the polynomial
//Step 1, look at f', if it is zero then we are done since f = h(x)^p
//for some particular h(x), clearly f(x) = sum a_k x^kp, k <= deg(f)
zmod_poly_t f_d, g, g_1;
zmod_poly_init(g_1, p);
zmod_poly_init(f_d, p);
zmod_poly_init(g, p);
zmod_poly_derivative(f_d, f);
//CASE 1:
if(zmod_poly_is_zero(f_d))
{
zmod_poly_t h;
zmod_poly_init(h, p);
for(unsigned long i = 0; i <= deg/p; ++i) //this will be an
integer since f'=0
{
zmod_poly_set_coeff_ui(h, i, zmod_poly_get_coeff_ui(f,
i*p));
}
//now run square-free on h, and return it to the pth power
zmod_poly_factor_t new_res;
zmod_poly_factor_init(new_res);
//h will be monic because the input is monic
zmod_poly_factor_square_free(new_res, h);
//now raise it to the power of p
zmod_poly_factor_pow(new_res, p);
zmod_poly_factor_concat(res, new_res); //note, concatenating
is equivalent to multiplying
}
else
{
zmod_poly_gcd(g, f, f_d);
zmod_poly_div(g_1, f, g);
unsigned long i = 1;
//CASE 2:
while (!zmod_poly_is_one(g_1))
{
zmod_poly_t h, z;
zmod_poly_init(h, p);
zmod_poly_init(z, p);
zmod_poly_gcd(h, g_1, g);
zmod_poly_div(z, g_1, h);
//Make sure the factor is monic
zmod_poly_make_monic(z,z);
//now add it to the factor array
zmod_poly_factor_add(res, z);
if (res->num_factors) res->exponents[res->num_factors-1] *= i;
i++;
zmod_poly_set(g_1, h);
zmod_poly_div(g, g, h);
}
if(!zmod_poly_is_one(g))
{
//so now we multiply res with square-free(g^1/p) ^ p
zmod_poly_t g_p; //g^(1/p)
zmod_poly_init(g_p, p);
for(unsigned long i = 0; i <= zmod_poly_degree(g)/p; i++)
zmod_poly_set_coeff_ui(g_p, i, zmod_poly_get_coeff_ui(g,
i*p));
zmod_poly_factor_t new_res_2;
zmod_poly_factor_init(new_res_2);
//make sure g_p is monic
zmod_poly_make_monic(g_p,g_p);
//square-free(g^(1/p))
zmod_poly_factor_square_free(new_res_2, g_p);
//now raise it to the power of p
zmod_poly_factor_pow(new_res_2, p);
zmod_poly_factor_concat(res, new_res_2);
}
}
}
int test_zmod_poly_factor_square_free()
{
int result = 1;
zmod_poly_t pol1,prod;
zmod_poly_factor_t res;
unsigned long bits;
unsigned long modulus;
for (unsigned long count1 = 0; (count1 < 10000) && (result == 1);
count1++)
{
bits = randint(FLINT_BITS-2)+2;
do {modulus = randprime(bits);} while (modulus < 2);
#if DEBUG
printf("bits = %ld, modulus = %ld\n", bits, modulus);
#endif
zmod_poly_init(pol1, modulus);
zmod_poly_init(prod, modulus);
zmod_poly_factor_init(res);
zmod_poly_set_coeff_ui(pol1, randint(20), 1L);
unsigned long k = zmod_poly_get_coeff_ui(pol1, zmod_poly_degree(pol1));
zmod_poly_factor_square_free(res, pol1);
//check the result
zmod_poly_factor_product(prod, res);
zmod_poly_scalar_mul(prod, prod, k);
result &= zmod_poly_equal(prod, pol1);
zmod_poly_clear(pol1);
zmod_poly_clear(prod);
zmod_poly_factor_clear(res);
}
return result;
}
--
Richard
|
|
From: Bill H. <goo...@go...> - 2008-08-15 12:13:20
|
Yep that's it. Thanks Peter. Note that you can't divide by zero mod p. Bill. On 15/08/2008, Peter Shrimpton <ps...@gm...> wrote: > if you are using unsigned longs then you can use z_invert(i,p) to get you > i^-1 mod p and then do c*(i^-1) mod p to get k. I think that should work > > Peter > > 2008/8/15 Richard Howell-Peak <ric...@gm...> > >> Is there anyway in FLINT to divide by a number mod p. For instance if I >> have i*k = c mod p and I know i and c, how can I compute k? >> >> -- >> Richard >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> flint-devel mailing list >> fli...@li... >> https://lists.sourceforge.net/lists/listinfo/flint-devel >> >> > |
|
From: Peter S. <ps...@gm...> - 2008-08-15 11:03:18
|
if you are using unsigned longs then you can use z_invert(i,p) to get you i^-1 mod p and then do c*(i^-1) mod p to get k. I think that should work Peter 2008/8/15 Richard Howell-Peak <ric...@gm...> > Is there anyway in FLINT to divide by a number mod p. For instance if I > have i*k = c mod p and I know i and c, how can I compute k? > > -- > Richard > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > flint-devel mailing list > fli...@li... > https://lists.sourceforge.net/lists/listinfo/flint-devel > > |
|
From: Richard Howell-P. <ric...@gm...> - 2008-08-15 11:00:16
|
Is there anyway in FLINT to divide by a number mod p. For instance if I have i*k = c mod p and I know i and c, how can I compute k? -- Richard |
|
From: Richard Howell-P. <ric...@gm...> - 2008-08-15 08:32:20
|
I reckon that's probably the best way, I'll try and get it working. -- Richard |
|
From: Bill H. <goo...@go...> - 2008-08-14 13:04:10
|
Hi Richard, Sorry I must have misunderstood what you meant when you mentioned this before. I personally don't have a problem with the extra factor, as you *have* returned factors after all. But some users may become confused by it. There are two ways to go. One is to make sure all factors are monic, then return the leading coefficient of the original polynomial as an extra unit factor (as an unsigned long). The other way is to multiply the leading coefficients of all the factors, invert the product mod p and multiply by the leading coefficient of the original polynomial and return this extra unit factor as an unsigned long. To me the former is the most logical. I think trying to get the algorithm to return the "right" factors in the first place is probably a lost cause, since this is not well-defined. Bill. To me the former makes the most sense. Bill. On 14/08/2008, Richard Howell-Peak <ric...@gm...> wrote: > I'm currently trying to fix the bug in the square_free algorithm which > results in returning factors which when multiplied together give a scalar > multiple of the original polynomial. My idea is to try and catch this > factor, and have the square-free algorithm return it as an unsigned long. > I know the input for square-free has to be monic (although I'm not entirely > sure why) so I wonder if when the recursion is called it might not be monic > and that could be causing a problem. The only other thing I could see as > being a problem is when gcd is called, is there something funny going on > since any scalar multiple of the gcd is also gcd? This shouldn't be a > problem because you divide by it anyway but I just wondered how the function > decides what polynomial to return. > > -- > Richard > |