From: <lutz.euler@fr...>  20120607 12:47:56

Hi, I wrote: > I have prepared two patches: The first marks the test skippedon nonx86 > and adds a comment as to why, and the second one adds another test, > skippedon x86, that tests the libm rangereduction. I have just committed these two. I am curious to learn if any combinations of backend and operating system fail the second test (the one added with commit 4de15256dd6387e5) as I tested it only on x8664 under Linux. Greetings, Lutz 
From: <lutz.euler@fr...>  20120528 15:39:28

Hi, the mentioned test is tagged as "Expected failure", currently only on x8664 (but also fails on PPC, see lp#964348  I agree with Nikodemus that that is expected). I am getting earnestly annoyed by this, so I investigated. It turns out things are completely different. The test is about range reduction for trigonometric functions on large float arguments, like (sin (expt 2d0 70)). It was introduced with commit ad9090dc91fc922a with a fix for lp#327192 where Gabor had noticed that on x86 the above expression returns 0d0 and Paul committed a change introducing range reduction into the relevant VOPs on x86. In short: The test is bogus. Even on x86 it only doesn't fail due to an (un)lucky choice of test values. Trigonometric functions on large float arguments on x86 still deliver completely wrong results. The x8664 port gets everything right but the test punishes that. Basically, x86 gets things wrong because it does range reduction (even for arguments below (expt 2 63), it seems) modulo some limitedprecision approximation of 2 pi while libm uses an algorithm that delivers results as if the infinitely precise pi was used to reduce mod 2 pi. So what do we want:  Document that x86 is imprecise.  Completely revert the fix for lp#327192 on the account that delivering 0 or some random value are both equally bad (the bug that "tan" crashes can be fixed independently; there is only an "fldz" missing).  Change the test to test for correct results and mark x86 as failing.  Drop the test as nobody would be using arguments this large to trigonometric functions anyway.  Implement a better range reduction on x86 (in Lisp, presumably) and use it for arguments above (expt 2 20) or so.  Convert x86 to use libm for these functions. Any comments appreciated! Some details: Here are some results of (sin (expt 2d0 n)) on SBCL x86, respectively (coerce (sin (expt 2l0 n)) 'doublefloat) on Clisp with the longfloat precision set to 3000 bits (the results of the first expression on SBCL x8664 are equal to Clisp's results and I assume that both are mathematically correct within one ULP or so): n = 20: SBCL x86: 0.33049314002173596d0 Clisp: 0.3304931400217347d0 n = 40: SBCL x86: 0.4057050128266265d0 Clisp: 0.40570501153282873d0 n = 70: SBCL x86: 0.936042960761405d0 Clisp: 0.9981794021933068d0 So, with n = 70, x86 does not even get the sign of the result right. Here is why the test only accidentally works on x86: For several values it compares for example (sin value) with (sin (mod value (* 2 pi))). It turns out that for the only value large enough to have a chance to make "almost=" in the test fail (in light of the size of the errors seen above), namely (* pi (expt 2d0 70)), the modular reductions by mod (Lisp) and by fprem/fsin (x87) yield the same, namely 0d0: (All following expressions are evaluated on x86) * (sin (* pi (expt 2d0 70))) 0.0d0 * (mod (* pi (expt 2d0 70)) (* 2 pi)) 0.0d0 * (sin (mod (* pi (expt 2d0 70)) (* 2 pi))) 0.0d0 But for some other values these are not equal (or "almost=") anymore (and still both completely wrong, I need not mention): * (let* ((value (* 1.625d0 pi (expt 2d0 70))) (mod (mod value (* 2 pi)))) (values (sin value) mod (sin mod))) 0.1676180272408028d0 0.0d0 0.0d0 * (let* ((value (* 1.6d0 (expt 2d0 70))) (mod (mod value (* 2 pi)))) (values (sin value) mod (sin mod))) 0.5720626950150568d0 262144.0d0 0.08410702780950105d0 Greetings, Lutz 
From: Nikodemus Siivola <nikodemus@ra...>  20120528 19:52:54

On 28 May 2012 18:36, Lutz Euler <lutz.euler@...> wrote: >  Document that x86 is imprecise. To the degree that we can't fix it (or haven't yet fixed tis), we should do this. >  Completely revert the fix for lp#327192 on the account that delivering > 0 or some random value are both equally bad (the bug that "tan" > crashes can be fixed independently; there is only an "fldz" missing). Adding the missing fldz sounds right TRT. >  Change the test to test for correct results and mark x86 as failing. Sounds right. >  Drop the test as nobody would be using arguments this large to > trigonometric functions anyway. Someone ran across it... so I think there's some experimental proof to the contrary. >  Implement a better range reduction on x86 (in Lisp, presumably) and > use it for arguments above (expt 2 20) or so. >  Convert x86 to use libm for these functions. I think I would prefer to use libm, but I foresee that even if we use libm elsewhere, we may need to do something else on Windows  but let's not hope so... Cheers,  Nikodemus 
From: Paul Khuong <pvk@pv...>  20120529 00:25:57

On Mon, May 28, 2012 at 5:36 PM, Lutz Euler <lutz.euler@...> wrote: > The test is about range reduction for trigonometric functions on large > float arguments, like (sin (expt 2d0 70)). It was introduced with commit > ad9090dc91fc922a with a fix for lp#327192 where Gabor had noticed that > on x86 the above expression returns 0d0 and Paul committed a change > introducing range reduction into the relevant VOPs on x86. > > In short: The test is bogus. Even on x86 it only doesn't fail due to an > (un)lucky choice of test values. Trigonometric functions on large float > arguments on x86 still deliver completely wrong results. The x8664 port > gets everything right but the test punishes that. > Basically, x86 gets things wrong because it does range reduction (even > for arguments below (expt 2 63), it seems) modulo some limitedprecision > approximation of 2 pi. The thing is, branching on C2 and going through FPREM/2Pi is exactly what Intel recommends in their ISA reference: "It is up to the program to check the C2 flag for outofrange conditions. Source values outside the range 263 to +263 can be reduced to the range of the instruction by subtracting an appropriate integer multiple of 2π or by using the FPREM instruction with a divisor of 2π." (I can't tell what section 8.3.8 of volume 1 means when we use FLDPI's 66 bit constant directly). If it's good enough for the official docs, I believe that's good enough for SBCL as well. I think any issue might stem from our using x87 in 64 bit mode. >  Completely revert the fix for lp#327192 on the account that delivering > 0 or some random value are both equally bad (the bug that "tan" > crashes can be fixed independently; there is only an "fldz" missing). Disagree. There's a lot of quibbling to be had wrt whether a FP value denotes a point or a range, but it's pretty clear to me that I should be able to depend on sin^2 + cos^2 = 1. >  Change the test to test for correct results and mark x86 as failing. It's a different test than what was originally intended, but sure. Paul Khuong 
From: <lutz.euler@fr...>  20120529 18:36:35

Hi Paul, > On Mon, May 28, 2012 at 5:36 PM, Lutz Euler <lutz.euler@...> wrote: > > The test is about range reduction for trigonometric functions on large > > float arguments, like (sin (expt 2d0 70)). It was introduced with commit > > ad9090dc91fc922a with a fix for lp#327192 where Gabor had noticed that > > on x86 the above expression returns 0d0 and Paul committed a change > > introducing range reduction into the relevant VOPs on x86. > > > > In short: The test is bogus. Even on x86 it only doesn't fail due to an > > (un)lucky choice of test values. Trigonometric functions on large float > > arguments on x86 still deliver completely wrong results. The x8664 port > > gets everything right but the test punishes that. > > Basically, x86 gets things wrong because it does range reduction (even > > for arguments below (expt 2 63), it seems) modulo some limitedprecision > > approximation of 2 pi. > > The thing is, branching on C2 and going through FPREM/2Pi is exactly > what Intel recommends in their ISA reference: "It is up to the program > to check the C2 flag for outofrange conditions. Source values > outside the range 263 to +263 can be reduced to the range of the > instruction by subtracting an appropriate integer multiple of 2π or by > using the FPREM instruction with a divisor of 2π." (I can't tell what > section 8.3.8 of volume 1 means when we use FLDPI's 66 bit constant > directly). I understand. > If it's good enough for the official docs, I believe that's > good enough for SBCL as well. That is fine with me. Unfortunately, x8664, using libm, calculates completely different results for large inputs, which have IMO at least the same right to be considered good enough as what x87 calculates. (I might even say: Unfortunately, time has passed, the restrictions necessary when x87 was conceived no longer apply (like wanting to use only a coarse approximation to pi), so libm takes a much improved approach and yields correct results even where x87 is completely off.) I started this thread saying that I am annoyed. That I am only about the fact that x8664 fails this test. I don't mind so much what x86 does. > >  Completely revert the fix for lp#327192 on the account that delivering > > 0 or some random value are both equally bad (the bug that "tan" > > crashes can be fixed independently; there is only an "fldz" missing). > > Disagree. There's a lot of quibbling to be had wrt whether a FP value > denotes a point or a range, but it's pretty clear to me that I should > be able to depend on sin^2 + cos^2 = 1. Fine with me. The list of possible changes I sent was not meant to mean "I want all this changed" but more "Which subset of these changes would be sensible?". > >  Change the test to test for correct results and mark x86 as failing. > > It's a different test than what was originally intended, but sure. In light of what you wrote I don't insist on regarding x86's behaviour as "failing". Maybe we should just make the expected test results dependent on whether the platform is x86 or not. With a bit of multiprecision arithmetic I can build a reducer modulo 2 pi with sufficient precision to replicate the results of the libm rangereduction and thus test that. Just now I see that you enabled the test originally only for x86. So another solution would be just to make the test x86only again? Greetings, Lutz 
From: Nathan Froyd <froydnj@gm...>  20120529 18:50:22

On Tue, May 29, 2012 at 2:32 PM, Lutz Euler <lutz.euler@...> wrote: > Maybe we should just make the expected test results dependent on whether > the platform is x86 or not. With a bit of multiprecision arithmetic I > can build a reducer modulo 2 pi with sufficient precision to replicate > the results of the libm rangereduction and thus test that. > > Just now I see that you enabled the test originally only for x86. > > So another solution would be just to make the test x86only again? I think marking the test as requiring x86only weirdness by whatever means would be fine. Longterm, +1 to just relying on libm everywhere. If some sufficiently motivated person wants to enable fsin and friends for appropriatelyDECLARE'd code, that would be acceptable. (glibc has basically done this.) If we run into imprecise platform libm's, it should be more obvious with such a scheme. (Until some sufficiently motivated person writes the appropriate routines in pure CL, of course. ;) Nathan 
From: <lutz.euler@fr...>  20120601 19:09:21

Nathan Froyd wrote, > On Tue, May 29, 2012 at 2:32 PM, Lutz Euler <lutz.euler@...> wrote: > > Maybe we should just make the expected test results dependent on whether > > the platform is x86 or not. With a bit of multiprecision arithmetic I > > can build a reducer modulo 2 pi with sufficient precision to replicate > > the results of the libm rangereduction and thus test that. > > > > Just now I see that you enabled the test originally only for x86. > > > > So another solution would be just to make the test x86only again? > > I think marking the test as requiring x86only weirdness by whatever > means would be fine. I have prepared two patches: The first marks the test skippedon nonx86 and adds a comment as to why, and the second one adds another test, skippedon x86, that tests the libm rangereduction. If no one objects, I will commit both in a few days. Kind regards, Lutz 
From: <lutz.euler@fr...>  20120607 12:47:56

Hi, I wrote: > I have prepared two patches: The first marks the test skippedon nonx86 > and adds a comment as to why, and the second one adds another test, > skippedon x86, that tests the libm rangereduction. I have just committed these two. I am curious to learn if any combinations of backend and operating system fail the second test (the one added with commit 4de15256dd6387e5) as I tested it only on x8664 under Linux. Greetings, Lutz 