From: Daniel B. <da...@te...> - 2002-01-28 01:47:43
|
I have at least one problem - possibly two - with floating point on PPC: 1) When compiling cross-float, it can't compute LEAST-POSITIVE-SINGLE-FLOAT. The error is debugger invoked on condition of type SB-KERNEL:FLOATING-POINT-EXCEPTION: An arithmetic error SB-KERNEL:FLOATING-POINT-EXCEPTION was signalled. No traps are enabled? How can this be? and the backtrace goes 0: (SB-VM:SIGFPE-HANDLER #<SB-DEBUG::UNPRINTABLE-OBJECT unavailable argument> #<SB-DEBUG::UNPRINTABLE-OBJECT unavailable argument> #<SB-DEBUG::UNPRINTABLE-OBJECT unavailable argument>) 1: ("foreign function call land") 2: (EXPT 2.0 -127) 3: (MAKE-SINGLE-FLOAT 1) 4: (EVAL (SINGLE-FROM-BITS 0 0 1)) 5: (EVAL (SB!C::%DEFCONSTANT 'LEAST-POSITIVE-SINGLE-FLOAT # 'NIL)) The same error can be produced by calling (expt 2.0 -127) interactively I did look at the cmucl-imp messages referred to, but I don't believe it's the same problem, as interactively calling (sb-c::safe-expt 2.0 -127) returns NIL 2) 'No traps are enabled' above seems to be because the floating point modes are reset when signals happen, and not reinitialized by the signal handlers. I'm not sure if this is a real problem, because nobody has yet complained that x86 does exactly the same thing - * (sb-int:get-floating-point-modes) (:TRAPS (:OVERFLOW :INVALID :DIVIDE-BY-ZERO) :ROUNDING-MODE :NEAREST :CURRENT-EXCEPTIONS (:INEXACT) :ACCRUED-EXCEPTIONS (:INEXACT) :FAST-MODE NIL) * ^C debugger invoked on condition of type SIMPLE-CONDITION: interrupted at #X40111E1E 0] (sb-int:get-floating-point-modes) (:TRAPS NIL :ROUNDING-MODE :NEAREST :CURRENT-EXCEPTIONS ...) For what it's worth, the traps get restored correctly if I take the CONTINUE restart, but stick at NIL if I ':r toplevel' Any ideas? (1) is probably a bigger problem, because it's between me and a PPC SBCL that can build itself. -dan -- http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources |
From: Daniel B. <da...@te...> - 2002-01-28 06:17:31
|
William Harold Newman <wil...@ai...> writes: > It's not completely clear which system you're referring to that has [...] > itself". So I'm guessing the exception is on the PPC. Sorry, yes. This is on the PPC itself. I've done a bit more prodding at it myself and realised that the exact same problem can be made to happen on x86 if underflow traps are enabled: * (sb-int:set-floating-point-modes :traps '(:overflow :underflow :invalid :divide-by-zero)) * (sb-int:get-floating-point-modes) (:TRAPS (:UNDERFLOW :OVERFLOW :INVALID :DIVIDE-BY-ZERO) :ROUNDING-MODE :NEAREST :CURRENT-EXCEPTIONS (:INEXACT) :ACCRUED-EXCEPTIONS (:INEXACT) :FAST-MODE NIL) * (expt 2.0 -127) debugger invoked on condition of type SB-KERNEL:FLOATING-POINT-EXCEPTION: [etc] so my workaround is to disable the underflow trap on PPC, on the grounds that (1) it will I hope Make Things Work, (2) at least we then have consistent behaviour between ports. > I have no immediate idea how to fix the "no traps enabled" broken > error-reporting aspect of the bug. It would probably benefit from some > quality time spent with a Pentium architecture manual and the source > to one or more of the *nix kernels that SBCL runs on. It's unlikely to > going to get that from me anytime in the foreseeable future. The list of enabled traps, when queried from the debugger, is NIL. I guess that the kernel is resetting the fpu control word when calling our signal handlers, then Lisp sigfpe-handler is looking at it and saying "how did I get here?" Possibly something akin to the old x86 CMUCL restore-the-fpu-state-on-signal-handler-entry code needs doing. > Actually, if it's correct that 1.17549434e-38 this is really the least > positive normalized IEEE single float, then exactly half of that (i.e. > (EXPT 2.0 -127), apparently) seems like a funny value to be constructing. > Is it possible that you want (EXPT 2.0 -126)? Well, we're calculating LEAST-POSITIVE-SINGLE-FLOAT, so we can probably expect the final answer to be denormalized. I think it's going to be tricky to get to that answer using only normal floats. Maybe disabling underflow traps is the right answer, not just the workaround. Sigh. I've learnt more about IEEE floating point this evening than I'd forgotten in a month of a Numerical Analysis course all those years ago at university. -dan -- http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources |
From: Raymond T. <to...@rt...> - 2002-01-28 14:48:04
|
>>>>> "Daniel" == Daniel Barlow <da...@te...> writes: Daniel> I've done a bit more prodding at it myself and realised that the exact same Daniel> problem can be made to happen on x86 if underflow traps are enabled: Daniel> * (sb-int:set-floating-point-modes :traps '(:overflow :underflow :invalid :divide-by-zero)) Daniel> * (sb-int:get-floating-point-modes) Daniel> (:TRAPS (:UNDERFLOW :OVERFLOW :INVALID :DIVIDE-BY-ZERO) :ROUNDING-MODE :NEAREST Daniel> :CURRENT-EXCEPTIONS (:INEXACT) :ACCRUED-EXCEPTIONS (:INEXACT) :FAST-MODE NIL) Daniel> * (expt 2.0 -127) Daniel> debugger invoked on condition of type SB-KERNEL:FLOATING-POINT-EXCEPTION: Daniel> [etc] What does the backtrace say? On my CMUCL x86, the back trace shows that unix:write in there, but no expt. (Last time I saw this, there was a missing fwait somewhere.) >> Actually, if it's correct that 1.17549434e-38 this is really the least >> positive normalized IEEE single float, then exactly half of that (i.e. >> (EXPT 2.0 -127), apparently) seems like a funny value to be constructing. >> Is it possible that you want (EXPT 2.0 -126)? least-positive-normalized-single-float is (expt 2.0 -126) on sparc, and my Sparc manual says the same. Daniel> Well, we're calculating LEAST-POSITIVE-SINGLE-FLOAT, so we can Daniel> probably expect the final answer to be denormalized. I think it's Daniel> going to be tricky to get to that answer using only normal floats. Yes. On sparc, this is (expt 2.0 -149), and I think this is right since the smallest exponent is -126 and fraction part should be 2^(-23) for this denormalized number. Daniel> Maybe disabling underflow traps is the right answer, not Daniel> just the workaround. Do you mean during the calculation or in general? Doesn't SBCL create these values just by smashing the appropriate bits into a single float? If that's true, how does that help? Ray |
From: Daniel B. <da...@te...> - 2002-01-28 06:23:38
|
William Harold Newman <wil...@ai...> writes: > The "No traps are enabled? How can this be?" thing was, in my mind > anyway, a known bug. However, I never realized it didn't appear in > BUGS. (It's bug 146 now. (EXPT 2.0 -127) doesn't suffice to exercise > the bug on x86/OpenBSD., but (EXPT 2.0 123456) does.) I note also bug 45 ("a slew of floating point errrors reported by Peter Van Eynde"). Peter, do you have the tests that these arose from? (I strongly suspect that fp handling on the alpha is also suboptimal; while I have signal handlers and the rudiments of ieee 754 in neural cache, this might be a good time to take a look at it all) -dan -- http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources |
From: Peter V. E. <pva...@de...> - 2002-01-29 20:51:56
|
On Mon, Jan 28, 2002 at 06:24:56AM +0000, Daniel Barlow wrote: > William Harold Newman <wil...@ai...> writes: > > > The "No traps are enabled? How can this be?" thing was, in my mind > > anyway, a known bug. However, I never realized it didn't appear in > > BUGS. (It's bug 146 now. (EXPT 2.0 -127) doesn't suffice to exercise > > the bug on x86/OpenBSD., but (EXPT 2.0 123456) does.) > > I note also bug 45 ("a slew of floating point errrors reported by > Peter Van Eynde"). Peter, do you have the tests that these arose > from? IIRC they are results from an older version of ansi-test (part of clocc). Please note that some versions of Linux were/are so dumb as to clear the FPU control word on creating an signal handler, thus nuking your reason to have the signal. See: cmucl/src/lisp/x86-arch.c: void sigtrap_handler(HANDLER_ARGS) { ... #if defined(__linux__) && defined(i386) /* * Restore the FPU control word, setting the rounding mode to nearest. */ if (contextstruct.fpstate) setfpucw(contextstruct.fpstate->cw & ~0xc00); #endif ... Maybe something similar is happening? Groetjes, Peter -- It's logic Jim, but not as we know it. | pva...@de... "God, root, what is difference?" - Pitr| "God is more forgiving." - Dave Aronson| http://cvs2.cons.org/~pvaneynd/ |