From: Bruce O'N. <ec...@pc...> - 2010-05-02 11:25:13
|
Hi, Thanks very much for all of the PPC patches! That's very nice and now nicely explains why I was having so many problems with my programs that used big hash tables. Excellent! I also have OpenBSD/PPC working, the patches to 1.0.38.5 are at: http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100430.tar.gz cheers bruce |
From: Alastair B. <ala...@gm...> - 2010-05-02 15:31:46
|
Hello, On Sun, May 2, 2010 at 7:24 AM, Bruce O'Neel <ec...@pc...> wrote: > Hi, > > Thanks very much for all of the PPC patches! That's very nice > and now nicely explains why I was having so many problems with my > programs that used big hash tables. Excellent! Not a problem. I just got a G5 about a fortnight ago, and have been trying to clear up the "obvious" problems that I've run into. Mostly failures in the test-suite, although the GC was fairly obvious as well. Do you have a list of PPC-related problems that could do with looking at? > I also have OpenBSD/PPC working, the patches to 1.0.38.5 are at: > > http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100430.tar.gz I'm afraid that these have two obvious problems: 1. The accepted patch format is a "unified context diff", not replacement files. 2. These changes very obviously break a number of other platforms, such as every non-openbsd gencgc ppc target. And that's just from src/compiler/ppc/parms.lisp. The other changed source files have similar breakages. An additional problem is that I cannot run OpenBSD on my system (it supports neither my hard drive controller nor my network adaptor), so I can't really clean these up myself. > cheers > > bruce -- Alastair Bridgewater |
From: Bruce O'N. <ec...@pc...> - 2010-05-02 19:28:26
|
Hi, Thanks, I'll clean up the patches. This was more of a quick hack rather than something as carefull as necessary. I'll also make them diffs as well. I have them as replacement files since that's easier for my build. cheers bruce On Sun, May 02, 2010 at 11:31:38AM -0400, Alastair Bridgewater wrote: > Hello, > > On Sun, May 2, 2010 at 7:24 AM, Bruce O'Neel <ec...@pc...> wrote: > > Hi, > > > > Thanks very much for all of the PPC patches! That's very nice > > and now nicely explains why I was having so many problems with my > > programs that used big hash tables. Excellent! > > Not a problem. I just got a G5 about a fortnight ago, and have been > trying to clear up the "obvious" problems that I've run into. Mostly > failures in the test-suite, although the GC was fairly obvious as > well. > > Do you have a list of PPC-related problems that could do with looking at? > > > I also have OpenBSD/PPC working, the patches to 1.0.38.5 are at: > > > > http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100430.tar.gz > > I'm afraid that these have two obvious problems: > > 1. The accepted patch format is a "unified context diff", not > replacement files. > > 2. These changes very obviously break a number of other platforms, > such as every non-openbsd gencgc ppc target. And that's just from > src/compiler/ppc/parms.lisp. The other changed source files have > similar breakages. > > An additional problem is that I cannot run OpenBSD on my system (it > supports neither my hard drive controller nor my network adaptor), so > I can't really clean these up myself. > > > cheers > > > > bruce > > -- Alastair Bridgewater |
From: Josh E. <jo...@el...> - 2010-05-02 17:57:30
|
On Sun, May 02, 2010 at 01:24:59PM +0200, Bruce O'Neel wrote: > I also have OpenBSD/PPC working, the patches to 1.0.38.5 are at: > > http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100430.tar.gz It's good to see someone else interested, I got this self-hosting some time ago but was unable to resolve some issues with floating point exceptions. Taking a quick look at your patch, a number of things jump out at me. In Config.ppc-openbsd, is there a reason you're building undefineds.c? Or not linking against libutil? In parms.lisp, perhaps you could include a comment explaining why you chose those particular addresses. Given that those numbers are highly magic, a little explanation never hurts. I didn't look at any of the others. Perhaps you could reroll that as a unified diff for readability. It's been a little while so I don't remember exactly what did and didn't work, but did you have any trouble with the floating point tests? |
From: Bruce O'N. <ec...@pc...> - 2010-05-02 19:31:36
|
Hi, This came heavily from staring at the OpenBSD x86 port, and, looking at the NetBSD/PPC port. I figured those were the closest. And yes, the magic numbers were from the OpenBSD x86 port. Some poking at header files led me to believe that the memory layout was quite similar. I've never had floating point problems, but, I've not pushed it too hard. I don't run the test suite since traditionally a chunk of it seemed to randomally fail on ppc. Alastair seems to have fixed a lot of these so I'll start trying the tests. Thanks! cheers bruce On Sun, May 02, 2010 at 10:57:23AM -0700, Josh Elsasser wrote: > On Sun, May 02, 2010 at 01:24:59PM +0200, Bruce O'Neel wrote: > > I also have OpenBSD/PPC working, the patches to 1.0.38.5 are at: > > > > http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100430.tar.gz > > It's good to see someone else interested, I got this self-hosting some > time ago but was unable to resolve some issues with floating point > exceptions. > > Taking a quick look at your patch, a number of things jump out at me. > > In Config.ppc-openbsd, is there a reason you're building undefineds.c? > Or not linking against libutil? > > In parms.lisp, perhaps you could include a comment explaining why you > chose those particular addresses. Given that those numbers are highly > magic, a little explanation never hurts. > > I didn't look at any of the others. Perhaps you could reroll that as a > unified diff for readability. > > It's been a little while so I don't remember exactly what did and > didn't work, but did you have any trouble with the floating point > tests? |
From: Alastair B. <ala...@gm...> - 2010-05-02 21:11:22
|
Hello, On Sun, May 2, 2010 at 3:31 PM, Bruce O'Neel <ec...@pc...> wrote: > I've never had floating point problems, but, I've not pushed it too > hard. I don't run the test suite since traditionally a chunk of it > seemed to randomally fail on ppc. Alastair seems to have fixed > a lot of these so I'll start trying the tests. If you're going to start working with the test suite, I recommend learning what the current baselines are for test results on various platforms. x86/linux, for example, should only be failing three tests, and x86-64/linux failing four. ppc/linux... That's another story. > On Sun, May 02, 2010 at 10:57:23AM -0700, Josh Elsasser wrote: >> It's good to see someone else interested, I got this self-hosting some >> time ago but was unable to resolve some issues with floating point >> exceptions. <snip> >> It's been a little while so I don't remember exactly what did and >> didn't work, but did you have any trouble with the floating point >> tests? I still get two float.pure.lisp failures on ppc/linux: Expected failure: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) And I think there might be intermittent floaty exception failures on one or both of the x86oid linux backends, but don't remember which or how to trigger them. Were these what you were seeing, or something more alarming? -- Alastair Bridgewater |
From: Josh E. <jo...@el...> - 2010-05-02 21:27:38
|
On Sun, May 02, 2010 at 05:11:15PM -0400, Alastair Bridgewater wrote: > >> It's been a little while so I don't remember exactly what did and > >> didn't work, but did you have any trouble with the floating point > >> tests? > > I still get two float.pure.lisp failures on ppc/linux: > Expected failure: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) > Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) > And I think there might be intermittent floaty exception failures on > one or both of the x86oid linux backends, but don't remember which or > how to trigger them. Were these what you were seeing, or something > more alarming? If they failed on linux then I probably wasn't too concerned about making them pass on openbsd. What I recall is that floating point exceptions were never signaled at all, however I don't remember any details. I'll see if I can take a look at it again sometime soon. |
From: Alastair B. <ala...@gm...> - 2010-05-07 03:52:01
|
Hello, [re-ccing sbcl-devel, because this really should be discussed there.] On Thu, May 6, 2010 at 10:32 PM, Josh Elsasser <jo...@el...> wrote: > I cleaned up and did some quick testing of my old PPC branch, it > doesn't look hugely different than your changes. The most significant > change would probably be the address space locations. I don't remember > the exact reason I used the addresses I did, but I remember being > concerned that someone might build a kernel with MAXDSIZ bumped up to > 1GB like it is on i386. > > Here's a diff of my changes against 1.0.38.5: > > http://www.elsasser.org/misc/sbcl-obsd-ppc.diff This, I mostly like. Two questions, though. First, would you mind resolving the comment "XXX JRE test this with a 1GB MAXDSIZ kernel" for the heap space parameters? Even just re-wording it to explain what the concern is would be an improvement over the rather cryptic note there. Second, can you explain the logic behind the test/foreign.test.sh change? Beyond that, I'd be inclined to commit this if you and Bruce can agree that it works. > There are several dynamic-extend test failures which I don't remember > from before, I had to disable one of them to avoid dropping into the > debugger during the test run. The failing timer tests are normal for > OpenBSD. I can confirm that everything but the timer and float errors are normal for PPC/linux. I'm not sure about the float tests, as I haven't looked deeply enough at floating point in general to start diagnosing it. As far as the other test results go, you might like to know why they fail: > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) These two tests fail because the undefined-function trampoline function is not considered a valid lisp pointer by the debugger, thus causing the frame to show up as "bogus stack frame" instead of "undefined function" in the backtrace. The bug-346 case also fails because the heuristic used for detecting an incompletely-set-up stack frame doesn't work reliably and causes truncated backtraces when it fails. > Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) > Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) The breakpoint functionality that underlies these two tests was never properly implemented for PPC. > Failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) > Failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES > Failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK > Failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG I suspect that these, and the disabled bogus-compiler-note test, are from changes made for x86oid dynamic-extent that were never implemented for PPC (or for other non-x86oid platforms, most likely). The dropping-to-debugger thing is at least partly from 1.0.37.47, when impure tests were run via run-program instead of fork() across the board (I have a partial fix in one of my trees that causes an unhandled exception instead of dropping to debugger, which at least prevents the test suite from hanging). > Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET > Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT These are both set :fails-on :sbcl. They are testing for the behavior of name-conflicts between more than two symbols with the same name at once, and expect a slightly nicer user experience than SBCL provides. > Expected failure: run-program.impure.lisp / (RUN-PROGRAM INHERIT-STDIN) This also dates back to 1.0.37.47, something about SIGTTIN and process groups, and other stuff that I've been able to make neither heads nor tails of. Anyway, that's my impression of the proposed patch and an explanation of what's what with the test suite. -- Alastair Bridgewater |
From: Josh E. <jo...@el...> - 2010-05-07 06:08:14
|
On Thu, May 06, 2010 at 11:51:53PM -0400, Alastair Bridgewater wrote: > Hello, > > [re-ccing sbcl-devel, because this really should be discussed there.] > > On Thu, May 6, 2010 at 10:32 PM, Josh Elsasser <jo...@el...> wrote: > > I cleaned up and did some quick testing of my old PPC branch, it > > doesn't look hugely different than your changes. The most > > significant change would probably be the address space > > locations. I don't remember the exact reason I used the addresses > > I did, but I remember being concerned that someone might build a > > kernel with MAXDSIZ bumped up to 1GB like it is on i386. > > > > Here's a diff of my changes against 1.0.38.5: > > > > http://www.elsasser.org/misc/sbcl-obsd-ppc.diff > > This, I mostly like. Two questions, though. First, would you mind > resolving the comment "XXX JRE test this with a 1GB MAXDSIZ kernel" > for the heap space parameters? Even just re-wording it to explain > what the concern is would be an improvement over the rather cryptic > note there. That was a note to me to test with a kernel build with the 512MB hard data size limit bumped to 1GB, which affects how virtual address space is laid out. It's probably silly to worry about this though. > Second, can you explain the logic behind the test/foreign.test.sh > change? On OpenBSD that test fails spectacularly if the library isn't build with -fPIC. Is this not necessary on other PPC platforms? > Beyond that, I'd be inclined to commit this if you and Bruce can > agree that it works. > > > There are several dynamic-extend test failures which I don't > > remember from before, I had to disable one of them to avoid > > dropping into the debugger during the test run. The failing timer > > tests are normal for OpenBSD. > > I can confirm that everything but the timer and float errors are > normal for PPC/linux. I'm not sure about the float tests, as I > haven't looked deeply enough at floating point in general to start > diagnosing it. I'll see if I can look into the float problems again. I remember it had something to do with float-related exceptions never being signaled, perhaps because SIGFPE was never delivered. > As far as the other test results go, you might like to know why they fail: > > > ?Expected failure: ? ?debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) > > ?Expected failure: ? ?debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) > > These two tests fail because the undefined-function trampoline > function is not considered a valid lisp pointer by the debugger, > thus causing the frame to show up as "bogus stack frame" instead of > "undefined function" in the backtrace. The bug-346 case also fails > because the heuristic used for detecting an incompletely-set-up > stack frame doesn't work reliably and causes truncated backtraces > when it fails. > > > ?Expected failure: ? ?debug.impure.lisp / (TRACE ENCAPSULATE NIL) > > ?Expected failure: ? ?debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) > > The breakpoint functionality that underlies these two tests was never properly > implemented for PPC. > > > ?Failure: ? ? ? ? ? ? dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) > > ?Failure: ? ? ? ? ? ? dynamic-extent.impure.lisp / DX-COMPILER-NOTES > > ?Failure: ? ? ? ? ? ? dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK > > ?Failure: ? ? ? ? ? ? dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG > > I suspect that these, and the disabled bogus-compiler-note test, are > from changes made for x86oid dynamic-extent that were never > implemented for PPC (or for other non-x86oid platforms, most > likely). The dropping-to-debugger thing is at least partly from > 1.0.37.47, when impure tests were run via run-program instead of > fork() across the board (I have a partial fix in one of my trees > that causes an unhandled exception instead of dropping to debugger, > which at least prevents the test suite from hanging). Avoiding the debugger would be nice, especially for someone who does automated build and test runs. > > ?Expected failure: ? ?packages.impure.lisp / USE-PACKAGE-CONFLICT-SET > > ?Expected failure: ? ?packages.impure.lisp / IMPORT-SINGLE-CONFLICT > > These are both set :fails-on :sbcl. They are testing for the > behavior of name-conflicts between more than two symbols with the > same name at once, and expect a slightly nicer user experience than > SBCL provides. > > > ?Expected failure: ? ?run-program.impure.lisp / (RUN-PROGRAM INHERIT-STDIN) > > This also dates back to 1.0.37.47, something about SIGTTIN and > process groups, and other stuff that I've been able to make neither > heads nor tails of. > > Anyway, that's my impression of the proposed patch and an > explanation of what's what with the test suite. Thanks for the rundown. I was mostly concerned about OpenBSD-specific test failures, but I suppose it would also be nice to try and fight the PPC bitrot a bit and fix some of those other test failures. |
From: Bruce O'N. <ec...@pc...> - 2010-05-07 07:43:17
|
Hi, I think this looks good. I'll test it tonight my time (UTC+2) but as Josh already said, his changes and my changes are bascially the same except for the memory layout so I don't think there are any problems. Thanks! cheers bruce ----- Message d'origine ----- De: Alastair Bridgewater <ala...@gm...> Date: Thu, 6 May 2010 23:51:53 -0400 Sujet: Re: [Sbcl-devel] Thanks very much for all the PPC patches, plus OpenBSD À: Josh Elsasser <jo...@el...> Cc: "Bruce O'Neel" <ec...@pc...>, sbc...@li... Hello, [re-ccing sbcl-devel, because this really should be discussed there.] On Thu, May 6, 2010 at 10:32 PM, Josh Elsasser wrote: > I cleaned up and did some quick testing of my old PPC branch, it > doesn't look hugely different than your changes. The most significant > change would probably be the address space locations. I don't remember > the exact reason I used the addresses I did, but I remember being > concerned that someone might build a kernel with MAXDSIZ bumped up to > 1GB like it is on i386. > > Here's a diff of my changes against 1.0.38.5: > > http://www.elsasser.org/misc/sbcl-obsd-ppc.diff This, I mostly like. Two questions, though. First, would you mind resolving the comment "XXX JRE test this with a 1GB MAXDSIZ kernel" for the heap space parameters? Even just re-wording it to explain what the concern is would be an improvement over the rather cryptic note there. Second, can you explain the logic behind the test/foreign.test.sh change? Beyond that, I'd be inclined to commit this if you and Bruce can agree that it works. > There are several dynamic-extend test failures which I don't remember > from before, I had to disable one of them to avoid dropping into the > debugger during the test run. The failing timer tests are normal for > OpenBSD. I can confirm that everything but the timer and float errors are normal for PPC/linux. I'm not sure about the float tests, as I haven't looked deeply enough at floating point in general to start diagnosing it. As far as the other test results go, you might like to know why they fail: > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) These two tests fail because the undefined-function trampoline function is not considered a valid lisp pointer by the debugger, thus causing the frame to show up as "bogus stack frame" instead of "undefined function" in the backtrace. The bug-346 case also fails because the heuristic used for detecting an incompletely-set-up stack frame doesn't work reliably and causes truncated backtraces when it fails. > Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) > Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) The breakpoint functionality that underlies these two tests was never properly implemented for PPC. > Failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) > Failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES > Failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK > Failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG I suspect that these, and the disabled bogus-compiler-note test, are from changes made for x86oid dynamic-extent that were never implemented for PPC (or for other non-x86oid platforms, most likely). The dropping-to-debugger thing is at least partly from 1.0.37.47, when impure tests were run via run-program instead of fork() across the board (I have a partial fix in one of my trees that causes an unhandled exception instead of dropping to debugger, which at least prevents the test suite from hanging). > Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET > Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT These are both set :fails-on :sbcl. They are testing for the behavior of name-conflicts between more than two symbols with the same name at once, and expect a slightly nicer user experience than SBCL provides. > Expected failure: run-program.impure.lisp / (RUN-PROGRAM INHERIT-STDIN) This also dates back to 1.0.37.47, something about SIGTTIN and process groups, and other stuff that I've been able to make neither heads nor tails of. Anyway, that's my impression of the proposed patch and an explanation of what's what with the test suite. -- Alastair Bridgewater |
From: Bruce O'N. <ec...@pc...> - 2010-05-08 10:59:41
|
Hi, I've tested this on my OpenBSD/PPC system and three different Linux PPC systems running 2 different Distros and it looks good. Thanks very much! cheers bruce On Thu, May 06, 2010 at 11:51:53PM -0400, Alastair Bridgewater wrote: > Hello, > > [re-ccing sbcl-devel, because this really should be discussed there.] > > On Thu, May 6, 2010 at 10:32 PM, Josh Elsasser <jo...@el...> wrote: > > I cleaned up and did some quick testing of my old PPC branch, it > > doesn't look hugely different than your changes. The most significant > > change would probably be the address space locations. I don't remember > > the exact reason I used the addresses I did, but I remember being > > concerned that someone might build a kernel with MAXDSIZ bumped up to > > 1GB like it is on i386. > > > > Here's a diff of my changes against 1.0.38.5: > > > > http://www.elsasser.org/misc/sbcl-obsd-ppc.diff > > This, I mostly like. Two questions, though. First, would you mind > resolving the comment "XXX JRE test this with a 1GB MAXDSIZ kernel" > for the heap space parameters? Even just re-wording it to explain what > the concern is would be an improvement over the rather cryptic note > there. Second, can you explain the logic behind the test/foreign.test.sh > change? > > Beyond that, I'd be inclined to commit this if you and Bruce can agree > that it works. > > > There are several dynamic-extend test failures which I don't remember > > from before, I had to disable one of them to avoid dropping into the > > debugger during the test run. The failing timer tests are normal for > > OpenBSD. > > I can confirm that everything but the timer and float errors are normal for > PPC/linux. I'm not sure about the float tests, as I haven't looked deeply > enough at floating point in general to start diagnosing it. > > As far as the other test results go, you might like to know why they fail: > > > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) > > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) > > These two tests fail because the undefined-function trampoline function is not > considered a valid lisp pointer by the debugger, thus causing the frame to show > up as "bogus stack frame" instead of "undefined function" in the backtrace. > The bug-346 case also fails because the heuristic used for detecting an > incompletely-set-up stack frame doesn't work reliably and causes truncated > backtraces when it fails. > > > Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) > > Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) > > The breakpoint functionality that underlies these two tests was never properly > implemented for PPC. > > > Failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) > > Failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES > > Failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK > > Failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG > > I suspect that these, and the disabled bogus-compiler-note test, are > from changes > made for x86oid dynamic-extent that were never implemented for PPC (or for other > non-x86oid platforms, most likely). The dropping-to-debugger thing is > at least partly > from 1.0.37.47, when impure tests were run via run-program instead of > fork() across > the board (I have a partial fix in one of my trees that causes an > unhandled exception > instead of dropping to debugger, which at least prevents the test > suite from hanging). > > > Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET > > Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT > > These are both set :fails-on :sbcl. They are testing for the behavior > of name-conflicts > between more than two symbols with the same name at once, and expect a slightly > nicer user experience than SBCL provides. > > > Expected failure: run-program.impure.lisp / (RUN-PROGRAM INHERIT-STDIN) > > This also dates back to 1.0.37.47, something about SIGTTIN and process groups, > and other stuff that I've been able to make neither heads nor tails of. > > Anyway, that's my impression of the proposed patch and an explanation of what's > what with the test suite. > > -- Alastair Bridgewater |
From: Bruce O'N. <ec...@pc...> - 2010-05-21 07:51:52
|
Hi, Using Josh's patch below: http://www.elsasser.org/misc/sbcl-obsd-ppc.diff produces a working sbcl on OpenBSD/PPC 4.7. I've still not gotten back to the float test oddities yet. Thanks! cheers bruce ----- Message d'origine ----- De: Josh Elsasser <jo...@el...> Date: Thu, 6 May 2010 19:32:16 -0700 Sujet: Re: [Sbcl-devel] Thanks very much for all the PPC patches, plus OpenBSD À: "Bruce O'Neel" <ec...@pc...> Cc: Alastair Bridgewater <ala...@gm...> On Tue, May 04, 2010 at 10:19:48PM +0200, Bruce O'Neel wrote: > Hi, > > On Sun, May 02, 2010 at 11:31:38AM -0400, Alastair Bridgewater wrote: > > > > > I also have OpenBSD/PPC working, the patches to 1.0.38.5 are at: > > > > > > http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100430.tar.gz > > > > I'm afraid that these have two obvious problems: > > > > 1. The accepted patch format is a "unified context diff", not > > replacement files. > > I have made a much cleaned up patch set with just patches, and one > complete new file. > > http://www.pckswarms.ch/sbcl/ppc-openbsd-4.6-patch-20100504-small.tar.gz > > > > > 2. These changes very obviously break a number of other platforms, > > such as every non-openbsd gencgc ppc target. And that's just from > > src/compiler/ppc/parms.lisp. The other changed source files have > > similar breakages. > > These patches should be much better. I've tested it on both Linux/PPC > and OpenBSD/PPC with the same patches. I also documented the magic > numbers. > > Would you both mind looking them over and see if they look ok? I cleaned up and did some quick testing of my old PPC branch, it doesn't look hugely different than your changes. The most significant change would probably be the address space locations. I don't remember the exact reason I used the addresses I did, but I remember being concerned that someone might build a kernel with MAXDSIZ bumped up to 1GB like it is on i386. Here's a diff of my changes against 1.0.38.5: http://www.elsasser.org/misc/sbcl-obsd-ppc.diff There are several dynamic-extend test failures which I don't remember from before, I had to disable one of them to avoid dropping into the debugger during the test run. The failing timer tests are normal for OpenBSD. Finished running tests. Status: Unexpected success: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) Failure: float.pure.lisp / NAN-COMPARISONS Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) Failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) Failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES Failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK Failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT Expected failure: run-program.impure.lisp / (RUN-PROGRAM INHERIT-STDIN) Failure: timer.impure.lisp / (TIMER STRESS) Failure: timer.impure.lisp / (WITH-TIMEOUT TIMEOUT) test failed, expected 104 return code, got 1 |
From: Josh E. <jo...@el...> - 2010-05-21 14:16:42
|
On Fri, May 21, 2010 at 09:51:39AM +0200, Bruce O'Neel wrote: > Hi, > > Using Josh's patch below: > > http://www.elsasser.org/misc/sbcl-obsd-ppc.diff > > produces a working sbcl on OpenBSD/PPC 4.7. > > I've still not gotten back to the float test oddities yet. The failing NaN tests at least appear to be to be a problem with OpenBSD and need to be fixed there instead of in SBCL. I had been working on a fix but was sidetracked, I'll see if I can get it working soon. I suppose as far as changes to SBCL are concerned, that patch is pretty good. The XXX comment in src/compiler/ppc/parms.lisp can be removed, if it ever becomes an issue then I or someone else can fix it then. Something better should probably be done about that handler-case-bogus-compiler-note test in dynamic-extent.impure.lisp which hangs SBCL, unless the recent test infrastructure changes already took care of it. |
From: Alastair B. <ala...@gm...> - 2010-05-24 02:56:17
|
On Fri, May 21, 2010 at 10:16 AM, Josh Elsasser <jo...@el...> wrote: > On Fri, May 21, 2010 at 09:51:39AM +0200, Bruce O'Neel wrote: >> I've still not gotten back to the float test oddities yet. > > The failing NaN tests at least appear to be to be a problem with > OpenBSD and need to be fixed there instead of in SBCL. I had been > working on a fix but was sidetracked, I'll see if I can get it working > soon. Okay, so here's the current status on all this: It all works on Linux. The only bit that doesn't work on OpenBSD is SIGFPE. SIGFPE is possibly completely broken on OpenBSD. As of 1.0.38.12, there should be one expected float.pure.lisp failure. Josh, I found your patch for fixing up SIGFPE, and I have an observation for you and a question. My observation is that the signal delivery mechanism does not appear to my (admittedly hasty) reading to consume any stack or other process resources until the stub in sys/arch/macppc/macppc/locore.S (sigcode) has executed at least one instruction. My question is, did you remember to clear the accrued exception fields in the fpscr before returning to userland? If they (or the trap enables) aren't cleared, then you can obtain an infinite loop that way. Other than that, if things still work as of 1.0.38.12, I think we're done. --Alastair Bridgewater |
From: Alastair B. <ala...@gm...> - 2010-05-21 23:30:09
|
Hello, I've just put together the version of the patch that I'm planning to commit. Unfortunately, I can't seem to access repo.or.cz right now, so I put the patch at <http://www.lisphacker.com/temp/sbcl-openbsd-ppc.diff>. I made a few tweaks here and there, but nothing too egregious. Barring being told "no, no, it doesn't work now" or code-freeze, I intend to commit this at some point on Sunday, the 23rd. If there's something else I should be doing beyond adding a NEWS snippet, now would be the time for another SBCL maintainer to mention it. On Fri, May 21, 2010 at 10:16 AM, Josh Elsasser <jo...@el...> wrote: > The failing NaN tests at least appear to be to be a problem with > OpenBSD and need to be fixed there instead of in SBCL. I had been > working on a fix but was sidetracked, I'll see if I can get it working > soon. I looked at these briefly earlier today, and it looks like the traps are being disabled for some strange reason on my linux system, which worries me, but indicates to me that it might be more an SBCL problem than a Linux problem. Don't let my conclusions stop you from approaching it as an OpenBSD problem, though. > I suppose as far as changes to SBCL are concerned, that patch is > pretty good. The XXX comment in src/compiler/ppc/parms.lisp can be > removed, if it ever becomes an issue then I or someone else can fix it > then. I rewrote it as a FIXME comment with a bit of the background explanation. > Something better should probably be done about that > handler-case-bogus-compiler-note test in dynamic-extent.impure.lisp > which hangs SBCL, unless the recent test infrastructure changes > already took care of it. They did. The problem was that a COMPILER-NOTE isn't an ERROR, so it slipped through the usual net. What I didn't apply was the -fPIC thing for foreign.test.sh, as it doesn't seem to be necessary for PPC/Linux. I'd be happier with a special case similar to what is done for Darwin. --Alastair Bridgewater |
From: Bruce O'N. <ec...@pc...> - 2010-05-23 11:51:54
|
Hi, On Fri, May 21, 2010 at 07:29:58PM -0400, Alastair Bridgewater wrote: > Hello, > > I've just put together the version of the patch that I'm planning to commit. > Unfortunately, I can't seem to access repo.or.cz right now, so I put the > patch at <http://www.lisphacker.com/temp/sbcl-openbsd-ppc.diff>. I made > a few tweaks here and there, but nothing too egregious. This looks great. Thanks to both of you, this is very nice. cheers bruce |
From: Josh E. <jo...@el...> - 2010-05-22 07:42:48
|
On Fri, May 21, 2010 at 07:29:58PM -0400, Alastair Bridgewater wrote: > Hello, > > I've just put together the version of the patch that I'm planning to commit. > Unfortunately, I can't seem to access repo.or.cz right now, so I put the > patch at <http://www.lisphacker.com/temp/sbcl-openbsd-ppc.diff>. I made > a few tweaks here and there, but nothing too egregious. Thanks, looks fine to me aside from foreign.test.sh > Barring being told "no, no, it doesn't work now" or code-freeze, I intend to > commit this at some point on Sunday, the 23rd. If there's something else > I should be doing beyond adding a NEWS snippet, now would be the time > for another SBCL maintainer to mention it. > > On Fri, May 21, 2010 at 10:16 AM, Josh Elsasser <jo...@el...> wrote: > > The failing NaN tests at least appear to be to be a problem with > > OpenBSD and need to be fixed there instead of in SBCL. I had been > > working on a fix but was sidetracked, I'll see if I can get it working > > soon. > > I looked at these briefly earlier today, and it looks like the traps are being > disabled for some strange reason on my linux system, which worries me, > but indicates to me that it might be more an SBCL problem than a Linux > problem. Don't let my conclusions stop you from approaching it as an > OpenBSD problem, though. There certainly is an OpenBSD problem, but of course that doesn't mean there isn't also an SBCL problem. > > I suppose as far as changes to SBCL are concerned, that patch is > > pretty good. The XXX comment in src/compiler/ppc/parms.lisp can be > > removed, if it ever becomes an issue then I or someone else can fix it > > then. > > I rewrote it as a FIXME comment with a bit of the background explanation. Works for me, it's not like anyone will ever run into it unless the default macppc MAXDSIZ changes. > > Something better should probably be done about that > > handler-case-bogus-compiler-note test in dynamic-extent.impure.lisp > > which hangs SBCL, unless the recent test infrastructure changes > > already took care of it. > > They did. The problem was that a COMPILER-NOTE isn't an ERROR, so > it slipped through the usual net. > > What I didn't apply was the -fPIC thing for foreign.test.sh, as it doesn't > seem to be necessary for PPC/Linux. I'd be happier with a special case > similar to what is done for Darwin. Unpatched, foreign.test.sh fails quite spectacularly on OpenBSD/macppc. When you say special case, do you mean something like this: http://www.elsasser.org/misc/jre-sbcl-ppc-foreign-test.diff For the record, here are my test results with sbcl-openbsd-ppc.diff and jre-sbcl-ppc-foreign-test.diff applied: Finished running tests. Status: Unexpected success: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) Failure: float.pure.lisp / NAN-COMPARISONS Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) Expected failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) Expected failure: dynamic-extent.impure.lisp / HANDLER-CASE-BOGUS-COMPILER-NOTE Expected failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES Expected failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK Expected failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT Failure: timer.impure.lisp / (TIMER STRESS) Failure: timer.impure.lisp / (WITH-TIMEOUT TIMEOUT) test failed, expected 104 return code, got 1 |
From: Bruce O'N. <ec...@pc...> - 2010-05-26 14:07:21
|
Hi Josh, I saw your (sadly, unanswered) message on the OpenBSD ppc list. I had a few minutes to poke at it today and I wonder if this doucment doesn't help. https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699600741775/$file/prg.pdf Down in section 1.4 (FPSCR) they say that in general the status bits are sticky and you have to clear them. I'll poke around with this and see if this helps. I wonder if your test just didn't start looping constantly delivering SIGFPEs over and over again since the bits weren't cleared. I do have this on my list of things to do, but, I've not gotten there yet. cheers bruce ----- Message d'origine ----- De: Josh Elsasser <jo...@el...> Date: Sat, 22 May 2010 00:42:40 -0700 Sujet: Re: [Sbcl-devel] Thanks very much for all the PPC patches, plus OpenBSD À: Alastair Bridgewater <ala...@gm...> Cc: "Bruce O'Neel" <ec...@pc...>, sbc...@li... On Fri, May 21, 2010 at 07:29:58PM -0400, Alastair Bridgewater wrote: > Hello, > > I've just put together the version of the patch that I'm planning to commit. > Unfortunately, I can't seem to access repo.or.cz right now, so I put the > patch at . I made > a few tweaks here and there, but nothing too egregious. Thanks, looks fine to me aside from foreign.test.sh > Barring being told "no, no, it doesn't work now" or code-freeze, I intend to > commit this at some point on Sunday, the 23rd. If there's something else > I should be doing beyond adding a NEWS snippet, now would be the time > for another SBCL maintainer to mention it. > > On Fri, May 21, 2010 at 10:16 AM, Josh Elsasser wrote: > > The failing NaN tests at least appear to be to be a problem with > > OpenBSD and need to be fixed there instead of in SBCL. I had been > > working on a fix but was sidetracked, I'll see if I can get it working > > soon. > > I looked at these briefly earlier today, and it looks like the traps are being > disabled for some strange reason on my linux system, which worries me, > but indicates to me that it might be more an SBCL problem than a Linux > problem. Don't let my conclusions stop you from approaching it as an > OpenBSD problem, though. There certainly is an OpenBSD problem, but of course that doesn't mean there isn't also an SBCL problem. > > I suppose as far as changes to SBCL are concerned, that patch is > > pretty good. The XXX comment in src/compiler/ppc/parms.lisp can be > > removed, if it ever becomes an issue then I or someone else can fix it > > then. > > I rewrote it as a FIXME comment with a bit of the background explanation. Works for me, it's not like anyone will ever run into it unless the default macppc MAXDSIZ changes. > > Something better should probably be done about that > > handler-case-bogus-compiler-note test in dynamic-extent.impure.lisp > > which hangs SBCL, unless the recent test infrastructure changes > > already took care of it. > > They did. The problem was that a COMPILER-NOTE isn't an ERROR, so > it slipped through the usual net. > > What I didn't apply was the -fPIC thing for foreign.test.sh, as it doesn't > seem to be necessary for PPC/Linux. I'd be happier with a special case > similar to what is done for Darwin. Unpatched, foreign.test.sh fails quite spectacularly on OpenBSD/macppc. When you say special case, do you mean something like this: http://www.elsasser.org/misc/jre-sbcl-ppc-foreign-test.diff For the record, here are my test results with sbcl-openbsd-ppc.diff and jre-sbcl-ppc-foreign-test.diff applied: Finished running tests. Status: Unexpected success: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) Failure: float.pure.lisp / NAN-COMPARISONS Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) Expected failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) Expected failure: dynamic-extent.impure.lisp / HANDLER-CASE-BOGUS-COMPILER-NOTE Expected failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES Expected failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK Expected failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT Failure: timer.impure.lisp / (TIMER STRESS) Failure: timer.impure.lisp / (WITH-TIMEOUT TIMEOUT) test failed, expected 104 return code, got 1 |
From: Alastair B. <ala...@gm...> - 2010-05-22 12:56:37
|
On Sat, May 22, 2010 at 3:42 AM, Josh Elsasser <jo...@el...> wrote: > On Fri, May 21, 2010 at 07:29:58PM -0400, Alastair Bridgewater wrote: >> Hello, >> >> I've just put together the version of the patch that I'm planning to commit. >> Unfortunately, I can't seem to access repo.or.cz right now, so I put the >> patch at <http://www.lisphacker.com/temp/sbcl-openbsd-ppc.diff>. I made >> a few tweaks here and there, but nothing too egregious. > > Thanks, looks fine to me aside from foreign.test.sh Okay, so this much goes in, even if we don't have something we agree on for foreign.test.sh. >> Barring being told "no, no, it doesn't work now" or code-freeze, I intend to >> commit this at some point on Sunday, the 23rd. If there's something else >> I should be doing beyond adding a NEWS snippet, now would be the time >> for another SBCL maintainer to mention it. >> >> On Fri, May 21, 2010 at 10:16 AM, Josh Elsasser <jo...@el...> wrote: >> > The failing NaN tests at least appear to be to be a problem with >> > OpenBSD and need to be fixed there instead of in SBCL. I had been >> > working on a fix but was sidetracked, I'll see if I can get it working >> > soon. >> >> I looked at these briefly earlier today, and it looks like the traps are being >> disabled for some strange reason on my linux system, which worries me, >> but indicates to me that it might be more an SBCL problem than a Linux >> problem. Don't let my conclusions stop you from approaching it as an >> OpenBSD problem, though. > > There certainly is an OpenBSD problem, but of course that doesn't mean > there isn't also an SBCL problem. Quite right, PPC/Linux has only expected failures in the test suite at this point, indicating an OpenBSD problem... And an OpenBSD unproblem (SCALE-FLOAT-OVERFLOW BUG-372). >> What I didn't apply was the -fPIC thing for foreign.test.sh, as it doesn't >> seem to be necessary for PPC/Linux. I'd be happier with a special case >> similar to what is done for Darwin. > > Unpatched, foreign.test.sh fails quite spectacularly on > OpenBSD/macppc. When you say special case, do you mean something like > this: > > http://www.elsasser.org/misc/jre-sbcl-ppc-foreign-test.diff Like that, but probably not that. At the same time, that should be enough information for me to be able to come up with something I'd be happier with. > For the record, here are my test results with sbcl-openbsd-ppc.diff > and jre-sbcl-ppc-foreign-test.diff applied: > > Finished running tests. > Status: > Unexpected success: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) > Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) > Failure: float.pure.lisp / NAN-COMPARISONS > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-346) > Expected failure: debug.impure.lisp / (UNDEFINED-FUNCTION BUG-353) > Expected failure: debug.impure.lisp / (TRACE ENCAPSULATE NIL) > Expected failure: debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL) > Expected failure: dynamic-extent.impure.lisp / (NO-CONSING DX-RAW-INSTANCES) > Expected failure: dynamic-extent.impure.lisp / HANDLER-CASE-BOGUS-COMPILER-NOTE > Expected failure: dynamic-extent.impure.lisp / DX-COMPILER-NOTES > Expected failure: dynamic-extent.impure.lisp / HANDLER-CASE-EATING-STACK > Expected failure: dynamic-extent.impure.lisp / RECHECK-NESTED-DX-BUG > Expected failure: packages.impure.lisp / USE-PACKAGE-CONFLICT-SET > Expected failure: packages.impure.lisp / IMPORT-SINGLE-CONFLICT > Failure: timer.impure.lisp / (TIMER STRESS) > Failure: timer.impure.lisp / (WITH-TIMEOUT TIMEOUT) > test failed, expected 104 return code, got 1 So, float.pure.lisp is still different, the debug.impure.lisp, dynamic-extent.impure.lisp and packages.impure.lisp stuff is at current PPC-normal, timer.impure.lisp is at what is apparently OpenBSD-normal, and the room.test.sh failure isn't there, for which we may as well blame differing heap locations or something. Looks good, though. --Alastair Bridgewater |
From: Alastair B. <ala...@gm...> - 2010-05-23 18:54:01
|
On Sat, May 22, 2010 at 8:56 AM, Alastair Bridgewater <ala...@gm...> wrote: > On Sat, May 22, 2010 at 3:42 AM, Josh Elsasser <jo...@el...> wrote: >> On Fri, May 21, 2010 at 07:29:58PM -0400, Alastair Bridgewater wrote: >>> Hello, >>> >>> I've just put together the version of the patch that I'm planning to commit. >>> Unfortunately, I can't seem to access repo.or.cz right now, so I put the >>> patch at <http://www.lisphacker.com/temp/sbcl-openbsd-ppc.diff>. I made >>> a few tweaks here and there, but nothing too egregious. >> >> Thanks, looks fine to me aside from foreign.test.sh > > Okay, so this much goes in, even if we don't have something we agree on for > foreign.test.sh. It's committed as 1.0.38.10. Hopefully the foreign.test.sh stuff works for everyone. >>> Barring being told "no, no, it doesn't work now" or code-freeze, I intend to >>> commit this at some point on Sunday, the 23rd. If there's something else >>> I should be doing beyond adding a NEWS snippet, now would be the time >>> for another SBCL maintainer to mention it. >>> >>> On Fri, May 21, 2010 at 10:16 AM, Josh Elsasser <jo...@el...> wrote: >>> > The failing NaN tests at least appear to be to be a problem with >>> > OpenBSD and need to be fixed there instead of in SBCL. I had been >>> > working on a fix but was sidetracked, I'll see if I can get it working >>> > soon. >>> >>> I looked at these briefly earlier today, and it looks like the traps are being >>> disabled for some strange reason on my linux system, which worries me, >>> but indicates to me that it might be more an SBCL problem than a Linux >>> problem. Don't let my conclusions stop you from approaching it as an >>> OpenBSD problem, though. >> >> There certainly is an OpenBSD problem, but of course that doesn't mean >> there isn't also an SBCL problem. > > Quite right, PPC/Linux has only expected failures in the test suite at this > point, indicating an OpenBSD problem... And an OpenBSD unproblem > (SCALE-FLOAT-OVERFLOW BUG-372). So, further investigation: If I add an (sb-vm::set-floating-point-modes :traps '(:overflow :divide-by-zero) just before the scale-float overflow test, I get an unexpected success for (SCALE-FLOAT-OVERFLOW BUG-372). If I further add :invalid to the list of traps I get a failure for NAN-COMPARISONS. I also note that somewhere along the line (in host-2, most likely) the traps are being disabled, possibly during some sort of signal handling. Going one step further, it appears that even when overflow traps are enabled the traps are not being executed when running (addition-overflow bug-372), at least on linux. At this point, I would suggest looking at src/code/float-trap.lisp, line 161, and changing #!-netbsd :invalid to #!-(or netbsd (and openbsd ppc)) :invalid, and seeing if that helps. >> For the record, here are my test results with sbcl-openbsd-ppc.diff >> and jre-sbcl-ppc-foreign-test.diff applied: >> >> Unexpected success: float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372) >> Expected failure: float.pure.lisp / (ADDITION-OVERFLOW BUG-372) >> Failure: float.pure.lisp / NAN-COMPARISONS |