Thread: [Sbcl-devel] NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

Common Lisp compiler and runtime

Brought to you by: crhodes, demoss, jsnell, pkhuong, and 4 others

sbcl-devel

[Sbcl-devel] NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

From: Russell M. <rus...@ya...> - 2004-10-05 12:47:00

I was able to build sbcl-0.8.14 on NetBSD 2.0_BETA from source using
CLISP 2.33.  It seems to work (woohoo!).

Also, I ran the tests, and they seemed to get pretty far, but
eventually died like so:

//running exhaust.impure.lisp test
; in: LAMBDA NIL
;     (SB-KERNEL:FLOAT-WAIT)
; 
; note: deleting unreachable code
; compilation unit finished
;   printed 1 note
unhandled SIMPLE-ERROR in thread 3819: segmentation violation at #X0


unhandled condition in --disable-debugger mode, quitting
test exhaust.impure.lisp failed, expected 104 return code, got 1

Not sure if this is a known issue, so I report it here.

I plan to keep using sbcl on NetBSD going forward, building from
source, reporting bugs, fixing what I can (probably not much).  Also,
I would be willing to build and make available NetBSD binaries going
forward.  Would this be useful?

-russ

Re: [Sbcl-devel] NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

From: Christophe R. <cs...@ca...> - 2004-10-05 13:36:58

Russell McManus <rus...@ya...> writes:

> I was able to build sbcl-0.8.14 on NetBSD 2.0_BETA from source using
> CLISP 2.33.  It seems to work (woohoo!).

Excellent.

> Also, I ran the tests, and they seemed to get pretty far, but
> eventually died like so:
>
> //running exhaust.impure.lisp test
> unhandled SIMPLE-ERROR in thread 3819: segmentation violation at #X0
> unhandled condition in --disable-debugger mode, quitting
> test exhaust.impure.lisp failed, expected 104 return code, got 1
>
> Not sure if this is a known issue, so I report it here.

I think at least Richard Kreuter knows about it.  (I'm not sure that
he knows how to solve it).  It's a little painful to debug, but the
idea is that the system memprotect()s various different pages on the
regular stack at different times, to ensure that stack exhaustion is
detected.  It is of course possible that you're finding something
which doesn't quite work in NetBSD's SA_SIGINFO signal handling, or it
could be a subtle bug in sbcl's memory_fault_handler() itself.

> I plan to keep using sbcl on NetBSD going forward, building from
> source, reporting bugs, fixing what I can (probably not much).  Also,
> I would be willing to build and make available NetBSD binaries going
> forward.  Would this be useful?

Having binaries available would be good, I think.

Cheers,

Christophe

[Sbcl-devel] Re: NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

From: Richard M K. <kr...@pr...> - 2004-10-07 00:13:45

Christophe Rhodes <cs...@ca...> writes:
> Russell McManus <rus...@ya...> writes:
>
>> I was able to build sbcl-0.8.14 on NetBSD 2.0_BETA from source using
>> CLISP 2.33.  It seems to work (woohoo!).

Note: the particular bug (whatever it is) is involved with signal
handling code that's changed somewhat from SBCL 0.8.14 to 0.8.15,
described below. The same sort of error occurs in SBCL 0.8.15, though.

>> Also, I ran the tests, and they seemed to get pretty far, but
>> eventually died like so:
>>
>> //running exhaust.impure.lisp test
>> unhandled SIMPLE-ERROR in thread 3819: segmentation violation at #X0
>> unhandled condition in --disable-debugger mode, quitting
>> test exhaust.impure.lisp failed, expected 104 return code, got 1
>>
>> Not sure if this is a known issue, so I report it here.
>
> I think at least Richard Kreuter knows about it.  (I'm not sure that
> he knows how to solve it).  

Both of these are true, but I'm still working on it. This is the only
repeatable failure I've found in the SBCL tests on NetBSD, fwiw.

> It's a little painful to debug, but the idea is that the system
> memprotect()s various different pages on the regular stack at
> different times, to ensure that stack exhaustion is detected.  It is
> of course possible that you're finding something which doesn't quite
> work in NetBSD's SA_SIGINFO signal handling, or it could be a subtle
> bug in sbcl's memory_fault_handler() itself.

The following things work for me: (1) the signal handler setup
succeeds and the handler runs on the alternate signal stack, (2)
mprotect is being called at the right times and is working, according
to pmap. However, I always get segfaults somewhere after the end of
call_into_lisp (in src/runtime/x86-assem.S) before reaching the Lisp
debugger. The page containing the fault address is the then-writable
control_stack_guard_page.

I think the problem is in the setup for the call into Lisp: the eax
register is zero at the end of call_into_lisp on NetBSD, but contains
a non-zero address at the same point of execution on GNU/Linux (it's
supposed to contain a pointer to the lexenv for the Lisp function to
be called, IIUC). Tracing backwards, the word that's copied into eax
at/around line 216 of x86-assem.S is zero, but I'm not sure who's
supposed to be stashing the desired value into that address. 

Hope that helps,
Richard

[Sbcl-devel] Re: NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

From: Richard M K. <kr...@pr...> - 2004-10-07 00:25:29

Richard M Kreuter <kr...@pr...> writes:
> Christophe Rhodes <cs...@ca...> writes:
>> Russell McManus <rus...@ya...> writes:
>
> Note: the particular bug (whatever it is) is involved with signal
> handling code that's changed somewhat from SBCL 0.8.14 to 0.8.15,
> described below.

Oops. I description of the intended stack exhaustion handling
behavior, but then cut it out. In case it's useful to somebody, I
added it to the sbcl-internals wiki:

http://sbcl-internals.cliki.net/stack%20exhaustion

--
Richard

Re: [Sbcl-devel] Re: NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

From: Nikodemus S. <tsi...@cc...> - 2004-10-10 14:21:39

On Wed, 6 Oct 2004, Richard M Kreuter wrote:

> I think the problem is in the setup for the call into Lisp: the eax
> register is zero at the end of call_into_lisp on NetBSD, but contains
> a non-zero address at the same point of execution on GNU/Linux (it's
> supposed to contain a pointer to the lexenv for the Lisp function to
> be called, IIUC). Tracing backwards, the word that's copied into eax
> at/around line 216 of x86-assem.S is zero, but I'm not sure who's
> supposed to be stashing the desired value into that address.

Looking at the commit message for 0.8.15.7

  "arrange_return_to_lisp_function wasn't restoring esp
   properly.  Not sure it ever makes a difference in practice,
   but fix it anyway."

makes me guess arrange_r_t_l_f is the culprit, or alternatively we're not 
getting the right/all info out of the signal context (see x86-bsd-os.h and 
x86-bsd-os.c).

This is just guessing, though.

Cheers,

  -- Nikodemus              Schemer: "Buddha is small, clean, and serious."
                   Lispnik: "Buddha is big, has hairy armpits, and laughs."

[Sbcl-devel] Re: NetBSD 2.0_BETA build of sbcl-0.8.14 success (?)

From: Richard M K. <kr...@pr...> - 2004-10-16 03:46:17

> On Wed, 6 Oct 2004, Richard M Kreuter wrote:
>
>> I think the problem is in the setup for the call into Lisp...

I think I figured it out: NetBSD restores esp from the uesp mcontext
slot. Dunno if that's a bug in NetBSD, but the following small patch
fixes the stack overflow handling here, and a build of tonight's
anoncvs with with this patch passes all the tests.

Regards,
Richard

--- sbcl/src/runtime/interrupt.c	2004-10-02 20:57:14.000000000 -0400
+++ sbcl-0.8.15.11/src/runtime/interrupt.c	2004-10-15 22:24:38.000000000 -0400
@@ -679,7 +679,11 @@
     *os_context_pc_addr(context) = call_into_lisp;
     *os_context_register_addr(context,reg_ECX) = 0; 
     *os_context_register_addr(context,reg_EBP) = sp-2;
+#ifdef __NetBSD__   /* NetBSD restores esp from uesp, evidently. */
+    *os_context_register_addr(context,reg_UESP) = sp-14;
+#else
     *os_context_register_addr(context,reg_ESP) = sp-14;
+#endif
 #else
     /* this much of the calling convention is common to all
        non-x86 ports */
--- sbcl/src/runtime/x86-bsd-os.c	2004-07-20 16:20:15.000000000 -0400
+++ sbcl-0.8.15.11/src/runtime/x86-bsd-os.c	2004-10-15 17:30:43.000000000 -0400
@@ -69,6 +69,8 @@
 	return CONTEXT_ADDR_FROM_STEM(ESI);
     case 14:
 	return CONTEXT_ADDR_FROM_STEM(EDI);
+    case 16:
+	return CONTEXT_ADDR_FROM_STEM(UESP);
     default:
 	return 0;
     }
--- sbcl/src/runtime/x86-lispregs.h	2000-10-20 19:30:35.000000000 -0400
+++ sbcl-0.8.15.11/src/runtime/x86-lispregs.h	2004-10-15 17:30:52.000000000 -0400
@@ -34,8 +34,9 @@
 #define reg_EBP REG(10)
 #define reg_ESI REG(12)
 #define reg_EDI REG(14)
+#define reg_UESP REG(16)
 
-#define REGNAMES "EAX", "ECX", "EDX", "EBX", "ESP", "EBP", "ESI", "EDI"
+#define REGNAMES "EAX", "ECX", "EDX", "EBX", "ESP", "EBP", "ESI", "EDI", "UESP"
 
 /* classification of registers
  *