From: Matt K. <kau...@cs...> - 2006-02-06 18:09:32
|
Hello -- I have encountered several bugs with SBCL 0.9.7 running on on Mac OS 10.4 (Darwin PPC), when attempting to run ACL2 Version 2.9.4. These are liste= d below. Another ACL2 user encountered problems with SBCL on Mac OS X as w= ell. I've had no problems at all with SBCL on Linux/x86. Some bugs are intermittent, but you can probably reproduce most or all of them as follo= ws. Fetch: http://www.cs.utexas.edu/users/moore/acl2/v2-9/new/v2-9-4/acl2.tar.gz Create acl2-sources/ directory: tar xfz acl2.tar.gz Build ACL2 in acl2-sources/ directory (use full path to sbcl if necessary= ): make LISP=3Dsbcl To run regression in acl2-sources/ directory: make regression-fresh [Just run 'make regression' if you don't want to clean up previous attemp= t.] Feel free to send me email if I can be of assistance. Although I doubt y= ou'll need the following info, I should mention that you can get lots of detail= s on ACL2, including detailed installation instructions, from the ACL2 home pa= ge (http://www.cs.utexas.edu/users/moore/acl2/). By the way, if you want a more comprehensive test, you can additionally d= o the following before "make regression" or "make regression-fresh". Fetch: http://www.cs.utexas.edu/users/moore/acl2/v2-9/new/v2-9-4/workshops.tar= .gz Create acl2-sources/books/workshops/ directory: cd acl2-sources/books/ tar xfz workshops.tar.gz Run regression: cd acl2-sources/ make regression-fresh Anyhow, here are the problems I observed. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D Scary warning with characters that are a problem for utf-8 encoding: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D This first complaint is minor and independent of ACL2 (but see the next i= tem for a more serious, related issue). It's perhaps unfortunate that we can't have arbitrary characters in comme= nts, regardless of the locale, without seeing a warning. bbh:~ kaufmann$ cat foo.lisp ; The comment reader cannot handle this character: ; =E1 (hi there) bbh:~ kaufmann$ sbcl This is SBCL 0.9.7, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. * (with-open-file (str "foo.lisp") (read str)) STYLE-WARNING: Character decoding error in a ;-comment at position 52 reading sourc= e file #<FD-STREAM for "file /Users/kaufmann/foo.lisp" {100248F1}>, resyn= cing. (HI THERE) *=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D More serious issue with characters that are a problem for utf-8 encoding: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D With make regression while certifying books/defexec/dag-unification/subsumption-subst.lisp: debugger invoked on a SIMPLE-ERROR: segmentation violation at #X1F2EBF0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D Problem with make -j 2 regression: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D I got the following error: Making /Users/kaufmann/acl2/current/books/arithmetic/natp-posp.cert on = Tue Jan 24 09:50:17 CST 2006 Making /Users/kaufmann/acl2/current/books/arithmetic/mod-gcd.cert on Tu= e Jan 24 09:50:17 CST 2006 debugger invoked on a SIMPLE-ERROR: breakpoint/trap at #X91F1418 Another user got this error: fatal error encountered in SBCL pid 5247: no transport function for object 0x10c00fe7 (widetag 0x58) The system is too badly corrupted or confused to continue at the Lisp level. If the system had been compiled with the SB-LDB feature, we'd dr= op into the LDB low-level debugger now. But there's no LDB in this build, = so we can't really do anything but just exit, sorry. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D Problem: "mprotect: Invalid argument" and "unknown immediate object" =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D I temporarily added the following to ACL2 source file acl2.lisp. ............................................................ ; The following was sent by Juho Snellman as a workaround for files that ; contain characters that are a problem for utf-8 encoding, which is the ; default on Mac OS 10.4 with sbcl 0.9.7. He points out that this is not= a ; currently exported interface, so it could change in the future. #+(and sbcl ppc darwin) (setf sb-impl::*default-external-format* :ISO-8859-1) ............................................................ But then we get this: Making /Users/kaufmann/acl2/current/books/meta/meta-plus-lessp.cert on Tu= e Jan 24 11:56:26 CST 2006 mprotect: Invalid argument Which came from this: bbh:~/acl2/current kaufmann$ (time nice make regression-fresh -j 3 ACL2= =3D/Users/kaufmann/acl2/current/sbcl-isofix-saved_acl2) >& make-regressio= n-sbcl-isofix.log& [1] 14343 bbh:~/acl2/current kaufmann$=20 Type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL. restarts (invokable by number or by possibly-abbreviated name): 0: [ABORT] Exit debugger, returning to top level. ((FLET SB-IMPL::FAIL) #<unknown immediate object, lowtag=3D#b10, widetag=3D#x3A {13A}>) 0] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D Thanks -- -- Matt Kaufmann |
From: Christophe R. <cs...@ca...> - 2006-02-06 19:19:56
|
Matt Kaufmann <kau...@cs...> writes: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > Scary warning with characters that are a problem for utf-8 encoding: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > > This first complaint is minor and independent of ACL2 (but see the next i= tem > for a more serious, related issue). > > It's perhaps unfortunate that we can't have arbitrary characters in comme= nts, > regardless of the locale, without seeing a warning. This isn't true. What is true is that, because of Mac OS X bugs -- specifically the fact that nl_langinfo() is essentially unimplemented -- the default encoding for textual input/output is UTF-8, no matter what the locale. However, one of the points of the UTF-8 encoding is that all Unicode characters can be encoded in it. What you may have meant is that files encoded under a one-byte-per-character encoding are not read back silently under a multibyte encoding; this is as it should be: the warning is there to tell you that something is wrong. > bbh:~ kaufmann$ cat foo.lisp > ; The comment reader cannot handle this character: > ; =C3=A1 When I run SBCL on a file with these contents, no complaint will occur, but that is because I create this file with a UTF-8 encoding of the a-grave character. It is important to understand what is happening here. The file contains a sequence of bytes, only; the question is how should the lisp interpret these bytes as text. The one-byte-per-character encodings have the merit of simplicity but the disadvantage of severe ambiguity; I presume that you have iso-8859-1-encoded files, which will look significantly different to, say, Russian users, unless they happen to guess which encoding you have used. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > More serious issue with characters that are a problem for utf-8 encoding: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > > With > make regression > while certifying books/defexec/dag-unification/subsumption-subst.lisp: > > debugger invoked on a SIMPLE-ERROR: segmentation violation at #X1F2EBF0 I will attempt to reproduce some of this tomorrow, but in the meantime, could you say firstly why you think this is a character encoding issue, and secondly whether you have in ACL2 any Lisp code which is compiled under non-safe optimization policies? > I temporarily added the following to ACL2 source file acl2.lisp. > > ............................................................ > ; The following was sent by Juho Snellman as a workaround for files that > ; contain characters that are a problem for utf-8 encoding, which is the > ; default on Mac OS 10.4 with sbcl 0.9.7. He points out that this is not= a > ; currently exported interface, so it could change in the future. > #+(and sbcl ppc darwin) > (setf sb-impl::*default-external-format* :ISO-8859-1) > ............................................................ A more principled workaround might be to specify=20 :external-format #+sbcl :iso-8859-1 #-sbcl :default to calls to OPEN / WITH-OPEN-FILE. Your mail is titled "SBCL problems on a Mac". Do you know whether the problems exist under linux/ppc, or have you only tried Mac OS X? Cheers, Christophe |
From: Harald Hanche-O. <ha...@ma...> - 2006-02-06 20:28:37
|
+ Christophe Rhodes <cs...@ca...>: | A more principled workaround might be to specify | :external-format #+sbcl :iso-8859-1 #-sbcl :default | to calls to OPEN / WITH-OPEN-FILE. And a less principled one might be to (setf sb-impl::*default-external-format* :iso-8859-1) if that works. Slightly less unprincipled, rebind the variable around the form loading the code. Of course, this advice should come with all the usual health warnings related to double colons and relying on implementation details and so forth, but it could be a useful workaround anyway. A third way might be to convert all them thar files by running them to something like [untested] (with-open-file (in in-filename :direction :input :external-format :iso-8859-1) (with-open-file (out out-filename :direction :output :external-format :utf-8) (loop for char = (read-char in :eof-error nil) while char do (write-char char out)))) - Harald |
From: Harald Hanche-O. <ha...@ma...> - 2006-02-06 20:31:55
|
+ Christophe Rhodes <cs...@ca...>: | > #+(and sbcl ppc darwin) | > (setf sb-impl::*default-external-format* :ISO-8859-1) Uh, I didn't see that on my first reading, so I suggested it all over again. Sorry about the noise. - Harald |
From: Matt K. <kau...@cs...> - 2006-02-06 19:33:34
|
Hi -- Thanks for your quick reply. Please see comments below. From: Christophe Rhodes <cs...@ca...> Cc: sbc...@li... Date: Mon, 06 Feb 2006 19:18:52 +0000 Sender: Christophe Rhodes <cs...@ca...> X-SpamAssassin-Status: No, hits=3D-2.1 required=3D5.0 X-UTCS-Spam-Status: No, hits=3D-292 required=3D180 Matt Kaufmann <kau...@cs...> writes: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > Scary warning with characters that are a problem for utf-8 encoding: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > > This first complaint is minor and independent of ACL2 (but see the n= ext item > for a more serious, related issue). > > It's perhaps unfortunate that we can't have arbitrary characters in = comments, > regardless of the locale, without seeing a warning. This isn't true. What is true is that, because of Mac OS X bugs -- specifically the fact that nl_langinfo() is essentially unimplemented -- the default encoding for textual input/output is UTF-8, no matter what the locale. However, one of the points of the UTF-8 encoding is that all Unicode characters can be encoded in it. What you may have meant is that files encoded under a one-byte-per-character encoding are not read back silently under a multibyte encoding; this is as it should be: the warning is there to tell you that something is wrong. > bbh:~ kaufmann$ cat foo.lisp > ; The comment reader cannot handle this character: > ; =E1 When I run SBCL on a file with these contents, no complaint will occur, but that is because I create this file with a UTF-8 encoding of the a-grave character. It is important to understand what is happening here. The file contains a sequence of bytes, only; the question is how should the lisp interpret these bytes as text. The one-byte-per-character encodings have the merit of simplicity but the disadvantage of severe ambiguity; I presume that you have iso-8859-1-encoded files, which will look significantly different to, say, Russian users, unless they happen to guess which encoding you have used. Thanks for the explanations. My point was only that as a dumb user, I wa= s surprised to see the warnings. I'm not seeing those on any other lisp on= any other platform, but I think I understand your point that at a deeper leve= l, the warnings result from a combination of Mac OS X bugs and (if I understand correctly) the fact that sbcl supports more encodings than other lisps. I think the file was created by someone in Spain, so you're probably righ= t about the encoding. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > More serious issue with characters that are a problem for utf-8 enco= ding: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > > With > make regression > while certifying books/defexec/dag-unification/subsumption-subst.lis= p: > > debugger invoked on a SIMPLE-ERROR: segmentation violation at #X1F= 2EBF0 I will attempt to reproduce some of this tomorrow, but in the meantime, could you say firstly why you think this is a character The file in question is one of the few that had characters that created problems for the comment reader. encoding issue, and secondly whether you have in ACL2 any Lisp code which is compiled under non-safe optimization policies? Yep, ACL2 is compiled thus: (proclaim (quote (optimize #+(or cmu sbcl lucid lispworks) (compilation-s= peed 0) ;; (debug 0) ; could consider for Lispworks, A= llegro #+cmu (extensions:inhibit-warnings 3) #+sbcl (sb-ext:inhibit-warnings 3) (speed 3) (space #+cmu 1 #-cmu 0) ; no longer necessary? (safety 0)))) > I temporarily added the following to ACL2 source file acl2.lisp. > > ............................................................ > ; The following was sent by Juho Snellman as a workaround for files = that > ; contain characters that are a problem for utf-8 encoding, which is= the > ; default on Mac OS 10.4 with sbcl 0.9.7. He points out that this i= s not a > ; currently exported interface, so it could change in the future. > #+(and sbcl ppc darwin) > (setf sb-impl::*default-external-format* :ISO-8859-1) > ............................................................ A more principled workaround might be to specify=20 :external-format #+sbcl :iso-8859-1 #-sbcl :default to calls to OPEN / WITH-OPEN-FILE. Thanks. That would probably require significantly more ACL2 code changes= , and/or be harder to maintain; for example, GCL doesn't currently support :external-format for with-open-file. So I'm hoping to get by with the so= lution above. If that's a grave mistake, I'd be interested in understanding the= issue better. Your mail is titled "SBCL problems on a Mac". Do you know whether the problems exist under linux/ppc, or have you only tried Mac OS X? Sorry about the imprecise Subject. I have only tried Mac OS X on a Mac. = I've used sbcl on linux/x86 without problems. Cheers, Christophe Thanks -- -- Matt |
From: Christophe R. <cs...@ca...> - 2006-02-06 20:10:24
|
Matt Kaufmann <kau...@cs...> writes: > I think the file was created by someone in Spain, so you're probably right > about the encoding. > > > ============================== > > More serious issue with characters that are a problem for utf-8 encoding: > > ============================== > > > > With > > make regression > > while certifying books/defexec/dag-unification/subsumption-subst.lisp: > > > > debugger invoked on a SIMPLE-ERROR: segmentation violation at #X1F2EBF0 > > I will attempt to reproduce some of this tomorrow, but in the > meantime, could you say firstly why you think this is a character > > The file in question is one of the few that had characters that created > problems for the comment reader. > > encoding issue, and secondly whether you have in ACL2 any Lisp code > which is compiled under non-safe optimization policies? > > Yep, ACL2 is compiled thus: > > (proclaim (quote (optimize #+(or cmu sbcl lucid lispworks) (compilation-speed 0) > ;; (debug 0) ; could consider for Lispworks, Allegro > #+cmu (extensions:inhibit-warnings 3) > #+sbcl (sb-ext:inhibit-warnings 3) > (speed 3) > (space #+cmu 1 #-cmu 0) ; no longer necessary? > (safety 0)))) Could you please alter this, for the sake of debugging, to (speed 1) (safety 3) and recompile? (I understand that this might take a while, but the extra safety might show up something interesting) > A more principled workaround might be to specify > :external-format #+sbcl :iso-8859-1 #-sbcl :default > to calls to OPEN / WITH-OPEN-FILE. > > Thanks. That would probably require significantly more ACL2 code changes, > and/or be harder to maintain; for example, GCL doesn't currently support > :external-format for with-open-file. So I'm hoping to get by with the solution > above. If that's a grave mistake, I'd be interested in understanding the issue > better. Fair enough. The *default-external-format* solution might well eventually turn into a supported format, and is unlikely to vanish in the near future in any case. > Your mail is titled "SBCL problems on a Mac". Do you know whether the > problems exist under linux/ppc, or have you only tried Mac OS X? > > Sorry about the imprecise Subject. I have only tried Mac OS X on a Mac. I've > used sbcl on linux/x86 without problems. Right. Tomorrow I will attempt to reproduce some of these failures on a ppc with linux; I do have access to a ppc with OS X, but it is rather slow, so if someone here with access to a faster one has time to try things that might be good. Cheers, Christophe |
From: Christophe R. <cs...@ca...> - 2006-02-07 14:13:49
|
Matt Kaufmann <kau...@cs...> writes: > It's perhaps unfortunate that we can't have arbitrary characters in comments, > regardless of the locale, without seeing a warning. I think I finally see what you're driving at here: it is probably worthwhile resignalling the decoding error from the handler in the comment reader, so that outer handlers for decoding errors get a chance to run (and possibly handle the error silently). > With > make regression > while certifying books/defexec/dag-unification/subsumption-subst.lisp: > > debugger invoked on a SIMPLE-ERROR: segmentation violation at #X1F2EBF0 I have reproduced this on a dual G5 running PPC/linux. The offending function is indeed SB-IMPL::RESYNC/UTF-8; the faulting instruction itself appears to be the one doing the SAP-REF-8 dereference in that function: (defun ,resync-function (stream) (loop (input-at-least stream 1) (incf (fd-stream-ibuf-head stream)) (unless (block decode-break-reason (let* ((sap (fd-stream-ibuf-sap stream)) (head (fd-stream-ibuf-head stream)) (byte (sap-ref-8 sap head)) (size ,in-size-expr)) ,in-expr) nil) (return)))) ; (input-at-least stream 1) ; 9F2C: L2: 8074006B LWZ $NL0,107($CNAME) ; 9F30: 80940067 LWZ $NL1,103($CNAME) ; 9F34: 7C841850 SUBF $NL1,$NL1,$NL0 ; 9F38: 2C040004 CMPWI CR0,$NL1,4 ; 9F3C: 418004A8 BT LT,L25 ; (incf (fd-stream-ibuf-head stream)) + overflow check ; 9F40: 80940067 LWZ $NL1,103($CNAME) ; 9F44: 38840004 ADDI $NL1,$NL1,4 ; 9F48: 3C607FFF ADDIS $NL0,$ZERO,32767 ; 9F4C: 6063FFF8 ORI $NL0,$NL0,65528 ; 9F50: 7C041800 CMPW CR0,$NL1,$NL0 ; 9F54: 41810480 BT GT,L24 ; 9F58: 90940067 STW $NL1,103($CNAME) ; 9F5C: 7C8A2378 MR $FDEFN,$NL1 ; (fd-stream-ibuf-sap stream) ; 9F60: 8154005F LWZ $FDEFN,95($CNAME) ; (fd-stream-ibuf-head stream) ; 9F64: 80B40067 LWZ $NL2,103($CNAME) ; is SAP NULL? ; 9F68: 7C0A9000 CMPW CR0,$FDEFN,$NULL ; -> error "not a SYSTEM-AREA-POINTER" ; 9F6C: 41820454 BT EQ,L23 ; get the actual address that the sap points to ; 9F70: 806AFFFD LWZ $NL0,-3($FDEFN) ; untag HEAD ; 9F74: 7CA41670 SRAWI $NL1,$NL2,2 ; get the byte at sap+head ; 9F78: 7C6320AE LBZX $NL0,$NL0,$NL1 Can anyone see anything wrong (not just in this disassembly, but in the whole code)? I can post more disassembly and do more experiments if people can think of anything. Note, however, that simply repeatedly doing READ on a UTF-8 stream opened to subsumption-subst.lisp doesn't cause a segmentation fault here. Cheers, Christophe |
From: Christophe R. <cs...@ca...> - 2006-02-07 14:56:31
|
Christophe Rhodes <cs...@ca...> writes: > I have reproduced this on a dual G5 running PPC/linux. The offending > function is indeed SB-IMPL::RESYNC/UTF-8; the faulting instruction > itself appears to be the one doing the SAP-REF-8 dereference in that > function: > > (defun ,resync-function (stream) > (loop (input-at-least stream 1) > (incf (fd-stream-ibuf-head stream)) > (unless (block decode-break-reason > (let* ((sap (fd-stream-ibuf-sap stream)) > (head (fd-stream-ibuf-head stream)) > (byte (sap-ref-8 sap head)) > (size ,in-size-expr)) > ,in-expr) > nil) > (return)))) Teemu Kalvas (on #lisp) has diagnosed the problem: this will fail to do the right thing if the bad octet sequence is at an octet position just below an integer multiple of 4096. At that point, the SAP-REF-8 may well attempt to read into unallocated memory. Most of the time, this is harmless (in some sense) because we are usually inside a comment reader, so reading garbage bytes doesn't matter unless the garbage happens to contain a 10 (which would be interpreted as #\Newline). In my failed reproduction of this case, running sbcl normally, this is presumably what happens. If I READ from subsumption-subst.lisp having run sbcl from make(1), however, I get the segmentation fault from reading at position 12285 (which is 3*4096-3); I think that the behaviour is different because under make(1) the heap from which fd-stream buffers are allocated is at a different virtual address, and adjoins memory which is read-protected. I hope that a fix for this will not take too long. Cheers, Christophe |
From: Matt K. <kau...@cs...> - 2006-02-07 15:31:35
|
Great! I appreciate your efforts and that of the SBCL developer community, including Juho Snellman <js...@ik...> who has helped a lot in earlier email discussions. In my original bug report, I mentioned other errors in "make -j 2 regression" and "make -j 3 regression" that occur at other (earlier) points in the regression than the one you've been looking at. If there is an SBCL test suite, perhaps you'll want to add the ACL2 2.9.4 regression to it (instructions are in my original bug email), and perhaps later releases as well. There's an ACL2 mailing list with links from the ACL2 home page, or I can just make a note to email sbcl-devel upon new ACL2 releases, generally every few months, if you like. Thanks -- -- Matt From: Christophe Rhodes <cs...@ca...> Cc: sbc...@li... Date: Tue, 07 Feb 2006 14:52:49 +0000 Sender: Christophe Rhodes <cs...@ca...> X-SpamAssassin-Status: No, hits=-2.1 required=5.0 X-UTCS-Spam-Status: No, hits=-292 required=180 Christophe Rhodes <cs...@ca...> writes: > I have reproduced this on a dual G5 running PPC/linux. The offending > function is indeed SB-IMPL::RESYNC/UTF-8; the faulting instruction > itself appears to be the one doing the SAP-REF-8 dereference in that > function: > > (defun ,resync-function (stream) > (loop (input-at-least stream 1) > (incf (fd-stream-ibuf-head stream)) > (unless (block decode-break-reason > (let* ((sap (fd-stream-ibuf-sap stream)) > (head (fd-stream-ibuf-head stream)) > (byte (sap-ref-8 sap head)) > (size ,in-size-expr)) > ,in-expr) > nil) > (return)))) Teemu Kalvas (on #lisp) has diagnosed the problem: this will fail to do the right thing if the bad octet sequence is at an octet position just below an integer multiple of 4096. At that point, the SAP-REF-8 may well attempt to read into unallocated memory. Most of the time, this is harmless (in some sense) because we are usually inside a comment reader, so reading garbage bytes doesn't matter unless the garbage happens to contain a 10 (which would be interpreted as #\Newline). In my failed reproduction of this case, running sbcl normally, this is presumably what happens. If I READ from subsumption-subst.lisp having run sbcl from make(1), however, I get the segmentation fault from reading at position 12285 (which is 3*4096-3); I think that the behaviour is different because under make(1) the heap from which fd-stream buffers are allocated is at a different virtual address, and adjoins memory which is read-protected. I hope that a fix for this will not take too long. Cheers, Christophe |
From: Christophe R. <cs...@ca...> - 2006-02-09 13:33:27
|
Matt Kaufmann <kau...@cs...> writes: > Great! I appreciate your efforts and that of the SBCL developer community, > including Juho Snellman <js...@ik...> who has helped a lot in earlier email > discussions. The utf-8 segmentation fault in subsumption-subst.lisp should now be fixed (there was a buffer overread in the resyncing code for variable-width external formats); the acl2 regression test just ran to completion here on a Linux/PPC dual G5 with sbcl-0.9.9.24. > In my original bug report, I mentioned other errors in "make -j 2 > regression" and "make -j 3 regression" that occur at other (earlier) > points in the regression than the one you've been looking at. Right, investigation of these is next on the list. > If there is an SBCL test suite, perhaps you'll want to add the ACL2 > 2.9.4 regression to it (instructions are in my original bug email), > and perhaps later releases as well. There is an SBCL test suite, but the various distributed developers don't all have access to very fast hardware -- and indeed even on a very recent G5 the ACL2 regression tests take upwards of 3 hours, which is impractical to run on every commit. > There's an ACL2 mailing list with links from the ACL2 home page, or > I can just make a note to email sbcl-devel upon new ACL2 releases, > generally every few months, if you like. This might be useful, or indeed (since sbcl is simple, we hope, to build) regressions could be tested for by an interested ACL2 user or developer. Thanks, Christophe |
From: Christophe R. <cs...@ca...> - 2006-02-09 15:23:08
|
Matt Kaufmann <kau...@cs...> writes: > ============================== > Problem with make -j 2 regression: > ============================== > > I got the following error: > > Making /Users/kaufmann/acl2/current/books/arithmetic/natp-posp.cert on Tue Jan 24 09:50:17 CST 2006 > Making /Users/kaufmann/acl2/current/books/arithmetic/mod-gcd.cert on Tue Jan 24 09:50:17 CST 2006 > > debugger invoked on a SIMPLE-ERROR: breakpoint/trap at #X91F1418 > > Another user got this error: > > fatal error encountered in SBCL pid 5247: > no transport function for object 0x10c00fe7 (widetag 0x58) I just ran the regression tests again, on my dual G5 running linux/ppc and sbcl-0.9.9.24, with make -j 3; system load varied between 1 and 3 for 100 minutes (wall clock time), and saw no failures. This means that the problem is unlikely to lie directly with SBCL on the PPC architecture (unfortunately, because if it did the problem would be easier to solve), which might have been predictable in any case, given that multiple sbcl processes should be completely independent of each other. The two most likely sources of this kind of problem that I can think of are: * a bug in the Mac OS X kernel in the translation layer between Mach exceptions and POSIXoid signals, or possibly between POSIXish system calls and whatever it is OS X actually uses; * bad hardware which develops faults under load. The fact that you have two reports of this problem under OS X (and I believe we have a third report of something similar, again under OS X, but less easily repeatable) makes the bad hardware hypothesis a bit less likely, though I still wouldn't rule it out. To resolve this, we're likely to need someone who has a better knowledge of OS X internals than I do, I'm afraid. A reduced test case for the problem would certainly help if we need to go outside the lisp-using community; if I'm guessing right about signals, it could be possible to reproduce similar problems by running several instances of heavy signal-using programs concurrently. Cheers, Christophe |
From: Matt K. <kau...@cs...> - 2006-02-15 21:46:47
|
Hi -- Thanks for making the fix (and for your SBCL work in general). OK, I've made a note to send future release emails to sbc...@li.... I intend to test SBCL on (ubuntu) Linux and Mac OS X (Darwin) just before each ACL2 release. We tend to do these releases about every four months, though this varies. If you have a particular sbcl version that you think is ready to test and you want me to test it on Mac OS X (Darwin), or Linux for that matter, feel free to ask. By the way, if for ACL2 you run "make certify-books" instead of "make regression", you'll get a significantly shorter but still non-trivial test. -- Matt From: Christophe Rhodes <cs...@ca...> Cc: sbc...@li... Date: Thu, 09 Feb 2006 13:32:50 +0000 Sender: Christophe Rhodes <cs...@ca...> X-SpamAssassin-Status: No, hits=-2.1 required=5.0 X-UTCS-Spam-Status: No, hits=-282 required=180 Matt Kaufmann <kau...@cs...> writes: > Great! I appreciate your efforts and that of the SBCL developer community, > including Juho Snellman <js...@ik...> who has helped a lot in earlier email > discussions. The utf-8 segmentation fault in subsumption-subst.lisp should now be fixed (there was a buffer overread in the resyncing code for variable-width external formats); the acl2 regression test just ran to completion here on a Linux/PPC dual G5 with sbcl-0.9.9.24. > In my original bug report, I mentioned other errors in "make -j 2 > regression" and "make -j 3 regression" that occur at other (earlier) > points in the regression than the one you've been looking at. Right, investigation of these is next on the list. > If there is an SBCL test suite, perhaps you'll want to add the ACL2 > 2.9.4 regression to it (instructions are in my original bug email), > and perhaps later releases as well. There is an SBCL test suite, but the various distributed developers don't all have access to very fast hardware -- and indeed even on a very recent G5 the ACL2 regression tests take upwards of 3 hours, which is impractical to run on every commit. > There's an ACL2 mailing list with links from the ACL2 home page, or > I can just make a note to email sbcl-devel upon new ACL2 releases, > generally every few months, if you like. This might be useful, or indeed (since sbcl is simple, we hope, to build) regressions could be tested for by an interested ACL2 user or developer. Thanks, Christophe |
From: Matt K. <kau...@cs...> - 2006-06-02 19:10:50
|
Hello -- A few days ago I successfully built ACL2 Version 3.0 on top of SBCL 0.9.12 on Mac OS 10.4.6 (Darwin Kernel Version 8.6.0). However, the "-j" option for "make" doesn't yet work for me; just thought I'd let you know in case this is news to you, given your "next on the list" remark below. But there's no hurry as far as I'm concerned. SBCL has impressive performance. You might find interesting the Performance Comparisons at http://www.cs.utexas.edu/users/moore/acl2/v3-0/new.html. Regards, Matt From: Christophe Rhodes <cs...@ca...> Cc: sbc...@li... Date: Thu, 09 Feb 2006 13:32:50 +0000 Sender: Christophe Rhodes <cs...@ca...> X-SpamAssassin-Status: No, hits=-2.1 required=5.0 X-UTCS-Spam-Status: No, hits=-282 required=180 Matt Kaufmann <kau...@cs...> writes: > Great! I appreciate your efforts and that of the SBCL developer community, > including Juho Snellman <js...@ik...> who has helped a lot in earlier email > discussions. The utf-8 segmentation fault in subsumption-subst.lisp should now be fixed (there was a buffer overread in the resyncing code for variable-width external formats); the acl2 regression test just ran to completion here on a Linux/PPC dual G5 with sbcl-0.9.9.24. > In my original bug report, I mentioned other errors in "make -j 2 > regression" and "make -j 3 regression" that occur at other (earlier) > points in the regression than the one you've been looking at. Right, investigation of these is next on the list. > If there is an SBCL test suite, perhaps you'll want to add the ACL2 > 2.9.4 regression to it (instructions are in my original bug email), > and perhaps later releases as well. There is an SBCL test suite, but the various distributed developers don't all have access to very fast hardware -- and indeed even on a very recent G5 the ACL2 regression tests take upwards of 3 hours, which is impractical to run on every commit. > There's an ACL2 mailing list with links from the ACL2 home page, or > I can just make a note to email sbcl-devel upon new ACL2 releases, > generally every few months, if you like. This might be useful, or indeed (since sbcl is simple, we hope, to build) regressions could be tested for by an interested ACL2 user or developer. Thanks, Christophe |