Christophe Rhodes <csr21@...> wrote on Sat, 13 Aug 2011:
> Don Geddis <don@...> writes:
>> decoding error on stream
>> #<SB-SYS:FD-STREAM for "file /home/geddis/www/don/lisp/asciify.lisp"
>> {AA7C181}>
>> (:EXTERNAL-FORMAT :ASCII):
>> the octet sequence (225) cannot be decoded.
> Your external format for this stream is :ASCII, meaning that code
> points (bytes) 0-127 are mapped to characters, and everything else is
> an error: ASCII is a 128-character repertoire. The first step in
> fixing this problem is going to be opening asciify.lisp in a non-ASCII
> external format. (I'm surprised that you're getting ASCII as the
> default, even after you did something explicitly to select a UTF-8
> locale; checking with "locale -k LC_CTYPE | grep charmap" might be
> worthwhile). SBCL takes its default external format from the
> environment; you can check that it's doing that correctly by looking
> at sb-impl::*default-external-format*, once your Unix environment is
> sorted out.
Just to complete this report:
unix:~> echo $LANG
en_US.UTF-8
unix:~> locale -k LC_CTYPE | grep charmap
charmap="ANSI_X3.4-1968"
unix> sbcl
This is SBCL 1.0.50.0.debian, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.
SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* sb-impl::*default-external-format*
:ANSI_X3.4-1968
* (quit)
> The second step is going to be to identify the actual encoding of your
> source file: is it stored on the disk in ISO-8859-1, UTF-8 or something
> else? From looking at the file that's served up from your webserver, I
> suspect ISO-8859-1 (an 8-bit encoding that I think we can just about
> call "legacy" these days), which means that even if you have a UTF-8
> locale you will have to tell sbcl explicitly to load it with that
> external format -- or else you'll have to set up your Unix to use an
> ISO-8859-1 LC_CTYPE. (I don't recommend this unless you really know
> what you're doing).
And, obviously, I barely know what I'm doing.
It seems like I would have more luck if my source file were UTF-8
instead of ISO-8859-1. I think it came from typing a text file into
emacs, perhaps copying particular characters from usenet postings read
in GNUS (again, with emacs). Or possibly from a web browser, into an
emacs text file.
Does't seem to help, though. I did
iconv -f ISO-8859-1 -t UTF-8 asciify.lisp > a2.lisp
but an attempt to (load "a2.lisp") in sbcl fails in the same way, for
the obvious reason: as you noted above, my sbcl seems to default to
:external-format :ascii
so all of these encodings are going to fail (by default).
> I hope that the above gives you some ideas; the important thing is to
> understand character encodings, at which point it should all become
> blindingly obvious.
Thanks for everybody's help. I now understand this topic (a little)
better.
It still seems ... unhelpful? ... for sbcl to default to :ASCII as an
external format. You seem surprised by this. But I have the same
behavior on both a Debian box and also a Ubuntu box (running different
sbcl versions), so it seems intentional.
Nonetheless, I have a workaround: I can add an :EXTERNAL-FORMAT argument
to all my LOADs and COMPILE-FILEs.
Thanks, all.
-- Don
_______________________________________________________________________________
Don Geddis http://don.geddis.org/ don@...
Parents seem to think their kids are like clay, that you mould them into the
right shape when they're wet. A better metaphor is that kids are like
flexible plastic -- they respond to pressure, but when you release the
pressure they tend to pop back to their original shape. -- Bryan Caplan
|