Don Cohen writes:
> Pascal J.Bourguignon writes:
> > Imagine you copy your lisp program to three different systems: a
> > Macintosh system, a MS-Windows system and a unix system, and that with
> > you same lisp program on these three systems, you try to read a TEXT
> > file named example.txt transfered thru FTP as a text file between the
> > three systems.
> Right, because ftp happens to be changing the file contents.
> If, on the other hand, I use scp, or ftp in binary mode,
But once you copy your files in BINARY mode, they're no longer TEXT
files. They are now binary file, and you cannot read CHARACTERS from
them only BYTES. If you want to have the exact same behavior on the
three systems BYTE-FOR-BYTE, you must read and write BYTES, not
LINES of CHARACTERS.
> then I get the same results with the program that I propose, which
> I consider to be system indedpendent, and different results with
> the program that I must write in clisp now, which I regard as
> system dependent. I don't mind that I *can* write a program that
> is system dependent. I just want to be *able* to write one that is
> system independent.
Then don't use CHARACTERS and CHAR-CODE/CODE-CHAR since these
functions ARE IMPLEMENTATION DEPENDENT! Which is even worse than
system dependent, since different implementations on the same system
can have different idea of what CHAR-CODE or CODE-CHAR should
return. Even the SAME implementation can be compiled with options
giving different results, such as clisp compiled in 8-bit chars or
with unicode support!
system independent <=> BYTE, READ-BYTE, WRITE-BYTE
implementation dependent <=> CHAR, READ-CHAR, WRITE-CHAR
> > > How code-char/char-code get away without an encoding:
> > > I gather the point is that the encoding relates bytes on a file to
> > > characters, and code-char/char-code relates characters to integers,
> > > which evidently do NOT have to be the same integers that you would
> > > get if you read or wrote those bytes to/from a file!
> > > I assume that (= x (char-code (code-char x))) for x in the appropriate
> > > range. I've also been assuming that the ascii characters correspond
> > > to the "right" codes. That's probably all that matters to me so far.
> > You're assuming too much. The "ASCII" characters have not the "right"
> > codes on an EBCDIC system. And EBCDIC systems are far from dead, they
> > even do web CGI on CICS...
> Ok, so if I want to use ebcdic then I should use some other encoding.
> As it turns out, the Internet protocols pretty much stick to ascii,
> so that's what I mostly want to use.
My point, and what you don't understand, is that when you are
expecting TEXT data, your program could be running on an EBCDIC system
and receive URLs and HTML textual data in EBCDIC! Of course, what
runs on the wire is always ASCII, but what code you find in core
memory can be anything the system likes to have. Read again the
and you'll see that the only thing that is prescribed is a minimum set
of characters, but that the corresponding codes can be anything as
long as some constaints are respected:
You should really study the whole chapter 2 and chapter 13 of CLHS.
There is no worse tyranny than to force a man to pay for what he doesn't
want merely because you think it would be good for him.--Robert Heinlein