Re: Another request - character streams without newline translation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

  Bruno Haible
  Don Cohen writes:
  > It seems to me that this would be easy to implement - a new line
  > terminator mode that reads CR's as CR's and LF's as LF's - I don't
  > care what #\newline is.

  It would be easy, yes, but what does it bring you or any other user?

  > I have files containing strings with these characters.  The files are
  > to be read with read.  They might have things like
  > (123 "asdf^M
  > zxcv")
  > 
  > (where the ^M is ascii 13 and the newline is ascii 10)
  > 
  > I want to keep those characters.

  Why? What is this strange text/data which would be corrupted if ^Ms
  are translated to #\Newlines in the same way as ^M^J? This would be
  text which cannot be ported from Unix to Windows or vice versa.

In this case, there's a client who sends me this data, I store it, and
I send it back when he asks for it.  If he sends me CR's he wants to
get them back.  In this case he's actually looking for them.

  Isn't it really binary data, possibly with embedded text?

To me this whole distinction between binary and [what's the name of
the alternative?] data is not so clear cut.  You can look at it
however you like.  From my point of view you're preventing me from
porting my data between windows and unix.  

  I'm recalling Gilbert's suggestion to provide functions for converting
  an array of bytes to a string (relative to a given encoding) and vice
  versa. After looking at Python, Java, and P*rl, I'm now convinced such
  a facility is missing in clisp. Would it help you in this case?

This, of course, is already a pretty significant cost.  It's the same
one I pay now.  For instance, in order to read from a file I have to
read the entire file into a data structure and then convert it.  Right
now at least I get to convert it one byte at a time.  What would be
much more friendly, I think, would be to allow READ (and related
functions) to read from binary streams.  You could even pass that
encoding.  But in that case I'd still want to be able to pass an
encoding that preserves the CR's and LF's.