From: <do...@ni...> - 2000-05-12 18:47:14
|
Bruno Haible Don Cohen writes: > It seems to me that this would be easy to implement - a new line > terminator mode that reads CR's as CR's and LF's as LF's - I don't > care what #\newline is. It would be easy, yes, but what does it bring you or any other user? > I have files containing strings with these characters. The files are > to be read with read. They might have things like > (123 "asdf^M > zxcv") > > (where the ^M is ascii 13 and the newline is ascii 10) > > I want to keep those characters. Why? What is this strange text/data which would be corrupted if ^Ms are translated to #\Newlines in the same way as ^M^J? This would be text which cannot be ported from Unix to Windows or vice versa. In this case, there's a client who sends me this data, I store it, and I send it back when he asks for it. If he sends me CR's he wants to get them back. In this case he's actually looking for them. Isn't it really binary data, possibly with embedded text? To me this whole distinction between binary and [what's the name of the alternative?] data is not so clear cut. You can look at it however you like. From my point of view you're preventing me from porting my data between windows and unix. I'm recalling Gilbert's suggestion to provide functions for converting an array of bytes to a string (relative to a given encoding) and vice versa. After looking at Python, Java, and P*rl, I'm now convinced such a facility is missing in clisp. Would it help you in this case? This, of course, is already a pretty significant cost. It's the same one I pay now. For instance, in order to read from a file I have to read the entire file into a data structure and then convert it. Right now at least I get to convert it one byte at a time. What would be much more friendly, I think, would be to allow READ (and related functions) to read from binary streams. You could even pass that encoding. But in that case I'd still want to be able to pass an encoding that preserves the CR's and LF's. |