1 - I guess I've got to retract what I said about most Lisp's handling ~CRLF in FORMAT, since I just tried "most Lisps", and, In fact, I've only found that CCL does this. (Most Lisps, for my purposes, are Windows versions of Clozure Common Lisp (CCL), Lispworks, Franz ACL, CLisp, and SBCL.) CCL handles ~CRLF by virtue of its treating ~<Return> as a sort of extended directive that is equivalent to ~<Newline>. I think that's a nice extension to Common Lisp. But it's not really the right place to treat the original problem.
2 - I now view the original problem a little differently. It's not that ~CRLF should be treated specially in Lisp code. It's that ~CRLF should not normally become present in Lisp format strings at all, i.e., in the usual case of Lisp strings created by normal reading through LOAD or COMPILE-FILE. Such Lisp code normally resides in plain text files stored on the local file system with line-ending conventions per the local operating system. My original complaint was
> If you have code that looks like
> (format stream "foo ... ~
> it compiles fine in emacs/slime, but when you save it out and
> compile it in SBCL, you get warnings about how ~^M is not a
> valid format
Now, I understand that the problem really is more general than handling ~CRLF. What's going on here is that code like the above gets saved out in plain text files in Windows, which means they have CRLF's at the ends of lines. So the problem is just that LOAD and COMPILE-FILE do not handle line ending conventions correctly on Windows. When a line ending appears in a string to be read via LOAD or COMPILE-FILE, the resulting string object in Lisp should be represent that line ending with the character #\Newline. SBCL failes to do this on Windows. (CCL does the same, by the way.) This is a bug. A very clear description of the requirements and associated problems is in CLtL2, section 2.2.2. "Line Divisions", online here:
3. To test if your Lisp has the bug, write the following code into a Lisp file in a plain text editor (e.g., Notepad) on Windows and then LOAD the resulting file.
> (format t "~:c" (char "
> " 0)) ; Should print Newline
If should print Newline. Otherwise, it's a bug (in which case, you're probably going to see it print Return). On SBCL it prints Return. Same with CCL, by the way. CLisp, ACL, and Lispworks print Newline, which is correct.
4. There are two ways to approach to this problem.
(1) Interpret any of the following sequences as Newline: CR, LF, or CRLF (and for about the same money, throw in LFCR). This might be called the the liberal approach. This is the approach recommended in CLtL2 (see ref. above). It's also consistent with the Robustness Principle (http://en.wikipedia.org/wiki/Robustness_principle), which is widely believed in the IETF community to apply to the handling of the mess that is line-ending conventions in the modern computer world.
(2) Extend your implementation's definition of EXTERNAL-FORMAT so that it allows different line-ending conventions, and create such a format appropriate for each platform (e.g., CRLF convention on Windows), and for sure have that format be the default on Windows.
I think both approaches are fine. And they're not in conflict with eachother. External formats, when used for input, should ideally support the approach in (1), that is, a line-ending convention that's "any". (It looks like Franz ACL supports this with :E-CRLF external format.)
But you could do either just (1) or (2) as ways to solve the bug with LOAD and COMPILE-FILE with respect to strings in Lisp plain-text source files. Doing (2) lets users get access to it in places besides LOAD and COMPILE-FILE, i.e., anywhere :external-format args are supported.
5. Anyhow, in terms of SBCL, it has external formats documented here:
There, it's noted that an external format can be a list whose car is a charset name and whose cdr is a plist (of which, currently, only :REPLACEMENT is a supported key.) The obvious way forward is by defining a new plist key, say, :line-ending-convention, with a possible value being one that corresponds to the Windows/DOS ("CRLF") convention (and ideally, also, something with the effect of :ANY, i.e., that implements "liberal" approach (1) above for input).
Then, just make such an external format be the default on Windows, i.e., such that LOAD or COMPILE-FILE end up using such a format when processing Lisp files. I recommend this be done as soon as possible for SBCL (and CCL, by the way). This could be the way to resolve the following SBCL bugs:
Stas Boukarev wrote:
> "Mark H. David"<mhd@...> writes:
>> Seems like a bug, or overly pedantic, to treat ~CRLF differently than ~LF. I've not seen it done in any other Lisp that I can recall.
> SBCL doesn't treat CRLF at all.