| Geez. That's twice misunderstood.
Sorry for avoiding the topic. This one is a
pain in the arse. There are two options I
know of, and each one has it's problems.
In most languages (C, Python, Java), \n is line feed LF 0x0A,
\r is carriage return CR 0x0D, \r\n is the carriage line
feed pair, CRLF 0x0D 0x0A. Thus, these are our escape sequences.
Furthermore, the specification currently says that CRLF and LF
pairs are normalized to LF when reading. Hence, in your example,
according to the specification, would only be LF and not CRLF
as you would prefer.
Note that this convention was copied from the XML specification,
so it is not "new" behavior. I've not heared of Perl people
having trouble with this behavior, so I figured adopting it
from XML without too much question wouldn't be such a bad idea.
However, for the record, IBM fellas have the same gripe...
They have NEL (0x85 unicode) which is similar to CRLF. When
it gets normlized to LF everything gets screwed up for them.
Thus, they can round-trip NEL or have readability, but not both.
Right now, the normalization to NL makes things pretty
easy in most cases, and still enables round-tripping.
Since \r\n or \0x85 can be used in escaped scalars to
store specific carriage return requirements. However,
it doesn't give the preferred behavior you described.
We can fix this, but then we give up the ability
to round-trip various forms of new lines using
escape sequences. Here is a proposal. Let NL stand
for new line and be one of LF, CRLF, NEL depending
on the platform. Then, when loading, LF, CR, CRLF,
NEL will all be converted to NL. Furthermore, we
drop \r from the escape sequences and only have \n
which is synonymous to NL. Also, to be on the safe
side, the hex equivalents of LF, CR, CRLF, and NEL
become errors since they will not round-trip.
This allows YAML line endings to work as expected
for all platforms and allows for line-endings to
be automagically converted to their appropriate
version on different platforms.
The difficulty with this proposal is that it
prevents applications from distinguishing
between CR, LF, CRLF, or NEL. If they could
distinguish, then we could not normalize.
For example, in Windows, people often use
LF for a "soft" return that dosen't start
a new paragraph (where CRLF does). Thus,
this particular application distinguishes
between these two characters. Furthermore,
as our spec stands, one can completely
store a binary value using our escaped
scalar... with this proposal, we give up
the ability to store arbitrary binary data
in this manner.
Unfortunatly, we can't have it both ways. Either
we pick a "basis", such as LF and use it as the
standard new line representation, allowing escaping
of other new line conventions. Or we let the
new line representation "float" -- in this
case, round-tripping of the new line characters
between platforms using different line endings
will not be reliable, and thus escaping of new
lines should be forbidden.
I hope this helps explain the compromise.
I've opted for LF standardization, so that
escaping will round-trip as expected.
So, using the spec as written, you'd have to add
the \r yourself in your program. Or, you could
use escaped scalars and use \r\n at the end of each
line to add the carriage returns explicitly. Neither
option is great, but this is what is required... and
by the way, it is how XML works.
Or... if we went with the proposal above, there
would be no way to encode binary values which
happened to have \r, \n in them using the
escaped scalar. I guess I'm open to either
proposal. Given that we will natively support the
base64 encoding, I don't mind the normalization.
So... I think it comes down to the use cases,
which ones are more popular. I hope this
explained it well... sorry for all of the