From: Clark C . E. <cc...@cl...> - 2001-12-02 15:19:23
|
| About the rest (NEL, PS and LS) - there's a lovely little document in | http://www.unicode.org/unicode/reports/tr13/ which describes what we (or | anyone else) should be doing Great. So CR, LF, CRLF, and NEL are all normalized to LF. | Outside scalars we throw away the line break characters anyway | so there's no isse of what PS/LS should map to. | | Inside text scalars I suggest that never convert PS/LS into LF or fold them | into a space. If someone is using them presumably he has a good reason to, | and he's aware that notepad wouldn't handle it well. Off hand, I think we should normalize PS/LS just like the others, unless, of course, they are escaped. ... The problem with the scalar treatment is the edge case. (PS = line ended using PS instead of CR/LF/CRLF/NEL) one: bing PS two: \\PS bop PS foo PS three: bar PS So, if we treat them different inside and outside, bing works. However, bop and foo become a single one-line thingy. Now, what happens to bar? Is it part of the map "two"? ... Clark |