From: Thomas W. <wi...@ac...> - 2005-12-29 17:15:11
|
Sean, On Dec 28, 2005, at 10:44 PM, Sean Parent wrote: > The parsers are all line ending agnostic. For the Begin app - I'd =20 > assume (I haven't looked at that part of the code) it converts into =20= > platform line endings on the way in - and probably leaves them that =20= > way. Hmm=85 to me this seems like a complicated and brittle way to deal with =20= this. Let me explain=85 > On Windows this would convert /n to /r/n - which may be what you =20 > are seeing? > I _really_ dislike any iostream libraries which try to auto-convert =20= > line endings IIUC this is what every std conforming iostream library does by =20 default. Well to be precise, it is what it asks the OS to do. I.e. =20 the OS is responsible for the mapping between in memory =20 representation and on disk representation of file contents. On =20 Windows this will mean mapping "\r\n" to "\n". I haven't seen a =20 plattform where the in memory representation uses anything but a =20 single '\n' for line endings. Come to think of this does the C =20 standard actually specify this? Given this there seem to be two strategies to deal with this. a) Require every file to be in the plattforms native encoding and let =20= the OS do the work. The program logic will always assume line endings =20= to be '\n'. b) Try to detect the line ending encoding on input and do the mapping =20= "native-encoding" -> '\n' yourself on input (Experience shows that =20 this is complicated and a frequent source of bugs). The rest of the =20 program logic will still assume line endings to be '\n'. When writing =20= the file you can either write to the original encoding, that is only =20 if you memorized that, or to the native encoding. In my experience b) is rarely worth the added complexity, but this =20 pretty much depends on your use case. I've used frequently used a) in =20= cross plattform projects without problems even with users working on =20 two plattforms simultaneously. To get back to my first sentence. To =20 me the key part of both solutions is to only use the "native" C++ =20 encoding '\n' internally as it simplifies the code and ensures =20 interoperability with third party code like widget libraries. What do you think? > - Often these will just do things like convert /r to /n... which =20 > would double your line endings. We'll look into it. > > I did fix some code related to line endings awhile ago - are you =20 > running the 1.0.11 release? Unless I am missing something it's CVS Head. Regards Thomas Thomas Witt wi...@ac... |