From: Clark C . E. <cc...@cl...> - 2002-06-25 00:50:27
|
I'm forwarding this and another thread to the list. My apologies that it didn't make it directly. ----- Forwarded message from Oren Ben-Kiki <or...@ri...> ----- From: Oren Ben-Kiki <or...@ri...> To: "'Clark C . Evans '" <cc...@cl...> Cc: 'Brian Ingerson ' <in...@tt...> Subject: RE: [Yaml-core] meeting minutes 20 jun 2002 Date: Fri, 21 Jun 2002 12:01:05 -0400 I just spent a bit of time thinking about the simple scalar (from a productions point of view). Combined with yout rationale: > summary: > > unquoted flow scalars used as keys and used inside a nested > collection will not be allowed to span multiple lines. > rationale: > > Doing otherwise isn't very clear or pleasing to the eye. > Furthermore, > it complicates comment handling. For example... > > --- [ "Quoted string # this is not a comment, it is content > that continues on the second line" # this is a comment, > Unquoted strings # comment? or no? > that continue are a bit more problematic # another > > > The ambiguity above caused by third line above, "# comment? or > no?" > caused ambiguity which would be hard to explain. Thus the > limitation > of unquoted in-line scalars to a single line. > note: > > How to address this comment ambuguity within the top-level flow > (non nested) was not discussed; Clark thinks off-hand that the > comment should be allowed to trail the flow scalar, but not > appear in the middle. I think this is wrong. Following is my case for it. We start with agreeing on the following: this: should be legal # comment and: "this" # comment or: 'this' # comment We got good feedback on this. This is "the thin point of the wedge". Once we start going this way, it is impossible to stop and be consistent. You have to go all the way. Now, in in-line, we agree we should allow: this: [ '123', # comment "456", # comment 789, # comment etc. ] I think so far this is rather uncontroversial. Moving on, we agree that unquoted keys should always be single line. So, we inevitably have to allow: this: { key: # comment value, # comment and: # comment so on. } Next, in in-line collections, anywhere there's a space, we allowed to wrap to the next line. Especially outside the scalar value. Therefore we allow: this: [ value , value , value , value , value ] So we must allow: this: [ value # comment , value , value # comment? , value , value ] On the face of it, there's no reason to say it isn't. It is almost exactly the same as: this: value # comment Note that we allow optional space between a key and the ':'. This means that: this: { key # comment : value } Would be legal. Not too pretty, I guess, but acceptable. OK. So far, all is "reasonable". Now the problems. First, with the multi-line top-level form. I think people would expect this to work: this: one line # comment and: simple # comment flow value Disallowing it, I believe, would make things very surprising to people. You suggested that they only be able to: this: one line # comment and: simple flow value # comment I think this would be "surprising" to people and hence we should be more liberal there (being more DWIM). Second problem is what happens to values in in-line collections. If this is OK: this: [ value # comment? , value , "flow value" ] Then I think people would be surprised to learn that: this: [ flow value ] Isn't legal (Note I'm not promoting multi-line unquoted keys - I'm talking about values). You have pointed out that: this: [ confusing because # comment this, "isn't a # comment text", but this is # comment ] I concur that this is surprising. But I also think that these would be: this: [ "not a # comment text" ] # comment this: > # comment text is not a # comment great example of # comment readability. # comment Note the above would be confusing even if one used a '|' instead of '>'. It is deliberatly written to confuse the reader. On the other hand, I don't see this as confusing: this: [ simple # comment flow, # comment "tel # 1 800 123456", # comment etc. ] # comment It is all in the formatting, really. Summary: - I think the overall less confusing set of rules would be that comments are allowed anywhere where they are not "protected against" - meaning, inside quotes or a nested block. - I definitely think this should apply to top-level multi-line unquoted flow values (as I've demonstrated above). - Disallowing in-line multi-line flow values does not prevent cases where "what looks like a comment isn't". All such cases are in quoted/protected scalar forms, and there are many of these (not just multi-line quoted forms in in-line collections). - The only way to consistently eliminate such confusion is to return to our original rule (comments can't trail lines, they can only be standalone lines). However, that is too restrictive (we *want* trailing comments). - Taken together, I think there's no reason to limit in-line values to a single line and I hereby move that they be restored. Have fun, Oren Ben-Kiki |