RE: [Yaml-core] Re: New Spec Version

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Clark C . Evans [mailto:cc...@cl...] wrote:
> | There's also:
> | 
> | D) Do not allow comments inside nested leaf values.
> |    Allow them *following* a nested leaf value.
> 
> I left D out because it does not solve the unbounded
> look-ahead problem.  Comments are easily discarded,
> the problem has not much to do with comments; it has
> to do with temporary buffer of new lines till the
> end of the scalar is found.  This buffer is required
> even in cases where a comment isn't present.

Sorry, I wasn't clear about this proposal. The idea is
that *everything* up to the first '#' is content.
Chomping would only apply to the very final line break:

> --- ]-
>  This is a folded scalar that is chomped.
> 
> 
> 
>  Thought the scalar had ended... didn't ya?

Nope. I haven't seen a less-indented line yet.

>  You had to store those three New Lines in
>  a temp buffer since you didn't know this
>  paragraph was on its way.

Nope. They were content...

> 
> 
> 
> # The last three lines have to be tossed.

No, only the very final line break is tossed.
The above value would end with just two line
breaks. If it wasn't chomped it would have
ended with three.

> # And this has *nothing* to do with the
> # comment here...

No, the final line break can be tossed as
soon as the parser sees the '#' character.

> # ... But, since this may not
> # have been comment, they'd have to be
> # stored in a buffer... just in case.
> ...

No, since this '#' is known to be a comment
(it is less indented), no lookahead buffer
is required.

What (D) doesn't solve is the ability to chomp a
series of trailing line breaks. If you want that
you'd have to write them this way:

key: ]
     some value, empty lines

     are always content lines.
#

# empty lines are comments after
# the first explicit comment line.
...

Note that the explicit comment line may be a single
'#' indicating the end of the value. The '#' works
as a sort of "close block" indicator - it makes
visual sense, actually, as demonstrated by the above.

> | That said, C is an interesting notion...
> 
> BTW, Option "C" aligns with our notion of
> a double NL in a folded scalar becoming a
> single NL.   Brian?  How are you on option "C"?

Well, chomping also applies to blocks... but
it is a good point that a series of NL is
already a special case in folded, so giving
it a special case in a block isn't that bad.

I still think option (D) is better - the file
is more "structured": one can break it into
a sequence of "chunks", where each chunk is
either a complete node (without the content,
for a branch node) or a comment... That would
make the syntax model that much simpler, and
hence make editing tools easier. Allowing
content lines to mix with comment lines seems
so *messy*, somehow...

In our tradition of erring on the side of
human convenience rather than processing
simplicity I'm willing to go with C, if
both of you feel it is worth the complexity.

Have fun,

	Oren Ben-Kiki