On Tue, May 21, 2002 at 10:13:08AM -0400, Oren Ben-Kiki wrote:
| Brian Ingerson [mailto:ingy@...] wrote:
| > I think we're missing the whole point of what I really want here.
| > Clark alluded to it but, what I really want is the ability to extend
| > the multiline leaf formats that we support. One very useful format
| > would be a mix of block and folded. Lines beginning with whitespace
| > (as content) would be block. Lines beginning with non-whitespace would
| > be folded. I see this pattern in textual data all the time. (If you
| > know Perl, think POD)
Ok. First, there are two things we can't move into the format,
the first one is "scalar" indicator since we must have a way to
know that the given node is not a mapping. Second, we need
the escaped flag. YAML has a limited character set and thus
escaping is needed so that the full unicode can be represented.
Third, we need chomp since there is no way to signal if the
trailing new line is significant.
Both escape and chomp are flags that need to apply to every
scalar value and handled at the parser level. Since format is
handled by the loader and specific to given scalar values, we
cannot use escape and chomp within the format.
The question then becomes, is folded a style or can it be moved
into a format. As a format it would be handled by the loader
and not the parsre like escape and chomp and applies only to
one family, !str, by default. Other families would have to have
some application-specific knowlege to know that folded is valid.
The use case for folded is rather simple. In much business data,
paragraphs (say a warranty description) are used without explicit
carriage returns to let the form wrap the content as appropriate.
This is also typical of HTML. In HTML carriage returns are not
significant and can be inserted into the stream to enhance
readability. Now, we don't strictly need folded, one could just
put in these paragaphs/html blocks as one very long line; or could
even use the escape mechanism to break the long line into several
smaller lines. Using the escape in this manner is good enough for
me, but it isn't the most readable.
So, given that my use case is _primarly_ the ability to break
a line at any given point so that it fits in 76 columns for
"pretty printing" and given that most content of this sort won't
typically have embedded \'s which have to be escaped, this is
perfectly acceptable to me. In other words,
This paragraph which is
This paragraph which is \
So, I'm willing to just "strike" folded from the styles, since
escaping handles my use case (just not as readable).
| Hmmm. Even if you make this a format, extending it won't be trivial (it
| effects parsers/loaders everywhere). And IMVHO str is too basic a type to
| extend it like that later on.
If we make it a format, it will be impossible for the parser to
know that it also applies to an HTML type (as I said earlier).
Thus, a filter which doesn't know about the type's ability to
have that format can't use folded to reformat the text for pretty
printing. Thus, it is kinda useless. Let's just let application
specific types define it in this case, no point in even having
it in the YAML specification if it doesn't apply to everything.
| I, for one, would be happy if we could have *less* styles. The trouble is, I
| see the sense in all our three variables (chomping, folding and escaping).a
Ok. Let's just have chomp and escape then.
| That means that we can only apply these formats to !str. On the up side, it
| means that we would have just one "style" of multi-line scalars (block). For
| type families such as binary, it is easy enough to specify that all white
| space is ignored.
This reduces the power of the folded scalar so that it just
isn't worth it for me.
| Possible solution: How about we allow !|<format> as a shorthand for
| !<implicit>|<format> (or at least as a shorthand for !str|<format>)?
This won't work since the parser needs to do the escaping due
to the possiblility of illegal characters.
Clark C. Evans Axista, Inc.
XCOLLA Collaborative Project Management Software