Nice summary of the issues.
On Fri, Sep 10, 2004 at 05:49:26PM -0400, trans. (T. Onoma) wrote:
|There seems to be some confusion about "phases". Let see if we can clear
|that up. What you are referring to as "implicit typing phase" I am
|calling more generally resolution, and is a after phase, obviously.
I plead guilty for having helped with this confusion.
But yes, since 'resolution' (or the replacement word Oren uses,
'specification') is under the Application's control; at a high level,
the only item that matters is that only information in the Graph
Representation should be used (i.e. ignore Presentation and
Serialization specific items).
|I do not know if tag specification (fulfilling TAG directives) is
|supposed to be the first part of resolution or a part of parsing (or
|on it's own for that matter). But I leave that aside b/c what I am
|generally referring to as resolution is the _act of "binding" a
|native type to a YAML representation_.
Yes, you got it.
|Resolution is a given. It happens presently according to the spec:
|"When resolving tags, a YAML processor must only rely upon
|representation details, with one notable exception. It may consider
|whether a scalar was written in the plain style when resolving the
|scalar's tag. Other than this exception, the processor must not rely
|upon presentation or serialization details. In particular, it must not
|consider key order, anchors, styles, spacing, indentation or comments."
|-- 3.3.2. Resolved
And with proposal #4, this "notable exception" goes away.
|So the question is simply this: what information appropriately belongs
|to the representation. My assertion is the scalar style (the forms of
|quoting, folding, literalizing) is rightfully a part of the
The impacts of this are huge, so please spend some time thinking
it through. Basically, what the Representation Model describes
is what various transformations, filters, pretty-printers,
and other applications would be required to preserve -- lacking
specific instructions to the contrary. This would, effectively,
prevent compliant tools from changing _how_ a given chunk of
text was written to a YAML file. It's a poor decision, IMHO.
|And that the source of the trouble with the plain-scalar
|exception, among others, is the fact that this has been artificially
|extracted from the notion.
Certainly; but your perscription is worse than the illness, and
perhaps the same could be said for proposal #4, but I've yet to
see some good use cases that make this clear. ;)
|I think you have the idea, but the terms are a bit rough, and I may have
|not been clear enough. The "implicit typing", as you refer to it here is
|really nothing more then the default behavior of resolution, i.e. when
|the application doesn't do anything special. So "implicit typing" can
|happen in both cases --and of course the application can say, "don't do
|that" to whatever degree it likes.
|The only difference between (a) and
|(b) is how the style is conveyed to the resolution phase: either by not
|stripping off the style characters, or by stripping them off and passing
|a style-variable (similar to kind and tag). Again, implicit typing can
|happen in either case. these are just the two different ways in which
|_our_ proposal can be achieved.
Well, passing along a style flag is a lot cleaner.
|> This isn't what I'm advocating. The style should remain with the nodes
|> during schema application, and the schema can decide how to transform the
|> node. Schema application is something loader does right before it passes
|> the representation to the application.
|Okay, so here again: schema application (YASLT, not YASL) is the same
|thing --it is resolution. It is not really something that happens right
|before the representation is passed to the application, it _is_ the
|representation becoming a part of the application as native data types.
Since schema application is under Application's direction, it
properly belongs on the Application side of the line; correct.
|> The only thing I'm saying that seems close to what you're describing is:
|> I don't see quotes as a "style." I see them either a) as an escape
|> mechanism, to be completely used and discard during parsing or b) as a
|> way for the implicit typing mechanism to detect !!str types, where it
|> will then eat the quotes and tag the node as a string.
And the single quoted style is there for something similar, use of
strings with lots of slashes, quotes, and special characters. It is a
_different_ method of escaping. The difference between | and > is also
quite subtle differences in how new line characters are handled.
|> I do, however, see > and | as a style. But I think the scalar with these
|> styles should still be loaded very plainly, untransformed, and let the
|> implicit typing phase transform them.
Why even have styles in this case? If the application directs the
parsing here, there is no reason for YAML to define such things; it
should have 'scalars', and that's it. All other "semantics", should be
left to the application. IMHO, if you wanted this approach, the YAML
parser should just pass-on any string it gets without any 'escaping'
methods, and in this case, styles should just be removed from the spec.
There are lots of downsides here:
- applications may do it differently, causing incompatibilites
- it's burdensome to implement it yourself, or call someone to do it
- oh dear, need I go on?
|> They should only be tagged as
|> styled "plain", "folded" or "literal", but type tagged as "scalar". The
|> implicit typing phase can perform all transformations and all typing in
|> one shot.
|According to the spec, all of these are called styles:
|"YAML provides a rich set of scalar style variants. Scalar block styles
|include the literal and folded styles; scalar flow styles include the
|plain, single quoted and double quoted styles. These styles offer a
|range of tradeoffs between expressive power and readability." --
|The whole point of this is to _not_ distinguish between these styles in
|any arbitrary manner. Either they are what they are, as I propose; or
|they have no distinction as Clark proposes. Making a distinction is the
|current "wart" between plain-scalar and the rest.
Styles emerged beacuse people wanted to _write_ or _present_ the same
information differently so that it fit the constraints of a
serialization format. The idea is that they are not 'content', they are
window-dressing. That's why they are called styles. And yes, the wart
of letting applications 'see' a plain window dressing, but not the other
kinds is hackish, and we don't need to describe how we got there. ;)
|> 1) since the loader can still apply the and > transformations and still
|> does all the implicit typing, and still recognizes strings as scalars
|> wrapped by quotes, you pretty much get all the same features you have now.
|Clark is against this. He suggests that all scalars be treated
|uniformally such that style information plays no role whatsoever, i.e.
|is to not at all to be considered part of the representation, including
|plain vs. non-plain.
Correct. In 99% of the cases, you can determine the type of an
unspecified node by its context; no need to use the presentation style.
|> 2) all transformations happen during implicit typing, so it's VERY simple
|> to plug in a schema mechanism later and allow applications to completely
|> change how scalars are transformed implicitly.
|That's the question, does the phase allow style to be take into
|consideration or has all indication of scalar style been stripped away
|before this can occur.
And, dramatically changing the 'meaning' of the document by
switching writing styles, is, well, quite unexpected and exactly
against the trend of separating content vs presentation.
|> 3) since all transformations happen in one place, an application may even
|> request from the loader that NO transformations occur.
|> It's just simpler, more consistent, easier to comprehend and it provides a
|> door for some pretty powerful schema mechanism(s) later on.
|Agreed. That is what _we_ have proposed. My proposal is essentially a
|refinement of your earlier (a) proposal. We are essentially in
|agreement. And with common terms I think that becomes clear.
If you are all taking about writing an _editor_ that works at the
Presentation level, I'd completely agree. The styles are different,
beacuse you're doing a UI that needs to differentiate between them. But
if you are talking about a general application. Well, I do sincerely
think you are really, really choping down the wrong tree.
Could I ask you to drum up some serious use cases where this
proposal would actually help? Thanks.