From: Clark C. E. <cc...@cl...> - 2004-02-05 02:26:14
|
Howdy Sean! On Wed, Feb 04, 2004 at 05:12:40PM -0800, Sean O'Dell wrote: | On Wednesday 04 February 2004 04:26 pm, Clark C. Evans wrote: | > In this new and brave YAML land, there isn't such a thing as a 'str' | > kind, it is a mapping, sequence or scalar. | | This, to me, makes a lot of sense...I have wondered in the past why !str was | the default. It's a good move leaving untyped scalars simply scalars. | > implicit: | > Implicit (or untagged) nodes are those with an empty tag, | > but have a flag saying if it is a plain scalar or not. | > Note, a true YAML representation won't be able to distinguish | > between plain scalar vs non-plain scalars. | | By this, do you mean (could one also say): implicit nodes are nodes with no | typing "!tag" Yes, ones which do not have !tags on them in the character stream ("presentation"). | but may be typed or left as untyped scalars? Well, the resolver would put types on them, and during this 'typing' process, and choosing a tag for a scalar may use one peice of 'presentation' level information -- if the plain scalar style was used or not (a boolean flag). It is an ugly wart, but makes for very readable YAML documents. Bear with us here, this 'rethinking' happened a day or so before we left, and I'm sure the spec is not completely consistent with its impacts. If you don't mind, let me try an explanation on for size... | > tagged: | > Tagged nodes are those which appear in the YAML | > stream with a non-empty "!tag" or have been tagged | > by the resolver. Tagged nodes do not record if they | > were created from plain scalars or not, this 'hack flag' | > is only used during he tagging process. | | Is the resolver a new step in the loading process? Parser->Loader->Resolver? The three stage breakdown in the spec is: representation -- modeling your native data structures in a langauge independent manner serialization -- flattening these representations so that they can pass through a sequential-access medium such as a series of event calls. presentation -- making the serializations look pretty Othogonal to this breakdown are two processes that kinda go in the reverse direction: resolution -- this takes nodes which do not yet have a tag (we call these implicitly typed) plus a plain scalar flag and produces a tagged node without the plain scalar flag binding -- this takes a tagged node and produces either a canonical form (for equality comparision) or a native data object. I say kinda, beacuse resolution and binding could happen at any of the stages. One could resolve from the serialization or the representation. The spec goes into this somewhat, but it needs a bit more work. We call a representation which has all of its tags resolved, and bound a 'complete representation'. The YAML schema tools and such will be defined at this level, on complete representations. In short: It can happen in any of the three places! | > In particular, a YAML representation graph, or a tree serialization | > uses only tagged nodes. The YAML presentation nodes can be either | > implicit or tagged. Somehow in the process of going from a | > presentation to a serialization/representation this "resolution" | > process must be carried out. Does this make sense? | | This is one of those statements I have trouble with. What is a | representation graph? When you have native data, you need to "fit" it into the YAML abstract model for interoperability reasons. This abstract model is the "YAML representation" of your native data. It may or may not appear as a physical component of your system, if it is part of your YAML toolset, it will be a generic random access node API, or DOM. The abstract model is necessary since this is where a 'structural schema' would be defined and is the model upon which language independent YAML tools such as a YPATH would be based upon. | What are presentation nodes? By presentation, we mean human presentation. Presentation nodes can be represented as characters on a page, or as a tree in a YAML text editor that has such things as scalar style, etc. In the spec we define presentation not so much for what it is, but rather for what it isn't... it is used when we have aspects of YAML which are not considered part of a language-indepenent representation of your native data. Serialization nodes are somewhere between the two, they are representation nodes that have been 'flattened' to fit onto a sequential access interface. | Assuming "representation graph" means the conceptual data structure... | then, nodes are tagged (explicitly or through the resolver) Good so far... and a representation which is fully tagged where each tag is known by the processor is called a 'complete representation' | or left untagged (marked as a plain scalar). Well, the spec leaves this a bit vague (for now). But yes, you could have something very similar to a complete representation having tags which are blank. We don't have a good word for this case yet, "incomplete" doesn't quite say enough beacuse it can be incomplete due to a failure to resolve implicit types, or to bind the types to make a canonical form or native objects. | If "presentation nodes" means the physical YAML document, then the nodes are | either tagged, implicit or left untagged (plain scalar). Correct? In the presentation layer there are definately two distinct questions: tagged vs untagged plain scalar or not | What's the difference between a scalar, and a plain scalar? Isn't something | either typed or left as a scalar? an untagged node would have: a kind (scalar, mapping, sequence) a plain flag (applys only to scalars) By mixing plain into kind it makes things confusing, and this is what I was hoping to avoid. While it may be an implementation decision how to model a 3-state enum plus a flag that only happens on one of the enum possiblities, there is a conseptual difference. Kind (the three state enum) is part of the YAML representation, while the plain flag is not. So, merging them into a four-state enum is confusing at best; but may be the cleanest API option. ;( Don't say we didn't call this plain scalar thingy a hack. ;) Clark -- Clark C. Evans Prometheus Research, LLC Chief Technology Officer Turning Data Into Knowledge cc...@pr... www.prometheusresearch.com |