From: Clark C. E. <cc...@cl...> - 2004-09-07 20:37:17
|
On Tue, Sep 07, 2004 at 10:58:14PM +0300, Oren Ben-Kiki wrote: | - Scalar nodes that have no explicit tag, and that are written in any | style except the "plain" style, are reported by the parser as if they | were associated with the tag "tag:yaml.org,2002:str". | | - All other nodes that have no explicit tag (both plain scalars and | untagged collection nodes) are reported by the parser as "having no | tag" (having a NULL tag). | | - A node tagged with an explicit "null tag" is also reported as "having | no tag": | | --- | this has no tag: ! "23" | just like this: 23 | ... --- - "string" - "null" - [] - {} ... is equivalent to --- - !!str 'string' - ! 'null' - ! [] - ! {} | This allows non-plain scalar types to be "untagged". Note that the | semantics of "!" can not be overriden, and that using a double "!!" is | invalid: | | --- | this is invalid: !! 23 | ... | | - If a YAML document is loaded "all the way" to what is called a | "complete representation", each node is converted to an appropriate | native object (what Brian likes to call a "cave drawing"). | | - For each node, this inevitably requires two steps: (1) deciding on the | type of the "cave drawing" to use; (2) converting the node to the "cave | drawing" of that type. A complete representation happens if and only if: - there is a partial representation - every NULL tag is provided, and - each tag is recognized (i.e. a canonical form is possible) | | - For nodes with tags, deciding on the type of "cave drawing" is based | on the tag. For nodes without tags, deciding on the type of "cave | drawing" is based on the value of the node. | | Specifically, the decision must not depend on any syntactical details, | and must not depend on the node's location in the document or the value | of any other node other than the one being "typed". Note the node may | be a collection, in which case all its content is available as input. | (But see below) | | - It is required that each type of "cave drawing" used by the | application have a name (== tag). Note this name (tag) may be private | ("!foo") or global (a URI). It just needs to have _some_ sort of name. | | - Therefore, deciding on the type of "cave drawing" for an untagged node | can be expressed as "deciding on the tag of the node". Hence, this | process is called "tag resolution". In short, a tag resolution is a simple transformation where only NULL tags are replaced using only the content of the given node. This word is nice; it gives us a special fuzzy feeling that we havent' really changed the intent of the document (which a transform would imply) but instead are simply filling-in the missing details | That's it. Implications: | | - An implementation is free to go directly from an untagged node to a | "cave drawing". An implementation may go directly from syntax to "cave | drawing", for that matter. The spec places no limitations on the | specific APIs or implementation details. "What is not explicitly | forbidden is allowed". With the simple resolution rule: (SCALAR, NULL) -> '!my-special-variant-tag' (MAPPING, NULL) -> 'tag:yaml.org,2002:map' (SEQUENCE, NULL) -> 'tag:yaml.org,2002:seq' You can consider resolution "completely optional", or at least a simple process that can be skipped. ;) | | - An implementation is free to use any type of "cave drawing" for any | node. For example, it can load all scalar nodes into the integer 12. | However, if the "cave drawing" chosen does not obey the semantics set | down by the node's tag, the application is said to have "transformed" | the document rather than merely "loaded" it. Note this is perfectly OK | in some contexts. | | - In contrast, the action of "filling in the blanks" done by the | (possibly implicit, hidden) "tag resolution" step does not transform | thge document. The "complete representation" of the document is _not_ | modified by this step. Naturally, the representation is changed from | being a "partial" one to a "complete" one. Round-tripping | may/should/not reverse this process as appropriate (this isn't | different from the rest of the round-tripping issues, including | indentation, comments, anchors, tag prefixes etc.) The complete representation is not modified by this step, beacuse until all NULL tags are provided, a complete representation does not exist. | - Why is restrict "tag resolution" to considering the value of a node? | | For the purpose of comparing tags for equality, it must be that 'NULL' | == 'NULL'. For example, the following must be invalid due to a | duplicate key: | | --- | a : foo | a : bar | ... This prevents the unexpected resolution, --- !foo a: foo !bar a: bar | If the tag resolution is restricted to examing the value of each node, | then the normal rule of comparing nodes (tag == tag && value == value) | just keeps on working (where NULL == NULL). Note that NULL semantics here are equivalent to an empty string, '' therefore an implementation can use an empty string for this purpose. Or, really, any special string that does not occur in the wild. | An implementation MAY use a more complex way to decide on the type of | "cave drawing" to load each untagged into. However, using a more | complex way is considered "transforming" the document rather than | "loading" it. Again, this isn't forbidden, and it makes sense in | certain contexts. It is just a different operation. | | Of course, this means that: | | --- | "a" : foo | a : foo | ... | | Would not be caught as a duplicate key prior to tag resolution. There's | really no helping it... This is the same problem as: | | --- | !!int 10 : foo | !!int 012 : bar | ... similar not the same, in the former case, after resolution the document could be equivlaent to, --- !!str 'a': foo !!str 'a': foo ... Which would not be well formed. Thus, tag recognition is not even needed here, where it is in the prior case. | | Some duplicate keys can only be caught in the process of construction | the "cave drawings", regardless of the issue of implicit tags. Exactly. Sometimes only applications know what duplicates are. ... This proposal works for both groups: a) those that want to consider missing tags as just a !implicit-mapping; without special significance b) those who want to consier a missing tag as something very speical with tis own resolution mechanism; a limited one that doesn't change intent, but does have an effect I fit into the former, Oren into the latter. Clark |