From: Clark C . E. <cc...@cl...> - 2002-07-05 22:07:06
|
Yamlers, I have quite a bit of YAML data at Axista, and we have a few customers which want XML (buzz-word-compliance). Thus, I wanted to blow-by an XML format for YAML which perhaps we could discuss. The goal of this format is to have the information model of YAML but serialized as XML for those who love poiny brackets and brand-name-recogition. Most of my data is nested maps, although I have a few sequences here and there. Note that I'm not talking about how XML could map into YAML; this is a much more complicated discussion and there are too many possibilities. My goal is to be able to serialize any YAML structure in a specific XML format so that (a) it is somewhat readable (as readable as XML can be), (b) it can be converted back to YAML if necessary, (c) it is, for the most part, intutitve. Anyway, without further ado, here is a text version; the html version I took the liberty of putting at yaml.org/xml.html for now... pending approval/mods from ya all.. YAXML, the (draft) XML Binding for YAML For those who love YAML, but require buzz-word compliance or the astethically-displeasing angle brackets of XML, there is a clean option -- a subset of XML, YAXML, which which has YAML's information model but XML's syntax. YAML implementations are encouraged, although not required to implement a conversion to/from this XML subset so that those using YAML can be compliant with XML. This binding boils down to a simple set of rules: * Mappings are expressed as elements, where the mapping key is represented using the element's name and the mapping value as the element's content. This restriction implies that a given tag name may not occur twice within a given context and that the order of the tag names is not significant. * Lists are expressed as an orderd sequence of elements having a special name, the underscore. In this case, and only in this case is the ordering of elements significant and duplicate element names (just the underscore) are permitted. With these two rules, the bulk of YAML can be serialized. * Scalars are modeled as a single text node child of a given element. Thus, mixed-content is not allowable. * Unlike YAML, XML requires a root element with a name. Since this isn't part of the YAML information model, the element name can be arbitrarly chosen and should be discarded by the YAX->YAML conversion utility, the default element name can be 'yaml' to make things clear. * In this binding, an anchor on a node is modeled via the "anchor" attribute. Then, subsequent occurances of this node can be referenced by an empty element with an "alias" attribute having value matching a previous anchor. * Implicit typing is handled on each element by applying the family regular expressions. * Explicit typing can be done by registering every family URI required as an attribute of the root node in a manner similar to XML namespaces, yaml:xxx=family-uri. Then, on each element, the abbreviation xxx can be used as the value of the "type" attribute to apply the given type. To avoid conflicts, the abbreviation (xxx) cannot be type, anchor, alias, or stream. * If a YAML stream contains more than one document, then the top level XML element must have an attribute "stream" with value "yes". In this case, each second level element acts as an isolated document (although all family URIs must still hang from the root element). * Nested keys, typed keys, or keys not matching the XML name production are emulated through a pair of elements, one for the key, the other for the value following each other sequentially with names "_key" and "_value". All XML names beginning with underscore are reserved so that this hack is clearly a hack. * YAML comments are modeled directly as XML comments, only that they cannot be used to break up a text node or occur in any order differently than an associated YAML file. * If XML namespaces are to be used, then the above mentioned attributes shall be in the "http://yaml.org" namespace. [Musing: perhaps the family URIs should be XML namespaces, and the type attribute give way to using element prefixing... in this way, the yaml namespace can be 'implicitly' typed. Hmm.] * Unrecognized attributes should be ignored. * Escaping via XML character entities is allowed, but all other forms of entities are not. * All other features of XML are forbidden. Intermediate whitespace between elements used for readability is ignored. An example of this mapping is as follows: --- one: value two: 239 typed: '1923' explicit: !clarkevans.com/boogle | Howdy, scalars like this break indentation. But this is why we have YAML, no? saved: &001 value referenced: *001 [ nested, key ]: ugly, but works "\t": illegal qname sequence: - one - two ... becomes... <yaml xmlns:yaml="http://yaml.org" yaml:str="http://yaml.org/str" yaml:cce="http://clarkevans.com/boogle" > <one>value</one> <two>239</two> <typed yaml:type="str">1923</typed> <explicit yaml:type="cce" >Howdy, scalars like this break indentation. But this is why we have YAML, no?</explicit> <saved yaml:anchor="001">value</saved> <referenced yaml:alias="001" /> <_key> <_>nested</_> <_>key</_> </_key> <_value>ugly, but works</_value> <_key>	<_key> <_value>illegal qname</_value> <sequence> <_>one</_> <_>two</_> </squence> </yaml> Icky, but it works. Any suggestions, refinements, improvements would be great, but I don't have alot of time to spend here. I just need a simple dump. So, I've implemented a subset of the above for my own purposes. Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |