From: Clark C . Evans <cce@cl...>  20011122 18:05:18

An ASCII version of the draft information model for your review. A better word for "dumper" and "binder" would be great. ... 2 Information Model An information model is an abstraction that codifies the expected behavior of processing systems. Such a model describes which chunks of data are important and relationships between those chunks. Implicitly, these relationships form invariants which must be enforced for information to remain consistent between processing systems. This section introduces key YAML concepts and presents the YAML information model. This model details how a YAML processor must treat YAML texts to enable information portability between various systems. Those processors which fail to preserve the model described are noncompliant. 2.1 Layers A YAML system consists of three layers: a serialization format, a transfer representation, and a native binding. There are four basic processing components: a parser, a binder, a dumper, and an emitter. These components translate YAML information between the layers as shown below. [serialization > [transfer > [native format] representation] binding] (parser) (binder) [native > [transfer > [serialization binding] representation] format] (dumper) (emitter) For each layer above is an information model. The model for the serialization layer distinguishes between content that is informational and the content which is a property of the serialization itself. The model for the transfer representation describes the core invariants which all YAML systems must enforce. The model for the binding layer is dependent upon the native language and platform being used. This specification partially describes such a model, but leaves the complete definition to the particular binding. 2.2 Transfer Model The transfer model serves as the core abstraction for YAML processing. It consists of the node concept, the identity and equals equivalence relations, scalar and collection nodes. YAML implementations may vary significantly, but they must be conceptually grounded with this formalism. node An abstract object having the following properties: kind A node may be one of two kinds, a collection or a scalar. method A node is associated with a transfer method. identity An equivalence relation which can be used to determine if a given object is a particular node. nodeset An unordered association of zero or more nodes. A node may participate in many node sets without restriction. A node with a particular identity may only appear once in a given node set. scalar A scalar is a node that has an additional string property. string An ordered sequence of unicode characters. This sequence of characters must remain constant and is inseparable from the node. transfer method A transfer method is a abstract object which is used by the binding layer for what ever purposes it requires. A transfer method has a string representation. transfer string An ordered sequence of unicode characters. This sequence of characters must remain constant and is inseparable from the transfer method. collection A collection is a triplet consisting of two node sets, the domain and range, together with a function from the domain onto the range. The domain is restricted such that only one node from a similar equivalence class may be a member. The term function has its usual meaning: For any given node in the domain, there exists exactly one node in the range to which it is associated. similar An equivalence relation which can be used to determine if two given objects have similar structure. The meaning of this relation is determined by the object. similar scalar Two scalars are similar means they have exactly the same ordered sequence of unicode characters, and a similar transfer method. similar transfer method Two transfer methods are similar means that their corresponding transfer strings they have exactly the same ordered sequence of unicode characters. similar nodeset Two nodesets are similar means for each node in one, there exists at least one node in the other such that the two nodes are similar. similar collection Two collections, X and Y are similar means that the domains of X and Y are similar, the range of X and Y are similar, and for every node x in the domain of X, with similar node y in the domain of Y the image of x under the function for X is similar to the image of y under the function for Y. 2.3 Serialization Model The serialization model is an extension of the transfer model in that it includes support for node styles, comments, and other goodies. It also defines two subsets of collections, mappings and sequences. Where a sequence is a collection where the domain is a continuous subset of the integers starting with 0. 2.4 Native Model The native model describes how the transfer method could used to construct a hierarchy of types. In particular, it describes the properties of integers, floats, dates, and other builtin types. 