From: Clark C. E. <cc...@cl...> - 2002-09-12 03:27:31
|
Ok. With a revised model that allows key/value pairs to be ordered and have duplicate keys, we have the following: A node is either a sequence, pairing, scalar, or alias we call the variety of node it's "kind". A pairing is a series of zero or more pairs, where each pair consists of two nodes, a key and a value. A sequence is a series of zero or more nodes. A scalar is a series of zero or more characters. Scalar nodes may have one of the following "styles": block/literal, block/folded, quoted/double, quoted/single, unquoted Both pairing and sequence nodes also have a "style" property, which is "flow" or "block" etc. There is another option, that is, sequences (-) could be treated in the information model as a pairing where the key is "null". And then the difference between a pairing (mapping key:value syntax) and the sequence (bulleted - syntax) could be treated as a style; much like there are quoted and block scalar styles. In this case, the model would read: A node is either a collection, scalar, or alias. A collection is a series of zero or more pairs, where each pair consists of two or more nodes. A scalar is a series of zero or more characters. Scalar nodes may have one of the following "styles": block/literal, block/folded, quoted/double, quoted/single, unquoted Collection nodes also have a style, which can be one of "flow/pairing", "flow/sequence", "block/pairing", "block/sequence". etc. In this case, the difference beween the sequence and the pairing syntax would be the "style" attribute attached to each collection (just as the difference between the block and quotes syntax for scalars). For sequence style, the "key" would be null. Thus, the only difference between --- !null: one !null: two and --- - one - two Is the style of the former would be "pairing", while the style of the latter would be "sequence". Just like the only differnce between: --- "x x" and --- \ x x is that the former style is "double quoted" and the latter style is "block folded". The advantage of this alternative model is that it is simplier. The "sequence" concept is merged into the "pairing" concept with the difference being delegated to a "style". This could make the parser APIs simpler or more consistent. ... Anyway, which way we choose to model this is important since it should probably reflect every implementation's parser API. If we choose the former, there would be two sorts of node kinds; one for key/value pairs and one for just values. If we choose the latter, then only one node kind exists; and if you could ask for the "key" of a list entry... it'd just return null. So. Some thought would be nice here. For greater XML compatibliity, we should choose the latter. Although I think that the former works better for a serialization use case with hashtable/array as primary data types. But since we arn't going to assume an application, perhaps normalizing it into a single "collection" is a better bet. This, hopefully is a clear example of why a Model is important. It sets our vocabulary about how we want to talk about parts of a YAML document (is it a pairing kind or a pairing style of a collection) and how our APIs should look. If we didn't have a model people would use different words and APIs in different implementations may choose to do it differently... Clark |