From: Tim P. <ti...@po...> - 2004-09-02 18:13:08
|
After spending some time looking around at the spec, the list and various other sources, I still don't understand why ordered mappings have to use the following (IMHO ugly) syntax :- var: !omap - one: 1 - two: 2 - three: 3 If, as figure 3.2 in the spec suggests, the post tokenized representation of mappings are lists of key:value pairs, then shouldn't it be possible, and possibly more yamlish, is the following were allowed. var: !omap one: 1 two: 2 three: 3 This wouldn't cause problems with readability (it should be obvious that it's not a standard mapping) and a standard mapping would still be 'unordered'. What problems would this cause? Tim |
From: Clark C. E. <cc...@cl...> - 2004-09-02 18:51:19
|
Hello Tim. This belongs in a FAQ. ;) One of the express goals of YAML is to have a very simple set of 'core' data types: - a map (unordered list of keys to values) - a list (ordered list of values) - and scalar. These three things have a direct representation in YAML; all other types have to be 'simulated' as permutations of these items plus tags. While we could have included !omap in this 'assembler', it would make things more complicated for simple cases, any generic API or schema language would then have to worry about omaps as well as maps, lists, and scalars. There are a few possible representations: - list of maps, where each map has only one pair - list of lists, where the first item in each list is a key, and the second item is a value - a mapping, plus a list of keys We chose a list of maps beacuse it expressed the intent the best, its clearly ordered (witness the -). ... Q: Why didn't we just use a map syntax? Afterall, the keys in the file are ordered, arn't they? A: Well, most mappings in YAML arn't ordered. So someone reading the YAML file is free to "reorder" the keys without having to worry about it 'changing' the document. A: If you required the mapping syntax to be 'ordered', then all native language bindings would have to try to 'preserve' the order, since it is considered significant. This would not allow the useage of native hashtables in the typical case. In short, it gets ugly if you try to bend the rules. The line between what is in the syntax file gets blurry... do you report, for example, which style of list was used? If an application used this distinction to drive processing, then one could not hope to have unversal "YAML" editors... in the extreme case, the editor would have to find a schema file to know what's important and what isn't -- the appraoch SGML took. Q: Why not just fix the information model, people can always discard the order if they don't want it A: The original goal of YAML was to provide a nice syntax to represent sound, commonly used data structures like hashtables, records, lists, etc. If we modified the model to have omaps, we'd be requiring most languages to have an ordered dictinoary. Python and Perl don't have ordered dictionaries. Classes in Java and other OO languages also arn't ordered. So, to keep with the primary goal, we're sticking with a mapping. A: Further, the two collections, lists and mappings, happen to be functions with very well defined characteristics. Very nice schema, path, and transformation utilities can be made if all collections in YAML remain functions. An ordered map is not a function... it's a monkeywrench. I hope this helps. A note to anyone out there -- this point is _not_ open for debate. It is clearly stated in the specification, it is a pillar design decision, if you don't like it, start your own language. Sincerely Yours, Clark |
From: Tim P. <ti...@po...> - 2004-09-02 19:25:07
|
On Thu, 2004-09-02 at 19:51, Clark C. Evans wrote: > One of the express goals of YAML is to have a very simple=20 > set of 'core' data types: >=20 > - a map (unordered list of keys to values) > - a list (ordered list of values) > - and scalar.=20 >=20 > These three things have a direct representation in YAML; all other types > have to be 'simulated' as permutations of these items plus tags. While > we could have included !omap in this 'assembler', it would make things > more complicated for simple cases, any generic API or schema language wou= ld > then have to worry about omaps as well as maps, lists, and scalars. but the layer that would construct the omap would be well above generic API's. Wouldn't it be >=20 > There are a few possible representations: > - list of maps, where each map has only one pair > - list of lists, where the first item in each list is a key, > and the second item is a value > - a mapping, plus a list of keys > We chose a list of maps beacuse it expressed the intent the best, > its clearly ordered (witness the -). >=20 > ... >=20 > Q: Why didn't we just use a map syntax? Afterall, the keys in the=20 > file are ordered, arn't they? =20 >=20 > A: Well, most mappings in YAML arn't ordered. So someone reading > the YAML file is free to "reorder" the keys without having > to worry about it 'changing' the document. =20 >=20 > A: If you required the mapping syntax to be 'ordered', then all > native language bindings would have to try to 'preserve'=20 > the order, since it is considered significant. This would > not allow the useage of native hashtables in the typical case. >=20 > In short, it gets ugly if you try to bend the rules. The line=20 > between what is in the syntax file gets blurry... do you report, > for example, which style of list was used? If an application used > this distinction to drive processing, then one could not hope to > have unversal "YAML" editors... in the extreme case, the editor > would have to find a schema file to know what's important and > what isn't -- the appraoch SGML took. =20 >=20 > Q: Why not just fix the information model, people can always > discard the order if they don't want it >=20 > A: The original goal of YAML was to provide a nice syntax to=20 > represent sound, commonly used data structures like hashtables, > records, lists, etc. If we modified the model to have omaps, > we'd be requiring most languages to have an ordered dictinoary. > Python and Perl don't have ordered dictionaries. Classes > in Java and other OO languages also arn't ordered. So, to keep > with the primary goal, we're sticking with a mapping. > =20 > A: Further, the two collections, lists and mappings, happen to be > functions with very well defined characteristics. Very nice > schema, path, and transformation utilities can be made if > all collections in YAML remain functions. An ordered map > is not a function... it's a monkeywrench. >=20 > I hope this helps. A note to anyone out there -- this point is _not_ > open for debate. It is clearly stated in the specification, it is a > pillar design decision, if you don't like it, start your own language. I'm going to be a bit of a devils advocate here so please bear with me. Why don't you make the point by expressly stipulating that YAML readers and emitters purposely randomise the order of mappings just so that people don't get confused. Do you think this might annoy some people? I think a lot of people, even though they are using unordered mappings, would like to see roundtripping preserving mapping order. The only way to do this is for the parser to preserve order. If this is the case, what is the harm in a userland type making use of this order in a very specific case. This wouldn't mean any special APIs or processing. I don't understand why it would change anything beyond a final level where the data is parsed into an object and there aren't any preconditions on how this is done?=20 Isn't this along the same lines as the fact that comments should be discarded BUT people will inevitably desire a parser to roundtrip comments aswell?=20 Finally as a non-devils advocate, what would happen if I wrote my own !omap type that did this, obviously noone elses parsers would be able to build my omap objects without the corresponding implementation, but this is the same situation as any other types I might wish to write? I suppose I'm asking the question 'If I were to preserve order and make it available in my parser and write a local type that made use of it, would I be excommunicated?' :-) Tim=20 |
From: Oren Ben-K. <or...@be...> - 2004-09-02 19:43:54
|
> Why don't you make the point by expressly stipulating that YAML readers > and emitters purposely randomise the order of mappings just so that > people don't get confused. Do you think this might annoy some people? 1. This is exactly what happens when you read a mapping to a Perl has table and then write it back. 2. Since it _is_ annoyying, there are flags that help you prevent that. > I think a lot of people, even though they are using unordered mappings, > would like to see roundtripping preserving mapping order. Sure. They'd also want their comments to be kept, as well as their indentation style, the way they folded long lines, which variant of escape sequence was used, and a zillion other things to be preserves. All these things are known as "syntax". > The only way > to do this is for the parser to preserve order. No. Another way os for "editor" like programs to preserve as much syntax as possible. It does not mean that a "loader" style program also needs to preserve it. > Isn't this along the same lines as the fact that comments should be > discarded BUT people will inevitably desire a parser to roundtrip > comments aswell? Exactly. > Finally as a non-devils advocate, what would happen if I wrote my own > !omap type that did this, obviously noone elses parsers would be able to > build my omap objects without the corresponding implementation, but this > is the same situation as any other types I might wish to write? The problem is that when I run my YAML-pretty-printer on your files, it will trash your key order. I like to have *my* files use alphabetic key order, so I ask my pretty-printer to print them this way. > I > suppose I'm asking the question 'If I were to preserve order and make it > available in my parser and write a local type that made use of it, would > I be excommunicated?' If I were to insert random Klingon words into my E-mails, would you be excommunicated? maybe not outright, but nobody will understand you :-) Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-09-02 20:01:20
|
On Thu, Sep 02, 2004 at 08:24:55PM +0100, Tim Parkin wrote: | Why don't you make the point by expressly stipulating that YAML readers | and emitters purposely randomise the order of mappings just so that | people don't get confused. Do you think this might annoy some people? The Syck parser completely avoids this issue; he provides the mapping pairs in a single function call via post-order traversal. I recommend this for all loader interfaces, it gaurentees compliance with the spec. (I'd make the lower-level interface a 'lexer' reporting far more detail, such as where the line breaks are, etc.) | I think a lot of people, even though they are using unordered mappings, | would like to see roundtripping preserving mapping order. The only way | to do this is for the parser to preserve order. If this is the case, | what is the harm in a userland type making use of this order in a very | specific case. This wouldn't mean any special APIs or processing. There is _nothing_ in the specification that says you couldn't create a 'shadow' object storing "Presentation" level attributes so that printing the document back out won't mangle all of the user's work. In fact, one could even add markers where line endings on block scalars happened. However, these 'style' objects should be very clearly marked in the API. There are also use cases of applications which only use the Presentation model, namely editors. As long as the models are clearly spelled out by the API, it's sufficient. If someone is going to break the model, I'd like to make sure that they realize what they are doing. Preventing them from breaking the model is unwise and impratical. | don't understand why it would change anything beyond a final level where | the data is parsed into an object and there aren't any preconditions on | how this is done? Well, for one thing, your omap wouldn't work with syck's parser, beacuse by the time the loader gets the keys, they are in a hashtable. You actually need to do some sort of hashing at the parser level to detect duplicate keys and report them as errors. | Isn't this along the same lines as the fact that comments should be | discarded BUT people will inevitably desire a parser to roundtrip | comments aswell? Hope the above item about having a 'seperate interface' makes sense, for example node.getPresentation().indentation would be a nice way to provide how indented a node is. It makes it clear that one is dealing with the presentation. | Finally as a non-devils advocate, what would happen if I wrote my own | !omap type that did this, obviously none elses parsers would be able to | build my omap objects without the corresponding implementation, but this | is the same situation as any other types I might wish to write? I | suppose I'm asking the question 'If I were to preserve order and make it | available in my parser and write a local type that made use of it, would | I be excommunicated?' I'd prefer if the loader used post-order traversal, presenting a mapping as a dict(). Note, someone could just do: makeObject( aDict): ob = MyObject() ob.__dict__ = aDict So, it's not like this has to be hugely inefficient. ;) Cheers! Clark |
From: Brian I. <in...@tt...> - 2004-09-02 22:31:44
|
On 02/09/04 14:51 -0400, Clark C. Evans wrote: > I hope this helps. A note to anyone out there -- this point is _not_ > open for debate. It is clearly stated in the specification, it is a > pillar design decision, if you don't like it, start your own language. +1 This innocent question was argued to the point of torture for 3 long months in the Fall of 2002. It almost divided YAML permanantly. I was arguing your side Tim, and it took a long time for me to finally come around. I believe the key assertion was: If Steve creates YAML with key order X, and Brian processes it and returns it to Steve with key order Y, is this OK? The answer is "yes". This is not to say that applications cannot be created to do such things, but key order preservation is not an expected property of YAML mappings, and thus it is better to use omaps. YAML.pm has an dumper/emitter interface that enables key order if you want it. It takes more setup of course but you can't avoid that. It also has a parser/loader method of shadowing that you can turn on such that YNY key order will be preserved. It's expensive application-wise, but sometimes you want that. It's a nice feature to have. But if I am going to process a mapping from a PHP program, am I always expected to use this feature. No way. PHP internal hashes are omaps, not YAML mappings. So if a PHP application really cares about that order, it needs to serialize its hashes with omaps. Does this make PHP and YAML not be as good a fit as Python and YAML. I would say, to a degree, yes. But Java is probably a whole lot worse :) Here is the one example that crystalizes it for me. In Perl I would always write a hash assignment for months like this: %months = ( Jan => 'January', Feb => 'February', Mar => 'March', Apr => 'April', ... ); It read nicely. But I know, and everyone else knows: "That there hash don't preserve no order once it been compiled!". Cheers, Brian |
From: Tim P. <ti...@po...> - 2004-09-02 22:45:29
|
On Thu, 2004-09-02 at 23:23, Brian Ingerson wrote:=20 > On 02/09/04 14:51 -0400, Clark C. Evans wrote: > > I hope this helps. A note to anyone out there -- this point is _not_ > > open for debate. It is clearly stated in the specification, it is a > > pillar design decision, if you don't like it, start your own language. >=20 > +1 >=20 > This innocent question was argued to the point of torture for 3 long mont= hs in > the Fall of 2002. It almost divided YAML permanantly. >=20 > I was arguing your side Tim, and it took a long time for me to finally co= me > around.=20 >=20 > I believe the key assertion was: >=20 > If Steve creates YAML with key order X, and Brian processes it and return= s it > to Steve with key order Y, is this OK? >=20 > The answer is "yes". I just sent Clark a message with the sentence "if the YAML standard says mappings don't have to be ordered a roundtrip could remove meaning from a document that uses any 'special' mapping" I can see both sides of the coin now and don't necessarily have to agree to be able to carry on using YAML :-) I'm more than happy with the decision although I can see why it was a difficult one. I would have voted with you Brian but I wouldn't have be vociferous about it as there are more important things in life (and YAML). I assume the two sides were -> i) mappings are internally ordered. By default they are emitted as unordered mappings. If a mapping is preceded by a !omap then they are ordered. If a language doesn't have a native orderd mapping then a representation is used that simulates one. This keeps roundtripping and is self consistent. ii) mappings are internally unordered. To construct an ordered mapping needs to be built using ordered lists and unordered mappings.=20 The first hides a layer of complexity in order to facilitate ordered mappings which are rarely used in order to make the syntax of ordered mapping slightly nicer. The second is more consistent with most languages and uses the base atomic units in a consistent fashion. As such it is the 'cleanest' at the expense of some contrivance in the representation of ordered mappings. I'm only posting this in order to understand why the decision was made, not to challenge it. If anybody cares to add to this that would be cool and I'll pass the reponse on to the my colleages who have raised questions. Sorry about the distraction, I didn't want to interupt the riveting discussion currently taking place so please carry on :-) Tim ps I could argue that mappings are, at their most fundamental, lists of key-value pairs (which are themselves lists) and hence ordered but if I did that I might get shot and so won't ;-) |
From: Clark C. E. <cc...@cl...> - 2004-09-02 22:55:19
|
On Thu, Sep 02, 2004 at 11:45:14PM +0100, Tim Parkin wrote: | i) mappings are internally ordered. By default they are emitted as | unordered mappings. If a mapping is preceded by a !omap then they are | ordered. If a language doesn't have a native orderd mapping then a | representation is used that simulates one. This keeps roundtripping and | is self consistent. | | ii) mappings are internally unordered. To construct an ordered mapping | needs to be built using ordered lists and unordered mappings. | | The first hides a layer of complexity in order to facilitate ordered | mappings which are rarely used in order to make the syntax of ordered | mapping slightly nicer. | | The second is more consistent with most languages and uses the base | atomic units in a consistent fashion. As such it is the 'cleanest' at | the expense of some contrivance in the representation of ordered | mappings. Yes. It is a trade off, but the impact is quite vicious, it rippes through the API and it imposes burdens on languages that don't order their mappings -- perl and python. The (perhaps more important) consideration for Oren and myself is that mappings/sequences are both functions (a sequence is a function on integers). And this is very nice model to be defining tools that work on YAML documents, such as schemas, and other fun nerdly stuff. YPath expressions, for example, would be quite hard to write if you had to represent lists, unordered dictionaries, sets, ordered maps, etc. Our position emerged more as an objection to the "mindless" complexity of XML's model, we were looking for something that was "sufficient" to store most data cleanly, but only including items that were "necessary". ;) | ps I could argue that mappings are, at their most fundamental, lists of | key-value pairs (which are themselves lists) and hence ordered but if I | did that I might get shot and so won't ;-) No, you'd just be talking about something that isn't a mapping (a mapping is just another name for function...), you'd be talking about a structure with two mappings, one from the integers onto the keys, and one from the keys onto values. The resulting item is quite a complex object... and not a good 'assembler' instruction, IMHO. Besides, Python mappings don't preserve key order. ;) *bings* -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: David H. <dav...@bl...> - 2004-09-02 18:54:04
|
Tim Parkin wrote: > After spending some time looking around at the spec, the list and > various other sources, I still don't understand why ordered mappings > have to use the following (IMHO ugly) syntax :- > > var: !omap > - one: 1 > - two: 2 > - three: 3 > > If, as figure 3.2 in the spec suggests, the post tokenized > representation of mappings are lists of key:value pairs, then shouldn't > it be possible, and possibly more yamlish, is the following were > allowed. > > var: !omap > one: 1 > two: 2 > three: 3 > > This wouldn't cause problems with readability (it should be obvious that > it's not a standard mapping) and a standard mapping would still be > 'unordered'. > > What problems would this cause? IIUC, the conversion of a document to the information model doesn't depend on the tags, only on the syntax (if it depended on the tag, then there would be difficulties in adding other tags that require an ordered mapping). Because it is ordered, an !omap is defined as a sequence of one-key mappings in the information model, therefore it needs the '-' syntax. There are alternative possible designs, such as defining all mappings to be ordered with duplicates, and then treating some as unordered depending on context, but they don't map as well to the data models of some of the languages that were considered important in the design of YAML. -- David Hopwood <dav...@bl...> |
From: Oren Ben-K. <or...@be...> - 2004-09-02 19:25:13
|
I put both David's succinct answer and Clark's details one in the wiki: http://yaml.kwiki.org/index.cgi?OrderedMaps One day we'll collect all these and make a proper FAQ. Have fun, Oren Ben-Kiki |