From: Tim P. <ti...@po...> - 2006-04-15 11:39:21
|
Clark C. Evans wrote: > > Possible Resolutions > ==================== > > Uncle! Just Make it Illegal > --------------------------- > > One solution is just to forbid stuff like {key:value}, and [key:value]. > This philosophy may break some existing documents, although when they > break, they won't have a (subversive!) different interpretation. > > What this does is enforce the best practice of separating out key/value > pairs with a ": " for readability and it is consistent with the block > style. What this does do, however, is make the following illegal: > > availability: [ 10:00, 12:30, 14:30 ] > fun-websites: [ http://yaml.org, http://htsql.org, http://slashdot.org ] > dependencies: [ Data::Dumper, Go::Fetch, My::Favorite::Package ] > > At first it seems that the solution is to quote the values and be done > with it. But this isn't exactly correct. Unfortunately, by quoting a > string, you turn off implicit typing. While 10:00 is a time object, > "10:00" is a string value. There is a big difference. What you have to > do in this case is either type each node, use block syntax, or use "!" > to force implicit typing. > > availability: [ ! "10:00", ! "12:30" , ! "14:30" ] > > While the first version is perfectly clear to a secretary who is looking > at time availabilities for a meeting; the second one... it is unusable. > The typing information gets in the way. The only option is then to have > some sort of path-based typing mechanism; and tossing out the convince > of implicit typing within flow collections. > > gains: > - No Chance of mis-interpretation > > loss: > - more verbosity in times, timestamps, urls, and urns. > - for all practical purposes, implicit typing on the above > - current data that is non-compliant must be fixed > > The Javascript Way, Baby! > ------------------------- > > Another solution is to treat {bing:false} as it would be treated in > Javascript, as a key value mapping 'bing' onto the ``false`` boolean > value. This is relatively simple proposition, it has implications > very similar to above. > > gains: > - compatibility with Javascript and expectations > > losses: > - Possible mis-interpretation, e.g. 10:20 is a time in a block > - Current data which stores lists like above, will break with > no warning. > > > Allow, but Complain if not obvious > ---------------------------------- > > This solution keeps the current behavior, {key:value} remains equivalent > to { ! "key:value": ! "" }, where ! forces implicit typing on a quoted > scalar value. However, it sets-aside a set of Regular Expressions to > check. If the value contains a ":" but does not match a RE, then a > complaint is issued. > > gains: > - documents currently following the spec remain valid > - we keep unquoted time/dates, and urls in flow collections > > losses: > - Javascript people might get a bit confused > - possible (but unlikely) chance of mis-interpretation > > I want to assert that mis-interpretation in this case is unlikely. In > Javascript, keys cannot be integers; although they can in Python; > although python has a style guide that recommends using ": ". Even if > someone gets the rule wrong in one or two spots, chances are most of > these spots _won't_ pass the regular expression test so they will be > reported. > > The regular expressions would be, necessarily, quite strict. For > time/dates, we would follow the ISO standard, including requiring > leading 0's. for clarity. For URLs and such, we would only allow > well-known prefixes, such as "http", "https", "mailto", etc. Hence, > the chance of something that doesn't clearly look like time or > a URL would, by default, be flagged as a possible error. > > What we mean by ``complain`` is left to be explored. I'm thinking a > warning; but it could be an error that requires the user to explicitly > register a regex expression to allow his/her particular use case. I do > thing, however, that the warnings should be configurable (but perhaps > this is debatable). > > NOTE: this error would probably be a load error, and would not > be a parse error > > Conclusion > ========== > > Thus far, we have tentative agreement among Ingy, Oren and myself to > go with this last solution, permitting colons in flow-scalars, but > complaining if they don't match a regular expression. > > However, this is a big decision; and perhaps a controversial one, > so I want to make sure that we have broad community support. As > I stated before, it's not an easy choice... I'm looking forward > to each of your statements on the problem and potential solutions. > > > Kind Regards, > > Clark C. Evans > > P.S. Please be advised that many, many users of YAML are _not_ > programmers; they are business people, medical researchers, and other > power users who are just trying to *read* the data. If you want an > opaque serialization language that is perfectly free of ambiguity, use > XML, JSON, or, quite soon a JSON++ which will add YAML alises/anchors > and type library support to JSON with an extended syntax (a very > restricted, JSON compatible flow syntax subset of YAML). As one who thinks implicit typing as implemented in yaml is an admirable goal but ultimately confusing, succinct explicit to me is more important than clever implicit. Hence the set of times above would be better implemented as part of a schema or explicit type. e.g. [ !time 10:00, !time 10:30 ] or !times [ 10:20, 10:30 ] This means that I'd like everything to be supplied as strings. This means that the 'type' warnings mentioned above would be checking types and then discarding that information (as long as it can do this then I'm ok with it). The only way we could end up using implicit typing is if we checked the values that were parsed to make sure they had been correctly 'implicitly' typed. If we're doing this, we might as well apply our own explicit, but external, typing. Of course I might be missing the critical use of implicit typing so I'll ask if anybody can tell me why it's of so much importance? (this isn't meant as a challenge btw, I have to explain yaml's features to people when I'm evangelising and the implicit typing is one of it's features that seems to get in the way more than it sells). In summary I think I'd be happiest with the third option as long as I can still read everything in as strings unless explicitly typed. Tim Parkin |