From: Oren Ben-K. <or...@be...> - 2006-04-11 16:41:56
|
Hi people, Sorry for the long absence - personal reasons. I hope to get more YAML time from now on. Brian, Clark and myself discussed the JSON compatibility issue. We basically see two options: 1. Only require a space after the ":" if the key is plain. So: "foo":123 # Becomes valid 12:34: Meeting # Stays valid bar: http://yaml.org # Stays valid This is backward compatible, and compatible with machine-generated JSON. It is _not_ compatible with hand written "JSON-like" data like {foo:bar} - JSON insists on quotes, but hand-written data often lacks them. 2. Disallow ":" in plain keys. "foo":123 # Becomes valid 12:34: Meeting # Becomes error bar: http://yaml.org # Stays valid This is not backward compatible, as keys with ":" in them will break. To minimize this, we can specify that the above rule only holds in flow context (inside {}). Presumably all YAML that contains : in keys has it in block context (outside {}). So {12:34: meeting} will be an error but the above example will stay valid. Personally, I feel that (1) is a better way, as it is fully backward compatible and fully JSON compatible. AFAIK the vast majority of JSON is written by machines and not by people, and the JSON spec is clear that " are mandatory so all libraries emit them. I also think that having such a subtle difference between block and flow keys would be too confusing. Finally, option (1) is such a minor relaxation of the spec that I doubt its worth to bump YAML's version number for it. Option (2) is definitely a YAML 1.2 thing, which I feel is an overkill. Brian has been... persuaded... to accept option (1) <grin>, which leaves Clark to get a concensus. Clark? Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2006-04-11 20:19:42
|
Another issue before we start is permitting TABs within a flow collection. This is absolutely needed for JSON compatibility as some "pretty-printing" emitters use tabs and they are perfectly permitted there. On Tue, Apr 11, 2006 at 09:41:50AM -0700, Oren Ben-Kiki wrote: | 1. Only require a space after the ":" if the key is plain. So: | | "foo":123 # Becomes valid | 12:34: Meeting # Stays valid | bar: http://yaml.org # Stays valid | | This is backward compatible, and compatible with machine-generated | JSON. It is _not_ compatible with hand written "JSON-like" data like | {foo:bar} - JSON insists on quotes, but hand-written data often lacks | them. Well, luckily, ``{foo:bar}`` in Javascript assumes that ``bar`` is is a previously declared variable. Hence, in JSON like data that doesn't use variables, this is a ``ReferenceError: bar not declared``. That said, it is still... confusing. Mostly beacuse one single space in a bracketed context makes a huge difference in semantics. {foo:bar} is interpreted in current YAML as {"foo:bar": ""} and this just is not intuitive. I have a few very smart friends using YAML, and this has come up more than once. They like not having to use quotes... but the current interpretation of this example is profoundly unobvious. | 2. Disallow ":" in plain keys. | | "foo":123 # Becomes valid | 12:34: Meeting # Becomes error | bar: http://yaml.org # Stays valid | | This is not backward compatible, as keys with ":" in them will break. I don't mind this option, ``12:34: Meeting`` is not all that clear; it is also rare enough that its impact will be minimal. | To minimize this, we can specify that the above rule only holds in | flow context (inside {}). Presumably all YAML that contains : in keys | has it in block context (outside {}). So {12:34: meeting} will be an | error but the above example will stay valid. Yea, let's call this option #3. It follows the example of not permitting a comma in flow constructs. A more draconian version, #4 simply forbids a colon and comma in flow contstructs. This might break a few more documents, but I doubt it. It also is more rule-like. | Personally, I feel that (1) is a better way, as it is fully backward | compatible and fully JSON compatible. AFAIK the vast majority of JSON | is written by machines and not by people, and the JSON spec is clear | that " are mandatory so all libraries emit them. Well, as I stated above; {key:value} isn't even valid Javascript. | I also think that having such a subtle difference between block | and flow keys would be too confusing. Don't we already make a subtle difference here with regard to commas? I'm all for expanding this subtle difference to make it much more clear: don't use any indicators within plain style when your in-flow. | Finally, option (1) is such a minor relaxation of the | spec that I doubt its worth to bump YAML's version number for it. | Option (2) is definitely a YAML 1.2 thing, which I feel is an | overkill. A. None of the current parsers (except Kirill's recent PyYaml) really support 1.2, and this is the only place where Kirill's parser deviates from the specification and returns what I would expect: >>> yaml.load_document("--- {key:value}") {'key': 'value'} Note that Syck returns the correct result as specified, but it is quite unexpected from all but the most "inner circle" YAMLers: >>> syck.load("--- {key:value}") {'key:value': None} B. I'm very comfortable making this change if it *breaks* existing documents by making things illegal (and not changes their interpretation). My vote then is for option #4: Ban all indicators in-flow plain scalars. | Brian has been... persuaded... to accept option (1) <grin>, which | leaves Clark to get a concensus. Clark? Well... I attempted to change his mind after you left. I'm simply not comfortable with {key:value} having a totally unobvious interpretation. Best, Clark |
From: Oren Ben-K. <or...@be...> - 2006-04-11 21:14:39
|
Clark: > Another issue before we start is permitting TABs within a flow collection= . > This is absolutely needed for JSON compatibility as some "pretty-printing= " > emitters use tabs and they are perfectly permitted there. Ugh! We can allow them in separation spaces. But not in indentation. That will work, but still... UGH! > A more draconian version, #4 > simply forbids a colon and comma in flow contstructs. This might break > a few more documents, but I doubt it. It also is more rule-like. This kills unquoted URL, time and namespace values (perl modules etc.) - in flow context. But it is a trivial change to the spec. Note that if we do this, we don't need to require a space after the colon _anywhere_, because plain keys are subject to all the restrictions of flow keys (for a good reason). I must say I'm tempted... but see below. > I'm all for expanding this subtle difference to make it much more clear: > don't use any indicators within plain style when your in-flow. Well, that's not actually the case; you can have ! * & in plain scalars, as long as they aren't the first char. And you can have # as long as there's no space before it. BTW, under option #4, the # would be the only space-sensitive character we have in the whole spec. It's needed to allow plain URLs in block context. > | Finally, option (1) is such a minor relaxation of the > | spec that I doubt its worth to bump YAML's version number for it. This holds even if we allow tabs in separation spaces - its just another relaxation. > | Option (2) is definitely a YAML 1.2 thing, which I feel is an > | overkill. #4 is also very much YAML 1.2. > A. None of the current parsers (except Kirill's recent PyYaml) really > support 1.2, You mean 1.1? There's no 1.2 yet. > B. I'm very comfortable making this change if it *breaks* existing > documents by making things illegal (and not changes their > interpretation). "This turns out not to be the case". Proposal #4 will solently change the meaning of [12:34] to [ "12": "34" ] - and so will _any_ proposal addressing your {foo:bar} concern. Bottom line, do you think solving {foo:bar} is worth breaking [12:34]? If yes, #4 is fine. Otherwise, it's #1. Have fun, Oren Ben-Kiki |
From: Oren Ben-K. <or...@be...> - 2006-04-12 17:21:49
|
After a long chat in #yaml about this, we came up with the following. We modify two rules: - In all plain scalars, ":" must be followed by an ASCII non-space, non-letter (could be a 2nd ":"). So 10:30, Perl::Module and http://foo are valid unquoted plain scalars. Anything else (e.g., foo:bar) isn't. - Hash ":" will no longer require a space after it. So JSON {"foo":"bar"} would work fine, and so will {foo:bar}. This is JSON compatible and also backward compatible with all(most?) all existing uses of ":" in plan scalars. Does anyone see compatibility issues with this? Places where : is used that would break if we switch to these rules? Have fun, Oren Ben-Kiki |
From: TRANS <tra...@gm...> - 2006-04-12 18:19:00
|
On 4/12/06, Oren Ben-Kiki <or...@be...> wrote: > After a long chat in #yaml about this, we came up with the following. > We modify two rules: > > - In all plain scalars, ":" must be followed by an ASCII non-space, > non-letter (could be a 2nd ":"). So 10:30, Perl::Module and http://foo > are valid unquoted plain scalars. Anything else (e.g., foo:bar) isn't. > > - Hash ":" will no longer require a space after it. So JSON > {"foo":"bar"} would work fine, and so will {foo:bar}. > > This is JSON compatible and also backward compatible with all(most?) > all existing uses of ":" in plan scalars. > > Does anyone see compatibility issues with this? Places where : is used > that would break if we switch to these rules? I guess I don't understand why you wouldn't just make ':' illegal in plain keys? T. |
From: Oren Ben-K. <or...@be...> - 2006-04-12 19:02:13
|
> After a long chat in #yaml about this, we came up with the following. > We modify two rules: It seems I jumped the gun on this. The problem is things like foo:1. Under the rules I posted, it would be a valid single value - not what users would expect. We have several options, none of which look great. Our problem is that we want to achieve all of the following: - Allow the use of date/time values in flow collections (10:20). - Allow people to type foo:bar and foo:2 as key/value pairs (as a side effect, this solves JSON comatibility, but that's a fringe benefit; we can do JSON compatibility in other, simpler ways). - Allow Perl::Module as a key, at least in block mappings, because a ton of CPAN .yml files use that. - Allow unquoted URLs, especially "http://whatever" One way to achieve this would be to say that a ":" is never allowed in keys, and that it is only allowed in values if is "looks like a content character" by looking before and above it. The problem is thare are always strange edge cases, for example its hard to allow 10:20 as a time, but interpret a10:20 as a key/value pair. So it is still TBD. Input is welcome. Have fun, Oren Ben-Kiki |
From: <in...@tt...> - 2006-04-13 07:57:48
|
On 12/04/06 18:01 -0100, Oren Ben-Kiki wrote: > > After a long chat in #yaml about this, we came up with the following. > > We modify two rules: >=20 > It seems I jumped the gun on this. The problem is things like foo:1. > Under the rules I posted, it would be a valid single value - not what > users would expect. >=20 > We have several options, none of which look great. Our problem is that > we want to achieve all of the following: > - Allow the use of date/time values in flow collections (10:20). > - Allow people to type foo:bar and foo:2 as key/value pairs (as a side > effect, this solves JSON comatibility, but that's a fringe benefit; we > can do JSON compatibility in other, simpler ways). > - Allow Perl::Module as a key, at least in block mappings, because a > ton of CPAN .yml files use that. > - Allow unquoted URLs, especially "http://whatever" I tend to disagree with the spirit of these goals. Let's get a little more specific. The thing that started this discussion is that we want to tweak YAML's *flow* collections style, such that: 1) Valid JSON streams (or at least common JSON streams) are valid YAML. From this it follows that: =20 2) In *flow* collections only, we need to allow a key/value pair to be separated by a colon without a following space. This is actually being generous. YAML supports key/value pairs in both flow mapping and sequences. JSON only has them in mappings. It also follows that: 3) We don't need to change *block* collections at all. Block collection rules are perfect asis. At least in this context. I don't want to monkey with them. Specifically, block key/value pairs require ': ' as a separator. They will always require this. So there is no need to second guess a ':' in a key here. This needs to be reiterated. There are many standard applications using YAML to store metadata. I'm willing to bet that they all use block collections only. I would further that bet, that the creators of these standards don't even know YAML *has* flow collections. Let's face it, when people think of YAML, they think of indented, colon separated data. The cutesy flow stuff, while well specified, is the redheaded stepchild of YAML. Nobody, outside the inner circle, cared about it until JSON came along. AFAIK, no emitters emit a mixture of block and flow collections. So what I am getting at is, let's take time to get flow collections right, before usage of it becomes widespread like it is for block. And let's leave block completely alone. 3a) The Perl module as a mapping key use case is not important, since it only needs to work in block mappings, as it already does, and we aren't changing block mappings, right? The remaining issues involve: 4) What to do about ':' in *plain* *key* *flow* collection context (possibly in mappings only). 4a) The use cases are http://example.com (url), 16:20 (time), foo:bar (other). Where folks want 'other' to mean a key value pair. I would rather keep whatever rule we come up with be one that is simple and easy to remember. One such rule is: 5) Forbid a colon in a plain key in a flow collection. This is simple as far as rules go. It means that using times and urls as ke= ys (in flow collections ONLY) requires quoting them, and that foo:bar is just = an error. An error that keeps you from shooting yourself in the foot. Do we really need plain times and urls as flow keys? If so, why? It also prevents: [10:20, 10:30, 10:45] So: 5a) Forbid a colon in a plain key in a flow *mapping*. This allows the list of times. Another simple rule is: 6) You can omit the space after the colon in a key/value flow collection separator, IF AND ONLY IF the key is quoted. This rule is pretty simple, is just a tiny derivation from current YAML, is totally backwards compatible (since it is just a relaxation), and serves the original purpose of JSON compatability. The only thing it doesn't solve is {foo:bar} ambiguity. Thoughts? Cheers, Ingy > One way to achieve this would be to say that a ":" is never allowed in > keys, and that it is only allowed in values if is "looks like a > content character" by looking before and above it. The problem is > thare are always strange edge cases, for example its hard to allow > 10:20 as a time, but interpret a10:20 as a key/value pair. >=20 > So it is still TBD. Input is welcome. >=20 > Have fun, >=20 > Oren Ben-Kiki >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting langua= ge > that extends applications into web and mobile media. Attend the live webc= ast > and join the prime developer group breaking into this new coding territor= y! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=110944&bid$1720&dat=121642 > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Oren Ben-K. <or...@be...> - 2006-04-13 16:22:55
|
On 4/13/06, Ingy dot Net <in...@tt...> wrote: > I tend to disagree with the spirit of these goals. Let's get a little > more specific. Never mind the spirit... You seem to agree with the letter :-) Lets agree to disagree about the importance of the flow style (other than JSON). I use it a lot, and I believe that when people write config files etc. in YAML, they'll quickly discover that embedding snippets of flow in the file do wonders for readability. YAML was never about emitters, it was about readability. Otherwise it would have been JSON :-) > 5a) Forbid a colon in a plain key in a flow *mapping*. Today we have three sets of restrictions on plain scalars - outside flow, inside flow, and keys. What you suggest is that we bump it up to four - flow-key, flow-value, block-key, block-value. The least restrictive rule we can specify without ambiguity is: block-value: Can't start with an indicator, can't contain ": ", can't contain " #". flow-value: As above, but also can't contain [ { , } ] block-key: Same as block-value, but one-line and limited to 1K chars flow-key: Same as flow-value, but can't contain any ":", one-line and limited to 1K chars. However, today block-key is restricted as much as flow-value. This allows us to make the rules be a simple progression of restrictions, which I like better. Call this #5b: block-value: Can't start with an indicator, can't contain ": ", can't contain " #". flow-value: As above, but also can't contain [ { , } ] block-key: As above, but one-line and limited to 1K chars flow-key: As above, but can't contain any ":". Either one allows all the current ":" usage in values (in both flow and block context). It would seem that all(most?) all affected usage would become an error rather than change semantics. Good. > 6) You can omit the space after the colon in a key/value flow collection > separator, IF AND ONLY IF the key is quoted. Yes, that was my #1 proposal. Combined with 5a/5b, it seems to solve everything: > - Allow the use of date/time values in flow collections (10:20). As values - check. > - Allow people to type foo:bar and foo:2 as key/value pairs Becomes an error ({a:b} -> {"a:b":""}, there's ":" in the key). Good enough. Check. > - Allow Perl::Module as a key, at least in block mappings, because a > ton of CPAN .yml files use that. Check. > - Allow unquoted URLs, especially "http://whatever" As values - check. Unless I missed something, you cracked this one. Good job! So, do we agree on the combination of #5b and #1/#6? Have fun, Oren Ben-Kiki |
From: Kirill S. <xi...@ga...> - 2006-04-13 16:36:11
|
> block-value: Can't start with an indicator, can't contain ": ", can't > contain " #". > flow-value: As above, but also can't contain [ { , } ] > block-key: As above, but one-line and limited to 1K chars > flow-key: As above, but can't contain any ":". I believe any rules like a key cannot contain any ":" are ambiguous. After all, keys are defined as something that is followed by ":", so saying that ":" is forbidden in keys is tautology. For instance, how will you parse [1:2] It can be correctly parsed as [ { !!int "1": !!int "2" } ] # neither key nor value contain ":" and [ !!int "1:2" ] # it's a value, so it can contain ":" Both interpretations are valid according to above rules. |
From: Oren Ben-K. <or...@be...> - 2006-04-13 18:59:48
|
On 4/13/06, Kirill Simonov <xi...@ga...> wrote: > > I believe any rules like > a key cannot contain any ":" > are ambiguous. After all, keys are defined as something that is followed > by > ":", so saying that ":" is forbidden in keys is tautology. Not so. Plain keys are deifines as something followed by a ": ". We'll onl= y allow the space after the colon to be omitted when the key is quoted - like this: {"1":2} - for JSON compatibility. For instance, how will you parse > [1:2] > > It can be correctly parsed as > [ { !!int "1": !!int "2" } ] # neither key nor value contain > ":" > and > [ !!int "1:2" ] # it's a value, so it can contain ":" > > Both interpretations are valid according to above rules. Only the 2nd interpretation is valid. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2006-04-13 23:20:27
|
Although JSON compatibility is nice, most JSON parsers accept the following content, {"one":"1",two:"2","three":3,four:4} and hence, to make sure we don't have any hiccups in these four cases is important. In particular, we have two cases, 10:20 and a:scalar which although they do not conflict with the above uses, could be confusing if they have a different meaning than what is expected. On Thu, Apr 13, 2006 at 12:57:41AM -0700, Ingy dot Net wrote: | On 12/04/06 18:01 -0100, Oren Ben-Kiki wrote: | > - Allow the use of date/time values in flow collections (10:20). I'm OK with still allowing 10:20 as a valid unquoted scalar (as a in-flow sequence value or mapping key). But it isn't essential. | > - Allow people to type foo:bar and foo:2 as key/value pairs (as a side | > effect, this solves JSON comatibility, but that's a fringe benefit; we | > can do JSON compatibility in other, simpler ways). This is a nice goal, although I'm happy with foo:bar becoming illegal due to its historical difference. | > - Allow Perl::Module as a key, at least in block mappings, because a | > ton of CPAN .yml files use that. | > - Allow unquoted URLs, especially "http://whatever" Ok; but really neither of the above must necessarly be unquoted within a flow collection. | 1) Valid JSON streams (or at least common JSON streams) are valid YAML. Ok. See above though: {"one":"1",two:"2","three":3,four:4} | 2) In *flow* collections only, we need to allow a key/value pair to be | separated by a colon without a following space. It would be nice, due to the above cases; however, I don't care about key:value as much, but Oren pointed out that implicit typing of times in this context would be icky: [! "10:20" ] The other case is ISO dates, 2005-12-06T12:32:45 is another case. | This is actually being generous. YAML supports key/value pairs in both | flow mapping and sequences. JSON only has them in mappings. Well, this isn't true due to map-in-seq format. | 3) We don't need to change *block* collections at all. Agreed. ... | 5) Forbid a colon in a plain key in a flow collection. | | This is simple as far as rules go. It means that using times and urls | as keys (in flow collections ONLY) requires quoting them, and that | foo:bar is just an error. An error that keeps you from shooting | yourself in the foot. I like this option. | It also prevents: | [10:20, 10:30, 10:45] Yep. | 6) You can omit the space after the colon in a key/value flow collection | separator, IF AND ONLY IF the key is quoted. The problem is, there are lots of Javascript out there that is passing as JSON... ie, most JSON parsers accept the mapping above. | The only thing it doesn't solve is {foo:bar} ambiguity. Yea. Make it illegal. On Thu, Apr 13, 2006 at 09:22:48AM -0700, Oren Ben-Kiki wrote: | > 5a) Forbid a colon in a plain key in a flow *mapping*. | | Today we have three sets of restrictions on plain scalars - outside | flow, inside flow, and keys. What you suggest is that we bump it up to | four - flow-key, flow-value, block-key, block-value. Can't we make it simpler.... not worse? I'd rather give up some usability to make it more regular; although this conflicts with Brian's good observation that we don't require flow and block to use the same productions. | block-value: Can't start with an indicator, can't contain ": ", | can't contain " #". | | flow-value: As above, but also can't contain [ { , } ] | | block-key: As above, but one-line and limited to 1K chars | | flow-key: As above, but can't contain any ":". Under these proposed rules: - {key:value} is illegal - [key: value] means [{"key": "value"}] Good. - [key:value,10:20] means ["key:value","10:20"] I think this is problematic: splitting hairs. To be compatible with JSON we need: - {"key":3} - {"key":"value"} - {"key":true} To be compatible with common Javascript usage that claims to be JSON (and just about every other JSON parser handles): - {key:3} - {key:true} - {key:"value"} - {"key":3} I'm still undecided. Clark |
From: Clark C. E. <cc...@cl...> - 2006-04-13 21:36:47
|
I'd like to point out to relevant complications that we introduced (perhaps incorrectly, but it is a bit late) that especially make things a bit more complicated. Mapping-In-Sequence aka Ordered Mapping We wanted a mechanism to provided an "ordered mapping", and hence we have a syntax which permits map key/value pairs within a sequence, such as: [ "food": "tuna", "food": "tofu" ] This creates a sequence of two mappings, each with one key/value pair pairs. Key-Only-Mapping aka Set Notation We wanted it to be easy to omit key values to provide a nice syntax for an unorderd-set of values. { "tuna", "chicken", "tofu" } I'll address the current suggestions in another thread, but this in particular makes the statement about "in-flow mapping" vs an "in-flow sequence" non-sensical. The only distinction we can resonably make here is separating mapping *values* from mapping keys or sequence values. Best, Clark |
From: Clark C. E. <cc...@cl...> - 2006-04-14 00:04:34
|
On Thu, Apr 13, 2006 at 07:35:59PM +0300, Kirill Simonov wrote: | I believe any rules like | a key cannot contain any ":" | are ambiguous. After all, keys are defined as something that is | followed by ":", so saying that ":" is forbidden in keys is tautology. lol, you're new here! | For instance, how will you parse | [1:2] I'm actually happy with just making this illegal; I share the same level of discomfort. But there is a compelling reason, a list of times: [ 10:30, 11:00, 12:30 ] If this is illegal, to invoke implicitly typed times, you would have to use: [ ! "10:30", ! "11:00", ! "12:30" ] | It can be correctly parsed as | [ { !!int "1": !!int "2" } ] # neither key nor value contain ":" | and | [ !!int "1:2" ] # it's a value, so it can contain ":" I don't like this exception. Would it help to have an alternative definition, one that only permits a ":" in a plain-scalar value if it is preceded by a "non-word"? - [ 10:20, 2007-01-01T22:00:00] # ok, implicit typed scalars For example, the following items are treated where the : separates the key and the value, since the key matches ^w+ regex. - [ a10:20, bingles:hi ] Then, we can either decide to allow these to be legal key/value pairs or a syntax error. Assuming legality: [ 10:20 ] => [ ! "10:20" ] [ 2007-01-01T22:00:00 ] => [ ! "2007-01-01T22:00:00" ] { 10:20 } => { ! "10:20": ! "" } { 2007-01-01T22:00:00 } => { ! "2007-01-01T22:00:00": ! "" } [ a10:20 ] => [ { ! "a10": ! "20" } ] [ key:val ] => [ { ! "key": ! "val" } ] { a10:20 } => { ! "a10": ! "20" } { key:val } => { ! "key": ! "val" } This has the advantage of parsing Javscript (which is not-quite-JSON) propertly in all cases, but still permitting times and dates to be unquoted and implicitly typed. Instead of having rules that differ by the use of [] or {}, they are the same... albeit a bit more complicated. This is more Perl like, DWIM, and has an advantage of being compatible with *both* Syck and current PyYaml in almost every real-live case. Best, Clark |
From: Oren Ben-K. <or...@be...> - 2006-04-14 01:02:23
|
On Friday 14 April 2006 03:03, Clark C. Evans wrote: > Would it help to have an alternative > definition, one that only permits a ":" in a plain-scalar value if it > is preceded by a "non-word"? It would be perfect... if it weren't for the question of what is a "non-word", exactly. Must a word start with a letter? Can it contain '@'? etc. > This is more Perl like, DWIM, The problem is, who is the "I" in the DWIM? We need "Do What the *User* Means", not "Do What *We* Mean" :-) If we can find a "non-word" rule that would be immediately accepted by 99% of the users, fine, but I seriously doubt that. On the other hand, the rule that key/value pairs in YAML are always separated by ": " is much simpler (and enhances readability, which was our number one concern). We basically tell the user "we'll consider anything a single value unless we absolutely must accept it as a key". Under this view, the current rules are fine. Allowing {"key":value} (proposal #1/#6) is still in the spirit of this rule; there's no other way to read it. This leaves {key:value}, which is currently confusing. Making it an error (proposal #5a/#5b) eliminates this confusion. This only leaves [key:value] as a possible cause for confusion, but I expect that ordered mappings would be much more rare than normal mappings; by the time a user would get to these, he should be in the habit of using ": " for key/value pairs anyway. Either way, IMO it is worth it to allow ISO date and time values, and as a fringe benefit even unquoted URL values. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2006-04-14 03:06:22
|
On Fri, Apr 14, 2006 at 03:55:35AM +0300, Oren Ben-Kiki wrote: | On Friday 14 April 2006 03:03, Clark C. Evans wrote: | > Would it help to have an alternative | > definition, one that only permits a ":" in a plain-scalar value if it | > is preceded by a "non-word"? | | It would be perfect... if it weren't for the question of what is a | "non-word", exactly. Must a word start with a letter? What I mean by a name is anything starting with a letter, underscore, and for ECMA-262 compatibility, the dollar sign. The remaining part of the name would permit "." and "-" and any other characters permitted by YAML's productions, including the space and @ sign. I think this definition of word is "intuitive enough" for most even casual programmers to immediately grok. There is a big difference between strings like 10:20, 2004-01-01T23:20 and the like, vs. items such as age:23, $var:foo, delivered:true | This leaves {key:value}, which is currently confusing. Making it an | error (proposal #5a/#5b) eliminates this confusion. This only leaves | [key:value] as a possible cause for confusion I don't find this option compelling; it seems forced. While the item I'm suggesting is a wallop of a production and not necessarly easy to implement: it is intuitive in its result. I'm proposing what is already well-established, identifiers start with a letter. This proposal also has an immediate advantage over the others: just about every "javascript notation" object structure is valid YAML, strict JSON or not. I've got lab technicians making "JSON" data, and it does not always perfectly follow the JSON specfication, often times it uses anything that happens to ``eval`` in a browser. Best, Clark |
From: Oren Ben-K. <or...@be...> - 2006-04-14 04:41:34
|
Clark and I had yet another IRC session, kicking around ideas. It doesn't seem like there's any solution that doesn't involve some pain. At this point, Brian's proposal (as formalized by me) seems to be the least painful one. Again, the idea is: - The space after the colon can be omitted for quoted keys. This is a relaxation so there's no backward compatibility issue. - There are four sets of restrictions on plain scalars, depending on context (instead of the original three sets of restrictions). These are: block-value: # Same as today Can't start with an indicator, can't contain ": ", can't contain " #". flow-value: # Same as today As above, but also can't contain [ { , } ] block-key: # Same as today As above, but one-line and limited to 1K chars flow-key: # More restricted than today As above, but can't contain any ":". This is a minor additional restriction to the current rules. Backward compatibility concerns are minimal since the additional restriction only applies to flow collections, which "nobody" uses. The good: - block collections: no changes whatsoever # Maximal compatibility - {"abc":123} # JSON - Works as expected. - {key:value} # Error. Bad style anyway - error is good. - [10:20,http://www.yaml.org] # Time and URL entries in flow seq The bad: { 10:20 } # Error, not set with one time. This may be seen as overly restrictive by some people. But sets are a rare use case, the input triggers an error, and there's a simple workaround: {? 10:20}. The ugly: [ abc:def, http://www.yaml.org ] # 1st entry is "abc:def", not error. Here there is the possibility of misinterpretation. However, ordered maps are a rare use case, there is a trivial workaround [ abc: def ], and the behavior is consistent with the rest of the system - _always_ place a space after a ":" for a plain key. It is the last problem that is the worst, but it seems that it is less bad than the alternatives we considered (breaking backward compatibility, giving up on ISO dates as unquoted values, etc.). Clark has posted a proposal that the parser _may_ emit a warning on seeing values that might be a key/value pair lacking a space, unless they look like one of the common cases (URLs, dates, etc.). I agree that the spec should present the above issue, and that such recommendation in the spec makes sense. It would mitigate a lot of the potential for misinterpretation. I would expect the parser to allow the user to turn this error off, however ;-) At any rate, while none of us is exactly happy with this, it seems we really have little choice in the matter. So unless someone has a better notion... (anyone?), we'll go with this. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2006-04-14 04:24:42
|
| This leaves {key:value}, which is currently confusing. Making it an | error (proposal #5a/#5b) eliminates this confusion. This only leaves | [key:value] as a possible cause for confusion, but I expect that | ordered mappings would be much more rare than normal mappings; by the | time a user would get to these, he should be in the habit of using ": " | for key/value pairs anyway. Either way, IMO it is worth it to allow ISO | date and time values, and as a fringe benefit even unquoted URL values. Ok. My primary problem with {key:value} and [key:value] is that the results *may* be unexpected due to precedent with Javascript, C, and other languages where a space isn't required. But, perhaps my issue here can be solved by a *suggestion* in the specification: It is recommended that all plain scalar values (especially those containing a colon) be checked against the YAML Type Registry via regular expression checks. If a match is not found, then a warning should be issued. In particular, well known URI schemes and ISO date/time formats should not result in warnings; where values such as "foo:true" or "foo:32" should. As a policy, the YAML type library will not contain types registered with things that could be easily confused with key/value pairs. What this really is about is the use of implicitly typed scalar values that don't have a pre-registered REGEX. It is about a FAQ, warnings, and proper documentation as well as smart use of the type library. This leaves the other two cases for JSON (and Javascript) that are currently illegal to be made legal: { "key":"value", key:"value" } Advantages: - minor change to the spec, no breaking backwards compatibility - consistency between flow and block forms, as much as possible - proper respect for precedent to avoid "confusion" with new users Best, Clark |
From: Oren Ben-K. <or...@be...> - 2006-04-14 05:11:21
|
On Friday 14 April 2006 07:23, Clark C. Evans wrote: > Ok. My primary problem with {key:value} and [key:value] is that the > results *may* be unexpected due to precedent with Javascript, C, and > other languages where a space isn't required. But, perhaps my issue > here can be solved by a *suggestion* in the specification: To clarify, the idea is: =2D We make the following changes to the spec to support JSON data: - Allow tab characters in separation spaces. - If a key is quoted, we don't require a space after the colon. - Add "\/" as a valid escape sequence (JSON uses it for some reason). This makes {"aa":123} valid, as well as {"aa":<TAB>123}, etc. =2D We make *no* other changes to the spec, other than to add a=20 recommendation along the following lines: > =A0 It is recommended that all plain scalar values (especially those > =A0 containing a colon) be checked against the YAML Type Registry via > =A0 regular expression checks. =A0If a match is not found, then a warning > =A0 should be issued. =A0In particular, well known URI schemes and ISO > =A0 date/time formats should not result in warnings; where values such > =A0 as "foo:true" or "foo:32" should. Since the "string" regexp matches everything, the wording need a bit of=20 work. The user should have an option to turn off these warnings, of=20 course. The wording need a bit of work, which we'll do when we get it=20 into the spec. > =A0 As a policy, the YAML type library will not contain types registered > =A0 with things that could be easily confused with key/value pairs. URLs and ISO dates are bad enough :-) This seems to address everyone's concerns without any downsides. Brian?=20 If this is OK with you, we have a winner. Have fun, Oren Ben-Kiki |
From: <in...@tt...> - 2006-04-14 08:08:27
|
On 14/04/06 08:11 +0300, Oren Ben-Kiki wrote: > On Friday 14 April 2006 07:23, Clark C. Evans wrote: > > Ok. My primary problem with {key:value} and [key:value] is that the > > results *may* be unexpected due to precedent with Javascript, C, and > > other languages where a space isn't required. But, perhaps my issue > > here can be solved by a *suggestion* in the specification: >=20 > To clarify, the idea is: >=20 > - We make the following changes to the spec to support JSON data: > - Allow tab characters in separation spaces. In both block and flow collections? > - If a key is quoted, we don't require a space after the colon. In both block and flow collections? > - Add "\/" as a valid escape sequence (JSON uses it for some reason). >=20 > This makes {"aa":123} valid, as well as {"aa":<TAB>123}, etc. >=20 > - We make *no* other changes to the spec, other than to add a=20 > recommendation along the following lines: >=20 > > =A0 It is recommended that all plain scalar values (especially those > > =A0 containing a colon) be checked against the YAML Type Registry via > > =A0 regular expression checks. =A0If a match is not found, then a warni= ng > > =A0 should be issued. =A0In particular, well known URI schemes and ISO > > =A0 date/time formats should not result in warnings; where values such > > =A0 as "foo:true" or "foo:32" should. >=20 > Since the "string" regexp matches everything, the wording need a bit of= =20 > work. The user should have an option to turn off these warnings, of=20 > course. The wording need a bit of work, which we'll do when we get it=20 > into the spec. >=20 > > =A0 As a policy, the YAML type library will not contain types registered > > =A0 with things that could be easily confused with key/value pairs. >=20 > URLs and ISO dates are bad enough :-) >=20 > This seems to address everyone's concerns without any downsides. Brian?= =20 > If this is OK with you, we have a winner. >=20 > Have fun, >=20 > Oren Ben-Kiki >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting langua= ge > that extends applications into web and mobile media. Attend the live webc= ast > and join the prime developer group breaking into this new coding territor= y! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=110944&bid$1720&dat=121642 > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Ingy d. N. <in...@tt...> - 2006-04-14 09:35:29
|
Clark and I talked about the colon recommendation in irc: 01:17 < cce> what do you think about the warning idea? 01:17 < ingy> well, I'm not against it 01:18 < cce> yea; it isn't exactly ideal 01:18 < ingy> I don't like dynamic lookups in the parser 01:18 * cce nods. 01:18 < ingy> but a few well placed static checks is ok 01:18 < cce> so, just check for well-known URIs and date/time formats? 01:18 < cce> warn otherwise? 01:19 < ingy> yes 01:19 < cce> and then cause a warning if an impliclty typed node doesn't match 01:19 < cce> this makes the parser independent of the type library 01:19 < ingy> and we can add new types to the recommendation in a later spec 01:19 < ingy> right 01:20 < cce> ok, so if "wombles:furry" starts to make sence, we can fix it in later specs 01:20 < ingy> right 01:20 < cce> works for me Also discussed the block vs collection uncertainty in Oren's final proposal. 01:16 < cce> well, separation spaces in at least flow, but it could be block, doesn't matter much 01:17 < cce> same with quoted keys; we need it in flow for JSON, but it could apply to block as well; good questions 01:17 < ingy> let's see what oren says And most importantly: 01:17 < ingy> honestly I don't really care (whether oren wants this for block as well as flow) 01:17 < ingy> I think we have a winner 01:17 * cce nod nods. Cheers, Ingy PS You guys need to stop calling me Brian. That is no longer my name. PS Funny: 01:11 < ingy> hola 01:12 < cce> ingy: one sec, responding to one of your emails 01:12 < ingy> cce: fine, but oren is the boss 01:12 < cce> lol 01:12 < cce> no doubt ;) 01:12 < ingy> you *did* know that, I presume ;) On 14/04/06 01:08 -0700, Ingy dot Net wrote: > On 14/04/06 08:11 +0300, Oren Ben-Kiki wrote: > > On Friday 14 April 2006 07:23, Clark C. Evans wrote: > > > Ok. My primary problem with {key:value} and [key:value] is that the > > > results *may* be unexpected due to precedent with Javascript, C, and > > > other languages where a space isn't required. But, perhaps my issue > > > here can be solved by a *suggestion* in the specification: > >=20 > > To clarify, the idea is: > >=20 > > - We make the following changes to the spec to support JSON data: > > - Allow tab characters in separation spaces. >=20 > In both block and flow collections? >=20 > > - If a key is quoted, we don't require a space after the colon. >=20 > In both block and flow collections? >=20 > > - Add "\/" as a valid escape sequence (JSON uses it for some reason). > >=20 > > This makes {"aa":123} valid, as well as {"aa":<TAB>123}, etc. > >=20 > > - We make *no* other changes to the spec, other than to add a=20 > > recommendation along the following lines: > >=20 > > > =A0 It is recommended that all plain scalar values (especially those > > > =A0 containing a colon) be checked against the YAML Type Registry via > > > =A0 regular expression checks. =A0If a match is not found, then a war= ning > > > =A0 should be issued. =A0In particular, well known URI schemes and ISO > > > =A0 date/time formats should not result in warnings; where values such > > > =A0 as "foo:true" or "foo:32" should. > >=20 > > Since the "string" regexp matches everything, the wording need a bit of= =20 > > work. The user should have an option to turn off these warnings, of=20 > > course. The wording need a bit of work, which we'll do when we get it= =20 > > into the spec. > >=20 > > > =A0 As a policy, the YAML type library will not contain types registe= red > > > =A0 with things that could be easily confused with key/value pairs. > >=20 > > URLs and ISO dates are bad enough :-) > >=20 > > This seems to address everyone's concerns without any downsides. Brian?= =20 > > If this is OK with you, we have a winner. > >=20 > > Have fun, > >=20 > > Oren Ben-Kiki > >=20 > >=20 > > ------------------------------------------------------- > > This SF.Net email is sponsored by xPML, a groundbreaking scripting lang= uage > > that extends applications into web and mobile media. Attend the live we= bcast > > and join the prime developer group breaking into this new coding territ= ory! > > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=110944&bid$1720&dat=121642 > > _______________________________________________ > > Yaml-core mailing list > > Yam...@li... > > https://lists.sourceforge.net/lists/listinfo/yaml-core >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting langua= ge > that extends applications into web and mobile media. Attend the live webc= ast > and join the prime developer group breaking into this new coding territor= y! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=110944&bid$1720&dat=121642 > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Oren Ben-K. <or...@be...> - 2006-04-14 15:12:51
|
On Friday 14 April 2006 11:08, Ingy dot Net wrote: > > - We make the following changes to the spec to support JSON data: > > - Allow tab characters in separation spaces. > > In both block and flow collections? > > > - If a key is quoted, we don't require a space after the colon. > > In both block and flow collections? Since both these are relaxations, I don't see why not. It keeps things consistent and easier to spec. Have fun, Oren Ben-Kiki |
From: Kirill S. <xi...@ga...> - 2006-04-14 06:17:34
|
Hi Clark, >=20 > If this is illegal, to invoke implicitly typed times, you would > have to use: [ ! "10:30", ! "11:00", ! "12:30" ] >=20 According to the spec, the "!" does not invoke implicitly typed scalars. On the contrary: It is also possible for the tag property to explicitly specify the no= de has the =E2=80=9C!=E2=80=9D non-specific tag. This is only useful for= plain scalars, causing them to be resolved as if they were non-plain (hence, by the common tag resolution convention, as =E2=80=9Ctag:yaml.org,2002:str=E2= =80=9D). --=20 xi |
From: Clark C. E. <cc...@cl...> - 2006-04-14 07:48:40
|
(off list) Thanks. I'm actually a bit suprised to see this, I actually don't remember this change or why. In YAML 1.0, the ! marker is used to mark a quoted node as "plain" so it can be implicitly typed. I think I'll let Oren explain this one before I comment (it's his writing I think, er, hope, or I'm starting to lose it in my old age). If you're on-line, mind going over to IRC for a quick chat? I'll try to keep my window open for the next 2h or so. Best, Clark On Fri, Apr 14, 2006 at 09:17:20AM +0300, Kirill Simonov wrote: | Hi Clark, | | > | > If this is illegal, to invoke implicitly typed times, you would | > have to use: [ ! "10:30", ! "11:00", ! "12:30" ] | > | | According to the spec, the "!" does not invoke implicitly typed scalars. | On the contrary: | | It is also possible for the tag property to explicitly specify the node | has the ???!??? non-specific tag. This is only useful for plain scalars, | causing them to be resolved as if they were non-plain (hence, by the | common tag resolution convention, as ???tag:yaml.org,2002:str???). | | | -- | xi | |
From: Kirill S. <xi...@ga...> - 2006-04-14 09:44:26
|
On Friday 14 April 2006 10:47, Clark C. Evans wrote: > Thanks. I'm actually a bit suprised to see this, I actually don't > remember this change or why. In YAML 1.0, the ! marker is used to mark > a quoted node as "plain" so it can be implicitly typed. I think I'll > let Oren explain this one before I comment (it's his writing I think, > er, hope, or I'm starting to lose it in my old age). The old meaning seems to be much more useful. |
From: <in...@tt...> - 2006-04-14 09:53:19
|
On 14/04/06 12:44 +0300, Kirill Simonov wrote: > On Friday 14 April 2006 10:47, Clark C. Evans wrote: > > Thanks. I'm actually a bit suprised to see this, I actually don't > > remember this change or why. In YAML 1.0, the ! marker is used to mark > > a quoted node as "plain" so it can be implicitly typed. I think I'll > > let Oren explain this one before I comment (it's his writing I think, > > er, hope, or I'm starting to lose it in my old age). > > The old meaning seems to be much more useful. From #yaml: 01:14 < ingy> cce: why is {! "10:20"} not actually correct? 01:14 < cce> no idea 01:14 < cce> it _did_ mean that "10:20" was to be implicitly typed 01:15 < cce> in Yaml 1.0 01:15 < cce> I don't exactly recall how or why it changed to get the language that xitology produced ;) 01:15 < cce> but xi is really the spec master these days, he and oren 01:15 < ingy> oh, you mean it got dropped from the spec 01:16 < ingy> I remember that it did, even though I don't recall why 01:16 < ingy> casualty of war I suppose Oren, this seems to be some fallout from the heavy 1.1 refactoring we did. Do you remember why we dropped '!' to basically mean "has plain semantics"? Can we reinstitute it? Cheers, Ingy |