From: Oren Ben-K. <or...@ri...> - 2002-08-11 09:43:52
|
Hi guys, I see you had a busy weekend. Sorry I missed it; I was on vacation with no Internet access. I'm sorry for any confusion that was about accepting proposals; one reason I sent the list of changes is to ensure that everyone does in fact agree. As for the latest proposal; if I understand correctly, it is: > He is ok with the new type family (!) styles: > !!private > !yaml-specific > $domain,year/whatever > $language/whatever I'm hoping the '$' is a typo and that you meant: !!private !yaml-specific !domain,year/whatever !language/whatever Right? > - > > I assume he wants to keep !/whatever reserved for now, > pending a discussion of the #DOMAIN proposal. I gave up on !/str, then Brian said he kind of liked it and you said fine, lets go for it; it seems there was some confusion here. I'm OK with reserving it. I'm not certain yet about #DOMAIN as oppose to simply using !...^... in the type family of the top-level node. I'm certain that #DOMAIN does not replace ^ (it is too restricted). > - > > He is ok with just adding '/' to the string regular > expression, keeping other characters reserved. This > is primarly justified by the ypath use case and not by > a unix path use case, although unix paths would be > useable unquoted. I can see that './' is pretty rare,. As for '\', it is a little published fact that '/' works fine in Windows, even though it isn't the norm. This only leaves things like '../', again should be pretty rare in a configuration file. I'll go with just '/'. > - > > Since this breaks the //comment special key, he suggested > that perhaps the # could be used since it is not immediately > followed by a space. This works for me: > --- > #: one comment special key > #more: Another comment special key I'm uncomfortable with this, because a single character change '# :' vs. '#:' subtly changes the semantics of a file. In the other cases a space indicator is used, omitting the space (or placing one where it shouldn't be) causes an error, as in: a: b: c # Error a: { b, c: d } # Error Saying that both the following are comments: #: A comment # : Also a comment But one is throwaway and one isn't is too confusing IMVHO. I'd stick with ';' if that is at all acceptable to you. The ';' is a comment marker in configuration files and other contexts, it isn't that alien to people. > - > > He's ok with keeping things reserved given the two > reasons below (flexibility and simplicity). He's not > in favor of adding any more implicit types. I'm well aware of that and I agree with most of his reasoning, which is why we changed the string regexp as we did. I'd still like to keep the door open, though. > - > > Brian brought up the topic of how URIs are handled, > does a parser report the tag:uri or not. > > I answered no, it returns exactly what is in the YAML file > as these strings themselves should be unique. One restriction > is needed, so that yaml.org,2002 is not used for domain,year > which is easy since we control yaml.org ;) Right. > This leaves Brian's big question: > > - Do we need special keys, and if so, how can we clarify > the specification so that they are used properly. I think it is important to have a universal, standard way that a YAML system is capable of storing every YAML data, including data that uses unknown type families. This requires the '=' and '!' special keys. I also think it is important to be able to represent the tree model as YAML data, which means using the '*' and '&' special keys. This has potential for YTL, for example, and using YPATH to address *all* the parts of a YAML document. As for ';', I agree with Brian it is a tricky to implement something like: p1: !java/my.point.class x: 7 y: 8 p2: !java/my.point.class x: 8 y: 7 ;bloop: xyzzy 'p2' just can't be loaded into the specified native data type because it has the associated 'note' that doesn't fit into the class. This isn't a problem in Perl because there adding keys to a map is easy; but it is a problem in Java/C++/etc. (I don't know about Python). In order to analyze this you must keep in mind the distinction between a schema-specific application that, presumably, loads the data into the appropriate native data structures; and a schema-independent YAML tool that, presumably, doesn't. The YAML tool would have to use something like the special keys to encode YAML data in simple hash tables and lists. It is explicitly allowed in the spec for the application to apply a filter (before the loader) that strips away "notes". That is considered "processing the data" and does not qualify for a "round trip" but after all, applications are all about processing the data *anyway*. A generic YAML tool that does "round-trip" has no problem supporting the note special key. For it, it is just another key in the map. Put another way: notes are *expected* to be ignored by the application; but they should be round-trips through all the stages up to being delivered to it. For example, a YAML-RPC system with dispatchers and forwarders and caching and whatever, all working at the YAML level, would be expected to treat the "notes" as just more data, but the implementation of the RPC service that is finally triggered to handle a particular call is free to ignore such "notes". And, of course, if the application is capable of passing such notes through, without modification and without it effecting anything, so much the better. For example, anywhere the application uses a simple "!map", it is hardly any effort to simply ignore all note keys. And many applications do use simple "!map" objects. What is this good for? It allows you to, for example, "trace" the data as it flows through the system by adding such tags at various points, without worrying about the application choking on them and without having to invent a derived with-notes schema for each schema that moves through the system. I expect this would be a very useful debugging tactic for any generic YAML tool. I also expect it would be useful in other contexts. Perhaps it isn't the main focus of a data serialization problem, but I want to accommodate the user that asks "and what if I want my comment *not* to be thrown away?" - for whatever reason. Without a standard "notes" key, everyone would have to invent his own convention for it, and implement his own filter for stripping the notes (so the application won't choke on them). Using ';', we allow this to become a standard part of any YAML system. Note that you are not *required* to support the ';' keys, or any of the special keys for that matter. > Overall: > > > I hope this accurately reflects things. I think it should > be OK to move ahead with the spec changes, perhaps marking the > comment special key as "subject to change" and changing it from > // to # in the short-run. Make it ';' :-) > That said, I promised to review over the next week or so the > special key mechanism, questioning if they are necessary and > articulating the value they provide (or removing them). I hope the above is a good start on that. Have fun, Oren Ben-Kiki |
From: Oren Ben-K. <or...@ri...> - 2002-08-11 13:51:02
|
Clark C . Evans [mailto:cc...@cl...] wrote: ... it seems we're OK about most things ... and I agree that = (or (=) is useful beyond "encoding"; the spec explicitly refers to the schema migration scenario. > Brian didn't seem to like ; at all (he gave it a -4). I really don't > like it either. That being said, perhaps (#comment) is better choice. And: > I'm thinking that if we want to keep comment keys, and in fact > add more of them in the fattier (for example, the "inheritance" > feature) then perhaps we should do this only within parenthesis: > > (=): Using the equal special key > (#comment): A comment special key > > As much as a bare = is nicer, the above may give things the > level of consistency that we need. At the very least I think > it would make steve happy... I suppose you mean doing *all* the special keys this way: (=), (!), (&), (*), and (#...). There is a certain sense to it - we'd be consistently reserving just things in (...). Given you two won't stand for ';' and I won't stand for '#: ...', I guess (#...) is the only choice... Question: we *don't* require the '(' to be balanced, right? As in: foo) : Is OK. (#bar() : So this must be. Assuming Brian also agrees, I'll go along. Though I recall he wasn't enamored of (=), either... Brian? Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2002-08-11 16:43:02
|
On Sun, Aug 11, 2002 at 04:52:31PM +0300, Oren Ben-Kiki wrote: | > | > (=): Using the equal special key | > (#comment): A comment special key | | I suppose you mean doing *all* the special keys this way: (=), (!), (&), | (*), and (#...). Yes, (of course !&* keys are never serialized). | There is a certain sense to it - we'd be consistently reserving just things | in (...). Given you two won't stand for ';' and I won't stand for '#: ...', | I guess (#...) is the only choice... It seems like a good canidate. It also keeps the rules for implicits kinda clean (the special keys are implicits) | Question: we *don't* require the '(' to be balanced, right? As in: | | foo) : Is OK. | (#bar() : So this must be. Err, I don't think we should be condoning unballenced parenthesis, curely or square brackets. Alternatively we could use the caret... ^yes !bool 1 ^#hmm !special.comment hmm ^= !special.equal = That's kinda clean, I suppose % $ @ or ` would also work. %yes $yes @yes `yes (yes) %#hmm $#hmm @#hmm `#hmm (#hmm) %= $= @= `= (=) I'd like to keep @ reserved for inheritance, and since $ % ` have meanings in perl, perhaps ^ is the way to go. If we don't want to have to "ballence" the parenthesis we shouldn't use parenthesis. There is another reason not to use parenthesis... within a ypath we want to use parenthesis for operator grouping, thus one of the others would probably be better. That said, I like to use | for union and ^ for intersection; so one of the other fellas, $@%` would be better. Hmm. ~ !null ~ ~yes !bool 1 ~#hmm !special.comment hmm ~= !special.equal = Perhaps ~ is the character we want, it has the nice feature in that it fits in nicely with null. Best, Clark |
From: Oren Ben-K. <or...@ri...> - 2002-08-11 16:52:56
|
Clark C . Evans [mailto:cc...@cl...] wrote: > | Question: we *don't* require the '(' to be balanced, right? As in: > | > | foo) : Is OK. > | (#bar() : So this must be. > > Err, I don't think we should be condoning unballenced > parenthesis, curely or square brackets. Well, that rather ruins this notion... > Alternatively we could use the caret... > > ^yes !bool 1 Ugh! I *like* (true). ^true or ~true are horrid. > Perhaps ~ is the character we want, it has the nice feature in > that it fits in nicely with null. What may fit with null (if one looks at it from the right angle) is saying that ~<whatever> is a way to make <whatever> into a "note". Sort of "nullify" it if you get my drift. So: point: x: 5 y: 12 ~bloop: And keep all the other keys as they are (=, etc.). The only down-side is that '~' itself is *not* a "note". hash: ~: app sees this ~note: But not this ~~: Or this. Hmmm. Thoughts? Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2002-08-11 17:01:12
|
On Sun, Aug 11, 2002 at 07:54:24PM +0300, Oren Ben-Kiki wrote: | Clark C . Evans [mailto:cc...@cl...] wrote: | > | Question: we *don't* require the '(' to be balanced, right? As in: | > | | > | foo) : Is OK. | > | (#bar() : So this must be. | > | > Err, I don't think we should be condoning unballenced | > parenthesis, curely or square brackets. | | Well, that rather ruins this notion... Why? I think (#comment) is just fine. The only use case that you have for this non-ballenced mechansim is to comment out keys by just chaning the RHS. In this case, just use '# ' and really comment them out. I don't see how this use case is related to the other use case, of adding annotations. Adding annotations works well with ballenced parenthesis. | > Alternatively we could use the caret... | > | > ^yes !bool 1 | | Ugh! I *like* (true). ^true or ~true are horrid. I like (true) better, I was just presenting alternatives where ballencing isn't required. As for using ~ for the comment prefix and keeping everything else as is... yuck. Best, Clark |
From: Oren Ben-K. <or...@ri...> - 2002-08-11 17:11:50
|
Clark C . Evans [mailto:cc...@cl...] wrote: > | > | foo) : Is OK. > | > | (#bar() : So this must be. > | > > | > Err, I don't think we should be condoning unballenced > | > parenthesis, curely or square brackets. > | > | Well, that rather ruins this notion... > > Why? I think (#comment) is just fine. The only use case that > you have for this non-ballenced mechansim is to comment out keys > by just chaning the RHS. Nope. Suppose I have this: Unbalanced key :-( : Yikes! And I want to comment it: (#Unbalanced key: :-() : This must be OK! See? >In this case, just use '# ' and really > comment them out. I don't think we can say you can use (#...) to comment out stuff only if the stuff has balanced parenthesis. That's too restrictive for my taste, and has no real reason. BTW, what happens if someone writes: (#what's : this? Presumably an error because it doesn't match any implicit type. Correct? > I don't see how this use case is related to > the other use case, of adding annotations. Adding annotations > works well with ballenced parenthesis. Not if the original value wasn't balanced. > | > Alternatively we could use the caret... > | > > | > ^yes !bool 1 > | > | Ugh! I *like* (true). ^true or ~true are horrid. > > I like (true) better, I was just presenting alternatives > where ballencing isn't required. As for using ~ for the > comment prefix and keeping everything else as is... yuck. Oh well. I still think ';' is the most elegant way around this issue. But it seems we'll just have to live with the regexp \(#([^ \t].*)?\) - allowing for unbalanced regexps, and much worse things, to be "noted". Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2002-08-11 18:01:41
|
On Sun, Aug 11, 2002 at 08:13:21PM +0300, Oren Ben-Kiki wrote: | Nope. Suppose I have this: | | Unbalanced key :-( : Yikes! | | And I want to comment it: | | (#Unbalanced key: :-() : This must be OK! (#Unballenced key :-() : Is an edge case, but it works just fine. | BTW, what happens if someone writes: | | (#what's : this? | | Presumably an error because it doesn't match any implicit type. Correct? Yes, in the (#...) proposal. | > I like (true) better, I was just presenting alternatives | > where ballencing isn't required. As for using ~ for the | > comment prefix and keeping everything else as is... yuck. | | Oh well. I still think ';' is the most elegant way around this issue. But it | seems we'll just have to live with the regexp \(#([^ \t].*)?\) - allowing | for unbalanced regexps, and much worse things, to be "noted". I'd like to give some regularity back to our implict typing, having one other marker could help this out. The parenthesis seem to be a nice win. I really don't like the semi-colon being used this way. It's quite ugly. | I don't think we can say you can use (#...) to comment out stuff only if the | stuff has balanced parenthesis. That's too restrictive for my taste, and has | no real reason. I think you are way overestimating the cases where someone would want to use comment special key to begin with (I can see the annotation use case) and not just regular throw-away comment. Further, I can see it extremely rare that someone would try to use annotation via something that has an unballenced parens. This just doesn't make sense. Ok. Then I really think we shouldn't be using parenthesis for our special implicitsi -- this gives us one of the following: - { bool: ~true , ~#comment: value, ~=: migration, ~@: inherits } - { bool: `true , `#comment: value, `=: migration, `@: inherits } - { bool: @true , @#comment: value, @=: migration, @@: inherits } - { bool: $true , $#comment: value, $=: migration, $@: inherits } - { bool: %true , %#comment: value, %=: migration, %@: inherits } - { bool: ^true , ^#comment: value, ^=: migration, ^@: inherits } - { bool: =true , =#comment: value, ==: migration, =@: inherits } - { bool: ;true , ;#comment: value, ;=: migration, ;@: inherits } - { bool: .true , .#comment: value, .=: migration, .@: inherits } Hmm. I think I like the back-tick the best, ` for our various string implicits and our special key implicits. Best, Clark |
From: Brian I. <in...@tt...> - 2002-08-11 18:27:57
|
On 11/08/02 14:07 -0400, Clark C . Evans wrote: > On Sun, Aug 11, 2002 at 08:13:21PM +0300, Oren Ben-Kiki wrote: > > Hmm. I think I like the back-tick the best, ` for our various string > implicits and our special key implicits. Good grief! Cheers, Brian |
From: Neil W. <neilw@ActiveState.com> - 2002-08-11 20:17:12
|
Clark C . Evans [11/08/02 14:07 -0400]: > (#Unballenced key :-() : Is an edge case, but it works just fine. Unbalanced key :-) : Works. (#Unbalanced key :-)) : Works? "#Unbalanced" : Works. (#"#Unbalanced") : Works? "" : Works. (#) : Works -- same? Foo (and bar) and baz : Works. (#Foo (and bar) and baz) : Works? I'm seeing some problems with parens, unless we extend their semantics. The problem is that keys can be *any* data. They can be binary, number or arbitrary string. As soon as you get into arbitrary strings, you have to use some kind of escaping to delimit them. Something that shows up in the serialization that is not allowed to appear in strings. That, or escaping. (a) Encoding Foo :) : Bar (#Foo :)<SOME CHARACTER> : Bar (b) Escaping: Foo :) : Bar (#Foo :\)) : Bar Encoding is hard. Escaping is easier. But that's not the point... The point is, this is all pretty silly. We want these things: 1. People can easily comment out keys "non destructively". People tend not to push the boundaries that much. This will probably mean people putting (#<whatever>) around single-word <whatever>s. Not too hard. 2. YAML processes can insert comments at various stages to aid in debugging. Processes stress the boundaries a lot. They may want to comment out structured keys, for example. Or keys that don't have matched parens. Proposal: Use transfers, or formats, or something. Hear me out. "To tag a key/value pair as non-destructively commented out, give the key a !special|comment transfer". We can still introduce a (#shorthand) for simple keys. That's really it. This runs into its own complications if the key already had a transfer. I don't know how to solve that. Later, Neil |
From: Clark C . E. <cc...@cl...> - 2002-08-11 22:16:12
|
Neil, thank you for your time focusing on this... I'm sorry it has you distracted from libyaml. ;( On Sun, Aug 11, 2002 at 01:19:12PM -0700, Neil Watkiss wrote: | Clark C . Evans [11/08/02 14:07 -0400]: | > (#Unballenced key :-() : Is an edge case, but it works just fine. | | Unbalanced key :-) : Works. (#): | Unbalanced key :-) : Works. "#Unbalanced" : Works. "" : Works. Foo (and bar) and baz : Works. | I'm seeing some problems with parens, unless we extend their semantics. The | problem is that keys can be *any* data. They can be binary, number or | arbitrary string. The point of the comment key... if we even want to keep it... was to provide a simple way for people to "annotate" a node with information that applications should in general, ignore. That's it. It was never intended to comment out "arbitrary" keys, this is why we have the throwaway syntax. I think our "special" keys are probably quite useful, I use the '=' key in my data as I have older processes now that are working on new data. It's very neat trick, but it requires a way for the applicaiton to express if a given node is expecte to be a scalar. Thus far we don't have a general mechanism to do this... but I think schema will provide this expression and at this point the "=" key will be very useful. So... I don't want to dump it. As for the comment key, I've yet to use it. And Oren's current use cases are quite far from anything I can possibly imagine someone really wanting to do. So, we can either dump them, or they can comply with a syntax mechanism that other implicits use... cuz they haven't proved enough value to merit their own indicator and semantics. After some thought, I'd like to migrate the "special" keys over to use the parenthesized forms. I say this beacuse I think that we may want to introduce a few more special keys in the future, and if we make it clear in the spec that more special keys (or other short-cuts) may be added via the (parenthesis) then they can code for this eventuality and not choke on new special keys or implicit types that we dream up later. So, I'd like to use (=) for the "equals" special key. This leaves (#) for the comment special key. We started to allow other "differentiators" to follow the //, //a //b so that a given node can have more than one comment. As far as I'm concerned this ability doesn't need to extend to anywhere near a full set of characters. Limiting furture implicits so that they do not contain new lines and do not contain a parenthesis is probably a good idea, and an escaping mechanism is not needed. This is for very limited useage... not a general mechanism. So. I see two options on the table: (a) we adopt (#...) for comment keys or (b) we drop comment keys. In either case, it looks like we need to explicitly specify that current and future parenthesized implicits will never contain a ) or a new line. Hopefully this will make it easy. | The point is, this is all pretty silly. We want these things: | | 1. People can easily comment out keys "non destructively". | | People tend not to push the boundaries that much. This will probably | mean people putting (#<whatever>) around single-word <whatever>s. Not | too hard. | | 2. YAML processes can insert comments at various stages to aid in | debugging. | | Processes stress the boundaries a lot. They may want to comment out | structured keys, for example. Or keys that don't have matched parens. (#err304): | !!type this: is the offending key-value pair | Proposal: | | Use transfers, or formats, or something. Hear me out. | | "To tag a key/value pair as non-destructively commented out, give the key a | !special|comment transfer". | | We can still introduce a (#shorthand) for simple keys. | | That's really it. This runs into its own complications if the key already had | a transfer. I don't know how to solve that. Yes. This is a very simple issue, and we are indeed making a mountain out of it. However, the core proposal is: We re-frame the special keys so that they use the (parenthesized) forms so that implementers can handle them in a generic way even when they don't recognize or support a given implicit. As such, we probably need to limit what is in the parenthesis in a resonable way so that it is easy for implementers to handle. Best, Clark |
From: Oren Ben-K. <or...@ri...> - 2002-08-12 09:13:56
|
Neil Watkiss [mailto:neilw@ActiveState.com] wrote: > Proposal: > > Use transfers, or formats, or something. Hear me out. *Smack self on head*. That's it! Just a tiny twist - you want to preserve the original type family/format somehow. Which is easy enough... we just declare the type family "!~" to mean "a note" and say its "format" is the original type family. So... rule: original: !<transfer> <key-value> : <value-node> commented: !~|<transfer> <key-value> | <value-node> examples: - original: foo : bar commented: !~ foo : bar - original: !binary|base64 ... : data commented: !~|binary|base64 ... : data Perfect! Thanks, Neil! Have fun, Oren Ben-Kiki |
From: Brian I. <in...@tt...> - 2002-08-12 17:21:00
|
On 12/08/02 12:15 +0300, Oren Ben-Kiki wrote: > Neil Watkiss [mailto:neilw@ActiveState.com] wrote: > > Proposal: > > > > Use transfers, or formats, or something. Hear me out. > > *Smack self on head*. That's it! > > Just a tiny twist - you want to preserve the original type family/format > somehow. Which is easy enough... we just declare the type family "!~" to > mean "a note" and say its "format" is the original type family. So... > > rule: > original: > !<transfer> <key-value> : <value-node> > commented: > !~|<transfer> <key-value> | <value-node> > examples: > - original: > foo : bar > commented: > !~ foo : bar I'd like to drop the notion of commented keys for now. It's not the syntax that's bothering me. It's the semantics of what it means. And I don't think that will become clear without implementations. I'm having some issues with how you are percieving that the information model works. I'm not sure how to express it, but at least riddle me this. Describe how the following behave: --- foo: bar !~ foo: bar foo: !~ bar !~ foo: !~ bar ... It doesn't make sense to me that changing the semantics of a mapping key would somehow affect its value. Deeper yet, I don't consider a mapping key to be a YAML node in its own right. A mapping key is *part* of a mapping. YAML graphs consist of mappings, sequences, and leaves. A mapping key is not any of those. It may appear in those forms but it does not stand on its own. For instance, you can't have a mapping key as a top level production. Also scripting languages don't consider keys as a type. This distinction is important. It ties heavily into YPATH semantics as well. I don't think that having a path to a key is the right thing to do. A key is part of the path. Well I don't want to get off topic. I just think that the roundtripping comment is too deep of an issue to discuss in the void of meaningful implementations. And since we have no compelling use cases, let's please drop it until next year. Cheers, Brian |
From: <sh...@zi...> - 2002-08-12 17:53:55
|
Brian said: > I'd like to drop the notion of commented keys for now. It's not the syntax > that's bothering me. It's the semantics of what it means. And I don't think > that will become clear without implementations. > +1 It seems like the problems that the commented keys would solve could and should be solved at a higher level than YAML. |
From: Oren Ben-K. <or...@ri...> - 2002-08-12 17:49:05
|
Brian Ingerson [mailto:in...@tt...] wrote: > I'm having some issues with how you are percieving that the > information model works. I'm not sure how to express it, but > at least riddle me this. Describe how the following behave: > > --- > foo: bar > !~ foo: bar > foo: !~ bar > !~ foo: !~ bar > ... '~' is a type family saying: please hide this node from the application. It has *no* data type (as far as the application is concerned) because the application doesn't get to even *see* it in the first place. Hence the 'null' type family name. For a key, this says "please hide the key/value pair from the application". All the special keys have semantics only when used as keys, this isn't an exception. What does: foo: = Mean? The thing is, it isn't valid/defined under any schema. I don't know if we have to define a meaning for such cases. > It doesn't make sense to me that changing the semantics of a > mapping key would somehow affect its value. Makes perfect sense to me. > Deeper yet, I don't consider a mapping key to be a YAML node > in its own right. A mapping key is *part* of a mapping. YAML > graphs consist of mappings, sequences, and leaves. A mapping > key is not any of those. This is *so* Perl 5. What do you feel is the info model interpretation of: scatter-plot: { x: 1, y: 2 }: bloop { x: 2, y: 1 }: knick This is valid in Java, Python, C/C++, and Perl 6 (AFAIK). The key is *very much* a node in its own right. It may be an arbitrarily complex node. > This distinction is important. It ties heavily into YPATH > semantics as well. Sure does; in fact the main complexity of YPATH is to select whether one is going down into the value (the most common operation) or into the key (more rare). > I don't think that having a path to a key is the right thing > to do. A key is part of the path. How would I be able to select "the point whose X coordinate is 1" in the above example, then? > Well I don't want to get off topic. I just think that the > roundtripping comment is too deep of an issue to discuss in > the void of meaningful implementations. And since we have no > compelling use cases, let's please drop it until next year. Well... I still think it is a universal mechanism that is worth standardizing. We *can* just add it later on, if both you and Clark feel it is premature to add it at this point. Oh well. I still think it makes a lot of sense, and barring a strong use case, I'll reluctantly freeze it. For now. Until such a time as such a use case comes up :-) Sigh. Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2002-08-11 12:58:00
|
| I'm sorry for any confusion that was about accepting proposals; one reason I | sent the list of changes is to ensure that everyone does in fact agree. | As for the latest proposal; if I understand correctly, it is: | | > He is ok with the new type family (!) styles: | > !!private | > !yaml-specific | > $domain,year/whatever | > $language/whatever | | I'm hoping the '$' is a typo and that you meant: | | !!private | !yaml-specific | !domain,year/whatever | !language/whatever | | Right? Right. It was late. ;) | > - > | > I assume he wants to keep !/whatever reserved for now, | > pending a discussion of the #DOMAIN proposal. | | I gave up on !/str, then Brian said he kind of liked it and you said fine, | lets go for it; it seems there was some confusion here. I'm OK with | reserving it. I'm not certain yet about #DOMAIN as oppose to simply using | !...^... in the type family of the top-level node. I'm certain that #DOMAIN | does not replace ^ (it is too restricted). Good. We have yet to really dive into #DOMAIN at the surface it seems interesting enough. | > - > | > He is ok with just adding '/' to the string regular | > expression, keeping other characters reserved. This | > is primarly justified by the ypath use case and not by | > a unix path use case, although unix paths would be | > useable unquoted. | | I can see that './' is pretty rare,. As for '\', it is a little published | fact that '/' works fine in Windows, even though it isn't the norm. This | only leaves things like '../', again should be pretty rare in a | configuration file. I'll go with just '/'. Great. | > - > | > Since this breaks the //comment special key, he suggested | > that perhaps the # could be used since it is not immediately | > followed by a space. This works for me: | > --- | > #: one comment special key | > #more: Another comment special key | | I'm uncomfortable with this, because a single character change '# :' vs. | '#:' subtly changes the semantics of a file. In the other cases a space | indicator is used, omitting the space (or placing one where it shouldn't be) | causes an error, as in: | | a: b: c # Error | a: { b, c: d } # Error | | Saying that both the following are comments: | | #: A comment | # : Also a comment | | But one is throwaway and one isn't is too confusing IMVHO. I'd stick with | ';' if that is at all acceptable to you. The ';' is a comment marker in | configuration files and other contexts, it isn't that alien to people. Ick. Brian didn't seem to like ; at all (he gave it a -4). I really don't like it either. That being said, perhaps (#comment) is better choice. I know you have a use case for commenting out key/value pairs in such a way that they round-trip. I'm not certain that this is so important that someone couldn't just use the () thingy. For example: --- before: commenting --- (#before): commenting I kinda like it this way, it uses the parenthesis so that it's quite clear to someone who is moderately YAML aware that it is "special". | > - > | > He's ok with keeping things reserved given the two | > reasons below (flexibility and simplicity). He's not | > in favor of adding any more implicit types. | | I'm well aware of that and I agree with most of his reasoning, which is why | we changed the string regexp as we did. I'd still like to keep the door | open, though. Nods. | > - > | > Brian brought up the topic of how URIs are handled, | > does a parser report the tag:uri or not. | > | > I answered no, it returns exactly what is in the YAML file | > as these strings themselves should be unique. One restriction | > is needed, so that yaml.org,2002 is not used for domain,year | > which is easy since we control yaml.org ;) | | Right. Cool. | > This leaves Brian's big question: | > | > - Do we need special keys, and if so, how can we clarify | > the specification so that they are used properly. | | I think it is important to have a universal, standard way that a YAML system | is capable of storing every YAML data, including data that uses unknown type | families. This requires the '=' and '!' special keys. I also think it is | important to be able to represent the tree model as YAML data, which means | using the '*' and '&' special keys. This has potential for YTL, for example, | and using YPATH to address *all* the parts of a YAML document. Nods. For me, the special keys represent particular idioms layered on top of YAML which are valueable across many domains. For example: - key: '=' value: > This is very useful to represent the "value" idiom. That is, when a mapping has one primary "value" and the other keys are just coloring then = works wonders. For example, in a generic XML mapping, the = key could represent the "content" of the element, while each key in the mapping could represent the various attributes. The = key has another feature which will only be apparant when we have schemas or a sequential-access "pull" parser (in both cases the user can provide expectations to the parser). In this case, an older application can be expecting that a particular object is a scalar and newer applications can convert a scalar to a mapping. With the ability to express "expectations", the older application can operate on newer data (it only uses the stuff in the = key) untill it can be upgraded to be fully aware of the new structure. Such a "forward compatibility" feature is very valueable in a messaging system, esp., ones which are firmware and built into equipment. - key: '#' value: > It may be the case that some applications want to round-trip comments. If this is a big enough use case, then perhaps it's important to keep. | As for ';', I agree with Brian it is a tricky to implement something like: | | p1: !java/my.point.class | x: 7 | y: 8 | p2: !java/my.point.class | x: 8 | y: 7 | ;bloop: xyzzy | | 'p2' just can't be loaded into the specified native data type because it has | the associated 'note' that doesn't fit into the class. This isn't a problem | in Perl because there adding keys to a map is easy; but it is a problem in | Java/C++/etc. (I don't know about Python). The concept of a "shadow" will help out here. A shadow is a lookup table keyed by object with additional attributes. It should work with most languages, and is good for infrequently accessed stuff. The primary problem with shadows is garbage collection (not locking an object in memory and knowing when it is deleted). | In order to analyze this you must keep in mind the distinction between a | schema-specific application that, presumably, loads the data into the | appropriate native data structures; and a schema-independent YAML tool that, | presumably, doesn't. The YAML tool would have to use something like the | special keys to encode YAML data in simple hash tables and lists. | | It is explicitly allowed in the spec for the application to apply a filter | (before the loader) that strips away "notes". That is considered "processing | the data" and does not qualify for a "round trip" but after all, | applications are all about processing the data *anyway*. Right. And in this case, a specific application could discard all values in a map besides the '=' key. | A generic YAML tool that does "round-trip" has no problem supporting the | note special key. For it, it is just another key in the map. | | Put another way: notes are *expected* to be ignored by the application; but | they should be round-trips through all the stages up to being delivered to | it. | | For example, a YAML-RPC system with dispatchers and forwarders and caching | and whatever, all working at the YAML level, would be expected to treat the | "notes" as just more data, but the implementation of the RPC service that is | finally triggered to handle a particular call is free to ignore such | "notes". | | And, of course, if the application is capable of passing such notes through, | without modification and without it effecting anything, so much the better. | For example, anywhere the application uses a simple "!map", it is hardly any | effort to simply ignore all note keys. And many applications do use simple | "!map" objects. | | What is this good for? It allows you to, for example, "trace" the data as it | flows through the system by adding such tags at various points, without | worrying about the application choking on them and without having to invent | a derived with-notes schema for each schema that moves through the system. I | expect this would be a very useful debugging tactic for any generic YAML | tool. I was thinking of the debugging context as well. --- !clarkevans.com,2002/rpc-call callee: some-object invoke: some-method() args: - 3 - hello (#rpc): received: 2003-08-21 12:02:00.02 sender: 209.9.30.66 The comment key would allow stuff to be added to a mapping without requiring it to be included in the mapping's schema. Or having it be expected by a given application. | I also expect it would be useful in other contexts. Perhaps it isn't the | main focus of a data serialization problem, but I want to accommodate the | user that asks "and what if I want my comment *not* to be thrown away?" - | for whatever reason. Without a standard "notes" key, everyone would have to | invent his own convention for it, and implement his own filter for stripping | the notes (so the application won't choke on them). Using ';', we allow this | to become a standard part of any YAML system. Yes. The value of special keys is that they provide a common way to support a shared idiom rather than each problem domain inventing their own way of doing it. | Note that you are not *required* to support the ';' keys, or any of the | special keys for that matter. Exactly. | > Overall: > | > | > I hope this accurately reflects things. I think it should | > be OK to move ahead with the spec changes, perhaps marking the | > comment special key as "subject to change" and changing it from | > // to # in the short-run. | | Make it ';' :-) I'm thinking that if we want to keep comment keys, and in fact add more of them in the fututure (for example, the "inheritance" feature) then perhaps we should do this only within parenthesis: (=): Using the equal special key (#comment): A comment special key As much as a bare = is nicer, the above may give things the level of consistency that we need. At the very least I think it would make steve happy... Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |