From: Clark C. E. <cc...@cl...> - 2004-08-31 16:10:50
|
abstract: We need a mechanism to 'mark' particular nodes in a YAML tree as belonging to a particular type, schema, etc. This handle can then be used by external schema languages and tools to to attach or drive behavior. We do not want to imply any particular semantics to these tags to allow for greatest flexibility. Currently, we have globally unique 'taguri' for each node in a YAML tree that conforms to the tagURI specification [1]. However, the YAML syntax for binding this tag is complicated and doesn't address content mixing concerns. We keep our 'taguri' in the YAML information model as a globally unique string that conforms to the tagURI specification. However, we provide a simpler syntax mechanism for presenting these tags. proposal: We introduce a new directive, 'tag' which binds a prefix to an authority, or taggingEntity. declaration := "%tag:" taggingEntity ['=' prefix] typetag := "!" [prefix "^"] specific Where taggingEntity and specific refer to the same productions in the taguri specification; taggingEntity is either a domain or an email address followed by a date for uniqueness. Further, the empty prefix is expresed by omitting ['=' prefix ] and [prefix "^"]. The "^" character was chosen beacuse it is not included in RFC2396's uric production (aka taguri's specific), other possibilities include, "|" "\" "`" . The cooked 'tag' for any given node in the YAML representation model, is obtained by: taguri = 'tag:' taggingEntity ":" specific If a document does not contain a declaration for the empty prefix, then the taggingEntity for the empty prefix is 'yaml.org,2002' examples: --- - !int # tag:yaml,org,2002:int - !sub/type # tag:yaml,org,2002:sub/type - !!bing # tag:yaml,org,2002:!bing --- %tag:perl.yaml.org,2002 - !Some::Package # tag:perl.yaml,org,2002:Some::Package --- %tag:cla...@gm...,2004-08-20 %tag:clarkevans.com,2002-03=bing - !int # tag:cla...@gm...,2004-08-20:int - !bing/Some::Thing # tag:cla...@gm...,2004-08-20:bing/Some::Thing - !bing^wibble # tag:clarkevans.com,2002-03:wibble Clark |
From: Clark C. E. <cc...@cl...> - 2004-08-31 16:25:44
|
On Tue, Aug 31, 2004 at 12:10:49PM -0400, Clark C. Evans wrote: | declaration := "%tag:" taggingEntity ['=' prefix] | typetag := "!" [prefix "^"] specific ... | The cooked 'tag' for any given node in the YAML representation | model, is obtained by: | | taguri = 'tag:' taggingEntity ":" specific | | If a document does not contain a declaration for the empty prefix, | then the taggingEntity for the empty prefix is 'yaml.org,2002' Better yet, if the document does not contain a declaration for a particular prefix; then the prefix is considered "unbound" and must be provided by the Application. Most applications can then provide the default taggingEntity that makes the most sense. | | examples: | Assuming that 'yaml.org,2002' would be the default taggingEntity if one wasn't provided to the YAML parser, | --- | - !int # tag:yaml,org,2002:int | - !sub/type # tag:yaml,org,2002:sub/type | - !!bing # tag:yaml,org,2002:!bing | | --- %tag:perl.yaml.org,2002 | - !Some::Package # tag:perl.yaml,org,2002:Some::Package | | --- %tag:cla...@gm...,2004-08-20 | %tag:clarkevans.com,2002-03=bing | - !int # tag:cla...@gm...,2004-08-20:int | - !bing/Some::Thing # tag:cla...@gm...,2004-08-20:bing/Some::Thing | - !bing^wibble # tag:clarkevans.com,2002-03:wibble |
From: T. O. <tra...@ru...> - 2004-08-31 16:31:48
|
On Tuesday 31 August 2004 12:10 pm, Clark C. Evans wrote: > We introduce a new directive, 'tag' which binds a prefix to > an authority, or taggingEntity. > > declaration := "%tag:" taggingEntity ['=' prefix] > typetag := "!" [prefix "^"] specific > > Where taggingEntity and specific refer to the same productions > in the taguri specification; taggingEntity is either a domain > or an email address followed by a date for uniqueness. Clark, This is very nice. Yet I don't understand why you seem to be set against allowing specific tag definitions? Like: --- %tag:perl.yaml.org,2002:Some::Package=somepkg - !somepkg # perl.yaml.org,2002:Some::Package Seems to me that any future schema will need that kind of control. Also, just curious about your opinion on the % "directive doc", instead of just the single "directive command". -- T. |
From: why t. l. s. <yam...@wh...> - 2004-08-31 16:46:24
|
Clark C. Evans wrote: >proposal: > > We introduce a new directive, 'tag' which binds a prefix to > an authority, or taggingEntity. > > declaration := "%tag:" taggingEntity ['=' prefix] > typetag := "!" [prefix "^"] specific > > I'm completely warmed up to this. Sweet sweet sweet delicious corn, Clark. Is this still 1.0? This breaks a bunch of docs out there using the caret. Or is the caret staying? _why |
From: Sean O'D. <se...@ce...> - 2004-08-31 17:36:48
|
On Tuesday 31 August 2004 09:10, Clark C. Evans wrote: > > --- > - !int # tag:yaml,org,2002:int > - !sub/type # tag:yaml,org,2002:sub/type > - !!bing # tag:yaml,org,2002:!bing > > --- %tag:perl.yaml.org,2002 > - !Some::Package # tag:perl.yaml,org,2002:Some::Package > > --- %tag:cla...@gm...,2004-08-20 > %tag:clarkevans.com,2002-03=bing > - !int # tag:cla...@gm...,2004-08-20:int > - !bing/Some::Thing # > tag:cla...@gm...,2004-08-20:bing/Some::Thing - !bing^wibble # > tag:clarkevans.com,2002-03:wibble I like this. Variation ideas: --- %dom cla...@gm.../2004-08-20 %domref bing=clarkevans.com/2002-03 - !int # cla...@gm.../2004-08-20/int - !bing/Some::Thing # tag:cla...@gm.../2004-08-20/bing/Some::Thing - !bing^wibble # tag:clarkevans.com/2002-03/wibble These are notes from someone who doesn't quite understand why YAML does a lot of the things it does, so my eyes are still pretty newbie. * Since the header contains domain (I think that's what they're called) information, and not tag information (at least not directly), tag seems a little misleading. It's very confusing to look at "tag" and then learn that no tags are actually defined by it, it's just domain prefixes for tags in the document. I used "dom" for the default domain and "domref" to indicate domain shortcuts. * In "cla...@gm...,2004-08-20" the comma doesn't look like part of that value, it looks like the comma is separating the date from the rest of it for some other purpose. Also very confusing. * Since ":" separates key names from values in a YAML doc, it's kind of confusing to have them separating the date and type in "cla...@gm.../2004-08-20/int". It's also sort of confusing to have "tag:" when it seems since this is special, non-YAML-like syntax, just "%tag<space>" would do. * Using "/" to separate all elements of the domain (taguri?) is pretty intuitive to me, and it makes looking at the domain much easier. Sean O'Dell P.S. the YAML jargon is still a mystery to me, so I'm not sure "domain" is the right word. I think it may be "taguri". |
From: Clark C. E. <cc...@cl...> - 2004-08-31 18:05:00
|
On Tue, Aug 31, 2004 at 10:36:44AM -0700, Sean O'Dell wrote: | I like this. Great; I'll post a new draft incorporating your comments. | Variation ideas: | | --- %dom cla...@gm.../2004-08-20 | %domref bing=clarkevans.com/2002-03 | - !int # cla...@gm.../2004-08-20/int | - !bing/Some::Thing # tag:cla...@gm.../2004-08-20/bing/Some::Thing | - !bing^wibble # tag:clarkevans.com/2002-03/wibble | | These are notes from someone who doesn't quite understand why YAML | does a lot of the things it does, so my eyes are still pretty newbie. Ok. | * Since the header contains domain (I think that's what they're called) | information, and not tag information (at least not directly), tag seems | a little misleading. It's very confusing to look at "tag" and then | learn that no tags are actually defined by it, it's just domain prefixes | for tags in the document. Noted. Let's use 'ns' instead, where 'ns' == namespace. | "domref" to indicate domain shortcuts. I think you seem to be having problem with ['=' prefix] after the taggingEntity, I'll move it back in the next draft. | * In "cla...@gm...,2004-08-20" the comma doesn't look like part of | that value, it looks like the comma is separating the date from the rest of | it for some other purpose. Also very confusing. Well, the '/' implies path semantics and the taguri specification uses the comma... I don't see a reason to change. | * Since ":" separates key names from values in a YAML doc, it's kind of | confusing to have them separating the date and type in | "cla...@gm.../2004-08-20/int". Once again, just following taguri specification, and given that all options seem equal, I don't see a reason to not follow suit. | It's also sort of confusing to have | "tag:" when it seems since this is special, non-YAML-like syntax, just | "%tag<space>" would do. Right. This would be a broader issue with the declaration syntax, let's separate this from the proposal, it's kinda different. | * Using "/" to separate all elements of the domain (taguri?) is pretty | intuitive to me, and it makes looking at the domain much easier. As I said above, '/' implies path semantics, which isn't true in this case. Using ':' is well established as a separator in URNs, and taguri is a URN. This is kinda separate issue; do we want to invent our own uri standard. I don't think so. Thanks Sean, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: Sean O'D. <se...@ce...> - 2004-08-31 18:39:51
|
On Tuesday 31 August 2004 11:04, Clark C. Evans wrote: > > Well, the '/' implies path semantics and the taguri specification uses > the comma... I don't see a reason to change. Perhaps to make it easier for people new to YAML to understand what they are. domain/resource is very similar to URLs which people already understand. > | * Since ":" separates key names from values in a YAML doc, it's kind of > | confusing to have them separating the date and type in > | "cla...@gm.../2004-08-20/int". > > Once again, just following taguri specification, and given that all > options seem equal, I don't see a reason to not follow suit. I guess I don't understand what the advantage of using taguri is in the first place. People are already familiar with URLs, so why is taguri better? > | * Using "/" to separate all elements of the domain (taguri?) is pretty > | intuitive to me, and it makes looking at the domain much easier. > > As I said above, '/' implies path semantics, which isn't true in this > case. Using ':' is well established as a separator in URNs, and taguri > is a URN. This is kinda separate issue; do we want to invent our own > uri standard. I don't think so. Well, consider this. One day, schemes will probably be external to documents, and they may reside in various locations, such as on web servers. It might be very useful to code namespace documents as URLs, so you can say: "http://domain.org/2002-01-01/mytype." But also, taguri doesn't seem designed to locate external resources at all. "domain.org,2002:type" is not a path to a local file, and there is no protocol information at all that I can tell. It just seems URL is a much more flexible, forward-looking style, and people will understand it much better. I don't see the advantage in taguri at all. Sean O'Dell |
From: Clark C. E. <cc...@cl...> - 2004-08-31 19:12:12
|
On Tue, Aug 31, 2004 at 11:39:47AM -0700, Sean O'Dell wrote: | taguri doesn't seem designed to locate external resources at all. | "domain.org,2002:type" is not a path to a local file, and there is no | protocol information at all that I can tell. That is _exactly_ the point. Tag names are unique identifiers, they are not external resources. If you use it to find an external resource, well, that's your option. ;) | Well, consider this. One day, schemes will probably be external to | documents, and they may reside in various locations, such as on web | servers. It might be very useful to code namespace documents as URLs, | so you can say: "http://domain.org/2002-01-01/mytype." Far less useful than what you might imagine: a) What happens 5 years from now when you no longer own the domain? b) You may have this point to a 'YASL' schema today, but what happens tomorow when you change to Yippe (cuz YASL sucks)? Something RDDLish kinda solves this, but in a odd way. c) What happens if your laptop isn't on the internet? d) Just beacuse you put a YASL schema there doesn't mean that everyone will be so kind; thus, having it a full URL isn't that useful afterall -- it works 'sporaticly'. e) What happens if you don't own a domain name; taguri allows you to use email addresses | It just seems URL is a much more flexible, forward-looking style, and | people will understand it much better. I don't see the advantage in | taguri at all. Well, nothing to stop you from writing a 'YASL' schema finder app and putting it on the web (or using DNS or some other lookup mechanism). For instance, http://yasl.yaml.org/find?tag=domain.org,2002:type would work just wonderfully. In short, you don't want to add distribution semantics into the mix, a unique identifier is all that is required. Best, Clark |
From: Clark C. E. <cc...@cl...> - 2004-08-31 18:54:25
|
summary: This is a second-pass draft incorporating feedback from Sean, and the comments by Brian that he doesn't want to be bothered by the %TAG stuff (we make it optional) syntax: We introduce a new directive, 'ns' which binds a prefix to a namespace authority, or taggingEntity. declaration := "%ns:" [prefix '='] taggingEntity typetag := "!" [prefix "^"] specific The taggingEntity and specific refer to the same productions in the taguri specification; taggingEntity is either a domain or an email address followed by a date for uniqueness. Further, the empty prefix is expresed by omitting [prefix '='] and [prefix "^"]. semantics: The 'prefix' is considerd a feature of the Presentation model, and does not appear in the Serial or Representation models. The cooked 'tag' to be reported by the parser, is obtained by: taguri = 'tag:' taggingEntity ":" specific If a prefix is used in a document, and not associated with a taggingEntity in a directive, then the application must provide to the parser a valid taggingEntity for each prefix. It is customary that the empty prefix have an implicit namespace of 'yaml.org,2002' Prefixes used in a document that have not been defined by a directive, or explicitly provided to the parser are errors. design: - We are using the directive syntax, beacuse its there and this was one of the intended items (a directive is non-content). Alternatively, we could make a special syntax. - The "^" character was chosen beacuse it is not included in RFC2396's uric production (aka taguri's specific), other possibilities include, "|" "\" "`" . - We use tagURI specification (http://taguri.org) to define the unique URIs. This follows previous versions of the YAML spec. The tagURI is used beacuse it does not imply access semantics and defines an easily 'mintable' unique identifier. examples: --- # assuming default convention mapping '' to 'yaml.org,2002' - !int # tag:yaml,org,2002:int - !sub/type # tag:yaml,org,2002:sub/type - !!bing # tag:yaml,org,2002:!bing --- # with parser configured to map '' to 'perl.yaml.org,2002' - !Some::Package # tag:perl.yaml,org,2002:Some::Package --- %tag:perl.yaml.org,2002 - !Some::Package # tag:perl.yaml,org,2002:Some::Package --- %ns:cla...@gm...,2004-08-20 %ns:bing=clarkevans.com,2002-03 - !int # tag:cla...@gm...,2004-08-20:int - !bing/Some::Thing # tag:cla...@gm...,2004-08-20:bing/Some::Thing - !bing^wibble # tag:clarkevans.com,2002-03:wibble |
From: T. O. <tra...@ru...> - 2004-08-31 19:09:18
|
On Tuesday 31 August 2004 02:54 pm, Clark C. Evans wrote: > --- %ns:cla...@gm...,2004-08-20 > %ns:bing=clarkevans.com,2002-03 Looks familiar ;) %SPACE:where=!www.somewhere.com,2004 But still no :( %TAG:wTlog=!www.somewhere.com,2004:troublelog T. |
From: Clark C. E. <cc...@cl...> - 2004-08-31 19:47:30
|
On Tue, Aug 31, 2004 at 03:09:05PM -0400, T. Onoma wrote: | On Tuesday 31 August 2004 02:54 pm, Clark C. Evans wrote: | > --- %ns:cla...@gm...,2004-08-20 | > %ns:bing=clarkevans.com,2002-03 | | Looks familiar ;) | | %SPACE:where=!www.somewhere.com,2004 | | But still no :( | | %TAG:wTlog=!www.somewhere.com,2004:troublelog Yes, this is by design. _why was having a bit of problem trying to figure out where to 'break' the tag when emitting, this makes the break a no-brainer. While it may shorten your tags a bit, it introduces this ambiguity during writing time. However, you could do this: %ns:tlog=troublelog.domain.tld,2003-03 Best, Clark |
From: Oren Ben-K. <or...@be...> - 2004-08-31 20:46:28
|
On Tuesday 31 August 2004 21:54, Clark C. Evans wrote: > summary: > > This is a second-pass draft incorporating feedback from Sean, and > the comments by Brian that he doesn't want to be bothered by the > %TAG stuff (we make it optional) You turn your back for one second, and suddenly you see a _second_ pass of a new proposal! :-) Alas, I'm not too thrilled with it: - I'm very unhappy with making %tag/%ns/etc. optional, for two reasons. First, I have this strong gut feeling that all purely syntactical issues should be handled without any additional information. I can't put my finger on where exactly this will bite us... but I'm certain it will. More importantly, by making the declarations optional you are completely giving up on making prefixes be purely syntactical. If they were, then changing (all occurrences of ) a prefix in a document would be akin to re-indenting the text - it would have 0 effect on the document's semantics. However, saying the prefix's tagging entity is "provided to the parser by the application" means that changing a prefix _will_ change the semantics (because the application doesn't expects a different prefixes set). So you quickly get to a point where all (most) prefixes are untouchable and globally unique, so you'll need an IANA-like mechanism to manage conflicts. Of course, if you do that, %ns becomes completely pointless. I think we have only two realistic options here: 1. Make %ns or %tag or whatever mandatory. Plus: everyone can define his own namespace and everyone can use a short prefix. Minus: you must explicitly declare the prefixes at the start of each document. As an example of a successfull use of this approach, all the XHTML, DocBook and XSLT documents out there have namespace declarations. A similar approach is used by every "namespaced" programming languages (using "import: or "with" or "use" or whatever to explicitly list the namespaces used). Its quite a common practice, really. 2. Give up on "syntactical only" prefixes and set up a IANA-like repository for prefixes. Plus: short prefixes without any declerations. Minus: All the evils of a limited namespace - short names running out, squatting, etc. This approach is used for mime types; there, however, nobody really cares about having a nice, short name (beyond bragging rights :-). There are also URI schemes and the like - none are created by Joe programmer. Other than these examples, I don't know of any system that uses an IANA-like mechanism to manage namespaces this way. Hint, hint :-) Personally I'd stick with (1). If most people feel that %ns declarations are an evil and must be stamped out, fine, let's bite the bullet and go for (2). - As for restricting the prefix to mean just a tagging entity. Note it still allows one to define "sub-namespaces" of sort, e.g.: --- %ns:a=mine.com,2002 %ns:b=also.mine.com,2002 IMVHO forcing people to build hierarchical structures at the domain name rather than the path is rather pointless. What was the benefit again? On the other hand, insisting that %ns mean a tagging entity further increases the direction that a prefix is not simply a syntactical device. The %TAG was clearly a purely such a device - you simply concatenated the prefix with the suffix, and each could be anything at all so you had no possible illusion of semantics. - Finally, using ^ instead of : or / to seperate the prefix from the suffix. I see the point of using a non-URI character. It strengthents the "syntax-ness" of the mechanism. "^" is kind of ugly though. I suggest "|" instead (it is also not a valid URI character). While we are at it, rename '%tag' to '%pre' (for prefix), to remove any illusion of special semantics: --- %pre:a=tag:my.com,2002/n foo: !a|s/b # tag:my.com,2002/ns/b ... BTW, in the %ns proposal it makes more sense to use ':' rather than '^' (or '|'), because that's the character that immediately follows tagging entity in a taguri. It is also compatible with XML namespaces and is very readable. Since in the %ns proposal prefixes aren't just syntactical anyway ;-), there's no point in pretending they are by using an ugly character like "^". Similarly, if we go to a IANA-like solution, we should stick with ':'. Finally, your examples aren't consistent: > --- %ns:cla...@gm...,2004-08-20 > %ns:bing=clarkevans.com,2002-03 > - !int # tag:cla...@gm...,2004-08-20:int > - !bing/Some::Thing # tag:cla...@gm...,2004-08-20:bing/Some::Thing ? There's no '^' here... > - !bing^wibble # tag:clarkevans.com,2002-03:wibble Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-08-31 21:39:45
|
On Tue, Aug 31, 2004 at 11:46:21PM +0300, Oren Ben-Kiki wrote: | - I'm very unhappy with making %tag/%ns/etc. optional It was an attempt to hop Brian's objection. | 1. Make %ns or %tag or whatever mandatory. Plus: everyone can define his | own namespace and everyone can use a short prefix. Minus: you must | explicitly declare the prefixes at the start of each document. Right. Two versions of this, one that splits at the authority, the other where the split is much more random. You've convinced me that splitting at the authority isn't great. | 2. Give up on "syntactical only" prefixes and set up a IANA-like | repository for prefixes. Plus: short prefixes without any declerations. | Minus: All the evils of a limited namespace - short names running out, | squatting, etc. Right, this is our "shorthand" approach in the current spec. Also, T. Onoma, would like a way to specify the whole unique identifier once so he can reuse it again and again. | IMVHO forcing people to build hierarchical structures at the domain name | rather than the path is rather pointless. What was the benefit again? _why mentioned an emitter problem knowing where to 'split' the tag, this resolves the issue. | On the other hand, insisting that %ns mean a tagging entity further | increases the direction that a prefix is not simply a syntactical | device. The %TAG was clearly a purely such a device - you simply | concatenated the prefix with the suffix, and each could be anything at | all so you had no possible illusion of semantics. This mechanism was also compatible with T.Onoma's need, but didn't help solve the splitting problem; I suppose, what the emitter should have is a way to provide the 'handle' / 'prefix'. Ok. | "^" is kind of ugly though. I suggest "|" instead (it is | also not a valid URI character). While we are at it, rename '%tag' to | '%pre' (for prefix), to remove any illusion of special semantics: | | --- %pre:a=tag:my.com,2002/n | foo: !a|s/b # tag:my.com,2002/ns/b | ... OK |
From: why t. l. s. <yam...@wh...> - 2004-08-31 22:12:49
|
Clark C. Evans wrote: >| On the other hand, insisting that %ns mean a tagging entity further >| increases the direction that a prefix is not simply a syntactical >| device. The %TAG was clearly a purely such a device - you simply >| concatenated the prefix with the suffix, and each could be anything at >| all so you had no possible illusion of semantics. > >This mechanism was also compatible with T.Onoma's need, but didn't >help solve the splitting problem; I suppose, what the emitter >should have is a way to provide the 'handle' / 'prefix'. Ok. > > I'm okay with either solution. Even if prefixes end up with the ability to represent a full path, I'll just have the emitter prefix domains rather than scanning nodes to find out where I can cut corners. I can do that now as well. And your idea is good. So we can drop that issue. _why |
From: Oren Ben-K. <or...@be...> - 2004-08-31 22:47:23
|
OK, third time the charm (I hope). Clark, Brian and myself just went through a heated debate on this in IRC. Here's what we hope is a reasonable compromise (Brian had to leave so he may have further comments). - Some people (say, group A) want a GUID (globally unique id) in the document, so they can use it to look up the schema etc. Some people (say, group B) are happy to leave it unspecified and use context information instead. - If we do specify a GUID, we want to specify it once rather than repeat it all over the place. Especially if we need more than one GUID ("mixing schemas"). So. Here's the proposal in a nutshell: - There are two kind of tags. - Tags using the format "!prefix:stuff" (where "stuff" doesn't start with ':') are globally unique tags. The prefix _must_ be declared in a directive (see below). - All other tags are private tags. They are not globally unique. They are handled in an application specific way - any way whatsoever. - The syntax of the directive is "%tag:head|prefix", such that each "!prefix:tail" tag is converted to "tag:headtail" (that is, by simple concatenation). The result must be a valid tag: URI. No other restrictions apply. Note this is a purely syntactical operation. - It is allowed to specify "!tag:full-tag-URI" without specifying the "%tag:| tag" directive. In fact, it is an error define the 'tag' prefix (as in "%tag:|tag" or "%tag:stuff|tag"). So, group A is welcome to write: --- !tag:foo.com,2004:/bar/|mine !mine:baz stuff # tag:foo.com,2004:/bar/baz ... While group B is welcome to write: --- !My::Perl::Package stuff ... Or, in other words, if you want a globally unique namespace, you pay for it. If you don't, you don't. Obviously you can mix and match tags of both types in the same document, if that makes sense for you. Clark will expand on this later. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-09-02 08:31:03
|
summary: This is a fourth-pass draft incorporating ideas from Onoma, and Oren. It is attempting to provide two different ways of tag globalization; both of which seem resonable for different sorts of applications. One method of tag globalization is to use 'private' tags in your YAML document, and use a transformation of sorts (either explicit, or implicit by the application) to convert one's tags to a globally unique variety. This method is perfect for small teams where interoperability isn't a huge problem, and who do not wish to pay the price of globalized tags. The other method, is an XML namespace like mechanism where a tagURI can be broken into chunks, the first (longer) half of the tag, containing the taggingEntity, is moved up into the declaration and given a handle. The second (shorter) half is then used within each tag as an together with the handle that links it to the longer half. The combining of the parts is done by the parser, so the application always sees full tagURIs. This proposal also has a gift for Sean, as it allows http URLs to be used as a YAML tag; especially useful for that 'root' tag that will drive processing. We (Oren and I) think this is dangerous, but heck, if someone wants to hurt themselves, it may be fun to watch. (*evil grin*). In any case, this relaxation does _not_ apply to the '%tag:' mechanism, which only works with tagURIs. syntax: - We open up the tag mechanism !tag to allow one or more characters from the uric production of RFC2396. Thus, one can use %XX where X is a hex character, plus any combination of the following characters: ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 In particular, characters which may _not_ appear in a !tag are marked as 'unwise' in RFC2396, including: { } | \ ^ [ ] ` These characters will provide an 'escape hatch' for current and future extensions to YAML. With this change, any URI can be directly used as a !tag. - We introduce a new directive 'tag' which provides a way to shorten the data entry of tagURIs. In particular, declaration := "%tag:" taggingEntity ":" spec_first [ "|" handle ] Where 'taggingEntity' refers to the same production in the tagURI specification. The taggingEntity refers to either a domain or email address followed by the minting date; see tagURI specification for details. The 'spec_first' refers to zero or more uric characters (it is optional). The 'handle' refers to a sequence of one or more word characters [a-zA-Z0-9_]. Optionally the '|' and handle can be missing, in this case the handle is considered to be the empty string ''. In a YAML document, each handle must be unique via string comparison. - We extend the !tag mechanism to allow a single '|' character, which is in the reserved characters above, the syntax for this special case is, taguri := '!' handle '|' spec_second In this circumstance, the 'handle' _must_ appear as a handle in one of the document's directives. The 'spec_second', is zero or more uric characters; with the restriction that either spec_first or spec_second (or both) must be at least one character. semantics: - For every special tag having a '|', the parser will do special cooking to join the information specified in the declaration together with the node's tag, such nodes will be treated as if they had been tagged, cooked := "!tag:' taggingEntity ":" spec_first spec_second Note that the 'handle' is not included in this information, it is considered a detail of the Presentation model, and should not occur in tools that comply with the Serialization nor Representation models. Thus, the 'handle' is _not_ part of the core YAML information model, it is mearly a syntax-level trick to ease the burden of typing and human reading. Also note while other URI schemes may appear in a tag, this cooking mechanism purposefully constructs tagURIs; that is, globally unique identifiers lacking protocol or access semantics. - If the document has a directive with an empty handle, then all other tags are cooked according to the rule above, using the taggingEntity and spec_first from the directive using the empty handle. - If the document does _not_ have a directive with an empty handle, then all other tags are reported AS-IS without any cooking. In these cases, a missing tag can be reported as an empty string. This pass-through behavior, which happens unless a %tag directive appears, is straight-forward and desireable. Private tags used in this mechanism can be used in other tag-globalization mechanisms without fear of conflicting with %tags. Note that tagURIs can also directly appear in these tags, as well as any other URI, relative or not. design: - We are using the directive syntax, beacuse it gives a clear indication that 'magic' is about to happen. Also, it localizes all of the declarions up-font. - The "|" character was chosen beacuse it is not included in RFC2396's uric production (aka taguri's specific), other possibilities include, "|" "\" "`" . - We use tagURI specification (http://taguri.org) to define the unique URIs. This follows previous versions of the YAML spec. The tagURI is used beacuse it does not imply access semantics and defines an easily 'mintable' unique identifier. compatibility: - The rollback of the magical !tag cooking rules may cause pain. Processes which used previous logic that !int was equivalent to 'tag:yaml.org,2002:int' have two options: (a) they can leave their documents as-is, and upgrade their processes to use private types; or (b) they could use an older 'parser' to load the documents and emit them with a new emitter. Emitters that (by default) used !!Private types, such as PyYAML do not have this problem. - The removal of cut^paste could cause problems with files that were created by hand. Not much one can do here but fix them by hand, or use an older parser to load, and then reemit. Since emitting using cut^paste was not common, this is deemed to be a smaller problem. PyYAML didn't even implement cut^paste. ;) For both of these cases, it is recommended that Parsers have a legacy flag for a while to help users migrate. Sorry bout this, but it's the price for such a big change. example: The following document, --- %tag:bar.com,2004:timesheet/meeting|meet %tag:foo.com,2004:shape/|shape %tag:yaml.org,2002: !tag:baz.com,2004/mixed/list - event: !meet| where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape|ellipse width: !float 10 height: 5 - event: !meet| where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape|rectangle width: !float 10 height: 5 ... would differ in the Presentation Model, but would be identical in the Serialization and Representation model with, --- !tag:baz.com,2004:mixed/list - event: !tag:bar.com,2004:timesheet/meeting where: office time: 2004-09-09 10:00:00 duration: !tag:yaml.org,2002:int 1:00 text: boring shape: !tag:foo.com,2004:shape/ellipse width: !tag:yaml.org,2002:float 10 height: 5 - event: !tag:bar.com,2004:timesheet/meeting where: office time: 2004-09-09 10:00:00 duration: !tag:yaml.org,2002:int 1:00 text: boring shape: !tag:foo.com,2004:shape/rectangle width: !tag:yaml.org,2002:float 10 height: 5 ... Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: David H. <dav...@bl...> - 2004-09-02 18:02:13
|
Clark C. Evans wrote: > This is a fourth-pass draft incorporating ideas from Onoma, and Oren. I support this proposal. However, I think that some set of tag names that includes the current names in the YAML Tag Repository (such as [a-zA-Z/]+ or [a-zA-Z0-9_]+) should be "reserved" for future additions to the repository. That would avoid any concerns about additions clashing with private tags. For most languages, tags named after a fully-qualified class will automatically be unreserved (because they contain characters like '.' or ':'); otherwise, it is simple to prepend another '!', for example, to private tag names. -- David Hopwood <dav...@bl...> |
From: David H. <dav...@bl...> - 2004-09-02 18:56:27
|
David Hopwood wrote: > Clark C. Evans wrote: > >> This is a fourth-pass draft incorporating ideas from Onoma, and Oren. > > I support this proposal. > > However, I think that some set of tag names that includes the current names > in the YAML Tag Repository (such as [a-zA-Z/]+ or [a-zA-Z0-9_]+) should be > "reserved" for future additions to the repository. That would avoid any > concerns about additions clashing with private tags. For most languages, > tags named after a fully-qualified class will automatically be unreserved > (because they contain characters like '.' or ':'); otherwise, it is simple > to prepend another '!', for example, to private tag names. This comment applies also to the fifth-pass draft. (You lot don't slow down, do you? :-) -- David Hopwood <dav...@bl...> |
From: Sean O'D. <se...@ce...> - 2004-08-31 19:36:39
|
On Tuesday 31 August 2004 12:12, Clark C. Evans wrote: > On Tue, Aug 31, 2004 at 11:39:47AM -0700, Sean O'Dell wrote: > | taguri doesn't seem designed to locate external resources at all. > | "domain.org,2002:type" is not a path to a local file, and there is no > | protocol information at all that I can tell. > > That is _exactly_ the point. Tag names are unique identifiers, > they are not external resources. If you use it to find an external > resource, well, that's your option. ;) Any YAML parser is, sooner or later, going to have to resolve the location of external schemas. I don't agree that the fact that taguri's can't resolve to locations is a good thing. It seems pointless to have an identifier that can't locate what it identifies, especially when eventually there really will be external references to locate. > | Well, consider this. One day, schemes will probably be external to > | documents, and they may reside in various locations, such as on web > | servers. It might be very useful to code namespace documents as URLs, > | so you can say: "http://domain.org/2002-01-01/mytype." > > Far less useful than what you might imagine: > > a) What happens 5 years from now when you no longer own the domain? Same thing that happens to URLs. If the external document can't be located, it's gone, not YAML's problem. If that's a real possibility for some people, they should keep their external references as local files, or as auxilliary documents in the same stream. > b) You may have this point to a 'YASL' schema today, but what > happens tomorow when you change to Yippe (cuz YASL sucks)? > Something RDDLish kinda solves this, but in a odd way. I think YAML should provide a foundation for locating external references so the schema is just a document that describes, not locates. If you switch schema's, you just change the reference in your document to the location of the new schema. The mechanism for locating your new schema doesn't change. External references are probably inevitable anyway, so it's not really a matter of "if" they will be supported, but "when." Why not unify and simplify the domain identifiers to provide the groundwork now? > c) What happens if your laptop isn't on the internet? Again, where the resource is located is up to the developer. It can be on a web site, a local file or embedded in the same stream as an auxilliary document. What happens in failure conditions isn't something I think YAML should worry about. YAML, I think, should just do what people want and let programmers worry about exceptional situations. > d) Just beacuse you put a YASL schema there doesn't mean that > everyone will be so kind; thus, having it a full URL isn't that > useful afterall -- it works 'sporaticly'. Then the schema can be a local file, or embedded in the same stream with the data document as an auxilliary/header document. Where the resources are located will scale with the developer. Giant companies will have lots of external references on web sites. Little guys like you and I will probably have them as external files or as auxilliary documents in the same stream as the data document to avoid problems. > e) What happens if you don't own a domain name; taguri allows > you to use email addresses I feel the identifier should not just be unique, but should locate the reference, so I don't think email addresses would be allowed at all. I know it was a neat solution to a posed problem, but the reality is that I think URI's should locate when they reference, and I don't see email as a viable transport. > | It just seems URL is a much more flexible, forward-looking style, and > | people will understand it much better. I don't see the advantage in > | taguri at all. > > Well, nothing to stop you from writing a 'YASL' schema finder app and > putting it on the web (or using DNS or some other lookup mechanism). > For instance, > > http://yasl.yaml.org/find?tag=domain.org,2002:type > > would work just wonderfully. In short, you don't want to add > distribution semantics into the mix, a unique identifier is > all that is required. I don't think YAML should depend on CGI programs to locate external references. There are other transports that don't support CGI, such as ftp, that could be used. External references are in YAML's future sooner or later, and URLs work wonderfully...so why avoid them for a homebrewed system like taguri which doesn't offer any location at all and is actually, to me, sort of confusing to look at. Sean O'Dell |
From: why t. l. s. <yam...@wh...> - 2004-08-31 19:43:46
|
T. Onoma wrote: >Looks familiar ;) > > %SPACE:where=!www.somewhere.com,2004 > >But still no :( > > %TAG:wTlog=!www.somewhere.com,2004:troublelog > > --- %ns:yaml.org,2002 %ns:perl.yaml.org,2002:Some::Package=somepkg - !somepkg {} Yer idea has clash, T. _why |
From: T. O. <tra...@ru...> - 2004-08-31 20:44:29
|
On Tuesday 31 August 2004 03:43 pm, why the lucky stiff wrote: > T. Onoma wrote: > >Looks familiar ;) > > > > %SPACE:where=!www.somewhere.com,2004 > > > >But still no :( > > > > %TAG:wTlog=!www.somewhere.com,2004:troublelog > > --- %ns:yaml.org,2002 > %ns:perl.yaml.org,2002:Some::Package=somepkg > - !somepkg {} > > Yer idea has clash, T. ? Precedence. T. |
From: Clark C. E. <cc...@cl...> - 2004-08-31 20:53:09
|
Summary of Positions (to help We seem to have issue identifing what the problem we are solving is. Please correct me if I'm wrong. T. Onoma -------- - proposed a %TAG idea - tags are invading pretty YAMLs - a primary problem is that he mixes domains, so the cut^paste method is not helpful - suggested to move tag prefixes into the header - likes the idea of a schema providing type information rather than having it inline - was open to "sub-docs" idea, but didn't like having an earlier document in the stream - wants to have the %TAG mechanism go beyond just domains, and go all they way down to types Why The Lucky Stiff ------------------- - also doesn't like the cut^paste item, especially since it is difficult to determine where the 'break' should be - overall likes the %TAG idea Brian ----- - does not think YAML needs any form of globally unique identifiers in the document - would like tags to be passed-on to the type resolver without any sort of trickyness Sean O'Dell ----------- - wants a globally unique identifier in the document, but insists it should be a URL to locate the schema - likes having the mappings of tags to unique identifiers in a YAML document Clark ------ - agrees cut^paste sucks - agrees all of the short-hands suck - thinks that explicit tags should be optional - wants globally unique identifiers for tags - is willing to accept that some 'local' tags may not be globally unique (aka private tags) - likes the %TAG mechanism for shortening tags - any 'prefix' used to make writing tags shorter must be a syntax-only feature and not bleed into the serial or representation model |
From: Sean O'D. <se...@ce...> - 2004-08-31 21:21:40
|
Minor change; I have no strong feelings about mapping identifiers to shortcut tags, but I do feel strongly about separating namespace declarations from the document and having them in separate documents. Sean O'Dell ----------- - wants a globally unique identifier in the document, but insists it should be a URL to locate the schema - wants namespace data taken out of the document header and put into auxilliary documents, located in-stream or externally |
From: why t. l. s. <yam...@wh...> - 2004-08-31 20:00:12
|
Sean O'Dell wrote: >External references are in YAML's future sooner or later, and URLs work >wonderfully...so why avoid them for a homebrewed system like taguri which >doesn't offer any location at all and is actually, to me, sort of confusing >to look at. > > We used to use web URLs: (read from bottom to top) http://sourceforge.net/mailarchive/forum.php?forum_id=1771&style=flat&viewday=22&viewmonth=200207 Tags don't need all the extra baggage of having a URL. They just need to be unique. Finding the schema for a given type isn't an emergency or anything. _why |
From: Sean O'D. <se...@ce...> - 2004-08-31 20:10:20
|
On Tuesday 31 August 2004 12:59, why the lucky stiff wrote: > Sean O'Dell wrote: > >External references are in YAML's future sooner or later, and URLs work > >wonderfully...so why avoid them for a homebrewed system like taguri which > >doesn't offer any location at all and is actually, to me, sort of > > confusing to look at. > > We used to use web URLs: (read from bottom to top) > > > http://sourceforge.net/mailarchive/forum.php?forum_id=1771&style=flat&viewd >ay=22&viewmonth=200207 > > Tags don't need all the extra baggage of having a URL. They just need > to be unique. Finding the schema for a given type isn't an emergency or > anything. You have to be prepared for a fairly dramatic change, then, if you go with taguri now. When you need to resolve external references to load schemas, you'll be fighting taguri or inventing something homegrown that somehow makes use of taguri's. I don't see why you would knowingly do that when URLs are so simple and handle both the uniqueness and location issue. URLs do the job already. Why the fixation on taguri? Sean O'Dell |