From: Oren Ben-K. <or...@be...> - 2004-08-31 20:51:47
|
On Tuesday 31 August 2004 23:10, Sean O'Dell wrote: > When you need to resolve external references to load schemas, > you'll be fighting taguri or inventing something homegrown that somehow > makes use of taguri's. I don't see why you would knowingly do that when > URLs are so simple and handle both the uniqueness and location issue. They do neither (well). Problem #1: Both uniqueness and location fail when someone else buys the domain name. Problem #2: Not everyone who uses schemas is connected to the Internet, for any of several good reasons. Problem #3: Machine readable "schema" _will_ come in many formats. Trying to stuff them all into a single URL simply doesn't work. > URLs do the job already. Why the fixation on taguri? Because URLs don't do the job. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-08-31 23:36:07
|
On Wed, Sep 01, 2004 at 01:47:19AM +0300, Oren Ben-Kiki wrote: | Clark, Brian and myself just went through a heated debate on this in | IRC. Here's what we hope is a reasonable compromise (Brian had to | leave so he may have further comments). While we've had a tag system for quite some time, it has several known warts. While the cut^paste mechanism functions well for single namespaces, it fails when mixing namespaces in a document. The tag shortcuts are also quite subtle and difficult for newbies to pickup. These complications were added to support globally unique identifiers. The result is that the average user (who may not want globally unique identifiers) is burdened with complexity. A change has been coming for some time, I want to thank those who kept complaining. ;) To complicate this problem, we seem to have a difference in vision at the top. Brian stronly feels that globally unique identifiers do not belong in a document, and is opposed to any directive. I feel that unique identifiers are essential for lots of nerdly uses. The proposed compromse "philosophy" is: By default, all types should be private; unless you want to pay for uniqueness with additional pain. | So. Here's the proposal in a nutshell: | | - There are two kind of tags. | | - Tags using the format "!prefix:stuff" (where "stuff" doesn't | start with ':') are globally unique tags. The prefix _must_ | be declared in a directive (see below). | | - All other tags are private tags. They are not globally unique. | They are handled in an application specific way - any way whatsoever. Lots of ways to write non-unique tags! Very easy! !int !foo/bar !Perl::Bing !!OldPrivate | - The syntax of the directive is "%tag:head|prefix", such that each | "!prefix:tail" tag is converted to "tag:headtail" (that is, by simple | concatenation). The result must be a valid tag: URI. No other | restrictions apply. Note this is a purely syntactical operation. Example, --- %tag:www.somewhere.com,2004:/trouble/|tb date: !date 2004-08-28 # !date spec: !tb:spec {} # !tag:www.somewhere.com,2004/trouble/spec | - It is allowed to specify "!tag:full-tag-URI" directly --- !tag:foo.com,2004:/bar/baz stuff More formally, declaration := '%tag:' taggingEntity ':' [ one ] '|' prefix where 'taggingEntity' is from the tagURI specification 'one' is optional string of RFC2396 uric characters 'prefix' is your typical name production, less 'tag' typetag := "!" prefix ":" two or "!" 'tag:' taggingEntity ':' specific or '!" private where 'two' and 'specific' is one or more uric characters, and 'private' is a string of uric characters that does not match the former production (the spec will be more formal) The parser would then 'cook' all type tags that used the global format, using the recipie below; it will pass on all private types. Undeclared prefixes are an error. cooked := 'tag:' taggingEntity ':' one two or 'tag:' taggingEntity ':' specific or private Depending upon which form of type tag matched. | Or, in other words, if you want a globally unique namespace, you pay | for it. If you don't, you don't. Obviously you can mix and match tags | of both types in the same document, if that makes sense for you. Exactly. A few notes: a) This mechanism is far simpler than the current one specified b) The extra burden for using %tag: is placed only upon those who want globally unique identifiers c) The mechanism still isn't exactly what T.Onoma wished for, that is, being able to have !bing refer to a globally-unique tag; however, he can do !foo:bar d) Existing YAML document tags become private tags, except for those that use cut^paste which become invalid e) It is intensional side-effect that all 'cooked' tags that matches the URI production also match tagURI. In other words, no URLs allowed. Thoughts? Clark P.S. I'm not all that happy about the | delimiter... this is still a work in progress. |
From: T. O. <tra...@ru...> - 2004-09-01 00:06:46
|
On Tuesday 31 August 2004 07:36 pm, Clark C. Evans wrote: > Thoughts? So what's !int ? -- T. |
From: Clark C. E. <cc...@cl...> - 2004-09-01 03:47:16
|
On Tue, Aug 31, 2004 at 08:06:37PM -0400, T. Onoma wrote: | So what's !int ? According to the current specification, it is "cooked" to become 'tag:yaml.org,2002:int' in the YAML Representation model, and according to the YAML type repository, implementations would be obligated to load such a node into an 'integer' of sorts. I feel this is rather "reasurring" value, if you see !int in a YAML document, you know it is an integer. I think this is a good feature, I like it. Unfortunately, as you point out, the current specifcation doesn't provide a nice way to mix different globally unique tags in the same document. So, we need some sort of prefix mechanism to make this use case work. My preferred solution is to add a syntax-level %tag/prefix mechanism, making clear notes that it is presentation issue only, and does not go into the serialization or representation models (and thus, never makes it into the parser API). I'd leave current shortcuts alone, as I think they are a great compromise for inter-language data transfer and readability. I would deprechiate the cut^paste hack. Note, that this preferred solution would just add another syntax-level trick (we got tons of them for data, why not one more for tags?). Since a !tag cannot currently look like !prefix:value, this is a no-brainer. I'm not sure Oren would buy this, he seemed to want to get rid of the various syntax shortcuts for building a 'taguri' with the introduction of this %tag: directive. As for Brian, I would have thought two weeks ago that he'd be more open to this "modest" improvement, after all, he's famous for wanting many ways to write the same thing. However, he's decided that tags don't need to be globally unique, and thinks all tags appearing in the document should be application (he calls this schema) specific. Needless to say, I disagree. But, you seem to be agreeing with Brian, so, given two people saying the same thing, I'm willing to give it much more thought. And this proposal is the result. It makes private tags the default (for Brian), and simplifies the rules for globally unique tags (for Oren), provides a short-cut for long tags (T.Onoma). Now... to answer your question. According to this proposal, if it were adopted, !int would be left "uncooked", it would be retured by the parser as 'int' and treated as a private type. According to this proposal, !tags that don't match the regular expression '^\w+:[^:]' are left uncooked, and could quite possibly conflict with someone else's tag named 'int'. Brian would say: What the tag 'int' means depends completely on your application; You can load the value into an integer if you wish, or, alternatively, an interrupt or chocolate bar. With this proposal, if you want a YAML integer, you'd have to get explicit, kinda ugly, but that's the price of having unique identifiers. --- %tag:yaml.org,2002:|yaml - yaml:int 23 - yaml:str 23 or --- - tag:yaml.org,2002:int 23 - tag:yaml.org,2002:str 23 It's a huge readability hit for the "built-in" types, and I think will decrease portability of data across programming languages. It also may have some serious backward compatiblity issues. Removing all of the 'cooking' from tags in existing documents, converting what an author thought was a globally unique tag into one that is private, is well, ick. But, alas, I'm trying to understand the problem, and forge concensus. Cheers! Clark |
From: Clark C. E. <cc...@cl...> - 2004-09-01 15:28:40
|
It seems that there is a new requirement that a 'namespace' is provided on the top node, and then all other nodes are relative. We can add this to the ruleset: - Tags matching '^w+:[^:].+' are globally unique tags, there are two versions, the long and abbreviated form. The long form starts with 'tag' and exactly matches the tagURI specification. The abbreviated form is considered as two parts -- prefix ':' specific The prefix must be declared in a directive, and specific is a uiuc character from RFC2396. An example of an abbreviated tag is 'yaml:int' - A prefix is declared using: '%tag:' taggingEntity ':' wibble* '|' prefix where taggingEntity comes from taguri spec, and wibble is zero or more uiuc characters from RFC2396. The prefix is a name, matching standard regular expression \w+ (alpha numeric plus underscore), less the long-form version, 'tag', which has special meaning. - Tags not matching '^w+:[^:].+' may use characters from the uiuc production of RFC2396. There are two forms of these tags. If the %tag declaration appears without a prefix, the '|' prefix is optional, then all of these tags are globally unique using the taggingEntity in the declaration. Otherwise, lacking a %tag declaration and a prefix, every tag is Private. - If a tag is globally unique, the part of the %tag directive up-to but not including the | is prepended with the specific component. So, %tag:yaml.org,2002:|yaml and yaml:int are cooked by the parser as tag:yaml.org,2002:int This change (just making '| prefix be optional), allows for the following: --- %tag:artml.rubyforge.org,2004: !data hello_text: value: !redcloth | h2. This is a test! your_name: value: "Put your name here." This is simply a 'syntax shorthand' for the following: --- !tag:artml.rubyforge.org,2004:data | hello_text: value: !tag:artml.rubyforge.org,2004:redcloth | h2. This is a test! your_name: value: "Put your name here." We've got lots of ways to 'write' scalars to make them pretty, I see this as another way to 'write' tags to make them pretty. Impacts of this proposal: - If you don't use the magical %tag, your !tags are reported by the parser exactly as you see them... no more magic - The 'empty' prefix syntax allows for a default taguri to be provided across all tags. - For those that want to 'mix' two different tag prefixes, say from a transform language and a target language, one can use the prefix:tag notation - Tags reported by the parser _stay_ plain-old strings, no XML-like (namespace, localpart) - All tags reported by the parser that look like a taguri are globally unique and can be used to find stuff. Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: T. O. <tra...@ru...> - 2004-09-01 20:04:47
|
On Wednesday 01 September 2004 11:28 am, Clark C. Evans wrote: > We've got lots of ways to 'write' scalars to make them pretty, > I see this as another way to 'write' tags to make them pretty. This? After the last post? Somehow things seem to have gotten even more=20 complex. Oh why did I bring it up! ;) Just kidding... I am glad I did. Its just that= my=20 first take on it was a simple attempt to improve the "mixed domain" scenari= o.=20 But listening to Brian really got me thinking: If one has a mixed case=20 scenario doesn't that constitute the proper need for a new domain space,=20 merging them into one proper domain space? That might seem like over kill f= or=20 just two or three domains, but if it's not too much bother to actually do=20 what's the big deal? Take this simple example: =2D-- !artml.rubyforge.org,2004/^data =A0 =A0hello_text: =A0 =A0 =A0 =A0value: !^redcloth | =A0 =A0 =A0 =A0 =A0 h2. This is a test! =A0 =A0your_name: =A0 =A0 =A0 =A0 value: "Put your name here." I'm already using three domains: - artml.rubyforge.org,2004=20 - hobix.com,2004=20 - yaml.org,2002 The thing about artml.rubyforge.org,2004 is that it inherits the other two = and=20 adds some tags of its own. I can foresee programming in Ruby and registerin= g=20 this "space" and including the others into it. Much like one includes a=20 module into a class. I'll have to pay attention to name clashes. When there= =20 is a problem I'll have to alias or, if I wish, use a subdomain. Why do this= ?=20 B/c that's the mechanism we currently have for doing type schemas. And exce= pt=20 for the inheritance part, that's how it already works anyway. So the other opinion here is that we should allow domain mixing within the = doc=20 itself. And then, of course, we must fix the broken ^ shortcuts because the= y=20 don't work well with mixing. So we need prefixes, and this means we need a= =20 %tag directive, and so on ... (Despite my initial feelings) This is getting= =20 ugly, and unweildy --it's multiple inheritance vs. single inheritance and=20 (IMHO) MI is simply improper for yaml. That's my strong opinion. And I think it should be carefully considered. But if we must compromise, then I want something that's super clean. You kn= ow,=20 the "Yaml Way". So, lets try this. Why not go with the flow and insert the= =20 prefix on the fly just like the ^: =2D-- !artml=3Dartml.rubyforge.org,2004/^data =A0 =A0hello_text: =A0 =A0 =A0 =A0value: !artml^redcloth | =A0 =A0 =A0 =A0 =A0 h2. This is a test! =A0 =A0your_name: =A0 =A0 =A0 =A0 value: "Put your name here." Or better yet: =2D-- ![artml].rubyforge.org,2004/^data =A0 =A0hello_text: =A0 =A0 =A0 =A0value: !artml^redcloth | =A0 =A0 =A0 =A0 =A0 h2. This is a test! =A0 =A0your_name: =A0 =A0 =A0 =A0 value: "Put your name here." Or something like that. So when ever a new domain comes up for the first ti= me,=20 you can just prefix it inplace. That'll keep the doc light. Allow for a MI prefix model -or- a SI unified G= UID=20 model (depending on your preference). Allow Clark and Oren to do there thin= g.=20 Allow Brian and I to do our thing. No need for any the other tag crapola. Of course, my only question now is: What is !int ? ;) T. |
From: Clark C. E. <cc...@cl...> - 2004-09-01 20:28:38
|
On Wed, Sep 01, 2004 at 04:04:36PM -0400, T. Onoma wrote: | On Wednesday 01 September 2004 11:28 am, Clark C. Evans wrote: | > We've got lots of ways to 'write' scalars to make them pretty, | > I see this as another way to 'write' tags to make them pretty. ... | --- !artml=artml.rubyforge.org,2004/^data | ? ?hello_text: | ? ? ? ?value: !artml^redcloth | | ? ? ? ? ? h2. This is a test! | ? ?your_name: | ? ? ? ? value: "Put your name here." | | Or better yet: | | --- ![artml].rubyforge.org,2004/^data | ? ?hello_text: | ? ? ? ?value: !artml^redcloth | | ? ? ? ? ? h2. This is a test! | ? ?your_name: | ? ? ? ? value: "Put your name here." | | Or something like that. So when ever a new domain comes up for the | first time, you can just prefix it inplace. I like it, but neither syntax you propose is backward compatible. The characters we can use for finding a syntax, are from RFC2396: unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" So... options... --- !{handle}argml.rubyforge.org,2004/^data hello_text: value: !{handle}redcloth --- !handle|argml.rubyforge.org,2004/^data hello_text: value: !handle|redcloth Hmm. This will take some playing to find something pleasing that is also functional. Is a good road to explore? | What is !int ? This becomes a separate issue. We can do two things: - allow taguri to appear as a tag directly (all uris are currently not possible with the current productions) - let all other tags that don't look like uris be 'priavate' types subject to local scoping, etc. Or, we can leave the allowable tags and their resolution to globally unique tagURI as-is. I almost prefer that. Clark |
From: Oren Ben-K. <or...@be...> - 2004-09-01 20:54:02
|
On Wednesday 01 September 2004 23:28, Clark C. Evans wrote: > On Wed, Sep 01, 2004 at 04:04:36PM -0400, T. Onoma wrote: > | Or better yet: > | > | --- ![artml].rubyforge.org,2004/^data > | ? ?hello_text: > | ? ? ? ?value: !artml^redcloth | > | ? ? ? ? ? h2. This is a test! > | ? ?your_name: > | ? ? ? ? value: "Put your name here." > | > | Or something like that. So when ever a new domain comes up for the > | first time, you can just prefix it inplace. > > I like it, but neither syntax you propose is backward compatible. Another problem is that if your mixing case works like this: --- seq: - !<namespace1>/tag stuff - !<namespace2>/tag stuff map: field1: !<namespace1>/tag stuff field2: !<namespace2>/tag stuff ... There's no place you can put your prefixes so they can be reused. You still have to specify the full globally unique tag at the start of each micro-island. The %tag proposal allows each potentially long taguri to be written "once and only once" at the start of the document. > --- !handle|argml.rubyforge.org,2004/^data > hello_text: > value: !handle|redcloth That's the least eye-soring of the bunch. > | What is !int ? > > Or, we can leave the allowable tags and their resolution to > globally unique tagURI as-is. I almost prefer that. Right. But still, it doesn't really solve the problem. See http://yaml.kwiki.org/index.cgi?GraphicalTimesheet for a more concrete example. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-09-01 21:06:49
|
On Wed, Sep 01, 2004 at 11:53:54PM +0300, Oren Ben-Kiki wrote: | Right. But still, it doesn't really solve the problem. See | http://yaml.kwiki.org/index.cgi?GraphicalTimesheet for a more concrete | example. I added an example of this mechanism there, here it is to copy to the list. You use ^.*|handle to 'memorise' a handle, and you just use handle| to invoke it. --- !baz.com,2004/mixed/list - event: !bar.com,2004/timesheet/meeting^|meet where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !foo.com,2004/shape/^ellipse|shape width: !float 10 height: 5 - event: !meet| where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape|square width: !float 10 height: 5 ... Note, this specific proposal doesn't change the tag shortcuts in the existing spec; that is a different debate entirely. Clark |
From: Oren Ben-K. <or...@be...> - 2004-09-01 21:27:36
|
On Thursday 02 September 2004 00:06, Clark C. Evans wrote: > --- !baz.com,2004/mixed/list > - event: !bar.com,2004/timesheet/meeting^|meet > where: office > time: 2004-09-09 10:00:00 > duration: !int 1:00 > text: boring > shape: !foo.com,2004/shape/^ellipse|shape > width: !float 10 > height: 5 > - event: !meet| Ah. So prefixes survive all the way to the end of the document, instead of being restricted to descendents of the node (like today). Yes, that solves the problem of multiple island. > Note, this specific proposal doesn't change the tag shortcuts in > the existing spec; that is a different debate entirely. Right. I assume it replaces the current prefixing mechanism, however. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-09-02 16:43:39
|
summary: This is a fifth-pass draft, incorporating an idea from Onoma, that lacking a %tag directive (well, it was sort of his idea anyway), tag:yaml.org,2002: is assumed. This helps eliminate a class of backward compatibiltiy problems. One method of tag globalization is to use 'private' tags in your YAML document, and use a transformation of sorts (either explicit, or implicit by the application) to convert one's tags to a globally unique variety. This method is perfect for small teams where interoperability isn't a huge problem, and who do not wish to pay the price of mixing and matching globalized tags. The other method, is an XML namespace like mechanism where a tagURI can be broken into chunks, the first (longer) half of the tag, containing the taggingEntity, is moved up into the declaration and given a handle. The second (shorter) half is then used within each tag as an together with the handle that links it to the longer half. The combining of the parts is done by the parser, so the application always sees full tagURIs. syntax: - We open up the tag mechanism !tag to allow one or more characters from the uric production of RFC2396. Thus, one can use %XX where X is a hex character, plus any combination of the following characters: ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 In particular, characters which may _not_ appear in a !tag are marked as 'unwise' in RFC2396, including: { } | \ ^ [ ] ` These characters will provide an 'escape hatch' for current and future extensions to YAML. With this change, any URI can be directly used as a !tag. - We introduce a new directive 'tag' which provides a way to shorten the data entry of tagURIs. In particular, declaration := "%tag:" taggingEntity ":" spec_first [ "|" handle ] Where 'taggingEntity' refers to the same production in the tagURI specification. The taggingEntity refers to either a domain or email address followed by the minting date; see tagURI specification for details. The 'spec_first' refers to zero or more uric characters (it is optional). The 'handle' refers to a sequence of one or more word characters [a-zA-Z0-9_]. Optionally the '|' and handle can be missing, in this case the handle is considered to be the empty string ''. In a YAML document, each handle must be unique via string comparison. - We extend the !tag mechanism to allow a single '|' character, which is in the reserved characters above, the syntax for this special case is, taguri := '!' handle '|' spec_second In this circumstance, the 'handle' _must_ appear as a handle in one of the document's directives. The 'spec_second', is zero or more uric characters; with the restriction that either spec_first or spec_second (or both) must be at least one character. semantics: - For every special tag having a '|', the parser will do special cooking to join the information specified in the declaration together with the node's tag, such nodes will be treated as if they had been tagged, cooked := "!tag:' taggingEntity ":" spec_first spec_second Note that the 'handle' is not included in this information, it is considered a detail of the Presentation model, and should not occur in tools that comply with the Serialization nor Representation models. Thus, the 'handle' is _not_ part of the core YAML information model, it is mearly a syntax-level trick to ease the burden of typing and human reading. Also note while other URI schemes may appear in a tag, this cooking mechanism purposefully constructs tagURIs; that is, globally unique identifiers lacking protocol or access semantics. - If the document has a directive with an empty handle, then all other tags are cooked according to the rule above, using the taggingEntity and spec_first from the directive using the empty handle. - If the document does _not_ have a directive with an empty handle, then it is implicitly given one, 'tag:yaml.org,2002:'. This provides backwards compatibility with previous versions of the YAML specification, and furthermore, it allows for types which are common across various programming languages to be easily used. - If the !tag starts with 'tag:' then it is validated against the tagURI specification and passed-through uncooked. design: - We are using the directive syntax, beacuse it gives a clear indication that 'magic' is about to happen. Also, it localizes all of the declarions up-font. This version of cooking is much simpler than previous specification. - The "|" character was chosen beacuse it is not included in RFC2396's uric production (aka taguri's specific), other possibilities include, "|" "\" "`" . - We use tagURI specification (http://taguri.org) to define the unique URIs. This follows previous versions of the YAML spec. The tagURI is used beacuse it does not imply access semantics and defines an easily 'mintable' unique identifier. - By making YAML's type repository the default namespace, common cross-language tags, or sets which have been registered are easily accessable. In effect, it provides an IANA like mechanism for good things; allowing for namespaces for overriding this built-in behavior. compatibility: - The removal of cut^paste could cause problems with files that were created by hand. Not much one can do here but fix them by hand, or use an older parser to load, and then reemit. Since emitting using cut^paste was not common, this is deemed to be a smaller problem. PyYAML didn't even implement cut^paste. It is recommended that parsers support this old behavior for a while to help users migrate. - The simplification of magical !tag cooking rules is identical for most tags except those that have a domain, where: !clarkevans.com,2003/bing => tag:clarkevans.com,2003:bing was the old behavior, the following is the new behavior, !clarkevans.com,2003/bing => tag:yaml.org:clarkevans.com,2003/bing In pratice there is very little difference, although this usage will be deprechiated. example: The following document, --- %tag:bar.com,2004:timesheet/meeting|meet %tag:foo.com,2004:shape/|shape %tag:yaml.org,2002: !tag:baz.com,2004/mixed/list - event: !meet| where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape|ellipse width: !float 10 height: 5 - event: !meet| where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape|rectangle width: !float 10 height: 5 ... would differ in the Presentation Model, but would be identical in the Serialization and Representation model with, --- !tag:baz.com,2004:mixed/list - event: !tag:bar.com,2004:timesheet/meeting where: office time: 2004-09-09 10:00:00 duration: !tag:yaml.org,2002:int 1:00 text: boring shape: !tag:foo.com,2004:shape/ellipse width: !tag:yaml.org,2002:float 10 height: 5 - event: !tag:bar.com,2004:timesheet/meeting where: office time: 2004-09-09 10:00:00 duration: !tag:yaml.org,2002:int 1:00 text: boring shape: !tag:foo.com,2004:shape/rectangle width: !tag:yaml.org,2002:float 10 height: 5 ... Furthermore, --- - !int 23 # tag:yaml.org,2002:int - !!old-private # tag:yaml.org,2002:!old-private - !perl/Foo::Bar # tag:yaml.org,2002:perl/Foo::Bar - !python/tuple # tag:yaml.org,2002:python/tuple - !htsql.org,2004:request # tag:yaml.org,2002:htsql.org,2004:request --- %tag:d9...@ho...,2004-09-02: - !int 23 # tag:d9...@ho...,2004-09-02:int - !bingles wozer # tag:d9...@ho...,2004-09-02:bingles - !zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing --- %tag:d9...@ho...,2004-09-02: %tag:yaml.org,2002:yaml - !yaml|int 23 # tag:yaml.org,2002:int - !bingles wozer # tag:d9...@ho...,2004-09-02:bingles - !zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing --- %tag:d9...@ho...,2004-09-02:me - !int 23 # tag:yaml.org,2002:int - !python/tuple # tag:yaml.org,2002:python/tuple - !me|bingles wozer # tag:d9...@ho...,2004-09-02:bingles - !me|zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: Oren Ben-K. <or...@be...> - 2004-09-02 17:27:38
|
On Thursday 02 September 2004 19:43, Clark C. Evans wrote: > summary: > > This is a fifth-pass draft... Nice! I love the way you worked backward compatibility in. I like this one! Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-09-02 17:45:44
|
On Thu, Sep 02, 2004 at 08:27:29PM +0300, Oren Ben-Kiki wrote: | > summary: | > This is a fifth-pass draft... | | Nice! I love the way you worked backward compatibility in. | I like this one! I do as well; but it violates Brian's idea that !tags need not be globally unique; and my earlier position that they need only be globally unique iff there is a corresponding %tag. I told him that it isn't fair to prevent people from using global tags; on the other side of the coin, it probably itn's fair to force them to use global tags either. Probably the 'default' behavior should be to make all tags 'private' unless there is a %tag present without a handle. The behavior in the fifth pass could be a parser flag for backwards-compatibility? Although, I like the idea of seeing --- - !int 23 And _knowing_ that this is a cross-language integer. Cheers, Clark |
From: T. O. <tra...@ru...> - 2004-09-02 17:36:34
|
On Thursday 02 September 2004 12:43 pm, Clark C. Evans wrote: > We extend the !tag mechanism to allow a single '|' character, > =A0 =A0 which is in the reserved characters above, the syntax for this > =A0 =A0 special case is, > > =A0 =A0 =A0 =A0taguri :=3D '!' handle '|' spec_second > =A0 =A0 =A0 =A0 > =A0 =A0 In this circumstance, the 'handle' _must_ appear as a handle in o= ne > =A0 =A0 of the document's directives. =A0The 'spec_second', is zero or mo= re > =A0 =A0 uric characters; with the restriction that either spec_first or > =A0 =A0 spec_second (or both) must be at least one character. =A0 This worries me because !meet| looks an awful lot like =20 !meet | not good. T. |
From: Clark C. E. <cc...@cl...> - 2004-09-02 21:11:36
|
summary: This is the sixth-pass draft, based on the fourth-pass. This pass primarily incorporates Brian's feedback, and feedback from T.Onoma. This draft changes two syntax items: - %tag is changed to %TAG - the | separator is changed to ^ # # note: switch from | to ^ suggested by T.Onoma has not been approved # by Oren and Brian, but it helps avoid syntax error caused by # a spurious space, eg, !bar | vs !bar| # One method of tag globalization is to use 'private' tags in your YAML document, and use a transformation of sorts (either explicit, or implicit by the application) to convert one's tags to a globally unique variety. This method is perfect for small teams where interoperability isn't a huge problem, and who do not wish to pay the price of mixing and matching globalized tags. The other method, is an XML namespace like mechanism where a tagURI can be broken into chunks, the first (longer) half of the tag, containing the taggingEntity, is moved up into the declaration and given a handle. The second (shorter) half is then used within each tag as an together with the handle that links it to the longer half. The combining of the parts is done by the parser, so the application always sees full tagURIs. This proposal makes both methods available to YAML users so that the first class of people, who do not require globally unique tags, need not be burdened by them. syntax: - We open up the tag mechanism !tag to allow one or more characters from the uric production of RFC2396. Thus, one can use %XX where X is a hex character, plus any combination of the following characters: ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 In particular, characters which may _not_ appear in a !tag are marked as 'unwise' in RFC2396, including: { } | \ ^ [ ] ` These characters will provide an 'escape hatch' for current and future extensions to YAML. With this change, any URI can be directly used as a !tag. We really can't use {} or [] since they signify mappings and lists. The \ character is used for escaping, and we use | to signify block and the backtick looks too much like the single quote to be useful. This leaves the ^ delimiter, which was already used for the older cut^paste mechanism. - We introduce a new directive 'tag' which provides a way to shorten the data entry of tagURIs. In particular, declaration := "%TAG:" taggingEntity ":" spec_first [ "^" handle ] Where 'taggingEntity' refers to the same production in the tagURI specification. The taggingEntity refers to either a domain or email address followed by the minting date; see tagURI specification for details. The 'spec_first' refers to zero or more uric characters (it is optional). The 'handle' refers to a sequence of one or more word characters [a-zA-Z0-9_]. Optionally the '^' and handle can be missing, this case is called the 'default prefix' and the handle is considered to be the empty string ''. In a YAML document, each handle must be unique via string comparison. - We extend the !tag mechanism to allow a single '`' character, which is in the reserved characters above, the syntax for this special case is, taguri := '!' handle '^' spec_second In this circumstance, the 'handle' _must_ appear as a handle in one of the document's directives. The 'spec_second', is zero or more uric characters; with the restriction that either spec_first or spec_second (or both) must be at least one character. semantics: - For every special tag having a '^', the parser will do special cooking to join the information specified in the declaration together with the node's tag, such nodes will be treated as if they had been tagged, cooked := "!tag:' taggingEntity ":" spec_first spec_second Note that the 'handle' is not included in this information, it is considered a detail of the Presentation model, and should not occur in tools that comply with the Serialization nor Representation models. Thus, the 'handle' is _not_ part of the core YAML information model, it is merely a syntax-level trick to ease the burden of typing and human reading. Also note while other URI schemes may appear in a tag, this cooking mechanism purposefully constructs tagURIs; that is, globally unique identifiers lacking protocol or access semantics. - Tags not containing '`' and matching '^\w+', are considered URIs and passed through as-is. Therefore 'tag:' and 'http:' URIs are unaffected by default prefixing - If the document has a default prefix (a directive with an empty handle), then all other tags are cooked according to the rule above, using the taggingEntity and the spec_first from the directive with the empty handle. - Without a default prefix, all nodes not containing '^' are passed-through uncooked. design: - We are using the directive syntax, because it gives a clear indication that 'magic' is about to happen. Also, it localizes all of the declarations up-font. By using a directive, we set the precedent that other directive mechanisms may be added for other 'magical' needs if they show as much rationale as this one. This also allows us to easily identify which documents depend upon this magic. - This change makes private, uncooked tags the default, removing a ton of 'magic' from the average use cases, this should make YAML easier to grok and configure. - The "^" character was chosen because it is not included in RFC2396's uric production (aka taguri's specific), and it doesn't look like any of our other indicators. This character _was_ used for the previous cut^paste mechanism, but that mechanism is depreciated. - We use tagURI specification (http://taguri.org) to define the unique URIs. This follows previous versions of the YAML spec. The tagURI is used because it does not imply access semantics and defines an easily 'mint-able' unique identifier. - We purposefully named the directive TAG since it corresponds to the tagURI. If at a later date and time, we decide on another mechanism, say one based on HTTP schema access, we can add this directive independently, and, if appropriate phase out this directive. compatibility: - This proposal uses the same ^ character as the older cut^paste mechanism. This older syntax trick is not compatible with this proposal, and is depreciated. During the transition period, we recommend parser's keep the old cut^paste logic, with an appropriate warning, unless there is a %TAG directive, in this case, the usage above is implied. - The magical cooking rules in the core specification are also depreciated with this specification. Since the current version of the specification does not allow tags matching '^\w+:' and the %TAG parameter, either of these can be used to identify newer style YAML documents. When read with this semantics, all old-style tags become private types; as they were literally typed. It is recommended that an exception is thrown when a tag is not found; and give a command line option to provide a set of type handlers that meets the requirements of the old resolution mechanism for "YAML Tags". For PyYAML, it should probably continue to use !Class, for its private tags in the short run; thus continue to serialize as !!Class -- I'm not sure how to handle Syck, this is a big discussion. example: The following document, --- %TAG:bar.com,2004:timesheet/meeting^meet %TAG:foo.com,2004:shape/^shape %TAG:yaml.org,2002: !tag:baz.com,2004:mixed/list - event: !meet^ where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape^ellipse width: !float 10 height: 5 - event: !meet^ where: office time: 2004-09-09 10:00:00 duration: !int 1:00 text: boring shape: !shape^rectangle width: !float 10 height: 5 ... would differ in the Presentation Model, but would be identical in the Serialization and Representation model with, --- !tag:baz.com,2004:mixed/list - event: !tag:bar.com,2004:timesheet/meeting where: office time: 2004-09-09 10:00:00 duration: !tag:yaml.org,2002:int 1:00 text: boring shape: !tag:foo.com,2004:shape/ellipse width: !tag:yaml.org,2002:float 10 height: 5 - event: !tag:bar.com,2004:timesheet/meeting where: office time: 2004-09-09 10:00:00 duration: !tag:yaml.org,2002:int 1:00 text: boring shape: !tag:foo.com,2004:shape/rectangle width: !tag:yaml.org,2002:float 10 height: 5 ... Furthermore, --- - !int 23 # int - !!old-private # !old-private - !perl/Foo::Bar # perl/Foo::Bar - !python/tuple # python/tuple - !htsql.org,2004/request # htsql.org,2004/request --- %TAG:yaml.org,2002: - !int 23 # tag:yaml.org,2002:int - !!old-private # tag:yaml.org,2002:!old-private - !perl/Foo::Bar # tag:yaml.org,2002:perl/Foo::Bar - !python/tuple # tag:yaml.org,2002:python/tuple - !htsql.org,2004:request # tag:yaml.org,2002:htsql.org,2004:request --- %TAG:d9...@ho...,2004-09-02: - !int 23 # tag:d9...@ho...,2004-09-02:int - !bingles wozer # tag:d9...@ho...,2004-09-02:bingles - !zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing --- %TAG:d9...@ho...,2004-09-02: %TAG:yaml.org,2002:yaml - !yaml^int 23 # tag:yaml.org,2002:int - !bingles wozer # tag:d9...@ho...,2004-09-02:bingles - !zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing --- %TAG:d9...@ho...,2004-09-02:me - !int 23 # int - !python/tuple # python/tuple - !me^bingles wozer # tag:d9...@ho...,2004-09-02:bingles - !me^zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing Best, Clark |
From: Clark C. E. <cc...@cl...> - 2004-09-02 21:31:08
|
We are going to let this sit for a week or so, for a few reaons: (a) so we can get some real work done to pay the bills (b) to get some sleep (c) to give you all time to put together more collected couter proposals, state your approvals or objections; and, (d) to give Brian a chance to change his mind... So, with that, I'm headed to sleep. Clark On Thu, Sep 02, 2004 at 05:11:28PM -0400, Clark C. Evans wrote: | summary: | | This is the sixth-pass draft, based on the fourth-pass. This pass | primarily incorporates Brian's feedback, and feedback from T.Onoma. | This draft changes two syntax items: | | - %tag is changed to %TAG | | - the | separator is changed to ^ | | # | # note: switch from | to ^ suggested by T.Onoma has not been approved | # by Oren and Brian, but it helps avoid syntax error caused by | # a spurious space, eg, !bar | vs !bar| | # | | One method of tag globalization is to use 'private' tags in your YAML | document, and use a transformation of sorts (either explicit, or | implicit by the application) to convert one's tags to a globally | unique variety. This method is perfect for small teams where | interoperability isn't a huge problem, and who do not wish to pay the | price of mixing and matching globalized tags. | | The other method, is an XML namespace like mechanism where a tagURI | can be broken into chunks, the first (longer) half of the tag, | containing the taggingEntity, is moved up into the declaration and | given a handle. The second (shorter) half is then used within each | tag as an together with the handle that links it to the longer half. | The combining of the parts is done by the parser, so the application | always sees full tagURIs. | | This proposal makes both methods available to YAML users so that | the first class of people, who do not require globally unique | tags, need not be burdened by them. | | syntax: | | - We open up the tag mechanism !tag to allow one or more characters | from the uric production of RFC2396. Thus, one can use %XX where X | is a hex character, plus any combination of the following characters: | | ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 | | In particular, characters which may _not_ appear in a !tag are | marked as 'unwise' in RFC2396, including: | | { } | \ ^ [ ] ` | | These characters will provide an 'escape hatch' for current and | future extensions to YAML. With this change, any URI can be | directly used as a !tag. We really can't use {} or [] since they | signify mappings and lists. The \ character is used for escaping, | and we use | to signify block and the backtick looks too much | like the single quote to be useful. This leaves the ^ delimiter, | which was already used for the older cut^paste mechanism. | | - We introduce a new directive 'tag' which provides a way | to shorten the data entry of tagURIs. In particular, | | declaration := "%TAG:" taggingEntity ":" spec_first [ "^" handle ] | | Where 'taggingEntity' refers to the same production in the tagURI | specification. The taggingEntity refers to either a domain or email | address followed by the minting date; see tagURI specification for | details. The 'spec_first' refers to zero or more uric characters | (it is optional). | | The 'handle' refers to a sequence of one or more word characters | [a-zA-Z0-9_]. Optionally the '^' and handle can be missing, this | case is called the 'default prefix' and the handle is considered | to be the empty string ''. In a YAML document, each handle | must be unique via string comparison. | | - We extend the !tag mechanism to allow a single '`' character, | which is in the reserved characters above, the syntax for this | special case is, | | taguri := '!' handle '^' spec_second | | In this circumstance, the 'handle' _must_ appear as a handle in one | of the document's directives. The 'spec_second', is zero or more | uric characters; with the restriction that either spec_first or | spec_second (or both) must be at least one character. | | semantics: | | - For every special tag having a '^', the parser will do special | cooking to join the information specified in the declaration | together with the node's tag, such nodes will be treated as if | they had been tagged, | | cooked := "!tag:' taggingEntity ":" spec_first spec_second | | Note that the 'handle' is not included in this information, it is | considered a detail of the Presentation model, and should not occur | in tools that comply with the Serialization nor Representation | models. Thus, the 'handle' is _not_ part of the core YAML | information model, it is merely a syntax-level trick to ease the | burden of typing and human reading. | | Also note while other URI schemes may appear in a tag, this cooking | mechanism purposefully constructs tagURIs; that is, globally unique | identifiers lacking protocol or access semantics. | | - Tags not containing '`' and matching '^\w+', are considered URIs | and passed through as-is. Therefore 'tag:' and 'http:' URIs | are unaffected by default prefixing | | - If the document has a default prefix (a directive with an empty | handle), then all other tags are cooked according to the rule above, | using the taggingEntity and the spec_first from the directive with | the empty handle. | | - Without a default prefix, all nodes not containing '^' are | passed-through uncooked. | | design: | | - We are using the directive syntax, because it gives a clear | indication that 'magic' is about to happen. Also, it localizes all | of the declarations up-font. By using a directive, we set the | precedent that other directive mechanisms may be added for other | 'magical' needs if they show as much rationale as this one. This also | allows us to easily identify which documents depend upon this magic. | | - This change makes private, uncooked tags the default, removing | a ton of 'magic' from the average use cases, this should make | YAML easier to grok and configure. | | - The "^" character was chosen because it is not included | in RFC2396's uric production (aka taguri's specific), and | it doesn't look like any of our other indicators. This | character _was_ used for the previous cut^paste mechanism, | but that mechanism is depreciated. | | - We use tagURI specification (http://taguri.org) to define the | unique URIs. This follows previous versions of the YAML spec. | The tagURI is used because it does not imply access semantics | and defines an easily 'mint-able' unique identifier. | | - We purposefully named the directive TAG since it corresponds to the | tagURI. If at a later date and time, we decide on another mechanism, | say one based on HTTP schema access, we can add this directive | independently, and, if appropriate phase out this directive. | | compatibility: | | - This proposal uses the same ^ character as the older cut^paste | mechanism. This older syntax trick is not compatible with this | proposal, and is depreciated. During the transition period, we | recommend parser's keep the old cut^paste logic, with an | appropriate warning, unless there is a %TAG directive, in this | case, the usage above is implied. | | - The magical cooking rules in the core specification are also | depreciated with this specification. Since the current version | of the specification does not allow tags matching '^\w+:' and | the %TAG parameter, either of these can be used to identify newer | style YAML documents. When read with this semantics, all old-style | tags become private types; as they were literally typed. | | It is recommended that an exception is thrown when a tag is | not found; and give a command line option to provide a set | of type handlers that meets the requirements of the old | resolution mechanism for "YAML Tags". | | For PyYAML, it should probably continue to use !Class, | for its private tags in the short run; thus continue to | serialize as !!Class -- I'm not sure how to handle Syck, | this is a big discussion. | | example: | | The following document, | | --- %TAG:bar.com,2004:timesheet/meeting^meet | %TAG:foo.com,2004:shape/^shape | %TAG:yaml.org,2002: | !tag:baz.com,2004:mixed/list | - event: !meet^ | where: office | time: 2004-09-09 10:00:00 | duration: !int 1:00 | text: boring | shape: !shape^ellipse | width: !float 10 | height: 5 | - event: !meet^ | where: office | time: 2004-09-09 10:00:00 | duration: !int 1:00 | text: boring | shape: !shape^rectangle | width: !float 10 | height: 5 | ... | | would differ in the Presentation Model, but would be identical in the | Serialization and Representation model with, | | --- !tag:baz.com,2004:mixed/list | - event: !tag:bar.com,2004:timesheet/meeting | where: office | time: 2004-09-09 10:00:00 | duration: !tag:yaml.org,2002:int 1:00 | text: boring | shape: !tag:foo.com,2004:shape/ellipse | width: !tag:yaml.org,2002:float 10 | height: 5 | - event: !tag:bar.com,2004:timesheet/meeting | where: office | time: 2004-09-09 10:00:00 | duration: !tag:yaml.org,2002:int 1:00 | text: boring | shape: !tag:foo.com,2004:shape/rectangle | width: !tag:yaml.org,2002:float 10 | height: 5 | ... | | Furthermore, | | --- | - !int 23 # int | - !!old-private # !old-private | - !perl/Foo::Bar # perl/Foo::Bar | - !python/tuple # python/tuple | - !htsql.org,2004/request # htsql.org,2004/request | | --- %TAG:yaml.org,2002: | - !int 23 # tag:yaml.org,2002:int | - !!old-private # tag:yaml.org,2002:!old-private | - !perl/Foo::Bar # tag:yaml.org,2002:perl/Foo::Bar | - !python/tuple # tag:yaml.org,2002:python/tuple | - !htsql.org,2004:request # tag:yaml.org,2002:htsql.org,2004:request | | --- %TAG:d9...@ho...,2004-09-02: | - !int 23 # tag:d9...@ho...,2004-09-02:int | - !bingles wozer # tag:d9...@ho...,2004-09-02:bingles | - !zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing | | --- %TAG:d9...@ho...,2004-09-02: | %TAG:yaml.org,2002:yaml | - !yaml^int 23 # tag:yaml.org,2002:int | - !bingles wozer # tag:d9...@ho...,2004-09-02:bingles | - !zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing | | --- %TAG:d9...@ho...,2004-09-02:me | - !int 23 # int | - !python/tuple # python/tuple | - !me^bingles wozer # tag:d9...@ho...,2004-09-02:bingles | - !me^zoom/bing # tag:d9...@ho...,2004-09-02:zoom/bing | | Best, | | Clark | | | ------------------------------------------------------- | This SF.Net email is sponsored by BEA Weblogic Workshop | FREE Java Enterprise J2EE developer tools! | Get your free copy of BEA WebLogic Workshop 8.1 today. | http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core | -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: T. O. <tra...@ru...> - 2004-09-02 23:39:29
|
On Thursday 02 September 2004 05:11 pm, Clark C. Evans wrote: > summary: > > This is the sixth-pass draft, based on the fourth-pass. This pass > primarily incorporates Brian's feedback, and feedback from T.Onoma. > This draft changes two syntax items: > > - %tag is changed to %TAG > > - the | separator is changed to ^ > > # > # note: switch from | to ^ suggested by T.Onoma has not been approved > # by Oren and Brian, but it helps avoid syntax error caused by > # a spurious space, eg, !bar | vs !bar| > # > > One method of tag globalization is to use 'private' tags in your YAML > document, and use a transformation of sorts (either explicit, or > implicit by the application) to convert one's tags to a globally > unique variety. This method is perfect for small teams where > interoperability isn't a huge problem, and who do not wish to pay the > price of mixing and matching globalized tags. Not a big deal but I think this approach has broader application then that ascribed to it. > The other method, is an XML namespace like mechanism where a tagURI > can be broken into chunks, the first (longer) half of the tag, > containing the taggingEntity, is moved up into the declaration and > given a handle. The second (shorter) half is then used within each > tag as an together with the handle that links it to the longer half. > The combining of the parts is done by the parser, so the application > always sees full tagURIs. > > This proposal makes both methods available to YAML users so that > the first class of people, who do not require globally unique > tags, need not be burdened by them. > > syntax: > > - We open up the tag mechanism !tag to allow one or more characters > from the uric production of RFC2396. Thus, one can use %XX where X > is a hex character, plus any combination of the following characters: > > ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 > > In particular, characters which may _not_ appear in a !tag are > marked as 'unwise' in RFC2396, including: > > { } | \ ^ [ ] ` > > These characters will provide an 'escape hatch' for current and > future extensions to YAML. With this change, any URI can be > directly used as a !tag. We really can't use {} or [] since they > signify mappings and lists. The \ character is used for escaping, > and we use | to signify block and the backtick looks too much > like the single quote to be useful. This leaves the ^ delimiter, > which was already used for the older cut^paste mechanism. > > - We introduce a new directive 'tag' which provides a way > to shorten the data entry of tagURIs. In particular, - We introduce a new directive 'TAG' which provides a way > declaration := "%TAG:" taggingEntity ":" spec_first [ "^" handle ] > > Where 'taggingEntity' refers to the same production in the tagURI > specification. The taggingEntity refers to either a domain or email > address followed by the minting date; see tagURI specification for > details. The 'spec_first' refers to zero or more uric characters > (it is optional). > > The 'handle' refers to a sequence of one or more word characters > [a-zA-Z0-9_]. Optionally the '^' and handle can be missing, this > case is called the 'default prefix' and the handle is considered > to be the empty string ''. In a YAML document, each handle > must be unique via string comparison. Consider making the default prefix as separate directive, rather then a syntax variation on %TAG (%NS for example). This would more clearly distinguish the two approaches i.e. #1 has just an %NS, #2 has %TAG with optional %NS. Something to consider about this: %TAG is just "trickery", it "cooks" the tag throughout the doc, but %NS is not _just_ trickery -- that info needs to be passed up into the loader. > - We extend the !tag mechanism to allow a single '`' character, > which is in the reserved characters above, the syntax for this > special case is, A '`' character? Backtick? Confused about that. > taguri := '!' handle '^' spec_second > > In this circumstance, the 'handle' _must_ appear as a handle in one > of the document's directives. The 'spec_second', is zero or more > uric characters; with the restriction that either spec_first or > spec_second (or both) must be at least one character. > > semantics: > > - For every special tag having a '^', the parser will do special > cooking to join the information specified in the declaration > together with the node's tag, such nodes will be treated as if > they had been tagged, > > cooked := "!tag:' taggingEntity ":" spec_first spec_second > > Note that the 'handle' is not included in this information, it is > considered a detail of the Presentation model, and should not occur > in tools that comply with the Serialization nor Representation > models. Thus, the 'handle' is _not_ part of the core YAML > information model, it is merely a syntax-level trick to ease the > burden of typing and human reading. > > Also note while other URI schemes may appear in a tag, this cooking > mechanism purposefully constructs tagURIs; that is, globally unique > identifiers lacking protocol or access semantics. > > - Tags not containing '`' and matching '^\w+', are considered URIs > and passed through as-is. Therefore 'tag:' and 'http:' URIs > are unaffected by default prefixing Were you playing with the idea of using backtick? > - If the document has a default prefix (a directive with an empty > handle), then all other tags are cooked according to the rule above, > using the taggingEntity and the spec_first from the directive with > the empty handle. > > - Without a default prefix, all nodes not containing '^' are > passed-through uncooked. > > [snip] > > - The magical cooking rules in the core specification are also > depreciated with this specification. Since the current version > of the specification does not allow tags matching '^\w+:' and > the %TAG parameter, either of these can be used to identify newer > style YAML documents. When read with this semantics, all old-style > tags become private types; as they were literally typed. > > It is recommended that an exception is thrown when a tag is > not found; and give a command line option to provide a set > of type handlers that meets the requirements of the old > resolution mechanism for "YAML Tags". That seems pretty odd. I think maybe we should (after the spec if updated) just write a converter script. > For PyYAML, it should probably continue to use !Class, > for its private tags in the short run; thus continue to > serialize as !!Class -- I'm not sure how to handle Syck, > this is a big discussion. That'll be fun ;) In fact I have a question about it right now. --- i: !int 4 With this new system !int with be private, how does that relate to implicit typing? Does that mean that I'll get { i => "4" } rather then { i => 4 } ? --- A few other thoughts. I don't see why there is any good reason not to allow %TAG to apply to a whole stream: %TAG:bar.com,2004:timesheet/meeting^meet %TAG:foo.com,2004:shape/^shape --- rest: -of doc ... After all it's just parse trickery. No big deal. (Yet I wonder if one would ever really want to do this for NS. NS seems much more particular to a single document. Hmmm...?) My only other suggestion would be that you consider a directive document. Rather then just indiviudal directives. The two can be compatible (I think). So combined with the above: %TAG: % - bar.com,2004:timesheet/meeting^meet % - foo.com,2004:shape/^shape % - yaml.org,2002: --- !tag:baz.com,2004:mixed/list - event: !meet^ where: office ... (A bit more like comments... directives used to be in comments didn't they?) But if they need to be per document then: --- %TAG: % - bar.com,2004:timesheet/meeting^meet % - foo.com,2004:shape/^shape % - yaml.org,2002: !tag:baz.com,2004:mixed/list - event: !meet^ where: office ... Okay I think that's all. Great work, BTW! It really is looking much improved. -- T. |
From: Brian I. <in...@tt...> - 2004-09-03 01:15:42
|
T. You impress me with your thoughts and ideas. One question: Where have you been all these years? ;) On 02/09/04 19:39 -0400, T. Onoma wrote: > On Thursday 02 September 2004 05:11 pm, Clark C. Evans wrote: > > summary: > > > > This is the sixth-pass draft, based on the fourth-pass. This pass > > primarily incorporates Brian's feedback, and feedback from T.Onoma. > > This draft changes two syntax items: > > > > - %tag is changed to %TAG > > > > - the | separator is changed to ^ > > > > # > > # note: switch from | to ^ suggested by T.Onoma has not been approved > > # by Oren and Brian, but it helps avoid syntax error caused by > > # a spurious space, eg, !bar | vs !bar| > > # > > > > One method of tag globalization is to use 'private' tags in your YAML > > document, and use a transformation of sorts (either explicit, or > > implicit by the application) to convert one's tags to a globally > > unique variety. This method is perfect for small teams where > > interoperability isn't a huge problem, and who do not wish to pay the > > price of mixing and matching globalized tags. > > Not a big deal but I think this approach has broader application then that > ascribed to it. Me too. I think private tags are the cat's meow, the bee's knees. I think they will account for 90% of usage. We'll see. > > The other method, is an XML namespace like mechanism where a tagURI > > can be broken into chunks, the first (longer) half of the tag, > > containing the taggingEntity, is moved up into the declaration and > > given a handle. The second (shorter) half is then used within each > > tag as an together with the handle that links it to the longer half. > > The combining of the parts is done by the parser, so the application > > always sees full tagURIs. > > > > This proposal makes both methods available to YAML users so that > > the first class of people, who do not require globally unique > > tags, need not be burdened by them. > > > > syntax: > > > > - We open up the tag mechanism !tag to allow one or more characters > > from the uric production of RFC2396. Thus, one can use %XX where X > > is a hex character, plus any combination of the following characters: > > > > ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 > > > > In particular, characters which may _not_ appear in a !tag are > > marked as 'unwise' in RFC2396, including: > > > > { } | \ ^ [ ] ` > > > > These characters will provide an 'escape hatch' for current and > > future extensions to YAML. With this change, any URI can be > > directly used as a !tag. We really can't use {} or [] since they > > signify mappings and lists. The \ character is used for escaping, > > and we use | to signify block and the backtick looks too much > > like the single quote to be useful. This leaves the ^ delimiter, > > which was already used for the older cut^paste mechanism. > > > > - We introduce a new directive 'tag' which provides a way > > to shorten the data entry of tagURIs. In particular, > > - We introduce a new directive 'TAG' which provides a way > > > declaration := "%TAG:" taggingEntity ":" spec_first [ "^" handle ] > > > > Where 'taggingEntity' refers to the same production in the tagURI > > specification. The taggingEntity refers to either a domain or email > > address followed by the minting date; see tagURI specification for > > details. The 'spec_first' refers to zero or more uric characters > > (it is optional). > > > > The 'handle' refers to a sequence of one or more word characters > > [a-zA-Z0-9_]. Optionally the '^' and handle can be missing, this > > case is called the 'default prefix' and the handle is considered > > to be the empty string ''. In a YAML document, each handle > > must be unique via string comparison. > > Consider making the default prefix as separate directive, rather then a syntax > variation on %TAG (%NS for example). This would more clearly distinguish the > two approaches i.e. #1 has just an %NS, #2 has %TAG with optional %NS. > > Something to consider about this: %TAG is just "trickery", it "cooks" the tag > throughout the doc, but %NS is not _just_ trickery -- that info needs to be > passed up into the loader. Interesting. Maybe the cure for implicit globalization. > > - We extend the !tag mechanism to allow a single '`' character, > > which is in the reserved characters above, the syntax for this > > special case is, > > A '`' character? Backtick? Confused about that. > > > taguri := '!' handle '^' spec_second > > > > In this circumstance, the 'handle' _must_ appear as a handle in one > > of the document's directives. The 'spec_second', is zero or more > > uric characters; with the restriction that either spec_first or > > spec_second (or both) must be at least one character. > > > > semantics: > > > > - For every special tag having a '^', the parser will do special > > cooking to join the information specified in the declaration > > together with the node's tag, such nodes will be treated as if > > they had been tagged, > > > > cooked := "!tag:' taggingEntity ":" spec_first spec_second > > > > Note that the 'handle' is not included in this information, it is > > considered a detail of the Presentation model, and should not occur > > in tools that comply with the Serialization nor Representation > > models. Thus, the 'handle' is _not_ part of the core YAML > > information model, it is merely a syntax-level trick to ease the > > burden of typing and human reading. > > > > Also note while other URI schemes may appear in a tag, this cooking > > mechanism purposefully constructs tagURIs; that is, globally unique > > identifiers lacking protocol or access semantics. > > > > - Tags not containing '`' and matching '^\w+', are considered URIs > > and passed through as-is. Therefore 'tag:' and 'http:' URIs > > are unaffected by default prefixing > > Were you playing with the idea of using backtick? > > > - If the document has a default prefix (a directive with an empty > > handle), then all other tags are cooked according to the rule above, > > using the taggingEntity and the spec_first from the directive with > > the empty handle. > > > > - Without a default prefix, all nodes not containing '^' are > > passed-through uncooked. > > > > [snip] > > > > - The magical cooking rules in the core specification are also > > depreciated with this specification. Since the current version > > of the specification does not allow tags matching '^\w+:' and > > the %TAG parameter, either of these can be used to identify newer > > style YAML documents. When read with this semantics, all old-style > > tags become private types; as they were literally typed. > > > > It is recommended that an exception is thrown when a tag is > > not found; and give a command line option to provide a set > > of type handlers that meets the requirements of the old > > resolution mechanism for "YAML Tags". > > That seems pretty odd. I think maybe we should (after the spec if updated) > just write a converter script. > > > For PyYAML, it should probably continue to use !Class, > > for its private tags in the short run; thus continue to > > serialize as !!Class -- I'm not sure how to handle Syck, > > this is a big discussion. > > That'll be fun ;) In fact I have a question about it right now. > > --- > i: !int 4 > > With this new system !int with be private, how does that relate to implicit > typing? Does that mean that I'll get { i => "4" } rather then { i => 4 } ? Woah! We both stumbled back upon implicits at the same time. Hey since Clark and Oren have been ganging up on me, I declare T. as my new tag team wrestling partner! > --- > > A few other thoughts. > > I don't see why there is any good reason not to allow %TAG to apply to > a whole stream: +1 This is just too obvious to pass up guys! > %TAG:bar.com,2004:timesheet/meeting^meet > %TAG:foo.com,2004:shape/^shape > --- > rest: > -of doc > ... > > After all it's just parse trickery. No big deal. (Yet I wonder if one would > ever really want to do this for NS. NS seems much more particular to a single > document. Hmmm...?) Yep. Also the %YAML directive. > My only other suggestion would be that you consider a directive document. > Rather then just indiviudal directives. The two can be compatible (I think). > So combined with the above: > > %TAG: > % - bar.com,2004:timesheet/meeting^meet > % - foo.com,2004:shape/^shape > % - yaml.org,2002: > --- !tag:baz.com,2004:mixed/list > - event: !meet^ > where: office > ... Worth considering... > (A bit more like comments... directives used to be in comments didn't they?) > > But if they need to be per document then: > > --- %TAG: > % - bar.com,2004:timesheet/meeting^meet > % - foo.com,2004:shape/^shape > % - yaml.org,2002: > !tag:baz.com,2004:mixed/list > - event: !meet^ > where: office > ... > > Okay I think that's all. > > Great work, BTW! It really is looking much improved. Thanks for these neat ideas. Cheeers, Brian |
From: Clark C. E. <cc...@cl...> - 2004-09-03 01:45:30
|
On Thu, Sep 02, 2004 at 05:22:13PM -0700, Brian Ingerson wrote: | Me too. I think private tags are the cat's meow, the bee's knees. | I think they will account for 90% of usage. We'll see. It's already 99% of the usage in PyYaml | > Consider making the default prefix as separate directive, rather then a | > syntax variation on %TAG (%NS for example). This would more clearly | > distinguish the two approaches i.e. #1 has just an %NS, #2 has %TAG with | > optional %NS. | > | > Something to consider about this: %TAG is just "trickery", it "cooks" | > the tag throughout the doc, but %NS is not _just_ trickery -- that info | > needs to be passed up into the loader. | | Interesting. Maybe the cure for implicit globalization. See below. I thought alot about this... ick. | > Were you playing with the idea of using backtick? yea, too much like single quote; those are typos | > > It is recommended that an exception is thrown when a tag is | > > not found; and give a command line option to provide a set | > > of type handlers that meets the requirements of the old | > > resolution mechanism for "YAML Tags". | > | > That seems pretty odd. I think maybe we should (after the spec if | > updated) just write a converter script. That's the ideal, clearly. | > --- | > i: !int 4 | > | > With this new system !int with be private, how does that relate to implicit | > typing? Does that mean that I'll get { i => "4" } rather then { i => 4 } ? Well, you're using a private type 'int', what I would get mapped to depends upon what this private type 'int' does. If you want the answer to be clear, --- %TAG:yaml.org,2002: and you'll get the backward-compatible results. | Hey since Clark and Oren have been ganging up on me, I declare T. as my new | tag team wrestling partner! He's proven to be a good sounding board... ;) | > I don't see why there is any good reason not to allow %TAG to apply to | > a whole stream: | | +1 This is just too obvious to pass up guys! | | > %TAG:bar.com,2004:timesheet/meeting^meet | > %TAG:foo.com,2004:shape/^shape | > --- | > rest: | > -of doc | > ... | > | Yep. Also the %YAML directive. I don't have a problem with this; if the %TAG isn't defined, its an error, so it should be obvious you're missing part of the stream. Also, it makes it _damn_ clear that directives are _not_ content. +1 | > After all it's just parse trickery. No big deal. (Yet I wonder if one would | > ever really want to do this for NS. NS seems much more particular to a single | > document. Hmmm...?) I think the NS idea is quite a large monster with big harry legs. Seriously, if you added it, you'd need multiple namespaces, then ways to compare them, then semantics for dealing with two tags with the same value but different namespaces, etc. It's a mess. This door has a big sign on it: "There be Dragons Here" and they arn't imaginary. Let's not go there, please? | > %TAG: | > % - bar.com,2004:timesheet/meeting^meet | > % - foo.com,2004:shape/^shape | > % - yaml.org,2002: | > --- !tag:baz.com,2004:mixed/list | > - event: !meet^ | > where: office | > ... | | Worth considering... The other one is better | > Great work, BTW! It really is looking much improved. | | Thanks for these neat ideas. Well, it's been feedback from everyone and a civil framework that makes it all happen. Cheers! Clark |
From: T. O. <tra...@ru...> - 2004-09-03 03:10:07
|
On Thursday 02 September 2004 09:45 pm, Clark C. Evans wrote: > I think the NS idea is quite a large monster with big harry legs. > Seriously, if you added it, you'd need multiple namespaces, then > ways to compare them, then semantics for dealing with two tags > with the same value but different namespaces, etc. =A0It's a mess. > This door has a big sign on it: =A0"There be Dragons Here" and > they arn't imaginary. =A0Let's not go there, please? We can put a leash on the dragon if we make this simple rule: One doc, one namespace. If there's a need for multiple namespaces, then there's a need for a new=20 namespace (which brings them together). T. |
From: Oren Ben-K. <or...@be...> - 2004-09-03 07:14:30
|
On Friday 03 September 2004 03:22, Brian Ingerson wrote: > Hey since Clark and Oren have been ganging up on me, I declare T. as my n= ew > tag team wrestling partner! I'll see your T. Onoma and raise you David Hopwood :-) To the points: =2D %NS: -1. There's nothing special about the "default" tag, it doesn't ca= rry=20 any special semantics that need to be reported to the loader. All %TAG=20 prefixes, including the "default" one, are 100% syntactical trickery, nothi= ng=20 more, nothing else. As David said: > Does %NS identify the type of the document? Surely, the full tag name of > the root node does that. =2D One doc, one namespace: -10. It just doesn't work when you start to mix= =20 "schemas". See the GraphicTimesheet example. =2D Per stream headers. Clark said: >| > =A0 =A0 =A0 %TAG:bar.com,2004:timesheet/meeting^meet >| > =A0 =A0 =A0 %TAG:foo.com,2004:shape/^shape >| > =A0 =A0 =A0 --- >| > =A0 =A0 =A0 rest: >| > =A0 =A0 =A0 =A0 -of doc >| > =A0 =A0 =A0 ... >| >=20 >| Yep. Also the %YAML directive. > >I don't have a problem with this; if the %TAG isn't defined, its an >error, so it should be obvious you're missing part of the stream.=20 Not quite. This is only true for tags other than the "default" tag. Appendi= ng=20 a document with "private" tags to the end of a stream where a "default" tag= =20 has been defined will silently change its semantics. However, since this may be seen as a feature - as long as all the per-strea= m=20 directives must appear before the first document. After that, all directive= s=20 are per-document as usual. So, +1 for that, with a note that if you set up a default %TAG at the start= of=20 the stream, you had better mean it. Like you pointed out, %YAML would be restricted to per-document. =2D Directives document rather than simple directives: -1. We are working h= ard=20 to minimize the directives mechanism. Brian makes a huge fuss whenever we = =20 add something there, and quite rightly so! Allowing for a "document of=20 directives" is an overkill. So: +1 to per-stream %TAG. -1 to the rest. Have fun, Oren Ben-Kiki |
From: T. O. <tra...@ru...> - 2004-09-03 11:54:35
|
On Friday 03 September 2004 03:14 am, Oren Ben-Kiki wrote: > On Friday 03 September 2004 03:22, Brian Ingerson wrote: > > Hey since Clark and Oren have been ganging up on me, I declare T. as my > > new tag team wrestling partner! > > I'll see your T. Onoma and raise you David Hopwood :-) Hey now! I'm not a poker chip! > To the points: > > - %NS: -1. There's nothing special about the "default" tag, it doesn't > carry any special semantics that need to be reported to the loader. All > %TAG prefixes, including the "default" one, are 100% syntactical trickery, > nothing There is one thing special about it: it makes all your private globally unique. So, as you know, if you start with an all private doc and add a default prefix, the whole is "now for something completely different". > more, nothing else. As David said: > > Does %NS identify the type of the document? Surely, the full tag name of > > the root node does that. > > - One doc, one namespace: -10. It just doesn't work when you start to mix > "schemas". See the GraphicTimesheet example. Depends on where you want to mix "schemas", you're thinking is in the parser (where it has no semantic value). Mine is with the loader. If I know the default prefix (i.e. the document's namespace) then I would know how to interpret all the remaining tags. -- T. |
From: David H. <dav...@bl...> - 2004-09-03 03:57:42
|
T. Onoma wrote: > On Thursday 02 September 2004 05:11 pm, Clark C. Evans wrote: > >>summary: >> >> This is the sixth-pass draft, based on the fourth-pass. I'm going to keep banging on about the issue of additions to the YAML tag repository. If there is a policy of restricting new names in the repository to, say, [a-zA-Z0-9_]+, then people can choose to use private names that do not match that regexp without any risk that they will be confusable with future repository-defined tags. (This is useful despite the fact that there is nothing preventing collisions between two private names in general.) >> - We open up the tag mechanism !tag to allow one or more characters >> from the uric production of RFC2396. Thus, one can use %XX where X >> is a hex character, plus any combination of the following characters: >> >> ; / ? : @ & = + $ , - _ . ! ~ * ' ( ) # A-Z a-z 0-9 '#' is only allowed in URI references. I agree that it should be included, but in that case some references to "URI" in the spec need to be changed to "URI reference". >> In particular, characters which may _not_ appear in a !tag are >> marked as 'unwise' in RFC2396, including: >> >> { } | \ ^ [ ] ` More precisely, no other characters than the ones in the first list above may appear in a !tag. As well as 'unwise', this includes non-ASCII characters, control characters, ", <, >, space, and % when not used to introduce an escape. The [ and ] characters are added to 'uric' in <http://www.rfc-editor.org/internet-drafts/draft-fielding-uri-rfc2396bis-06.txt>. They are used only for literal IPv6 addresses in the 'host' field. If RFC2396bis is adopted as an RFC and Obsoletes RFC2396, then implementors might assume that its version of uric should be used because it is more up-to-date; that would be incorrect. So the YAML spec should probably give the grammar of tags explicitly rather than referencing productions from RFC2396. >> - Tags not containing '`' and matching '^\w+', are considered URIs >> and passed through as-is. Therefore 'tag:' and 'http:' URIs >> are unaffected by default prefixing My regexes are a little rusty, but I'm confused by the use of '^\w+' here. Are you using ^ to mean the start of the string, or the literal character ^, or negation of \w (i.e. '[^a-zA-Z0-9_]+', but that can't be right), or "not matching \w+"? I would have thought what was needed would be a production that matches all absolute URI references and excludes as many other strings as possible, but does not require knowledge of the syntax of particular URI schemes. Also, "are considered URIs" is wrong because URIs have additional requirements that we don't want to have to check. I suggest something like: - Tag names that, after YAML unescaping, would match the EBNF production ns-uri-tag, where ns-uri-tag ::= ns-uri-char* ":" ns-uri-char* ns-uri-char ::= ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," | "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" | "#" | ns-ascii-letter | ns-decimal-digit | ("%" ns-hex-digit x 2) are passed through as-is. This is a superset of the syntax for URI references (with the unimportant exception of URI references using RFC 2732 literal IPv6 addresses). Therefore URI references (whether using 'tag:' or any other scheme) are unaffected by default prefixing. A tag name is affected by prefixing iff it either contains '^', or does not contain ':'. > Consider making the default prefix as separate directive, rather then a syntax > variation on %TAG (%NS for example). This would more clearly distinguish the > two approaches i.e. #1 has just an %NS, #2 has %TAG with optional %NS. > > Something to consider about this: %TAG is just "trickery", it "cooks" the tag > throughout the doc, but %NS is not _just_ trickery -- that info needs to be > passed up into the loader. Why does it need to be passed to the loader? I was under the impression that, e.g. --- %TAG:example.org,2004: - !foo bar was intended to be semantically equivalent to --- - !tag:example.org,2004:foo bar >> For PyYAML, it should probably continue to use !Class, >> for its private tags in the short run; thus continue to >> serialize as !!Class -- I'm not sure how to handle Syck, >> this is a big discussion. > > That'll be fun ;) In fact I have a question about it right now. > > --- > i: !int 4 > > With this new system !int with be private, how does that relate to implicit > typing? Does that mean that I'll get { i => "4" } rather then { i => 4 } ? If the namespace is "tag:yaml.org,2002:" then you'll clearly get { i => 4 }. What you get if the namespace is something else, I don't know. This needs to be clarified. > My only other suggestion would be that you consider a directive document. > Rather then just indiviudal directives. The two can be compatible (I think). > So combined with the above: > > %TAG: > % - bar.com,2004:timesheet/meeting^meet > % - foo.com,2004:shape/^shape > % - yaml.org,2002: > --- !tag:baz.com,2004:mixed/list > - event: !meet^ > where: office > ... Seems more complicated than needed, especially if both forms are supported. -- David Hopwood <dav...@bl...> |
From: Oren Ben-K. <or...@be...> - 2004-09-03 08:00:48
|
On Friday 03 September 2004 06:57, David Hopwood wrote: > '#' is only allowed in URI references. I agree that it should be included, > but in that case some references to "URI" in the spec need to be changed to > "URI reference". Strictly speaking, _all_ references to "URI" in the spec need to be replaced with "URI reference". Thanks for catching that. > The [ and ] characters are added to 'uric' in > <http://www.rfc-editor.org/internet-drafts/draft-fielding-uri-rfc2396bis-06 >.txt>. They are used only for literal IPv6 addresses in the 'host' field. If > RFC2396bis is adopted as an RFC and Obsoletes RFC2396, then implementors > might assume that its version of uric should be used because it is more > up-to-date; that would be incorrect. Hmmm. Would it? This would prevent people from using IPv6 URIs. Not that I'm thrilled about such (ab)use of the !tag system, but if we allow "every URI reference" we should allow _every_ URI reference. As you have correctly said: > "are considered URIs" is wrong because URIs have additional requirements > that we don't want to have to check. So, I think the spec should allow any non-space character in a tag, and simply state that the result should be a valid URI under the relevant scheme, period. It wouldn't even refer to RFC2396 - not much point since the full set of restrictions on 'xyzzy:' URIs isn't specified there anyway. Parser implementations will simply take the tag "as is" (no form of escaping or cooking whatsoever) and pass it to the loader. This way the parser won;t need to know anything about the general URIs requirements and the additional requirements of the specific URI scheme. > >> - Tags not containing '`' and matching '^\w+', are considered URIs > >> and passed through as-is. Therefore 'tag:' and 'http:' URIs > >> are unaffected by default prefixing > > My regexes are a little rusty, but I'm confused by the use of '^\w+' > here. I think Clark means "tags that aren't simple words". And you are right, that's the wrong way to write it as a regexp, not to mention its the wrong pattern to use. > I would have thought what was needed would be a production that matches all > absolute URI references and excludes as many other strings as possible, but > does not require knowledge of the syntax of particular URI schemes. Exactly right. According to TFC2396, tags that "look like URIs" match the regexp "^[a-zA-Z_0-9+.\-]+:.*$" (where ^ means start-of-text, $ is end-of-text, etc.). Note that "-", "+" and "." are allowed in URI scheme names in addition to letters and numbers. I have no idea why this is so... but it is. This works fine most of the time: !foo/bar # Not a URI !baz # Not a URI !isbn:... # A URI There's just one place where it "creaks": !Net::FTP # Looks like a URI. Oops! As a courtesy to influentiual Perl users that shall remain unnamed :-), we have considered changing the regexp to "^[a-zA-Z_0-9+.\-]+:[^:].*$", that is say that repeating the ":" means the tag no longer "looks like a URI". Of course, if someone comes up with a URI scheme that uses a ":" as the very first character in the scheme-specific part, we'll be SOL, but that's a very remote danger compared to the very real pain for Perl users. Put in YAML BNF: uri-scheme-char ::= letter | digit | "-" | "+" | "." tag-looking-like-uri ::= uri-scheme-char+ ":" ( ns-char - ":" ) ns-char* tag-using-prefix ::= ( letter | digit )* "^" ns-char* private-tag ::= ( ns-char+ ) - tag-looking-like-uri - tag-using-prefix And then: [tag-looking-like-uri] > are passed through as-is. This is a superset of the syntax for URI > references. Therefore URI references (whether > using 'tag:' or any other scheme) are unaffected by default prefixing. > A tag name is affected by prefixing iff it matches tag-using-prefix. All other tags are "private" (and affected by the "default" %TAG directive, if any). And: - tag-looking-like-uri must be valid according to the requirements of the specific URI scheme used. - the result of "cooking" tag-using-prefix must be valid according to the requirements of the "tag:" URI scheme. - private-tag has no additional restrictions, unless a "default" %TAg directive is given. In this case, the result of "cooking" private-tag must be valid according to the requirements of the "tag:" URI scheme. - Other than "cooking", no other processing is done to tag characters. Specifically, no form of escaping is applied by the YAML processor. URIs in general define an escape mechanism (%xx). Since the semantics are specific to the particular URI scheme used, preserving the tag's semantics prevents the YAML processor from expanding or inserting such escape sequences and in general requires it to simply pass the (cooked) tag through "as is". Have fun, Oren Ben-Kiki |
From: David H. <dav...@bl...> - 2004-09-03 05:14:25
|
T. Onoma wrote: > On Thursday 02 September 2004 11:57 pm, David Hopwood wrote: > >>>Something to consider about this: %TAG is just "trickery", it "cooks" the >>>tag throughout the doc, but %NS is not _just_ trickery -- that info needs >>>to be passed up into the loader. >> >>Why does it need to be passed to the loader? I was under the impression >>that, e.g. >> >> --- %TAG:example.org,2004: >> - !foo bar >> >>was intended to be semantically equivalent to >> >> --- >> - !tag:example.org,2004:foo bar > > In that the %NS identifies the type of document. %TAG only "cooks" the > identities of nodes. Does %NS identify the type of the document? Surely, the full tag name of the root node does that. -- David Hopwood <dav...@bl...> |