From: Clark C . E. <cc...@cl...> - 2001-12-18 15:41:13
|
Yamlers, I was thinking, given the GUID example below, that separating the "namespace" from the "type" may be a bad idea. Perhaps our | separator is just syntax sugar. Consider the following two GUIDS... yaml://COM/GUID/{F3CA571F-C5DA-11CF-8F28-00AA006AAAAA} yaml://COM/GUID/{F3CA571F-C5DA-11CF-8F28-00AA006FFFFF} I've seen the case where the left part of the GUID is constant, while the right part varies from component to component. What if... --- !COM/GUID/{F3CA571F-C5DA-11CF-8F28-00AA006|AAAAA} child: !|FFFFF} Were the same as... --- !COM/GUID/{F3CA571F-C5DA-11CF-8F28-00AA006AAAAA} child: !COM/GUID/{F3CA571F-C5DA-11CF-8F28-00AA006FFFF} In other words, the | is just a function of the syntax model, leaving all objects in the system with just a family and format. This would mean... that the period is needed after lang ... --- !java/java.lang.|Dictionary child: !|Integer 34 Which is parsed as... --- !java/java.lang.Dictionary child: !java/java.lang.Integer In other words, the | acts as a very very simple splitter to save typing. It is only a syntax device... I kinda like this notion. --- !perl/Net::|Ftp - !|Smtp is then parsed as... --- !perl/Net::Ftp - !perl/Net::Smtp This saves us the whole headache of defining what is a namespace vs what is a class name. For the XML binding, we just take a fixed character that is not part of QName and use it to split the namespace from the element/attribute name. Thus, the last occurance of this character in our "family" name can be used to split the XML namespace from the element/attribute name. I like this beacuse it's not as smart. Sometimes dumb can be very useful. Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |
From: Clark C . E. <cc...@cl...> - 2001-12-18 15:11:26
|
FYI. Very good response to our type: URI scheme notion... Best, Clark ----- Forwarded message from "Fred L. Drake, Jr." <fd...@ac...> ----- Date: Tue, 18 Dec 2001 10:02:05 -0500 (EST) To: "Clark C . Evans" <cc...@cl...> Cc: xml-dev <xm...@li...> Subject: Re: [xml-dev] URI for class names and native data types From: "Fred L. Drake, Jr." <fd...@ac...> Clark C . Evans writes: > On Tue, Dec 18, 2001 at 01:15:43PM -0000, Leigh Dodds wrote: > | For Java purposes, why not use the javadoc URI? > | e.g. http://java.sun.com/j2se/1.3/docs/api/java/lang/Integer.html > works across multiple languages. After reading the > RFC a bit more, I was thinking something like this: > > type://language/the/package/path#classname Some comments: 1. The "//" is the hostname separator, and doesn't make sense for this kind of URN. So your template becomes type:language/the/package/path#classname 2. I can see how you make names for classes in modules, but there's more to it than that. Most languages (not all) have some types for which there is no corresponding named type in the package namespace. Numeric types often fall into this category; think of "int" in C/C++. How could these be spelled? Perhaps there needs to be another "chunk" to the URNs. Let's try this, and change the "chunk separator" to ":" instead of "/", to make it more easily distinguished: type:language:intrinsic/int A class in a package/module might then be: type:java:class/java/lang#String type:python:class/win32com/client#genpy type:perl:class/Net/Ftp 3. Have you considered types that are defined abstractly rather than for a specific programming language? Say, in an interface definition language such as CORBA IDL, or some other formal specification? type:abstract:ieee754/double type:abstract:corba/3.0/my/module#interface There are still (huge!) potential problems with these, I think. Versioning is as much a problem here as in other areas, and would probably need to be dealt with. Naming conflicts exist in the space being described, as well, so it's hard to name types that have the same name by coincidence: suppose two companies build their own MailSystem::Server interface using CORBA, and your job is to write a report explaining which one your employer should adopt. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation ----- End forwarded message ----- -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |
From: Clark C . E. <cc...@cl...> - 2001-12-18 17:01:59
|
Ok. Here is a formal proposal which is completely different, but I think draws a nicer ballence. 1. We don't distingusih between the name and the namespace in the graph nor tree information model 2. We use the "opaque" productions in the URI. It takes the general form: "yaml:<language>/<language-specific>" where the structure of <language-specific> depends upon the particular language. The "any" language takes the form of reverse DNS strings using the period. 3. We use ^ as a "dummy" splitter for syntax short-hands. This is a non-intelligent mechanism, !^ looks up in the ancestor tree for anything having the form !<whatever>^ and prepends <whatever> in place of the ^ --- !yaml:any/org.yaml.^seq - !^int 23 is short-hand for... --- !yaml:any/org.yaml.seq - !yaml:any/org.yaml.int 23 4. We use | for the format separator. --- !yaml:any/org.yaml.int|hex \ 0xFF 5. We have three "short hands" where a colon does not appear in the !transfer-string a. If there is no period and no slash, then the family name is prepended with "yaml:any/org.yaml." b. If there is a period, but no slash then the transfer string is prepended with "yaml:any/" c. In all other cases without a colon and having a slash, "yaml:" is prepended. Examples (after normalization via ^ ) !seq -> !yaml:any/org.yaml.seq !com.clarkevans.ts -> !yaml:any/com.clarkevans.ts !java/java.lang.String -> !yaml:java/java.lang.String !perl/Net::Ftp -> !yaml:perl/Net::Ftp !yaml:any/org.yaml.seq -> !yaml:any/org.yaml.seq !http://google.com -> !http://google.com Impacts on Perl... 1. The % in URI syntax is an escape, thus is represented as %25 This means that all objects should be assumed to be hashes (%), and you only use $ and @ 2. You can use the ^ as you wish to save typing. --- !perl/Net::^Ftp - !^Smtp - !perl/Net::Ftp becomes (after both types of substitution) --- !yaml:perl/Net::Ftp - !yaml:perl/Net::Smtp - !yaml:perl/Net::Ftp Overall, I spent too much time working this out... but a "dumber" mechanism like this may be exactly what the doctor ordered. Of course, it prevents namespace comparisons in the information model. However, I'm not so sure that this is all that useful anyway; and besides, a namespace comparison can be done via sub-strings. For reference, the opaque_part of the URI RFC is http://www.ietf.org/rfc/rfc2396.txt is summarized: opaque_part = uric_no_slash *uric uric_no_slash = unreserved | escaped | (reserved - "/") uric = reserved | unreserved | escaped reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" escaped = "%" hex hex hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" digit/alphanum = defined in the usual manner. NOTE: We'd probably allow much greater unicode freedom in digit/alphanum, as the newer URI RFC drafts do. ... This leaves the "unwise" characters out of the possibilities... unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" That leaves us a few binding options for our "splitter" and the "format" divider. splitter format example | ^ yaml:any/org.yaml.|int^hex | ` yaml:any/org.yaml.|int`hex | \ yaml:any/org.yaml.|int\hex \ | yaml:any/org.yaml.\int|hex ^ | yaml:any/org.yaml.^int|hex ` | yaml:any/org.yaml.`int|hex That said... I kinda like yaml:any/org.yaml^int|hex, so this is what I proposed. Hopefully submitted, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |
From: Clark C . E. <cc...@cl...> - 2001-12-18 17:14:17
|
This proposal has a number of advantages over the previous proposal: 1. The distinction between the namespace and the name is tenuous and has caused alot of grief in the XML world, IMHO. Also, with the COM example posted on the xml-dev list, it's clear that the notion of a "package" is not universal. 2. By keeping our hierarchical short-hand dumb we can abbreviate at any point... it need not be along package boundaries. Also, I like using ^ for the short-hand, since it kinda points upwards. Further, we haven't used this character as an indicator, so it kinda works out nicely. 3. It is a completely unified "absolute URI reference" mechanism (yes, we should allow #fragments, since some XML URIs use them). However, we should not do "relative URI references" as this will cause no end of grief since it requires a YAML BASE spec and all kinds of indeterminism.... yuck! 4. I like the | for the format separator. Pretty. 5. This should make Brian happy... I hope. The only show-stopper is the % hash escaping. But hopefully this can be his default... and he can use $ and @ for scalars and arrays as required. 6. The URIs are actually readable! And simple substring operations should work very well on them! Kind Regards, Clark On Tue, Dec 18, 2001 at 12:15:50PM -0500, Clark C . Evans wrote: | Ok. Here is a formal proposal which is completely different, | but I think draws a nicer ballence. | | 1. We don't distingusih between the name and the namespace | in the graph nor tree information model | | 2. We use the "opaque" productions in the URI. It | takes the general form: "yaml:<language>/<language-specific>" | where the structure of <language-specific> depends upon | the particular language. The "any" language takes the | form of reverse DNS strings using the period. | | 3. We use ^ as a "dummy" splitter for syntax short-hands. | This is a non-intelligent mechanism, !^ looks up in | the ancestor tree for anything having the form !<whatever>^ | and prepends <whatever> in place of the ^ | | --- !yaml:any/org.yaml.^seq | - !^int 23 | | is short-hand for... | | --- !yaml:any/org.yaml.seq | - !yaml:any/org.yaml.int 23 | | 4. We use | for the format separator. | | --- !yaml:any/org.yaml.int|hex \ | 0xFF | | 5. We have three "short hands" where a colon does not | appear in the !transfer-string | | a. If there is no period and no slash, then | the family name is prepended with | "yaml:any/org.yaml." | | b. If there is a period, but no slash then | the transfer string is prepended with | "yaml:any/" | | c. In all other cases without a colon | and having a slash, "yaml:" is prepended. | | Examples (after normalization via ^ ) | | !seq -> !yaml:any/org.yaml.seq | !com.clarkevans.ts -> !yaml:any/com.clarkevans.ts | !java/java.lang.String -> !yaml:java/java.lang.String | !perl/Net::Ftp -> !yaml:perl/Net::Ftp | !yaml:any/org.yaml.seq -> !yaml:any/org.yaml.seq | !http://google.com -> !http://google.com | | Impacts on Perl... | | 1. The % in URI syntax is an escape, thus is | represented as %25 | | This means that all objects should be assumed | to be hashes (%), and you only use $ and @ | | 2. You can use the ^ as you wish to save typing. | | --- !perl/Net::^Ftp | - !^Smtp | - !perl/Net::Ftp | | becomes (after both types of substitution) | | --- !yaml:perl/Net::Ftp | - !yaml:perl/Net::Smtp | - !yaml:perl/Net::Ftp | | Overall, I spent too much time working this out... | but a "dumber" mechanism like this may be exactly | what the doctor ordered. Of course, it prevents | namespace comparisons in the information model. | However, I'm not so sure that this is all that | useful anyway; and besides, a namespace comparison | can be done via sub-strings. | | For reference, the opaque_part of the URI RFC is | http://www.ietf.org/rfc/rfc2396.txt is summarized: | | opaque_part = uric_no_slash *uric | uric_no_slash = unreserved | escaped | (reserved - "/") | uric = reserved | unreserved | escaped | reserved = ";" | "/" | "?" | ":" | "@" | | "&" | "=" | "+" | "$" | "," | unreserved = alphanum | mark | mark = "-" | "_" | "." | "!" | "~" | | "*" | "'" | "(" | ")" | escaped = "%" hex hex | hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | | "a" | "b" | "c" | "d" | "e" | "f" | digit/alphanum = defined in the usual manner. | | NOTE: We'd probably allow much greater | unicode freedom in digit/alphanum, | as the newer URI RFC drafts do. | ... | | This leaves the "unwise" characters out of the possibilities... | unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" | | That leaves us a few binding options for our "splitter" | and the "format" divider. | | splitter format example | | ^ yaml:any/org.yaml.|int^hex | | ` yaml:any/org.yaml.|int`hex | | \ yaml:any/org.yaml.|int\hex | \ | yaml:any/org.yaml.\int|hex | ^ | yaml:any/org.yaml.^int|hex | ` | yaml:any/org.yaml.`int|hex | | That said... I kinda like yaml:any/org.yaml^int|hex, | so this is what I proposed. | | Hopefully submitted, | | Clark | | | -- | Clark C. Evans Axista, Inc. | http://www.axista.com 800.926.5525 | XCOLLA Collaborative Project Management Software | | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |
From: Clark C . E. <cc...@cl...> - 2001-12-18 18:01:51
|
A bit more formally.... (the normal form, not the abbreviated versions) uri_yaml = "yaml:" uri_lang "/" uri_part uri_lang = uri_gen uri_part = uri_gen uri_gen = ";" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," | "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" | alphanum | uri_esc uri_esc = "%" hex hex hex/digit/alphanum = refer to productions in our spec ... In a non-URI compatible way, we should allow unicode values in our serialization, by our \ style escaping. However, we should clearly state that this is not RFC complaint. On Tue, Dec 18, 2001 at 12:28:08PM -0500, Clark C . Evans wrote: | This proposal has a number of advantages over the | previous proposal: | | 1. The distinction between the namespace and the name | is tenuous and has caused alot of grief in the XML | world, IMHO. Also, with the COM example posted | on the xml-dev list, it's clear that the notion | of a "package" is not universal. | | 2. By keeping our hierarchical short-hand dumb we | can abbreviate at any point... it need not be | along package boundaries. Also, I like using | ^ for the short-hand, since it kinda points | upwards. Further, we haven't used this character | as an indicator, so it kinda works out nicely. | | 3. It is a completely unified "absolute URI reference" | mechanism (yes, we should allow #fragments, since | some XML URIs use them). However, we should not | do "relative URI references" as this will cause | no end of grief since it requires a YAML BASE | spec and all kinds of indeterminism.... yuck! | | 4. I like the | for the format separator. Pretty. | | 5. This should make Brian happy... I hope. The only | show-stopper is the % hash escaping. But hopefully | this can be his default... and he can use $ and @ for | scalars and arrays as required. | | 6. The URIs are actually readable! And simple substring | operations should work very well on them! | | Kind Regards, | | Clark | | | On Tue, Dec 18, 2001 at 12:15:50PM -0500, Clark C . Evans wrote: | | Ok. Here is a formal proposal which is completely different, | | but I think draws a nicer ballence. | | | | 1. We don't distingusih between the name and the namespace | | in the graph nor tree information model | | | | 2. We use the "opaque" productions in the URI. It | | takes the general form: "yaml:<language>/<language-specific>" | | where the structure of <language-specific> depends upon | | the particular language. The "any" language takes the | | form of reverse DNS strings using the period. | | | | 3. We use ^ as a "dummy" splitter for syntax short-hands. | | This is a non-intelligent mechanism, !^ looks up in | | the ancestor tree for anything having the form !<whatever>^ | | and prepends <whatever> in place of the ^ | | | | --- !yaml:any/org.yaml.^seq | | - !^int 23 | | | | is short-hand for... | | | | --- !yaml:any/org.yaml.seq | | - !yaml:any/org.yaml.int 23 | | | | 4. We use | for the format separator. | | | | --- !yaml:any/org.yaml.int|hex \ | | 0xFF | | | | 5. We have three "short hands" where a colon does not | | appear in the !transfer-string | | | | a. If there is no period and no slash, then | | the family name is prepended with | | "yaml:any/org.yaml." | | | | b. If there is a period, but no slash then | | the transfer string is prepended with | | "yaml:any/" | | | | c. In all other cases without a colon | | and having a slash, "yaml:" is prepended. | | | | Examples (after normalization via ^ ) | | | | !seq -> !yaml:any/org.yaml.seq | | !com.clarkevans.ts -> !yaml:any/com.clarkevans.ts | | !java/java.lang.String -> !yaml:java/java.lang.String | | !perl/Net::Ftp -> !yaml:perl/Net::Ftp | | !yaml:any/org.yaml.seq -> !yaml:any/org.yaml.seq | | !http://google.com -> !http://google.com | | | | Impacts on Perl... | | | | 1. The % in URI syntax is an escape, thus is | | represented as %25 | | | | This means that all objects should be assumed | | to be hashes (%), and you only use $ and @ | | | | 2. You can use the ^ as you wish to save typing. | | | | --- !perl/Net::^Ftp | | - !^Smtp | | - !perl/Net::Ftp | | | | becomes (after both types of substitution) | | | | --- !yaml:perl/Net::Ftp | | - !yaml:perl/Net::Smtp | | - !yaml:perl/Net::Ftp | | | | Overall, I spent too much time working this out... | | but a "dumber" mechanism like this may be exactly | | what the doctor ordered. Of course, it prevents | | namespace comparisons in the information model. | | However, I'm not so sure that this is all that | | useful anyway; and besides, a namespace comparison | | can be done via sub-strings. | | | | For reference, the opaque_part of the URI RFC is | | http://www.ietf.org/rfc/rfc2396.txt is summarized: | | | | opaque_part = uric_no_slash *uric | | uric_no_slash = unreserved | escaped | (reserved - "/") | | uric = reserved | unreserved | escaped | | reserved = ";" | "/" | "?" | ":" | "@" | | | "&" | "=" | "+" | "$" | "," | | unreserved = alphanum | mark | | mark = "-" | "_" | "." | "!" | "~" | | | "*" | "'" | "(" | ")" | | escaped = "%" hex hex | | hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | | | "a" | "b" | "c" | "d" | "e" | "f" | | digit/alphanum = defined in the usual manner. | | | | NOTE: We'd probably allow much greater | | unicode freedom in digit/alphanum, | | as the newer URI RFC drafts do. | | ... | | | | This leaves the "unwise" characters out of the possibilities... | | unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" | | | | That leaves us a few binding options for our "splitter" | | and the "format" divider. | | | | splitter format example | | | ^ yaml:any/org.yaml.|int^hex | | | ` yaml:any/org.yaml.|int`hex | | | \ yaml:any/org.yaml.|int\hex | | \ | yaml:any/org.yaml.\int|hex | | ^ | yaml:any/org.yaml.^int|hex | | ` | yaml:any/org.yaml.`int|hex | | | | That said... I kinda like yaml:any/org.yaml^int|hex, | | so this is what I proposed. | | | | Hopefully submitted, | | | | Clark | | | | | | -- | | Clark C. Evans Axista, Inc. | | http://www.axista.com 800.926.5525 | | XCOLLA Collaborative Project Management Software | | | | _______________________________________________ | | Yaml-core mailing list | | Yam...@li... | | https://lists.sourceforge.net/lists/listinfo/yaml-core | | -- | Clark C. Evans Axista, Inc. | http://www.axista.com 800.926.5525 | XCOLLA Collaborative Project Management Software | | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |