From: Matt C. <mat...@va...> - 2008-02-20 04:21:12
|
After a little more thought, absolute instances of xs:anyURI will not always work as a fragment identifier. If a spectrum's id attribute was xs:anyURI in file "foo.mzML": <spectrum id="file://foo.1.1.1.dta" /> <!-- this is a valid (absolute) xs:anyURI --> And in something like pepXML or analysisXML: <spectrumQuery spectrumRef="file://foo.mzML#file://foo.1.1.1.dta" /> <!-- not a valid xs:anyURI! --> Unless I'm missing something, using xs:anyURI for fragment identifiers would actually make the schema less safe. Valid mzML ids would be potentially unusable in an external URI unless URL-encoded, which would defeat the point of using xs:anyURI in the first place. -Matt Randy Julian wrote: > All we are trying to achieve with the anyURI is safety for use within a > URI. Any xlink-safe way of doing this will work. So if xs:ID is > supported better by the validating parsers, it would do what we want. > > -----Original Message----- > From: psi...@li... > [mailto:psi...@li...] On Behalf Of > Matthew Chambers > Sent: Tuesday, February 19, 2008 4:25 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] Relative URIs and RFC-2396 > > You are right Randy, we were forgetting about relative URIs which can > simply refer to a resource's name with no path at all ("1" is certainly > a valid resource name). However, I think anyURI is still a bad idea for > any attribute which is not intended to be able to refer to something in > a remote location (e.g. not in the current file). The "id" attribute in > the XML namespace has type "xs:ID" which has semantics more along the > lines of what I think you want. If I understand the use case correctly, > it is desirable to be able to link to certain mzML elements from > external documents with a URI, like: > file://data_source.mzML#s555 > This is an example absolute URI reference to a spectrum in a file at > "data_source.mzML" where the spectrum's id attribute is "s555". It > wouldn't make sense for the id itself to be a URI, although the > reference to it can (and should) be. > > So: > 1) for id attributes which can be referred to externally or internally, > use the type "xs:ID" > 2) for references to external or internal resources by their id > attribute, use the type "xs:anyURI" > > This would have the problem of the Xerxes C parser not validating > relative URIs correctly, but that seems to be wrong on their part. :/ > Anyway, users of Xerxes C can turn off the validation feature to work > around it. > > Also, Ref attributes in mzML could use anyURI for consistency reasons > even though we don't currently know of a use case where such references > would be made to an external file. > > -Matt > > > Randy Julian wrote: > >> Per our conversation today, the relevant specification is RFC-2396: >> >> http://www.ietf.org/rfc/rfc2396.txt >> >> Section 5 talks about relative URIs. They do not need to include the >> protocol and their syntax would include all integers: >> >> The syntax for relative URI takes advantage of the <hier_part> syntax >> of <absoluteURI> (Section 3) in order to express a reference that >> > is > >> relative to the namespace of another hierarchical URI. >> >> relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ] >> >> A relative reference beginning with two slash characters is termed >> > a > >> network-path reference, as defined by <net_path> in Section 3. >> > Such > >> references are rarely used. >> >> A relative reference beginning with a single slash character is >> termed an absolute-path reference, as defined by <abs_path> in >> Section 3. >> >> A relative reference that does not begin with a scheme name or a >> slash character is termed a relative-path reference. >> >> rel_path = rel_segment [ abs_path ] >> >> rel_segment = 1*( unreserved | escaped | ";" | "@" | "&" | "=" >> > | "+" | "$" | "," ) > >> That means that you don't need the net_path part or the abs_path part >> but can use the rel_path part alone. The rel_path part can have only >> the rel_segment part which is required to have one or more unreserved >> characters (includes all the integers) and/or any of the above special >> > > >> characters or escaped characters. >> >> The point of using it in mzML for IDs is that you can be assured of it >> > > >> being a valid relative path when extended by all the other components >> needed to navigate to a referenced document (protocol, absolute path, >> etc.). >> >> We can achieve this by convention by saying in the mzML spec doc (and >> possibly putting the required pattern in the schema), that the string >> for ID must conform to RFC-2396. >> >> Randy >> >> Randall K Julian, Jr. Ph.D. >> President >> Indigo BioSystems, Inc. >> (317) 536-2736 x101 >> (317) 306-5447 mobile >> >> www.indigobio.com <http://www.indigobio.com/> >> > |