|
From: Matt C. <mat...@va...> - 2008-02-20 04:21:12
|
After a little more thought, absolute instances of xs:anyURI will not
always work as a fragment identifier. If a spectrum's id attribute was
xs:anyURI in file "foo.mzML":
<spectrum id="file://foo.1.1.1.dta" /> <!-- this is a valid (absolute)
xs:anyURI -->
And in something like pepXML or analysisXML:
<spectrumQuery spectrumRef="file://foo.mzML#file://foo.1.1.1.dta" />
<!-- not a valid xs:anyURI! -->
Unless I'm missing something, using xs:anyURI for fragment identifiers
would actually make the schema less safe. Valid mzML ids would be
potentially unusable in an external URI unless URL-encoded, which would
defeat the point of using xs:anyURI in the first place.
-Matt
Randy Julian wrote:
> All we are trying to achieve with the anyURI is safety for use within a
> URI. Any xlink-safe way of doing this will work. So if xs:ID is
> supported better by the validating parsers, it would do what we want.
>
> -----Original Message-----
> From: psi...@li...
> [mailto:psi...@li...] On Behalf Of
> Matthew Chambers
> Sent: Tuesday, February 19, 2008 4:25 PM
> To: Mass spectrometry standard development
> Subject: Re: [Psidev-ms-dev] Relative URIs and RFC-2396
>
> You are right Randy, we were forgetting about relative URIs which can
> simply refer to a resource's name with no path at all ("1" is certainly
> a valid resource name). However, I think anyURI is still a bad idea for
> any attribute which is not intended to be able to refer to something in
> a remote location (e.g. not in the current file). The "id" attribute in
> the XML namespace has type "xs:ID" which has semantics more along the
> lines of what I think you want. If I understand the use case correctly,
> it is desirable to be able to link to certain mzML elements from
> external documents with a URI, like:
> file://data_source.mzML#s555
> This is an example absolute URI reference to a spectrum in a file at
> "data_source.mzML" where the spectrum's id attribute is "s555". It
> wouldn't make sense for the id itself to be a URI, although the
> reference to it can (and should) be.
>
> So:
> 1) for id attributes which can be referred to externally or internally,
> use the type "xs:ID"
> 2) for references to external or internal resources by their id
> attribute, use the type "xs:anyURI"
>
> This would have the problem of the Xerxes C parser not validating
> relative URIs correctly, but that seems to be wrong on their part. :/
> Anyway, users of Xerxes C can turn off the validation feature to work
> around it.
>
> Also, Ref attributes in mzML could use anyURI for consistency reasons
> even though we don't currently know of a use case where such references
> would be made to an external file.
>
> -Matt
>
>
> Randy Julian wrote:
>
>> Per our conversation today, the relevant specification is RFC-2396:
>>
>> http://www.ietf.org/rfc/rfc2396.txt
>>
>> Section 5 talks about relative URIs. They do not need to include the
>> protocol and their syntax would include all integers:
>>
>> The syntax for relative URI takes advantage of the <hier_part> syntax
>> of <absoluteURI> (Section 3) in order to express a reference that
>>
> is
>
>> relative to the namespace of another hierarchical URI.
>>
>> relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
>>
>> A relative reference beginning with two slash characters is termed
>>
> a
>
>> network-path reference, as defined by <net_path> in Section 3.
>>
> Such
>
>> references are rarely used.
>>
>> A relative reference beginning with a single slash character is
>> termed an absolute-path reference, as defined by <abs_path> in
>> Section 3.
>>
>> A relative reference that does not begin with a scheme name or a
>> slash character is termed a relative-path reference.
>>
>> rel_path = rel_segment [ abs_path ]
>>
>> rel_segment = 1*( unreserved | escaped | ";" | "@" | "&" | "="
>>
> | "+" | "$" | "," )
>
>> That means that you don't need the net_path part or the abs_path part
>> but can use the rel_path part alone. The rel_path part can have only
>> the rel_segment part which is required to have one or more unreserved
>> characters (includes all the integers) and/or any of the above special
>>
>
>
>> characters or escaped characters.
>>
>> The point of using it in mzML for IDs is that you can be assured of it
>>
>
>
>> being a valid relative path when extended by all the other components
>> needed to navigate to a referenced document (protocol, absolute path,
>> etc.).
>>
>> We can achieve this by convention by saying in the mzML spec doc (and
>> possibly putting the required pattern in the schema), that the string
>> for ID must conform to RFC-2396.
>>
>> Randy
>>
>> Randall K Julian, Jr. Ph.D.
>> President
>> Indigo BioSystems, Inc.
>> (317) 536-2736 x101
>> (317) 306-5447 mobile
>>
>> www.indigobio.com <http://www.indigobio.com/>
>>
>
|