From: Gary H. <how...@nt...> - 2005-05-21 08:39:30
|
Richard Smith wrote: >>I suppose there's no harm in putting the backslash in to make it work. >> >> > >Agreed. I've now done this. > > Thanks. I think it might be a function of the Java regexp processor. Personally I think it's clearer with it in as it's obviously not part of a character range then. >I've put a suitable XLink schema here: > > http://www.ex-parrot.com/~richard/schemas/xlink.xsd > >Can you try using that one and tell us whether you still >have problems? > > The problem goes away. >>>Finding a standard schema for xlink was not easy either >>>so I would question its usefulness overall but that's a >>>debate for later as I have other more pressing comments. >>> >>> >There's a good reason for this. XLink does not (currently) >have a normative schema, but I don't see why this should be >an issue -- it's easy enough to provide one. > > Then why is the one that I found different? It ain't that easy to match the XLink "standard". BTW mine came from: http://schemas.opengis.net/gml/2.1.2/xlinks.xsd >Having said that, I'm not one of XLink's greatest fans, and >if you have an alternative suggestion, I'd be interested to >hear it. > > I'll be thinking about it. My first thoughts were to have links to Dove/Felstead etc. for the relevant info. >>>Wow, so many namespaces! On further anaylsis, those on the <method> tag >>>are redundant >>> >>> >>Yes, I know. It's an issue with DBIx::XMLServer. It's quite a long way >>down the list of priorities to fix, though, because the extra >>de >> >That said, we should aim to sort this out eventually. > > I'll leave you to wrestle with your own libraries: the Java ones aren't always that obvious either. >Martin has already responded to this, so I'm not going to, >except to say that all our search script currently puts in >the <meta> elt is a database timestamp. This is something >for which I can easily imagine wanting to query the >database. If I have a local (perhaps off-line) database, I >might want to regularly sync this to the server database. >Downloading just those methods changed since my previous >snapshot was created is an obvious way of doing this. > > See my comments to Martin but we're still in the database-centric view. A method definition schema should not be expected to support database synchronisation. To perform this task you should create a synchronisation document with timestamps etc. and include method definitions where required. Even so, for the example cited above, just a list of methods (no annotations) matching the criterion of being newer that a supplied timestamp would suffice. E.g. the HTTP request "get if newer" (or what ever it is) will have an HTML document returned if there is a newer one but the document itself does not contain a timestamp to say it is newer: why should method definitions be any different? Separation of concerns again. >And if you don't like having this in the output, you can >always set the 'fields' parameter to specify which fields >you want. > > http://methods.ringing.org/query.html#fields > >(Thinking about it, we might want to add a way of saying all >fields except those in a given list.) > > See comments to Martin. I think you're both missing the point with field selection. That's a SQL thing where all data has been flattened into a table: CSV would be just a good in this case. I want to see some structure in the results from an object-oriented viewpoint (this is after all a discussion list about a C++ library). >Yes. Basically we had a choice. Either we could describe >the <method> element as an <xsd:sequence>, in which case we >would have been allowed to put elements from other >namespaces there. Similarly we could have put the >performances and classification data directly there. (They >can't be in the current schema as they can occur multiple >times.) The cost of this flexibility is that we would have >had to specify an order for all of these elements, and the >XML would only have been valid if these elements were in the >right order. > > Nothing wrong with that. >The reason for this, as Martin says, is that an ><xsd:sequence> is effectively taken as a grammar for a >regular language, and keeping track of the number of times >elements have occured rapidly becomes extremely difficult. >(The schema requried grows combinatorially with the number >of elements.) > > I don't understand the combinatorial explosion. A sequence gives an "approved" order to elements which may be optional - what's the problem with that? >The alternative, which is what we've decided to do, is to >describe it with an <xsd:all> element which allows child >elements to occur at most once. It also does not allow >elements from other namespaces. We get around these >restrictions by having container elements, such as the ><meta>, <refs>, <performances> and <classification> >elements. I think we felt that this was preferable to having >an arbitrary order in which the child elements had to occur. > > On the other hand you now have arbitrary flexibility which can also cause problems. >Namespaces aren't referenced via entity references so I >don't see how this is relevant. (Or are you suggesting a >DTD that adds an implicit namespace declaration on the root >entity? If so, I think this would be a very bad idea.) > > No DTDs, we're using XML Schema after all! The DTD equivalent is the public id and they are used in exactly the manner I described (see the different versions of HTML all distinguished by the public id in the DOCTYPE declaration). >I assume you're referring to the XML Catalog-like techniques >that can be used to select a schema for a namespace. > > Yes. > This helps, but so long as you keep the XML Schema documents >backwards compatibile (which should be easy as the >conceptual schemas need to be backwards compatible), this is >a non-issue. Always using the most recent schema for the >namespace might result in lots of unnecessary schema for >unknown elements, but should otherwise be fine. > >Finally, it should be remembered that in many real-world >applications, you don't actually use the schema -- it's a >simply a piece of documentation on what is allowed in the >XML. The parser presumably already knows this. > > Depends on your parser and your document validation policy. The point being that the standard should help in a standard way those who wish to process multiple versions using standard techniques like XML Catalog. Those who wish to ignore validation do so at their own risk: their parser may or may not cope but that's no fault of the schemas. <snip> >Thinking further, we *are* inconsistent in our use of >xsi:nil, as method names are handled differently from >anything else. For a method name, there are three things >that I might want to convey in the XML: > > - the method is named, but has the null name (i.e. it is > Little Bob); > > - according to the database, the method is unnamed; or > > - it is unspecified whether the method is named. > >Currently, we use xsi:nil for the former case, ignore the >second case (we have no unnamed methods in the database at >the moment), and omit the element in the latter case. > >Really we should be able to distinguish all three cases. > > I'm still thinking about this one. Can you give me an example of the third case and how it is distinct from case 2? Gary. |