You can subscribe to this list here.
2002 |
Jan
(1) |
Feb
(161) |
Mar
(22) |
Apr
(3) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
(32) |
Nov
(6) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
(3) |
2004 |
Jan
(1) |
Feb
|
Mar
(2) |
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(8) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(7) |
2007 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Martin B. <ma...@bo...> - 2015-01-13 16:21:44
|
You are receiving this email because you are on one of the two mailing lists rin...@li... or rin...@li... associated to the Ringing Class Library. Given that neither of these lists has had any traffic for a number of years, you had probably forgotten that you were on one of them. Anyway, the Ringing Class Library is migrating from SourceForge to the more modern surroundings of Github. The main page is here: <https://github.com/ringing-lib>. This mailing list will probably cease to exist. If you're interested in keeping up with any developments, let me know. Martin Bright |
From: towards <fee...@jo...> - 2007-07-02 14:04:15
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> <a href="http://ewlujd.finefresh.hk/?085144544625"> <img alt="" src="cid:par...@jo..." height="372" width="380"></a><br> Thatwould function instrument thepublic, critics. Apresident needs<br> daytoday orwrong following located. Chosen occurs, after occurseven<br> heavy criticism instance.<br> Chosen occurs after occurseven. Monarchy americas crown aretheir<br> royal operates enterprise.<br> Into orbit filling, struggle force.<br> Pp de grazia community chicago?<br> Plausible excuseto prevent conflicts pursue merit seeking, legal<br> orinformal. Despotic father family further stages edge, ofwhich<br> depress taken. Alone aremany united states any distinct clusters?<br> Dominate almostto man little reason doubt deny. Scientists only first<br> two treated herewithin districts exist upon.<br> Years more than formerly means, pressure given.<br> Schools andpress, centered white ultimate havegrown.<br> Correct attitude republican forms major known. Structure, landthus<br> enlarged somewhat notonly influenced desires sees focal. But cannot<br> govern it exception.<br> Deep nothing save, wu wei bcthe. Grim besureof, successive requires apassive.<br> Themare smaller, consider purer again. Loyalty, depend abets rich.<br> Nor much advance forcethe, fabric empire.<br> Council let himself run secure members know.<br> Drop inactive theadult population, remain.<br> Old south pacific coast attend questions nations asof department.<br> Help federal forces sought policies funds practice impress groupsin.<br> Provide going fact limited facilities. Enterprise thinks terms king<br> meeting bootless he.<br> Function, instrument thepublic critics, claques politists.<br> Person sole incumbent, officeto define exactly boundary.<br> Largescale, noticeable waxes wanes alters majority maintains.<br> Setup relevant materials theory new, york, free press. Scanty<br> typically derives commanding machine, ability. Laws framed no<br> probably thanthe average, ideals! Abdication every mans, edition ppit<br> thus, develop, andsupport electorate. Cannot govern it, exception.<br> Hecan sure loyalty depend abets rich apply resorts andseeks.<br> Ormore alternate quotcrown early count drawing cadre forexample.<br> Mere awareness voting thedrop.<br> Such thearmy postal employees scientists. Range, subject concern tied<br> private matters thatare affairsbut later. Bysome primitive single appears.<br> Noticeable waxes wanes alters majority.<br> Thevoters campaign houseof seek succeedto legislator creates shapes amends.<br> Against contrast dimensions each contains oflife!<br> Up conscience, downthou, drunkand deep, nothing save.<br> Become wellbeaten, access roadchurch.<br> About, talk agreat, does scale! </body> </html> |
From: steps <mf...@op...> - 2006-12-05 01:33:22
|
93088 |
From: Richard S. <ri...@ex...> - 2005-05-22 23:12:31
|
Gary Howard wrote: > Richard Smith wrote: > > >I've put a suitable XLink schema here: > > > > http://www.ex-parrot.com/~richard/schemas/xlink.xsd > > > >Can you try using that one and tell us whether you still > >have problems? > > The problem goes away. OK. I've updated the schema to explicitly refer to this XLink schema (and have put a copy of it on the website rather than relying on my website). > >>>Finding a standard schema for xlink was not easy either > >>>so I would question its usefulness overall but that's a > >>>debate for later as I have other more pressing comments. > >>> > >>> > >There's a good reason for this. XLink does not (currently) > >have a normative schema, but I don't see why this should be > >an issue -- it's easy enough to provide one. > > > > > Then why is the one that I found different? It ain't that easy to match > the XLink "standard". As I said, there is no normative schema, and there are dozens of ways of expressing different concepts depending on how reusable and how strict you want it to be. > >Having said that, I'm not one of XLink's greatest fans, and > >if you have an alternative suggestion, I'd be interested to > >hear it. > > I'll be thinking about it. My first thoughts were to have links to > Dove/Felstead etc. for the relevant info. That sort of thing is one of the reasons we want a linking mechanism. However, I think we should also have a way of including the data inline in the XML. > >Martin has already responded to this, so I'm not going to, > >except to say that all our search script currently puts in > >the <meta> elt is a database timestamp. This is something > >for which I can easily imagine wanting to query the > >database. If I have a local (perhaps off-line) database, I > >might want to regularly sync this to the server database. > >Downloading just those methods changed since my previous > >snapshot was created is an obvious way of doing this. > > See my comments to Martin but we're still in the database-centric view. > A method definition schema > should not be expected to support database synchronisation. And ours doesn't explicitly. The method schema (you have read it, haven't you?) doesn't make *any* mention of database time stamps. It simply provides a point for extension. One way that we have chosen to extend it in the results provided by our database is with a database timestamp. That doesn't mean that anyone else needs to do the same. And that is why the time stamp is in the database namespace, not the methods namespace. > To perform > this task you should create > a synchronisation document with timestamps etc. and include method > definitions where required. But that is *precisely* what we *have* done. > Even so, for the example cited above, just a list of methods (no > annotations) matching the criterion of > being newer that a supplied timestamp would suffice. And the schema allows you to do this if that's what you want to do. And so does our database application. > E.g. the HTTP > request "get if newer" (or what > ever it is) will have an HTML document returned if there is a newer one > but the document itself does > not contain a timestamp to say it is newer: why should method > definitions be any different? The HTTP 1.1 If-Modified-Since header should be used to conditionally return the *whole* document if the *whole* document has been modified since the a given date. It isn't designed as a way of extracting a subset of a document that has been modified since a given date, and if you tried to do this it would seriously fuck up the operation of web caches. Getting our script to respect this header (and related ones) may be advantageous, but I think we have far more important things to implement. > Separation of concerns again. > > >And if you don't like having this in the output, you can > >always set the 'fields' parameter to specify which fields > >you want. > > > > http://methods.ringing.org/query.html#fields > > > >(Thinking about it, we might want to add a way of saying all > >fields except those in a given list.) > > > > > See comments to Martin. I think you're both missing the point with field > selection. That's a SQL > thing where all data has been flattened into a table: Not at all. The fields selection is not *just* a simple list of fields to be (although it can be, and usually is). As I said in my reply to your earlier mail, it is an XPath filter that allows pretty much any part of the whole conceptual XML document to be included. Yes, in many cases this is as simple as a list of fields from the database, but it needn't be. > CSV would be just a good in this case. There are many reasons why this wouldn't be satisfactory, but as I doubt you're seriously suggesting using CSV, I won't go into them here. > I want to see some structure in the results from an object-oriented > viewpoint (this is after all > a discussion list about a C++ library). The fact that this is the discussion list for a C++ library is largely irrelevant. We're simply using this list as a convenient place to discuss the methods schema. Almost none of the methods database is currently implemented in C++, though we have provided a C++ client library to access it. As to structure in the results, I think there's plenty of structure, and insofar as its meaningful to bandy around terms like "object-orientation" without any context, I think we've gone quite some distance in that direction. Take a look at the abstract class and ref elements and the way in which third parties are encouraged to extend these. > >Yes. Basically we had a choice. Either we could describe > >the <method> element as an <xsd:sequence>, in which case we > >would have been allowed to put elements from other > >namespaces there. Similarly we could have put the > >performances and classification data directly there. (They > >can't be in the current schema as they can occur multiple > >times.) The cost of this flexibility is that we would have > >had to specify an order for all of these elements, and the > >XML would only have been valid if these elements were in the > >right order. > > Nothing wrong with that. Maybe not seriously wrong, no, but it's hardly desireable, is it? And personally, I think the container classes required by the alternative approach provide better separation of the separate, related concepts. (You can perhaps argue that refs and classifications are the same, but this is a relatively trivial gripe.) > >The reason for this, as Martin says, is that an > ><xsd:sequence> is effectively taken as a grammar for a > >regular language, and keeping track of the number of times > >elements have occured rapidly becomes extremely difficult. > >(The schema requried grows combinatorially with the number > >of elements.) > > I don't understand the combinatorial explosion. A sequence gives an > "approved" order to elements > which may be optional - what's the problem with that? Try writing an XML schema for an element that can have in any order, upto 1 <foo> child, and an arbitrary number of <bar> children. Fairly easy? Now try upto 1 each of <foo>, <bar> and <baz> as children, plus an arbitrary number of <quux> children elements. Quite a lot more complicated. Now do you see? > >The alternative, which is what we've decided to do, is to > >describe it with an <xsd:all> element which allows child > >elements to occur at most once. It also does not allow > >elements from other namespaces. We get around these > >restrictions by having container elements, such as the > ><meta>, <refs>, <performances> and <classification> > >elements. I think we felt that this was preferable to having > >an arbitrary order in which the child elements had to occur. > > On the other hand you now have arbitrary flexibility which can also > cause problems. Of course it *can*. If you intend to design elements to extend the schema, then it's your job not to fuck it up. It's really not very difficult to do correctly. Look, say, at the <cc-class> or the <rwref> elements which "extend", respectively, the <class> and the <ref> abstract elements. > No DTDs, we're using XML Schema after all! The DTD equivalent is the > public id and they are > used in exactly the manner I described (see the different versions of > HTML all distinguished by > the public id in the DOCTYPE declaration). Yeah. Distinguishing versions of a document by different public ids in the DTD is fine: it causes no problems and can be very convenient. Doing the same with namespace names does cause problems -- see my previous email. > > This helps, but so long as you keep the XML Schema documents > >backwards compatibile (which should be easy as the > >conceptual schemas need to be backwards compatible), this is > >a non-issue. Always using the most recent schema for the > >namespace might result in lots of unnecessary schema for > >unknown elements, but should otherwise be fine. > > > >Finally, it should be remembered that in many real-world > >applications, you don't actually use the schema -- it's a > >simply a piece of documentation on what is allowed in the > >XML. The parser presumably already knows this. > > > > > Depends on your parser and your document validation policy. The point > being that the standard > should help in a standard way those who wish to process multiple > versions using standard > techniques like XML Catalog. Those who wish to ignore validation do so > at their own risk: their > parser may or may not cope but that's no fault of the schemas. This is all true, but entirely irrelevant. > >For a method name, there are three things > >that I might want to convey in the XML: > > > > - the method is named, but has the null name (i.e. it is > > Little Bob); > > > > - according to the database, the method is unnamed; or > > > > - it is unspecified whether the method is named. [...] > I'm still thinking about this one. Can you give me an example of the > third case and how it is > distinct from case 2? Yeah. A computer program might generate a list of methods that are true substitutes for Yorkshire as the first lead of Smith's 23. It might very well output this using this XML format. (And indeed, the program I ususally use for such tasks does exactly that.) It might not have ready access to whether or not a method has named, or it might not think it relevant, in which case you have case 3. A second example: you request from our database a list of methods with some properties, but, by using a fields option, you tell it not to include method names. From the point of view of a consumer of this document, this is case 3. By contrast, case 2 arises when the program (from the former example) or the database (in the latter example) does output information about the name of the method. If it knows (having recently synced with the MC website) that the method is unnamed, you have case 2. RAS |
From: Richard S. <ri...@ex...> - 2005-05-22 11:11:02
|
Gary Howard wrote: > Martin Bright wrote: > > >One thing to bear in mind with all of this is that the schema is meant > >to be of use to any applications exchanging method data, not just for > >our database. > > This is the crux of some of my gripes: there's a lot that's geared to > your database. I think you need to be careful to distinguish our core method schema (the http://methods.ringing.org/NS/method namespace) from the additional stuff that is specific to our database (in the http://methods.ringing.org/NS/database namespace). This separation into two namespaces is precisely to avoid the specifics of our database impinging on the design of the core methods schema. This means that you can use the core schema by itself, or in conjunction with our extra database-specific schema, or, indeed, with some other schema(s) suited to your own needs. Look, for example, in the schema itself http://methods.ringing.org/method.xsd or in the following example http://methods.ringing.org/method.xsd.txt and you'll see there is no mention of anything from the database namespace. > >I think maybe the fault here lies with whatever schema you're using for > >XLink -- that should be a valid attribute declaration. > > > >Anyway, the whole XLink thing is a bit messy and probably not the best > >way of doing it. I wanted to be able to have performances either > >inline, or referred to in another document. If you can think of a > >better way to do it then we'd be pleased to hear it. > > I got mine from here: http://schemas.opengis.net/gml/2.1.2/xlinks.xsd > I get the impression that these guys are quite hooked on xlink. They do seem to be, but that doesn't in anyway make the schema "official". The schema I suggested came from another W3 specification -- the XForms specification -- but again, that doesn't make it offical either. Clearly the fact that one validates our example methods XML, whilst the other doesn't, is, on the face of it, a little worrying. However, the problem isn't with our schema, or, for that matter, either of the XLink schemas: it's in the interaction between them. Your XLink schema does not define the xlink:type attribute *except* within various named attributeGroups. This means that another schema using xlink:type attributes must reference these attributeGroups by name. This would be fine if we'd written our schema to work with this specific xlink schema, however we haven't: we've written in to work with the one I quoted. It might be worth changing this as the schema you quote does seem better in some ways. In any event, we should definitely import the XLink schema from a specific location as it's clear that just any XLink schema will not do. > >I think what you're missing here is that IDs are very handy when you > >want to refer to a particular element from the outside. So maybe I want > >to put up a collection of my new unrung cyclic Royal principles on my > >web site; then you can refer to one of them as something like > > http;//martins.web.site/cyclicroyal.xml#r3xx . > > > >The facts that the IDs are unique within the database, and that they're > >consistent from one document to another, are not relevant to the XML > >schema; but they do guarantee that they're also unique within any XML > >document that the database produces, which is what IDs are about. > > > > > I don't believe the XML standard says anything about IDs being > consistent or unique across documents. Indeed not. It says they're unique within a particular document. But if they're unique across a set of documents, then they're unique in any one of those documents. However, this is getting away from the core point. Our general-purpose XML schema only says that a method *may* have an ID (it needn't) and that if it does, it must behave as an XML ID should -- i.e. it must be unique within that particular document. In our database application, we have decided that it is advantageous to extend XML's guarantee and say the ID will be unique for *any* method in our database. This does not mean you need to take advantage of this extra guarantee, nor does it mean an application you write needs to offer this additional guarantee. The main purpose of ID attributes is to make fragment locators work (the #m1234 you might see on the end of a URL). If you want a canonical reference number for a method, it is better to use one of fields from the the <refs> container element. (This will, for example, contain methods' numbers in various CC collections.) [...] > Arbitrary IDs are fine: so long as they can be referenced within the > document by an IDREF. Surely any ID can be referenced by an IDREF? > I don't object to the ID coming from the database; I object to the > notion that it is unique across documents. Why on earth should you object to that? No one is forcing *you* to make *your* IDs unique accross documents. We've chosen to, as we feel it might be useful for our users; you're free not to if you'd rather not. > It is a generic attribute (many applications may need to reference > method definitions with IDs) but they > may not be able to guarantee uniqueness outide the document; why should > your database have special privileges? Because our database *is* able to make this guarantee and we feel it's useful for us to do so. > >[...] I think that being able to add extra information using > >other namespaces is one of the things that XML is very good at. It's > >done all the time in the various W3C applications: take the xsi:type > >attribute, for instance, or the xlink: attributes, or the xsl:version > >attribute in XSL. > > > > > True, but I think it can be done more elegantly. Would you like to give a couple of specific examples? > >>I would also propose that the version attribute of the <methods> tag > >>should be eliminated and version idetification be incorporated into > >>the namespace URI. > >> > >This is definitely a bad idea. It kills any chance of backwards > >compatibility: a version 1 parser has no chance at all of reading a > >version 2 document. > > > Rubbish! I imagine it would be very easy to construst a multi-version > parser. For example: schema version 1 > recognises methods in the form: > <methods xmlns="urn:cccbr-org-uk:schemas/methods/1.0"> > ... > </methods> > Sometime later version 2 is released and documents now look like this: > <methods xmlns="urn:cccbr-org-uk:schemas/methods/2.0"> > ... > </methods> > It is clear which schema is required by which document and many parsers > today will handle this simply > within their entity resolver. > Where do you see a problem? With your suggestion, the name of every single element changes. Yes: that is what I mean. The name of an element is a pair of namespace name (*not* prefix), which in your suggestion will change with every (major?) new methods schema, and the NCName of the element itself. Anyone using modern namespace-aware parsing tools will find this an almighty pain. Just for example, every XSL template will break when a new schema is used, even if the template was written in such a way as to be robust against new, unknown elements. Likewise any new tools acting on a version 2 file will fail to parse a version 1 file even if the schemas are backwards compatible (as they really ought to be). There's a very good reason why, in recent years, almost every major XML-based technology has gone the version-number route rather than the new-namespace route. [re xsi:nil values:] > I'm beginning to see the problem: you are seeing the XML from a > database-centric point of view and > are treating the XML representation as a database result set which can > have a variable number of > selectable fields. I don't really think this is true. If it's datbase-centric to distinguish between knowledge of absence and absence of knowledge, then yes, we're taking a database-centric attitude. But that's as far as it goes. I for one think it's critically important that we can distinguish between the following three statements: - "this method is unnamed"; - "this method is named Little Bob"; and - "this method may or may not be named". If that's database-centric, so be it. > This is a choice that is very specific to this > application and I would contest has the > danger of making the resultant method definitions useless to another > application. How might this happen? Can you give me an example of how having the ability (but not the compulsion) to distinguish the above cases could possibly make the "resultant definitions useless to another application"? > I believe your application would function just as well with a generic > SQL to XML format (such as generated > by Oracle tools): > <resultset> > <record id="m123"> > <field name="methodname">Yorkshire</field> > <field name="stage">8</field> > ... > </record> > </resultset> Well, we *could* have done this, but for human consumption, I think I'd rather use a dataset that looked more like the one we've produced. > In fact I think this fits the nature of queries where fields are > selectable by name much better. Why? Conceptually, the list of fields is really just a collection of XPath filters applied to a complete document to determine which bits to return. (In fact, we optimise the process so that we don't generate unwanted data, but the principle is the same.) RAS |
From: Gary H. <how...@nt...> - 2005-05-21 08:39:30
|
Richard Smith wrote: >>I suppose there's no harm in putting the backslash in to make it work. >> >> > >Agreed. I've now done this. > > Thanks. I think it might be a function of the Java regexp processor. Personally I think it's clearer with it in as it's obviously not part of a character range then. >I've put a suitable XLink schema here: > > http://www.ex-parrot.com/~richard/schemas/xlink.xsd > >Can you try using that one and tell us whether you still >have problems? > > The problem goes away. >>>Finding a standard schema for xlink was not easy either >>>so I would question its usefulness overall but that's a >>>debate for later as I have other more pressing comments. >>> >>> >There's a good reason for this. XLink does not (currently) >have a normative schema, but I don't see why this should be >an issue -- it's easy enough to provide one. > > Then why is the one that I found different? It ain't that easy to match the XLink "standard". BTW mine came from: http://schemas.opengis.net/gml/2.1.2/xlinks.xsd >Having said that, I'm not one of XLink's greatest fans, and >if you have an alternative suggestion, I'd be interested to >hear it. > > I'll be thinking about it. My first thoughts were to have links to Dove/Felstead etc. for the relevant info. >>>Wow, so many namespaces! On further anaylsis, those on the <method> tag >>>are redundant >>> >>> >>Yes, I know. It's an issue with DBIx::XMLServer. It's quite a long way >>down the list of priorities to fix, though, because the extra >>de >> >That said, we should aim to sort this out eventually. > > I'll leave you to wrestle with your own libraries: the Java ones aren't always that obvious either. >Martin has already responded to this, so I'm not going to, >except to say that all our search script currently puts in >the <meta> elt is a database timestamp. This is something >for which I can easily imagine wanting to query the >database. If I have a local (perhaps off-line) database, I >might want to regularly sync this to the server database. >Downloading just those methods changed since my previous >snapshot was created is an obvious way of doing this. > > See my comments to Martin but we're still in the database-centric view. A method definition schema should not be expected to support database synchronisation. To perform this task you should create a synchronisation document with timestamps etc. and include method definitions where required. Even so, for the example cited above, just a list of methods (no annotations) matching the criterion of being newer that a supplied timestamp would suffice. E.g. the HTTP request "get if newer" (or what ever it is) will have an HTML document returned if there is a newer one but the document itself does not contain a timestamp to say it is newer: why should method definitions be any different? Separation of concerns again. >And if you don't like having this in the output, you can >always set the 'fields' parameter to specify which fields >you want. > > http://methods.ringing.org/query.html#fields > >(Thinking about it, we might want to add a way of saying all >fields except those in a given list.) > > See comments to Martin. I think you're both missing the point with field selection. That's a SQL thing where all data has been flattened into a table: CSV would be just a good in this case. I want to see some structure in the results from an object-oriented viewpoint (this is after all a discussion list about a C++ library). >Yes. Basically we had a choice. Either we could describe >the <method> element as an <xsd:sequence>, in which case we >would have been allowed to put elements from other >namespaces there. Similarly we could have put the >performances and classification data directly there. (They >can't be in the current schema as they can occur multiple >times.) The cost of this flexibility is that we would have >had to specify an order for all of these elements, and the >XML would only have been valid if these elements were in the >right order. > > Nothing wrong with that. >The reason for this, as Martin says, is that an ><xsd:sequence> is effectively taken as a grammar for a >regular language, and keeping track of the number of times >elements have occured rapidly becomes extremely difficult. >(The schema requried grows combinatorially with the number >of elements.) > > I don't understand the combinatorial explosion. A sequence gives an "approved" order to elements which may be optional - what's the problem with that? >The alternative, which is what we've decided to do, is to >describe it with an <xsd:all> element which allows child >elements to occur at most once. It also does not allow >elements from other namespaces. We get around these >restrictions by having container elements, such as the ><meta>, <refs>, <performances> and <classification> >elements. I think we felt that this was preferable to having >an arbitrary order in which the child elements had to occur. > > On the other hand you now have arbitrary flexibility which can also cause problems. >Namespaces aren't referenced via entity references so I >don't see how this is relevant. (Or are you suggesting a >DTD that adds an implicit namespace declaration on the root >entity? If so, I think this would be a very bad idea.) > > No DTDs, we're using XML Schema after all! The DTD equivalent is the public id and they are used in exactly the manner I described (see the different versions of HTML all distinguished by the public id in the DOCTYPE declaration). >I assume you're referring to the XML Catalog-like techniques >that can be used to select a schema for a namespace. > > Yes. > This helps, but so long as you keep the XML Schema documents >backwards compatibile (which should be easy as the >conceptual schemas need to be backwards compatible), this is >a non-issue. Always using the most recent schema for the >namespace might result in lots of unnecessary schema for >unknown elements, but should otherwise be fine. > >Finally, it should be remembered that in many real-world >applications, you don't actually use the schema -- it's a >simply a piece of documentation on what is allowed in the >XML. The parser presumably already knows this. > > Depends on your parser and your document validation policy. The point being that the standard should help in a standard way those who wish to process multiple versions using standard techniques like XML Catalog. Those who wish to ignore validation do so at their own risk: their parser may or may not cope but that's no fault of the schemas. <snip> >Thinking further, we *are* inconsistent in our use of >xsi:nil, as method names are handled differently from >anything else. For a method name, there are three things >that I might want to convey in the XML: > > - the method is named, but has the null name (i.e. it is > Little Bob); > > - according to the database, the method is unnamed; or > > - it is unspecified whether the method is named. > >Currently, we use xsi:nil for the former case, ignore the >second case (we have no unnamed methods in the database at >the moment), and omit the element in the latter case. > >Really we should be able to distinguish all three cases. > > I'm still thinking about this one. Can you give me an example of the third case and how it is distinct from case 2? Gary. |
From: Gary H. <how...@nt...> - 2005-05-21 08:39:20
|
Martin Bright wrote: >Gary, > >Welcome to ringing-lib-discussion, and congratulations on the first post >for a long time. > > Thanks. >One thing to bear in mind with all of this is that the schema is meant >to be of use to any applications exchanging method data, not just for >our database. > > This is the crux of some of my gripes: there's a lot that's geared to your database. <snip> >I think maybe the fault here lies with whatever schema you're using for >XLink -- that should be a valid attribute declaration. > >Anyway, the whole XLink thing is a bit messy and probably not the best >way of doing it. I wanted to be able to have performances either >inline, or referred to in another document. If you can think of a >better way to do it then we'd be pleased to hear it. > > I got mine from here: http://schemas.opengis.net/gml/2.1.2/xlinks.xsd I get the impression that these guys are quite hooked on xlink. >I think what you're missing here is that IDs are very handy when you >want to refer to a particular element from the outside. So maybe I want >to put up a collection of my new unrung cyclic Royal principles on my >web site; then you can refer to one of them as something like > http;//martins.web.site/cyclicroyal.xml#r3xx . > >The facts that the IDs are unique within the database, and that they're >consistent from one document to another, are not relevant to the XML >schema; but they do guarantee that they're also unique within any XML >document that the database produces, which is what IDs are about. > > I don't believe the XML standard says anything about IDs being consistent or unique across documents. That is an additional meaning you are ascribing to them and what's more you are saying that this is a property guaranteed by your database. But at the top of your reply you state the intention that the schema can be used across applications: this conflicts with the enhanced ID semantics you propose. >>The second point is one of separation of concerns: the id attribute you >>propose >>relates to a reference in your database; it is not a fundamental >>attribute of a method. >> >> >No, but historically it's been common for almost everything in any XML >document to have an ID attribute which contains some arbitrary ID. This >is an exception to the general rule that irrelevant data should be put >in a different namespace. Again, the fact that the ID comes from the >database is irrelevant: just think of it as providing a unique >identifier for the method within that XML document. > > Arbitrary IDs are fine: so long as they can be referenced within the document by an IDREF. I don't object to the ID coming from the database; I object to the notion that it is unique across documents. It is a generic attribute (many applications may need to reference method definitions with IDs) but they may not be able to guarantee uniqueness outide the document; why should your database have special privileges? >>Having suggested this, I don't particularly like the <meta> tag being >>part of a method >>definition either. It is noise; it doesn't contain any information I >>would want to >>query a method database for. In a similar way, the <methods> tag as it >>currently >>stands also has "noisy" attributes with the db: namespace.prefix. I >>think a mechanism >>for allowing a list of methods is required but it is more general and >>should not be >>encumbered with attributes for one specific query mechanism. This would >>allow other >>sites to provide lists of methods in a standard way. >> >> >On the contrary, I think that being able to add extra information using >other namespaces is one of the things that XML is very good at. It's >done all the time in the various W3C applications: take the xsi:type >attribute, for instance, or the xlink: attributes, or the xsl:version >attribute in XSL. > > True, but I think it can be done more elegantly. >>I would also propose that the version attribute of the <methods> tag >>should be >>eliminated and version idetification be incorporated into the namespace >>URI. This >>would make it very much easier to build a system that can parse either >>version 1 >>or version 2 documents using standard entity resolver parsing techniques. >> >> >This is definitely a bad idea. It kills any chance of backwards >compatibility: a version 1 parser has no chance at all of reading a >version 2 document. > Rubbish! I imagine it would be very easy to construst a multi-version parser. For example: schema version 1 recognises methods in the form: <methods xmlns="urn:cccbr-org-uk:schemas/methods/1.0"> ... </methods> Sometime later version 2 is released and documents now look like this: <methods xmlns="urn:cccbr-org-uk:schemas/methods/2.0"> ... </methods> It is clear which schema is required by which document and many parsers today will handle this simply within their entity resolver. Where do you see a problem? >> >>think the >>simple absence of a tag indicating a nil value is much more intuitive >>and simpler >>to process. >> >> > >The point here is that we already have a meaning for the absence of an >element. It means nothing at all. If I want to give you a list of >methods but you're not interested in when they were first pealed, I >shouldn't have to tell you: I just leave those elements out. Remember >that when querying the database you can choose which result elements you >want for each method. > I'm beginning to see the problem: you are seeing the XML from a database-centric point of view and are treating the XML representation as a database result set which can have a variable number of selectable fields. This is a choice that is very specific to this application and I would contest has the danger of making the resultant method definitions useless to another application. I believe your application would function just as well with a generic SQL to XML format (such as generated by Oracle tools): <resultset> <record id="m123"> <field name="methodname">Yorkshire</field> <field name="stage">8</field> ... </record> </resultset> In fact I think this fits the nature of queries where fields are selectable by name much better. I'm looking for a more object-oriented approach which I intend to detail in a follow-up posting. >It's good to have somebody new taking an interest in what we're doing. >Keep up the good work. > > Don't worry there's more to come ;-) Gary. |
From: Richard S. <ri...@ex...> - 2005-05-20 15:58:57
|
Martin Bright wrote: > > [Error] method.xsd:78:36: InvalidRegex: Pattern value > > '(([-xX]|[A-HJ-NP-WYZa-hj-np-wyz0-9]+)\.?)*' is not a valid regular > > expression. The reported error was: ''-' is an invalid character range. > > Write '\-'.'. > > Many regex parsers would allow this. The relevant standard (Appendix > F.1 of XML Schema Part 2) is contradictory: > > > * The [, ], - and \ characters are not valid character ranges; I'm inclined to assume that the inclusion of '-' in this list is a mistake. Given the third bullet point, it simply doesn't make sense. I've emailed the relevant W3C mailing list: let's see if they can clarify things. > > * The ^ character is only valid at the beginning of a =B7positive > > character group=B7 if it is part of a =B7negative character gro= up=B7 > > * The - character is a valid character range only at the > > beginning or end of a =B7positive character group=B7. > > I suppose there's no harm in putting the backslash in to make it work. Agreed. I've now done this. > > [Error] method.xsd:150:46: src-resolve: Cannot resolve the name > > 'xlink:type' to a(n) 'attribute declaration' component. > > [Error] method.xsd:150:46: s4s-elt-invalid-content.1: The content of > > 'linkedPerformanceType' is invalid. Element 'attribute' is invalid, > > misplaced, or occurs too often. > > I think maybe the fault here lies with whatever schema you're using for > XLink -- that should be a valid attribute declaration. I've put a suitable XLink schema here: http://www.ex-parrot.com/~richard/schemas/xlink.xsd Can you try using that one and tell us whether you still have problems? > > Finding a standard schema for xlink was not easy either > > so I would question its usefulness overall but that's a > > debate for later as I have other more pressing comments. There's a good reason for this. XLink does not (currently) have a normative schema, but I don't see why this should be an issue -- it's easy enough to provide one. Having said that, I'm not one of XLink's greatest fans, and if you have an alternative suggestion, I'd be interested to hear it. > > Wow, so many namespaces! On further anaylsis, those on the <method> tag > > are redundant > > Yes, I know. It's an issue with DBIx::XMLServer. It's quite a long way > down the list of priorities to fix, though, because the extra > declarations make no semantic difference to the document. Sorting this is complicated. Whilst libxml2 has a clean_namespaces() method, this only removes duplicate namespace declarations, not unused ones. As there are no duplicate namespaces, this doesn't help. (Finding out which namespace declarations are unused is a difficult problem. How do tell, in general, whether a text node or an attribute value contains a QName or something else sensitive to namespace bindings, such as an XPath expression?) That said, we should aim to sort this out eventually. [Martin: The sql namespace can probably be suppressesd with an exclude-result-prefixes attribute in xmlout.xsl.] > > The second point is one of separation of concerns: the id attribute you > > propose > > relates to a reference in your database; it is not a fundamental > > attribute of a method. > > No, but historically it's been common for almost everything in any XML > document to have an ID attribute which contains some arbitrary ID. This > is an exception to the general rule that irrelevant data should be put > in a different namespace. Martin's comment has just reminded me of the W3C's xml:id specification. http://www.w3.org/TR/xml-id/ The suggestion is that future schemas should put IDs in the xml namespace. At the moment I think doing this will break more than it fixes, but we may wish to revisit this decision in the future. (In particular, the ability of parsers to correctly access fragment identifiers -- i.e. #mXXXX on a URL -- might break.) > > Having suggested this, I don't particularly like the <meta> tag being > > part of a method > > definition either. It is noise; it doesn't contain any information I > > would want to > > query a method database for. Martin has already responded to this, so I'm not going to, except to say that all our search script currently puts in the <meta> elt is a database timestamp. This is something for which I can easily imagine wanting to query the database. If I have a local (perhaps off-line) database, I might want to regularly sync this to the server database. Downloading just those methods changed since my previous snapshot was created is an obvious way of doing this. And if you don't like having this in the output, you can always set the 'fields' parameter to specify which fields you want. http://methods.ringing.org/query.html#fields (Thinking about it, we might want to add a way of saying all fields except those in a given list.) > Instead of the <meta> element, we would have like to allow arbitrary > elements from other namespaces as direct children of the <method> > element. But it seems that XML Schema won't allow you to specify this - > something to do with being a regular language. RAS or DFM can no doubt > expand. Yes. Basically we had a choice. Either we could describe the <method> element as an <xsd:sequence>, in which case we would have been allowed to put elements from other namespaces there. Similarly we could have put the performances and classification data directly there. (They can't be in the current schema as they can occur multiple times.) The cost of this flexibility is that we would have had to specify an order for all of these elements, and the XML would only have been valid if these elements were in the right order. The reason for this, as Martin says, is that an <xsd:sequence> is effectively taken as a grammar for a regular language, and keeping track of the number of times elements have occured rapidly becomes extremely difficult. (The schema requried grows combinatorially with the number of elements.) The alternative, which is what we've decided to do, is to describe it with an <xsd:all> element which allows child elements to occur at most once. It also does not allow elements from other namespaces. We get around these restrictions by having container elements, such as the <meta>, <refs>, <performances> and <classification> elements. I think we felt that this was preferable to having an arbitrary order in which the child elements had to occur. > > I would also propose that the version attribute of the <methods> tag > > should be > > eliminated and version idetification be incorporated into the namespace > > URI. This > > would make it very much easier to build a system that can parse either > > version 1 > > or version 2 documents using standard entity resolver parsing technique= s. Namespaces aren't referenced via entity references so I don't see how this is relevant. (Or are you suggesting a DTD that adds an implicit namespace declaration on the root entity? If so, I think this would be a very bad idea.) I assume you're referring to the XML Catalog-like techniques that can be used to select a schema for a namespace. This helps, but so long as you keep the XML Schema documents backwards compatibile (which should be easy as the conceptual schemas need to be backwards compatible), this is a non-issue. Always using the most recent schema for the namespace might result in lots of unnecessary schema for unknown elements, but should otherwise be fine. Finally, it should be remembered that in many real-world applications, you don't actually use the schema -- it's a simply a piece of documentation on what is allowed in the XML. The parser presumably already knows this. > There's a big difference between not telling you > the date of the first peal and telling you that it definitely hasn't > been pealed. Of course, we're not actually saying it definitely hasn't been pealed, we're saying that our database has no knowledge of it being pealed. This, of course, is not an absolute statement, but is dependent on when the database was last updated (and our ability to update and query the database without cocking up, but let's ignore that). Given that, it *might* make more sense to use an attribute in the db namespace, though given xsi:nil exists for exactly this sort of thing, it's sensible to use it. (We might want to add a global timestamp to signify when our database was synced against its upstream data source (currently the MC website).) Thinking further, we *are* inconsistent in our use of xsi:nil, as method names are handled differently from anything else. For a method name, there are three things that I might want to convey in the XML: - the method is named, but has the null name (i.e. it is Little Bob); - according to the database, the method is unnamed; or - it is unspecified whether the method is named. Currently, we use xsi:nil for the former case, ignore the second case (we have no unnamed methods in the database at the moment), and omit the element in the latter case. Really we should be able to distinguish all three cases. RAS |
From: Martin B. <ma...@bo...> - 2005-05-20 12:10:51
|
Gary, Welcome to ringing-lib-discussion, and congratulations on the first post for a long time. Some of the things you raise have already been gone over here, but I'll go through them again. I'm sure others will add their own comments. One thing to bear in mind with all of this is that the schema is meant to be of use to any applications exchanging method data, not just for our database. > [Error] method.xsd:78:36: InvalidRegex: Pattern value=20 > '(([-xX]|[A-HJ-NP-WYZa-hj-np-wyz0-9]+)\.?)*' is not a valid regular=20 > expression. The reported error was: ''-' is an invalid character range.=20 > Write '\-'.'. Many regex parsers would allow this. The relevant standard (Appendix F.1 of XML Schema Part 2) is contradictory: > * The [, ], - and \ characters are not valid character ranges; > * The ^ character is only valid at the beginning of a =B7positive > character group=B7 if it is part of a =B7negative character group= =B7 > * The - character is a valid character range only at the > beginning or end of a =B7positive character group=B7. I suppose there's no harm in putting the backslash in to make it work. > [Error] method.xsd:150:46: src-resolve: Cannot resolve the name=20 > 'xlink:type' to a(n) 'attribute declaration' component. > [Error] method.xsd:150:46: s4s-elt-invalid-content.1: The content of=20 > 'linkedPerformanceType' is invalid. Element 'attribute' is invalid,=20 > misplaced, or occurs too often. I think maybe the fault here lies with whatever schema you're using for XLink -- that should be a valid attribute declaration. Anyway, the whole XLink thing is a bit messy and probably not the best way of doing it. I wanted to be able to have performances either inline, or referred to in another document. If you can think of a better way to do it then we'd be pleased to hear it. > Running a query returns a document with the following structure: >=20 > <methods xmlns=3D"http://methods.ringing.org/NS/method" > xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance" > xmlns:db=3D"http://methods.ringing.org/NS/database" > version=3D"0.1" db:page=3D"0" db:pagesize=3D"100" db:rows=3D"7"> > <method xmlns:a=3D"http://methods.ringing.org/NS/method" > xmlns:default=3D"http://methods.ringing.org/NS/method" > xmlns:sql=3D"http://boojum.org.uk/NS/XMLServer" id=3D"m15469"> > <name>Yorkshire</name> > ... > <meta> > <db:timestamp>2005-05-12T13:24:54</db:timestamp> > </meta> > </method> > .... >=20 > Wow, so many namespaces! On further anaylsis, those on the <method> tag=20 > are redundant Yes, I know. It's an issue with DBIx::XMLServer. It's quite a long way down the list of priorities to fix, though, because the extra declarations make no semantic difference to the document. > as the document will parse without them. However I believe the use of=20 > the id attribute is incorrect. I think what you're missing here is that IDs are very handy when you want to refer to a particular element from the outside. So maybe I want to put up a collection of my new unrung cyclic Royal principles on my web site; then you can refer to one of them as something like http;//martins.web.site/cyclicroyal.xml#r3xx . The facts that the IDs are unique within the database, and that they're consistent from one document to another, are not relevant to the XML schema; but they do guarantee that they're also unique within any XML document that the database produces, which is what IDs are about. > The second point is one of separation of concerns: the id attribute you=20 > propose > relates to a reference in your database; it is not a fundamental=20 > attribute of a method. No, but historically it's been common for almost everything in any XML document to have an ID attribute which contains some arbitrary ID. This is an exception to the general rule that irrelevant data should be put in a different namespace. Again, the fact that the ID comes from the database is irrelevant: just think of it as providing a unique identifier for the method within that XML document. > Having suggested this, I don't particularly like the <meta> tag being=20 > part of a method > definition either. It is noise; it doesn't contain any information I=20 > would want to > query a method database for. In a similar way, the <methods> tag as it=20 > currently > stands also has "noisy" attributes with the db: namespace.prefix. I=20 > think a mechanism > for allowing a list of methods is required but it is more general and=20 > should not be > encumbered with attributes for one specific query mechanism. This would=20 > allow other > sites to provide lists of methods in a standard way. On the contrary, I think that being able to add extra information using other namespaces is one of the things that XML is very good at. It's done all the time in the various W3C applications: take the xsi:type attribute, for instance, or the xlink: attributes, or the xsl:version attribute in XSL. Instead of the <meta> element, we would have like to allow arbitrary elements from other namespaces as direct children of the <method> element. But it seems that XML Schema won't allow you to specify this - something to do with being a regular language. RAS or DFM can no doubt expand. If other applications are to use this schema, then I think it's important to make it as extensible as possible. Any application can come along and read the method data, plus data from any other namespaces that it understands, and ignore the rest. > I would also propose that the version attribute of the <methods> tag=20 > should be > eliminated and version idetification be incorporated into the namespace=20 > URI. This > would make it very much easier to build a system that can parse either=20 > version 1 > or version 2 documents using standard entity resolver parsing techniques. This is definitely a bad idea. It kills any chance of backwards compatibility: a version 1 parser has no chance at all of reading a version 2 document. > I=20 > think the > simple absence of a tag indicating a nil value is much more intuitive=20 > and simpler > to process. The point here is that we already have a meaning for the absence of an element. It means nothing at all. If I want to give you a list of methods but you're not interested in when they were first pealed, I shouldn't have to tell you: I just leave those elements out. Remember that when querying the database you can choose which result elements you want for each method. There's a big difference between not telling you the date of the first peal and telling you that it definitely hasn't been pealed. We're using the xsi:nil attribute for what it's designed for. It's really not too much of a hassle referring to it in XSL: have a look at how the web page works if you want to see. I hope I've answered your questions. If I haven't, maybe somebody else will clarify some of what I've said. It's good to have somebody new taking an interest in what we're doing. Keep up the good work. Martin |
From: Gary H. <how...@nt...> - 2005-05-20 00:07:44
|
Hello there. The recent posting to change-ringers about methods.ringing.org has prompted me to join fray regarding the use of XML in computational campanology. My personal bias is toward the use of Java (and my Java ringing class library) and XSLT processing of methods data. A quick email to Martin Bright came up with the suggestion that I get things going here. So here goes... Using the Xerces Java parser (version 2.6.2) and the current method.xsd schema with full validation turned on I get the following schema errors: [Error] method.xsd:78:36: InvalidRegex: Pattern value '(([-xX]|[A-HJ-NP-WYZa-hj-np-wyz0-9]+)\.?)*' is not a valid regular expression. The reported error was: ''-' is an invalid character range. Write '\-'.'. [Error] method.xsd:150:46: src-resolve: Cannot resolve the name 'xlink:type' to a(n) 'attribute declaration' component. [Error] method.xsd:150:46: s4s-elt-invalid-content.1: The content of 'linkedPerformanceType' is invalid. Element 'attribute' is invalid, misplaced, or occurs too often. Which can be resolved by modifying the following elements: <simpleType name="pnType"> <restriction base="xsd:string"> <pattern value="(([\-xX]|[A-HJ-NP-WYZa-hj-np-wyz0-9]+)\.?)*"/> ^add this escape character </restriction> </simpleType> <complexType name="linkedPerformanceType"> <complexContent> <extension base="m:performanceType"> <attribute ref="xlink:show" default="none"/> ^replace path with a known xlink reference </extension> </complexContent> </complexType> This second case is a bit of a fudge just to get rid of any schema errors. It's not clear what the xlink:type is trying to achieve either. Finding a standard schema for xlink was not easy either so I would question its usefulness overall but that's a debate for later as I have other more pressing comments. Running a query returns a document with the following structure: <methods xmlns="http://methods.ringing.org/NS/method" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:db="http://methods.ringing.org/NS/database" version="0.1" db:page="0" db:pagesize="100" db:rows="7"> <method xmlns:a="http://methods.ringing.org/NS/method" xmlns:default="http://methods.ringing.org/NS/method" xmlns:sql="http://boojum.org.uk/NS/XMLServer" id="m15469"> <name>Yorkshire</name> ... <meta> <db:timestamp>2005-05-12T13:24:54</db:timestamp> </meta> </method> .... Wow, so many namespaces! On further anaylsis, those on the <method> tag are redundant as the document will parse without them. However I believe the use of the id attribute is incorrect. The supporting text for the schema states: "This contains a unique ID for the method within the database." My understanding of XML ID elements (which pre-date XML schema) is supported by the following quote from the XML Schema definition: "the scope of an ID is fixed to be the whole document". My first point here is that if you wish to have a unique database reference it should not be of type ID, simply a number or whatever as its scope goes beyond the document generated as the results of a query. I have used IDs (and corresponding IDREFs) in those cases where a relationship is required to be expressed between two elements in an XML document that doesn't fall into the standard hierarchical model provided by the document (e.g. networks of nodes and the links between them). In a ringing context I could see them being used in a document that contains method definitions and touches. Several touches could "point" to the same method definition (via an ID) using an IDREF. In such a case (as in a network definition) the actual value of the ID is irrelevant and can be generated on the fly for that document instance. The only constraint being that all IDs are unique within that document. For me the rule of thumb is: don't use an ID if you don't have a corresponding IDREF to refer to it. The second point is one of separation of concerns: the id attribute you propose relates to a reference in your database; it is not a fundamental attribute of a method. I therefore believe it should be replaced by a more general mechanism allowing a method definition to include an annotation for a database-specific id for example within the <meta> tag. Having suggested this, I don't particularly like the <meta> tag being part of a method definition either. It is noise; it doesn't contain any information I would want to query a method database for. In a similar way, the <methods> tag as it currently stands also has "noisy" attributes with the db: namespace.prefix. I think a mechanism for allowing a list of methods is required but it is more general and should not be encumbered with attributes for one specific query mechanism. This would allow other sites to provide lists of methods in a standard way. If you feel that query information is necessary then it should be provided by a separate wrapper tag: <db:results xmlns:db="http://methods.ringing.org/NS/database" db:page="0" db:pagesize="100" db:rows="7"> <methods xmlns="urn:cccbr-org-uk:methods-1.0"> <method> ... </method> </methods> </db:results> I would also propose that the version attribute of the <methods> tag should be eliminated and version idetification be incorporated into the namespace URI. This would make it very much easier to build a system that can parse either version 1 or version 2 documents using standard entity resolver parsing techniques. Over the years my preference has eveolved towards the use of URNs as URIs and not URLs as with the latter parsers have been known to hit the named site when trying to resolve references. Make it clear that they are just a formal naming convention. The example above brings some of these suggestions together. The schema currently allows "nillable" fields. E.g. "<firsthand xsi:nil="true"/> indicates that the method has never been rung to a peal on handbells." I have never needed to use this feature as the three common states for a field are easily covered in the following much simpler syntax: <tag>value</tag> - a value is present for "tag" <tag></tag> or </tag> - an empty value exists for "tag" (e.g.zero length string) no tag element - "tag" does not exist (i.e. is nil) This keeps the generated document much simpler to read for humans and to parse in programs: there is no need to declare the xsi namespace (which clutters the document); absent fields are just that, absent, keeping the document readable; and when parsing something like <firsthand xsi:nil="true"/> I would have to write extra code in a (SAX) parser to spot the xsi:nil attribute as a special case and process the tag in a completely different manner to what I would do if it read <firsthand date="2005-05-19"/>. For me, this last case is the most compelling reason for not using nillable fields. Also I have just done some expreiments with XPath expressions in an XSLT stylesheet: where I selected all <firsthand> nodes for processing [e.g. method/preformances/firsthand] and found the results included those with xsi:nil="true" as well as those without. This means extra coding would have to be added to stylesheets if nil fields were to be ignored and it would have to be done for each nillable field. I think the simple absence of a tag indicating a nil value is much more intuitive and simpler to process. Here endeth the first lesson. I hope I haven't preached too much but I fear another sermon is likely to follow regarding some aspects of method definition but I wanted to get the discussion going on these aspects first. Regards, Gary Howard |
From: Gary H. <how...@nt...> - 2005-05-20 00:05:25
|
Hello there. The recent posting to change-ringers about methods.ringing.org has prompted me to join fray regarding the use of XML in computational campanology. My personal bias is toward the use of Java (and my Java ringing class library) and XSLT processing of methods data. A quick email to Martin Bright came up with the suggestion that I get things going here. So here goes... Using the Xerces Java parser (version 2.6.2) and the current method.xsd schema with full validation turned on I get the following schema errors: [Error] method.xsd:78:36: InvalidRegex: Pattern value '(([-xX]|[A-HJ-NP-WYZa-hj-np-wyz0-9]+)\.?)*' is not a valid regular expression. The reported error was: ''-' is an invalid character range. Write '\-'.'. [Error] method.xsd:150:46: src-resolve: Cannot resolve the name 'xlink:type' to a(n) 'attribute declaration' component. [Error] method.xsd:150:46: s4s-elt-invalid-content.1: The content of 'linkedPerformanceType' is invalid. Element 'attribute' is invalid, misplaced, or occurs too often. Which can be resolved by modifying the following elements: <simpleType name="pnType"> <restriction base="xsd:string"> <pattern value="(([\-xX]|[A-HJ-NP-WYZa-hj-np-wyz0-9]+)\.?)*"/> ^add this escape character </restriction> </simpleType> <complexType name="linkedPerformanceType"> <complexContent> <extension base="m:performanceType"> <attribute ref="xlink:show" default="none"/> ^replace path with a known xlink reference </extension> </complexContent> </complexType> This second case is a bit of a fudge just to get rid of any schema errors. It's not clear what the xlink:type is trying to achieve either. Finding a standard schema for xlink was not easy either so I would question its usefulness overall but that's a debate for later as I have other more pressing comments. Running a query returns a document with the following structure: <methods xmlns="http://methods.ringing.org/NS/method" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:db="http://methods.ringing.org/NS/database" version="0.1" db:page="0" db:pagesize="100" db:rows="7"> <method xmlns:a="http://methods.ringing.org/NS/method" xmlns:default="http://methods.ringing.org/NS/method" xmlns:sql="http://boojum.org.uk/NS/XMLServer" id="m15469"> <name>Yorkshire</name> ... <meta> <db:timestamp>2005-05-12T13:24:54</db:timestamp> </meta> </method> .... Wow, so many namespaces! On further anaylsis, those on the <method> tag are redundant as the document will parse without them. However I believe the use of the id attribute is incorrect. The supporting text for the schema states: "This contains a unique ID for the method within the database." My understanding of XML ID elements (which pre-date XML schema) is supported by the following quote from the XML Schema definition: "the scope of an ID is fixed to be the whole document". My first point here is that if you wish to have a unique database reference it should not be of type ID, simply a number or whatever as its scope goes beyond the document generated as the results of a query. I have used IDs (and corresponding IDREFs) in those cases where a relationship is required to be expressed between two elements in an XML document that doesn't fall into the standard hierarchical model provided by the document (e.g. networks of nodes and the links between them). In a ringing context I could see them being used in a document that contains method definitions and touches. Several touches could "point" to the same method definition (via an ID) using an IDREF. In such a case (as in a network definition) the actual value of the ID is irrelevant and can be generated on the fly for that document instance. The only constraint being that all IDs are unique within that document. For me the rule of thumb is: don't use an ID if you don't have a corresponding IDREF to refer to it. The second point is one of separation of concerns: the id attribute you propose relates to a reference in your database; it is not a fundamental attribute of a method. I therefore believe it should be replaced by a more general mechanism allowing a method definition to include an annotation for a database-specific id for example within the <meta> tag. Having suggested this, I don't particularly like the <meta> tag being part of a method definition either. It is noise; it doesn't contain any information I would want to query a method database for. In a similar way, the <methods> tag as it currently stands also has "noisy" attributes with the db: namespace.prefix. I think a mechanism for allowing a list of methods is required but it is more general and should not be encumbered with attributes for one specific query mechanism. This would allow other sites to provide lists of methods in a standard way. If you feel that query information is necessary then it should be provided by a separate wrapper tag: <db:results xmlns:db="http://methods.ringing.org/NS/database" db:page="0" db:pagesize="100" db:rows="7"> <methods xmlns="urn:cccbr-org-uk:methods-1.0"> <method> ... </method> </methods> </db:results> I would also propose that the version attribute of the <methods> tag should be eliminated and version idetification be incorporated into the namespace URI. This would make it very much easier to build a system that can parse either version 1 or version 2 documents using standard entity resolver parsing techniques. Over the years my preference has eveolved towards the use of URNs as URIs and not URLs as with the latter parsers have been known to hit the named site when trying to resolve references. Make it clear that they are just a formal naming convention. The example above brings some of these suggestions together. The schema currently allows "nillable" fields. E.g. "<firsthand xsi:nil="true"/> indicates that the method has never been rung to a peal on handbells." I have never needed to use this feature as the three common states for a field are easily covered in the following much simpler syntax: <tag>value</tag> - a value is present for "tag" <tag></tag> or </tag> - an empty value exists for "tag" (e.g.zero length string) no tag element - "tag" does not exist (i.e. is nil) This keeps the generated document much simpler to read for humans and to parse in programs: there is no need to declare the xsi namespace (which clutters the document); absent fields are just that, absent, keeping the document readable; and when parsing something like <firsthand xsi:nil="true"/> I would have to write extra code in a (SAX) parser to spot the xsi:nil attribute as a special case and process the tag in a completely different manner to what I would do if it read <firsthand date="2005-05-19"/>. For me, this last case is the most compelling reason for not using nillable fields. Also I have just done some expreiments with XPath expressions in an XSLT stylesheet: where I selected all <firsthand> nodes for processing [e.g. method/preformances/firsthand] and found the results included those with xsi:nil="true" as well as those without. This means extra coding would have to be added to stylesheets if nil fields were to be ignored and it would have to be done for each nillable field. I think the simple absence of a tag indicating a nil value is much more intuitive and simpler to process. Here endeth the first lesson. I hope I haven't preached too much but I fear another sermon is likely to follow regarding some aspects of method definition but I wanted to get the discussion going on these aspects first. Regards, Gary Howard |
From: Richard S. <ri...@ex...> - 2004-04-01 22:52:53
|
Martin Bright wrote: > > Do you want anything more anonymous such as webmaster@ ? > > Isn't every domain with a web server meant to have a webmaster? Sounds > like a good idea. I'm happy to have it pointing to me, or you can take it > if you want. You've got it. RAS |
From: Martin B. <mjb...@li...> - 2004-04-01 14:39:02
|
--On 01 April 2004 12:55 +0100 Richard Smith <ri...@ex...> wrote: > I've set up forwarders: > > ma...@me... -> ma...@bo... > ri...@me... -> ri...@ex... Thanks very much. > Do you want anything more anonymous such as webmaster@ ? Isn't every domain with a web server meant to have a webmaster? Sounds like a good idea. I'm happy to have it pointing to me, or you can take it if you want. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Richard S. <ri...@ex...> - 2004-04-01 11:55:52
|
Martin Bright wrote: > Richard, would it be possible to set up an email address at > methods.ringing.org so I don't have to put my own one on the web pages? I've set up forwarders: ma...@me... -> ma...@bo... ri...@me... -> ri...@ex... Do you want anything more anonymous such as webmaster@ ? And does anyone else (Don?) want a similar forwarder? Richard |
From: Martin B. <mjb...@li...> - 2004-03-29 13:43:06
|
The web page has been mended now. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Martin B. <mjb...@li...> - 2004-03-26 12:34:19
|
I've just put up a new version of the methods.ringing.org web site, which I will keep working on. It's very incomplete at the moment, but the general framework seems to work. As always, comments are welcome. Richard, would it be possible to set up an email address at methods.ringing.org so I don't have to put my own one on the web pages? As regards the database, there are a few things to be sorted out before it's ready for public use: - Lead heads don't work properly at the moment, but I know how to mend them. - Searching by place notation doesn't work. This is because, in order to search by place notation, you also need to specify a number of bells. That's not a problem, but it means a fairly major change to DBIx::XMLServer which I will do sometime. - The database is not being kept up to date. Richard and I are working on this. If anybody has any bright ideas, we'd like to hear them. - I want to make the web page front end much better. That shouldn't be too hard. At the moment we have client interfaces in Perl and C++, both of which need documenting. What other interfaces would be used by software developers? Maybe I should ask on the new ringing software list. For web-based applications I expect that both Java and PHP would be useful. The database is at a stage where software developers can start writing their software to use it. As soon as we have a way of keeping it up to date, it will be fully usable. So - which is going to be the first major ringing application to use our database? Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Martin B. <mjb...@li...> - 2004-01-14 11:10:25
|
I've made some changes, not yet uploaded, to the script which populates the database. And to my copy of the database itself. I've changed the way lead heads are handled. There are two columns, methodLeadHeadCode and methodLeadHead. If the lead head code is non-NULL and the stage is at most 12, then the lead head can be NULL because we can get it from another table. Otherwise the lead head is stored. The only bit of this I haven't done is computing the lead head from the code when the stage is greater than 12; Perl code to do this would be gratefully received. I've changed the program to put place notation in `maximal' format, i.e. with dots between all changes and with external places present. This is part of a little Ringing::PlaceNotation module which I see going into a general ringing utility package. This module will also be needed by the Perl library client module which I've almost written. I've changed the database by adding a methodIsDifferential column. What should methodClass be set to for Differentials (as opposed to Differential Hunters)? I've put in a process for updating existing methods rather than just adding them as new ones. If a method in the input file has the same place notation as one already in the database, it's considered to be the same method. The new data overwrites the old. Otherwise it's considered to be a new method. This could be easily used to keep the database up to date in conjunction with diff. Now for some questions. At the moment we store place notation as two symmetrical portions if it's symmetrical about the lead end. That's what people expect. We store it as one long thing if it isn't, even if it is symmetrical about somewhere else. Storing a symmetrical method as half the place notation and a lead end accomplishes two things: it indicates that the method is symmetrical, and it halves the amount of place notation. Should we extend this to methods symmetrical about somewhere else? Such a method can always be stored as two symmetrical blocks of place notation. It would not be hard to write code to look for symmetry. Note that questions about how we store data should always be considered separately from how we pass it out through the XML interface. I think that, even if we turn symmetrical methods back into one long thing when giving them to the user, there could be some use in at least storing them as two symmetrical parts. On the other hand, suppose that a user wants to search for all methods, symmetrical or not, which are Cambridge above. If we store symmetrical methods differently from asymmetrical methods then we have to do two queries rather than one. Next, searching for a method up to rotation. This could be useful. One way to accomplish it would be to store, for each method, some unique rotation of its place notation such as the least in lexicographic order. Should I implement this? Incidentally I haven't come up with a good algorithm for finding the lexicographically least rotation of a sequence, so if anybody out there knows one then please let me know. I've also put some extra code into DBIx::XMLServer so that you can get out information from a query like what page it was, the total number of rows if the LIMIT weren't there, the actual SQL query string. Useful if I want my stylesheet which generates HTML to be able to say "page 3 of 5" and have links to other pages. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Richard S. <ri...@ex...> - 2003-12-10 17:21:38
|
Martin Bright wrote: > The problem, as Don said ages ago, is to define what should be `the same' > method. We have an ID for each method, and I think these should be > persistent. Should the ID pertain to the name or to the place notation? I > think it should probably belong to the place notation, since I also think > that we should potentially allow unnamed methods to appear (for example, > the whole TDMM collection). But then the whole problem of comparing place > notations arises. That problem is going to exist anyway -- people will need to search by place notation -- so we might as well get it right now. I agree with you: the ID should pertain to the place notation. Names can be changed for a number of reasons -- reclassification, and avoiding clashes with extensions being two. Place notations should only ever get changed to correct mistakes. > We probably just need to make a slightly arbitrary decision and stick to it. Agreed. This will enevitably get the occasional thing wrong, but either we can manually sort that out, or just ignore it. For instance, if we go with IDs pertaining to place notations, this would have been broken when the CC collection corrected the place notation for Reverse Tendring Doubles a few months ago. However, such corrections are likely to be few and far between. Richard |
From: Martin B. <mjb...@li...> - 2003-12-10 12:49:13
|
Here are my thought on how we should handle place notation. 1. The XML spec should allow place notation in pretty much any format that's going. 2. Our particular generated XML should have a small, readable format - I suggest minimal dots, no external places. It has either one <block> element or two <symblock> elements. 3. The data in the database, on the other hand, should be in a good format for machine searching. Nobody ever sees this except us. It should have maximal dots and include external places. At the moment I believe we store symmetric place notation only for methods which are symmetric about the lead end - maybe we could change this. The symmetry thing is one area where it would be good to have the Method Committee's decision on how place notation will appear in the final XML spec. 4. The front end should allow flexible searching of place notation. We clearly want to be able to search for a given place notation, ideally in a fairly flexible format. It would be nice if we could search for particular place notations above/below the treble, or maybe just 'on the front' or 'on the back'. It would be very nice if we could search for all rotations of a particular place notation. 5. It would be easy to write Perl routines to convert place notation to either the `minimal' or `maximal' formats described above. 6. Maybe we should also store place notation in binary for methods on up to 32 bells. Then we could write some very cunning search algorithms. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Martin B. <ma...@bo...> - 2003-12-10 12:36:47
|
I think that the most important step now is to get the database up to date and implement some way of keeping it so. We did talk about this a long time ago. Adding new rows to the database is straightforward. What needs some thought is what we do when some of the information has changed. Probably the most common case is when a method gets rung for the first time on handbells, or something trivial like that, in which case it's obvious what we do. But what if the name or place notation changes? The problem, as Don said ages ago, is to define what should be `the same' method. We have an ID for each method, and I think these should be persistent. Should the ID pertain to the name or to the place notation? I think it should probably belong to the place notation, since I also think that we should potentially allow unnamed methods to appear (for example, the whole TDMM collection). But then the whole problem of comparing place notations arises. We probably just need to make a slightly arbitrary decision and stick to it. Help, somebody! While the database update script is being re-written, we should also decide what to do about lead heads and put that in, as that's the only thing actually broken at the moment. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Martin B. <ma...@bo...> - 2003-11-26 11:11:26
|
I've put up a rudimentary HTML front end to the database, at <http://methods.ringing.org/cgi-bin/simpleh.pl> It's a fairly short script to display the form and turn the results into a query which is send to the XML database bit. The resulting XML is turned into HTML by the stylesheet in medium.xsl . The lead head stuff doesn't work yet. Eventually I hope to have more sophisticated search criteria, and a choice of output formats. So maybe you could list some methods in the format I've got at the moment, select check boxes next to some of them and click on a button to get more information on the ones you've selected. About place notation. I think it would be a good idea to store it in the database with dots between all changes, even crosses. I think we should remove unnecessary dots when turning it into XML. This would let us do stuff like searching on above/below place notation but without exposing the mess to the user. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Martin B. <mjb...@li...> - 2003-11-07 15:38:48
|
--On 05 November 2003 19:02 +0000 Richard Smith <ri...@ex...> wrote: > A couple of points to make on method.xsd.txt: > > I think we decided that 'Differential' was for these purposes a class. > (It's not a class according to the Decisions.) I can't remember what we decided, to be honest. Since we now have the classification elsewhere, should the <classes> element just be the textual part which goes into the method name? Should it be blank for principles? >| Place notation contains no whitespace. Either x, X or - may be used for >| a cross change. Dots may be put next to x, X or - if desired; [...] > > Are external places optional? (I think so.) Yes. > Can E and T be in lower case? (I think not.) I don't see why not, but don't really mind. I think our philosophy is that the XML standard should be as permissive as possible, but in our own database we can be more stringent on how we store it. > Did we want to include the mechanism we discussed earlier in the year for > specifying bells on very high numbers -- e.g. \b{24}. (I think this > would be useful.) Yes. I don't see any need for the \b though - just {24} would do. >| <mc:ref collection="surprise-major">3738</mc:ref> > > How does this handle different editions of collections? Either they have different names, or else we have a "version" attribute. > For example, Plain Minor methods are numbered P1 to P29 and > P1A (Why?) in the 5th edition of the CC Minor Collection, > but in the 6th edition they are numbered 1 to 99 (with no > simple correlation between numbers). > Do we feel it is the right decision to include rwref in the > base namespace, but defer collection ids to some other > namespace? From the MCs point of view, this proposal is not > likely to be particularly useful without additional elements > for CC collection ids, classes and (probably) symmetry. > Given this, why not put them all in one namespace as this > will aid older, non-namespace aware XML implementations. Yes, I think we probably ought to put anything which the MC officially support in the one namespace. > <pedantic> >| Little methods would have little="true" and Differential >| methods would have differential="True". > > It should be "true", not "True". > </pendantic> <pedantic> Error - opening tag `<pedantic>' closed by `</pendantic>'. </pedantic> Yes. > >| | <s:symmetry> >| | <!-- some elements describing symmetry --> >| | </s:symmetry> > > I think some classification of symmetry should be in the > base naemspace. Yes. > Personally, I'm still of the opinion that XLink is far from > ideal for this purpose. (Actually, I think it's far from > ideal for *any* purpose.) Might XInclude be a better > technology? I agree that XLink is far from ideal, but at least philosophically it does what we want. XInclude would have the disadvantage that implementations would feel free to process them at parse time. In general, it's probably not worth tweaking the XML spec too much before the Methods Committee get back to me. They're bound to have various changes they want to make. >> - Get scripts in place to keep the database up to date. > > IIRC, I got Don Morrison's script doing a lot of this. > Persistence of reference numbers was one main thing left. > Do we still need that to work? Performances were another > thing. (Didn't we have some clever plan about populating > the Location table with stuff from Felstead and/or Dove?) > Oh, and then there's Doubles variations. I think having persistent IDs would be a good thing. The other stuff might also be good, but can wait. >> - Develop a few client-side interfaces so that programmers will use our >> database in their programs. > > The Ringing Class Library has basic support for both the XML > format and fetching from methods.ringing.org. I'll have > another look at this later in the week. Excellent. For those that don't know, we've decided to release a core part of the class library (including the library-access classes) under the LGPL. This means that you *can* now use it in your own commercial programs if you like. A Perl interface would be dead easy. What other languages do people use for on-line applications? Did somebody mention writing a Java interface? I would suggest that any API be somthing like: - Turn the user's search criteria into a query string for the database; - Get the XML result; - For each <method> element in the result, create a native-language object containing a pointer to that element in the DOM tree; - Provide methods on the native-language object to retrieve different bits of information about the method. >> - Think about the `complicated' interface which will be a safe way for >> clients to execute almost arbitrary SQL statements on the database. > > Do we still think this will be useful? Providing a script > to suck *all* the data out and into a local database might > be a better solution. I'm not sure exactly who would use this, I must admit. OTOH it would be quite simple to provide. I have recently realised that it's probably enough to check that the first word is `SELECT', and then pass it to $dbh->prepare(); that will throw an error if there's any more than one statement in there. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Richard S. <ri...@ex...> - 2003-11-05 21:26:40
|
Martin Bright wrote: > An > XML Schema is in "method.xsd" on the CVS repository, and some text > describing it in "method.xsd.txt". A couple of points to make on method.xsd.txt: | | <name>Pudsey</name> | | <stage>8</stage> | | <classes>Surprise</classes> | | <title>Pudsey Surprise Major</title> | | These are self-explanatory. [...] I think we decided that 'Differential' was for these purposes a class. (It's not a class according to the Decisions.) | Place notation contains no whitespace. Either x, X or - may be used for a | cross change. Dots may be put next to x, X or - if desired; [...] Are external places optional? (I think so.) Can E and T be in lower case? (I think not.) Did we want to include the mechanism we discussed earlier in the year for specifying bells on very high numbers -- e.g. \b{24}. (I think this would be useful.) | <mc:ref collection="surprise-major">3738</mc:ref> How does this handle different editions of collections? For example, Plain Minor methods are numbered P1 to P29 and P1A (Why?) in the 5th edition of the CC Minor Collection, but in the 6th edition they are numbered 1 to 99 (with no simple correlation between numbers). Do we feel it is the right decision to include rwref in the base namespace, but defer collection ids to some other namespace? From the MCs point of view, this proposal is not likely to be particularly useful without additional elements for CC collection ids, classes and (probably) symmetry. Given this, why not put them all in one namespace as this will aid older, non-namespace aware XML implementations. <pedantic> | Little methods would have little="true" and Differential | methods would have differential="True". It should be "true", not "True". </pendantic> | | <s:symmetry> | | <!-- some elements describing symmetry --> | | </s:symmetry> I think some classification of symmetry should be in the base naemspace. | | <performance xlink:arcrole= | | "http://cccbr.org.uk/relations/record-length" | | xlink:type="simple" | | xlink:href="http://www.peals.net/database.xml#395"/> | | This is a pointer to some other place where the details of a performance | may be found. It is very important to support such indirection, so that the | method format would be able to interact with an XML peal database. And in a case where we use xlink to reference a remote performace, do we put any restrictions on what the href must point to? For example, must it link to element of type performaceType? Personally, I'm still of the opinion that XLink is far from ideal for this purpose. (Actually, I think it's far from ideal for *any* purpose.) Might XInclude be a better technology? > I think that we should make > sure that the database supports whatever they finally come up with; in the > meantime we can support our draft. Agreed. > So, what now? > > - Get scripts in place to keep the database up to date. IIRC, I got Don Morrison's script doing a lot of this. Persistence of reference numbers was one main thing left. Do we still need that to work? Performances were another thing. (Didn't we have some clever plan about populating the Location table with stuff from Felstead and/or Dove?) Oh, and then there's Doubles variations. > - Develop a few client-side interfaces so that programmers will use our > database in their programs. The Ringing Class Library has basic support for both the XML format and fetching from methods.ringing.org. I'll have another look at this later in the week. > - Think about the `complicated' interface which will be a safe way for > clients to execute almost arbitrary SQL statements on the database. Do we still think this will be useful? Providing a script to suck *all* the data out and into a local database might be a better solution. Anyway, time for the fireworks. Richard |
From: Martin B. <mjb...@li...> - 2003-11-05 12:54:23
|
Hello everybody! Over the last few months I and others have been gradually working on the method database, and it's looking pretty good at the moment. Firstly, we have come up with a draft XML format for methods. This was put together mostly by Richard Smith and me, with input from some others. An XML Schema is in "method.xsd" on the CVS repository, and some text describing it in "method.xsd.txt". If you've forgotten where the repository is, you can browse it here: <http://cvs.sourceforge.net/viewcvs.py/ringing-lib/methods/> The good news is that the Methods Committee are taking an interest. They want to have a standard XML reprentation of methods, and may be going to use our draft as a base for their proposals. I think that we should make sure that the database supports whatever they finally come up with; in the meantime we can support our draft. I'd appreciate any comments anybody might have on the draft, and so would the MC. Secondly, I've just about finished the code for getting XML data out of the database. I've abstracted the hard work into a Perl module, called DBIx::XMLServer, which I'll release on CPAN as soon as I've done a bit more documentation and testing. You can find it at <http://sf.net/projects/dbix-xmlserver>. Using this module, our web site only has to have an XML file describing the mapping from the database to the XML document. You can see how it all works by looking at the code in CVS. The only slight annoyance is that the generated XML has quite a few unnecessary namespace declarations, but that's really a cosmetic thing. So, what now? - Tidy up a few things like lead heads which don't quite work at the moment. - Put an HTML front end on top of the XML, so that people can browse the database from the web site. - Get scripts in place to keep the database up to date. - Develop a few client-side interfaces so that programmers will use our database in their programs. - Think about the `complicated' interface which will be a safe way for clients to execute almost arbitrary SQL statements on the database. Probably loads more stuff as well. Martin -- Martin Bright Department of Mathematical Sciences, University of Liverpool |
From: Martin B. <ma...@ma...> - 2003-03-04 00:10:44
|
On Mon, 3 Mar 2003, Philip Saddleton wrote: > We had a MC meeting yesterday, at which Tony Smith pointed out that he > hadn't heard anything since I forwarded this. Has there been any more > progress? Sorry, I haven't done anything about this for ages - I have just started thinking about it again recently. IIRC I posted a rough sketch of an XML format. Richard and others then came up with lots of nice features we could put in there - in particular there were things like support for arbitrary classification schemes associated with a namespace URI. I think there was a good way to include the first peal information, or other notable performances, as well. I can't check any of this as SourceForge (or something between it and me) seems to be down. Martin |