A taxonomy naturally contains categories:
<taxonomy xml:id="docType"> <desc>Document type</desc> <category xml:id="dtManuscript"> <catDesc>Handwritten manuscript</catDesc> </category> <category xml:id="dtPrint"> <catDesc>Printed document</catDesc> </category> </taxonomy>
and it's also obvious that categories should be able to nest:
<category xml:id="dtManuscript"> <catDesc>Handwritten manuscript</catDesc> <category xml:id="dtLetter"> <catDesc>Handwritten letter</catDesc> </category> <category xml:id="dtMemo"> <catDesc>Handwritten memo</catDesc> </category> </category>
It's clear that you could assign a document to any of the subcategories, and this would imply membership of the parent category; so if a document has <catRef target="#dtLetter"/>
, then it is by definition also "dtManuscript".
However, there are some areas in a nested tree like this which don't follow this pattern. Consider this:
<category xml:id="dtPaperSize"> <catDesc>The size of paper on which a manuscript document is written.</catDesc> <category xml:id="dtA4"> <catDesc>A4 paper</catDesc> </category> <category xml:id="dtA5"> <catDesc>A5 paper</catDesc> </category> </category>
Any document may belong to either of the child categories (even both, if it happens to include both paper sizes); so we would see e.g. <catRef target="#dtA4"/>
, meaning that the document falls into the category of documents which consist wholly or partially of A4 paper. However, this does not make any claim as to the parent category; it makes no sense to say that a letter "is" or "has" a paper size, without specifying what that size is.
In other words, the "category" of paper size is not a category at all; it's a taxonomy. This issue comes up frequently in complex, nested taxonomies. It would be valuable to be able to create a structure like this:
<category xml:id="dtManuscript"> <catDesc>Handwritten manuscript</catDesc> <category xml:id="dtLetter"> <catDesc>Handwritten letter</catDesc> </category> <category xml:id="dtMemo"> <catDesc>Handwritten memo</catDesc> </category> <taxonomy xml:id="dtPaperSize"> <desc>The size of paper on which a manuscript document is written.</desc> <category xml:id="dtA4"> <catDesc>A4 paper</catDesc> </category> <category xml:id="dtA5"> <catDesc>A5 paper</catDesc> </category> </taxonomy> </category>
I submit that <taxonomy>
should be available as a child of <category>
, to allow for such rich multi-layered taxonomies.
Assigning to Paul Schaffner to triage and report to Council with a proposal.
Martin to provide a better example — probably a trimmed-down version of the real problem.
MH to provide better examples of use case.
First, I argue that taxonomies should be able to nest:
This example is based on a real use-case from the Map of Early Modern London.
We defined the nature of contributors' contributions to the project or to a specific document using taxonomies. We draw the majority of our responsibility definitions from the Marc Relators codes, as defined by the LOC:
http://www.loc.gov/marc/relators/
However, we use only a subset of that very long list, expressed as a TEI
<taxonomy>
:However, the Marc Relators codes do not provide for all of our needs; we also have our own supplementary responsibility codes, also defined as a
<taxonomy>
:These taxonomies are used together and it would make more sense to be able to express them as a single taxonomy composed of two:
Since taxonomies are often composed of other taxonomies, I believe
<taxonomy>
should be nestable.Next, consider a taxonomy of literary forms (this arises out of a different project). You might characterize literary work in many different ways:
Now, inside the "verse" category we need to characterize different features of the verse, including foot type, line-length and stanza type:
It's clear here that we have three distinct types of category; they don't belong as siblings. What we really have are three distinct taxonomies, that should be marked up as such:
It makes no sense to use nested categories for this; these are sub-taxonomies within the overall taxonomy of literary forms. On this basis, I argue that
<taxonomy>
should be available inside<category>
.Council 2015-05-29: MH to bug everyone on the Council list about this, and if no objections in 2 weeks, go ahead.