As noted on TEI Council list 2402/2013, attributes should define their datatype indirectly by refering to one of the existing data.xxx macros, whi
are then mapped to a RELAXNG expression. The following appear to be inconsistent with this policy:
* timeline@interval and when@interval are both defined using ad hoc RNG constructs (but not identical ones); we should define a
data.interval which is consistent.
* language@usage uses an adhoc RNG datatype directly; we should define a new "data.percentage" macro for it (and look for other cases
where this might be used)
* application@version uses an adhoc RNG expression, which surely ought to be replaced by a "data.versionNum" macro
* Several attributes [moduleRef@except and @include; att.identified@module; @key, on classRef, elementRef, macroRef, and
moduleRef; moduleRef@prefix] use the built in RNG datatype "NCName". It might be more consistent to define our own macro "data.xmlName" vel sim.
* rng:text is still used on most of the attributes delivered by att.lexicographic (expand, norm, orig, split, and value); two attributes which hold regexp values
(att.patternReplacement@replacementPattern and att.scoping@match); also on refState@delim, and valItem@ident. I think we should have
just one datatype for "string of words to be treated as a single entity" and use that for some of these; for the regexpes surely we should have
data.regexp.
* Many many attributes currently include <valList>s of various levels of closure. These should all have a datatype of data.enumerated, and maybe there should be a schematron rule to enforce this.
Council face-to-face 2013-04 agrees to this. LB to implement.
Just adding a note to the effect that when data.versionNum has been created, att.styleDef/@schemeVersion should be switched to it (see ticket https://sourceforge.net/p/tei/feature-requests/446/).
Have added data.interval, data.percentage, data.xmlName, data.regexp, and data.xpath as proposed here. However, data.regexp invites confusion with data.pattern. Not sure what to call the other current uses of
<rng:text>
Have now replaced data.regexp with data.replacement (it isnt a regexp, and we already have data.pattern for regexps). Have also added data.versionNumber, but am wondering why we have both this and data.version. The latter is more restricted than the former, and corresponds with both TEI and Unicode version numbering systems.
We've discussed this before. The application/@version attribute is attempting to cover all real-life application version numbers, including a, b, c, alpha, beta, and all sorts of alphanumeric combinations in a four-component format, such as 1.4.2b.34. data.version is trivially simple, and will miss most of those cases.
Closing this ticket : we seem to have gone as far as we can with it for the moment.