#150 TEI/@version

Ian Rons

TEI/@version is an xsd:decimal, but TEI version numbers aren't decimals, e.g. "1.5.0". Can I suggest changing this to something more applicable, perhaps data.enumerated. This would allow a limited range of values, and could even be reduced to one (the current value), thus causing a validation error if one attempts to validate against the wrong TEI schema version.


  • Ian Rons

    Ian Rons - 2010-01-15
    • labels: --> TEI: Definition of Elements/Attributes/Classes
    • milestone: --> AMBER
  • Lou Burnard

    Lou Burnard - 2010-01-15

    When defined, this attribute was for the base version of the Guidelines as a whole (e.g. P5 as opposed to P4), and we had not yet set in place the programme of regular version releases. It would certainly make sense for this now to be used for point release numbers e.g. 5.1.2

    Your suggestion of making it data.enumerated is a good one, since it would permit projects to define explicitly the versions for which a given document is valid in their project ODD. Let's see what the Council thinks...

  • Ian Rons

    Ian Rons - 2010-01-15

    I've just noticed that this also needs to be altered for <unicodeName>, since the Unicode Consortium also use a non-numeric versioning system. Also it applies to <teiCorpus>.

  • Lou Burnard

    Lou Burnard - 2010-01-19

    "Version numbers for the Unicode Standard consist of three fields, denoting the major version, the minor version, and the update version, respectively. For example, “Unicode 3.1.1” indicates major version 3 of the Unicode Standard, minor version 1 of Unicode 3, and update version 1 of minor version Unicode 3.1." it says here (http://unicode.org/versions/#Version_Numbering). So the current definition for <unicodeName> (data.numeric) is indeed wrong, but wrong in a slight different way from the definition for TEI/@version or teiCorpus/@version

    I'm now wondering whether we shouldnt define a "data.version" datatype -- a pattern which matches the Unicode spec above -- rather than use data.enumerated

  • Ian Rons

    Ian Rons - 2010-01-22

    That wouldn't work for TEI v5.1.4.1 though. `[0-9]+(\.[0-9]+){2,3}` would cover it, but then one could imagine circumstances in future where it might be desirable to use @version for more idiosyncratic version numbers -- which is presumably why @version in att.translatable is a data.word.

    I still like the idea of data.enumerated for TEI/@version being useful as a sanity check in official releases, so that (e.g.) if someone is using the TEI templates in oXygen XML and then updates oXygen without realising they're updating TEI with it, the first schema error they'll see will (probably) be TEI/@version.

    It would also be useful for projects to have TEI/@version as data.enumerated so that conformance (or lack of) could be specified in an ODD, or simply as a sanity check, e.g. @version="ManuscriptProjectVersion2".

  • Lou Burnard

    Lou Burnard - 2010-02-01

    @version in att.translatable should actually be a date, but I take the point. The trouble with making @version use an enumerated list is that we then have to update that list for every new version. This might be bearable if it were automated I suppose. Of course it remains possible for people to customize the accepable values for @version in their ODD.

  • Lou Burnard

    Lou Burnard - 2010-04-30

    Agreed to make datatype conform to our practice

  • Lou Burnard

    Lou Burnard - 2010-05-08

    A @version attribute exists on each of the elements <application>,
    <teiCorpus>, <TEI> and <unicodeName>. A similarly named attribute is
    also supplied by the class att.translatable, members of which are
    desc, exemplum, gloss, remarks, and valDesc. The datatypes specified
    for this attribute vary, as follows:

    1. On att.translatable it's data.word and there is a note saying "The
    version may be a number, a letter or a date"

    2. On <application> it's a token matching a specific RNG pattern

    3. On <TEI> and <teiCorpus> it's explicitly a W3C decimal number.

    4. On unicodeName its the TEI-defined data.numeric (which matches all
    sorts of things)

    Obviously the datatypes for @version on TEI, teiCorpus, and
    unicodeName are just plain wrong (the first two don't match our actual
    current practice, and the last doesn't match the Unicode
    requirement). In an ideal world, we'd like attributes with the same
    name to have the same datatype, so one solution would be to give
    everything the same datatype as <application>. Unfortunately the
    pattern defined there is too permissive for either Unicodename or TEI
    version (neither permits letters or a 4th number), so I have for the
    moment defined a new datatype data.version which matches the Unicode spec, and propose we restrict ourselves to that.

    There remains the problem of the @version provided by in the
    att.translatable class, which is uniformly a date, with hyphens
    separating the parts. We could at a pinch make that conform to the
    same pattern by permitting hyphens as well as dots in the pattern, but
    that I think that would be cheating. We could consider renaming the
    @version you get from att.translatable to "translateDate" or some
    such; so far as I know the attribute is only used for internal ODD
    processing by the TEI so the compatibility issue is less severe. Note
    that the semantics of this @version are quite different from the
    others -- it carries the date when an ODD component was last
    *translated* which is usually quite different from when the ODD
    component itself was last modified, and nothing to do with the version
    number of the P5 release into which that component was eventually

    There is an outstanding issue about how the @version on these
    translateable elements should be used in our workflow specifically how we ensure consistency when the English language version has changed and
    the translations have not.

  • Lou Burnard

    Lou Burnard - 2010-05-08
    • status: open --> closed
  • Kevin Hawkins

    Kevin Hawkins - 2012-07-17
    • status: closed --> closed-fixed
  • Kevin Hawkins

    Kevin Hawkins - 2012-07-17

    Lou's latest comment raised some outstanding questions, and for the record, it's not clear how they were resolved. So for historians, I note the following:

    1. On att.translatable, @version has been renamed to @versionDate per bug 3393781.

    3. TEI@version now refers to the TEI's major release number ("P") per bug 3393781. So the original issue raised in the bug was thus resolved.

    4. On <unicodeName> it's now data.version.

    The problem that "remains" has been resolved by creation of @versionDate.

    The "outstanding issue" was resolved by adding instructions on use of @versionDate to tcw22 in June 2012.


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks