Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.


#13 <index>

Lou Burnard
Andreas Nolda

The current <index> model of TEI suffers from several
limitations, including:

1. index entries are given as attribute values, so they
cannot contain additional markup
2. there is no support for ranges
3. there is no support for cross-references like "see" or
"see also"

The first limitation is easily fixed by substituting <label>
elements to the "level<n>" attributes.

As to the second limitation, <index> could be changed
from a 'milestone-like' element to an element containing
all of the material to be indexed in a subelement, e.g.:

<index id="index.lemmatization.arabic">
<indexLabel level="1">lemmatization</label>
<indexLabel level="2">arabic</label>
<indexContent>The students understand procedures
for Arabic lemmatisation and are beginning to build

(For the "id" attribute on <index>, see below.)

The third limitation could be removed by adding a
pointer child to <index>:

<indexLabel level="1">arabic lemmatization</label>
<indexRef>see <ptr

Alternatively, TEI could simply adopt DocBook's index


  • Lou Burnard
    Lou Burnard

    • labels: --> TEI: New or Changed Element
  • Syd Bauman
    Syd Bauman

    Logged In: YES

    As Lou has previously aluded to elsewhere, in the Big
    Picture it would probably be the right thing to ditch the
    entire TEI index model and replace it with hooks designed to
    make using an XML Topic Map easier. (XTM was designed to
    generate indexes, after all.)

    In the interum, there are 4 specific suggestions here:

    1. Change levenN= attributes to child elements. This is a
    good idea; such a good idea, it is already on the list of
    things to do for P5. We have not, however, come up with a
    good name for this child element. Suggestions welcome. (I,
    for one, am not fond of Andreas's <indexLabel>.)

    2. Use <indexContent> (I think Andreas intends this to hold,
    rather than replicate, source content, but I'm not sure.)
    I'm iffy on this one; overall I don't think I like it, but
    can't elucidate why very well. I'd prefer the DocBook zone=
    method (if I understand it correctly), I think.[1]

    3. Permit a pointer child to <index> for redirecting the
    reader to a different index entry. While it's probably a
    good idea to provide this kind of functiaonality, I have a
    feeling that by the time one is getting this complicated XTM
    is really the better way to go.

    And, as an alternate to the above
    4. Adopt the DocBook index model. I am not thoroughly
    knowledgable about the DocBook model, but I'm worried that
    it tries to do too much (or at least, more than we need) at
    once and therefore makes usage & processing more difficult
    than need be.

    [1] I think the DocBook zone= method works such
    that rather than look at the spot where
    <index> element appears in the text, the
    index generation software takes as that
    which is to be indexed the target of the
    zone= attribute. In the TEI world we'd
    probably use a target= attribute, and the
    semantics would be that the default value of
    target= ("default" here meaning what
    software should do if it finds no value, not
    what the schema should say is provided as a
    default value) is the <index> element

  • Andreas Nolda
    Andreas Nolda

    Logged In: YES

    Right, <indexContent> should hold the source content instead
    of replicating it.

  • Lou Burnard
    Lou Burnard

    • assigned_to: nobody --> louburnard
    • status: open --> closed
  • Lou Burnard
    Lou Burnard

    Logged In: YES

    The <index> element has now been revised completely, taking
    on board these and other suggestions, so I am closing this
    item. Please review the new specification and submit any
    comments you may have as a new tracker item.