#373 make @scheme optional on <keywords>


@scheme on <keywords> allows for reference to an external or internally defined keyword scheme used for the various <term> elements inside <keywords>. I'm interested in attaching keywords to a TEI document that are not from a controlled vocabulary, such as author-defined keywords. In such a case, I would be inclined to do:

<term>street life</term>

without using @scheme. Lou agreed with this approach on TEI-L:


In fact, P4 did not require @scheme, so we would be going back to the glory days.


  • Martin Holmes

    Martin Holmes - 2012-08-03

    I think the same argument applies to <classCode>, and I'd like to see it have an optional @scheme too. This is the full list of elements with @scheme:

    att catRef classCode constraintSpec gi keywords locus locusGrp occupation rendition socecStatus tag

    We should presumably look at each one and decide in which cases we think it should be optional.

  • Kevin Hawkins

    Kevin Hawkins - 2012-08-03

    @scheme is already optional on


    and required on:


    I think these are all correct, except that I would make it *required* on catRef. As with classCode, nobody is going to create their own classes according to a classification system in their head, whereas they might use free-form (uncontrolled) keywords.

  • Martin Holmes

    Martin Holmes - 2012-08-03

    I'm afraid I have used <classCode> without a formal scheme. Shame on me, perhaps, but I just needed a way to classify texts in a manner required by my processing system.

  • Kevin Hawkins

    Kevin Hawkins - 2012-08-03

    Martin, is that a sort of one- or two-character flag you use to group TEI documents into particular buckets, so to speak, when no other part of the encoding gives you what you need for these buckets? I'm sympathetic to this need, but to help win over others, could you give an example of this -- of something that you couldn't encode elsewhere in the document?

  • Martin Holmes

    Martin Holmes - 2012-08-03

    I've used it to characterize texts a prose, verse, prose-and-verse, etc.

  • Kevin Hawkins

    Kevin Hawkins - 2012-08-03

    This is a controlled vocabulary, just a controlled one, right? If so, I think you should in fact use @scheme and create a <taxonomy> showing your scheme. So I'm not yet convinced of a use of case for classCode with an optional @scheme.

  • Laurent Romary

    Laurent Romary - 2012-08-04

    @scheme mandatory on keywords is a really annoying feature when integrating author keywords (material retrieved from publication archives for instance). So +1 to this ticket. Implement this quickly!

  • Kevin Hawkins

    Kevin Hawkins - 2012-08-04

    A few things have been suggested in further discussion on TEI-L (all of which were made before Laurent's comment on this thread) ...

    On 8/3/12 12:56 PM, John Walsh wrote:
    > I would recommend then that we continue to require @scheme, but loosen
    > the prose definition of @scheme to allow for local, project-specific, or
    > document-specific vocabularies.

    On 8/3/12 1:13 PM, Peter Gorman wrote:
    > Following on Paul's comments, the definition should also permit @scheme values of "unknown" (I believe there is a scheme but I don't know which) or "uncontrolled" (I know there is no scheme, not even a local one.).

    On 8/3/12 1:40 PM, John Walsh wrote:
    > But if one has a list of keywords in the header, isn't that a local
    > scheme? You could even do something like this:
    > <keywords xml:id="uncontrolled" scheme="#uncontrolled">
    > <term>ceremonials</term>
    > <term>fairs</term>
    > <term>street life</term>
    > </keywords>
    > But I still think it would be better, even in "uncontrolled" or
    > "unknown" cases, to point to a <taxonomy> element with a bit more
    > explanation. To my (admittedly simple) mind, "uncontrolled" is not
    > relevant here. if one has created or documented a list of keywords, then
    > one is in possession of a "controlled" vocabulary. The list of possible
    > values has been restricted or controlled. The distinction, I think, is
    > not one of controlled or uncontrolled but the authority behind the
    > control: a standards body or similar organization, a project, an
    > individual, etc.

    On 8/3/12 2:19 PM, Peter Gorman wrote:
    > This may be splitting hairs, but then I'm a librarian. ;-) Two distinct scenarios, however, have been observed in the wild:
    > 1. A project has developed a fixed vocabulary from which keywords have been chosen, and applied as necessary to specific items. The vocabulary is a local one, but it has been documented.
    > 2. No project vocabulary was developed or even contemplated; keywords were applied ad hoc as the encoder saw fit in the moment.
    > To me, 'scheme' implies some degree of prior definition and control: something that could be used again. A simple list of words, just by the fact of their existence, is certainly a set, but I'd find it useful to know that it's a completely idiosyncratic one, not likely to be consistent with that in any other document. Put another way, "local" as a scheme type is meaningful to me if it's local to an institution or project, but not if it's local to the inside of someone's head at a particular point in time. I'm taking @scheme as analogous to MODS @authority. The DLF MODS guidelines recognize this distinction, but recommend omitting the attribute if the terms are uncontrolled.

    On 8/3/12 2:49 PM, Michael Piotrowski wrote:
    > I agree, but I think it may be useful to differentiate between two types
    > of control here:
    > a) In a controlled vocabulary the keywords are controlled.
    > b) In an uncontrolled vocabulary, the keywords may not be controlled,
    > but other aspects may still be controlled, in particular, the
    > language of the keywords, their number (e.g., "at least 3, but no
    > more than 5"), their part of speech (e.g., "use only singular nouns,
    > no verbs, please"), etc. Conventions such as these could also be
    > considered a scheme, and it may make sense to document them.

  • Kevin Hawkins

    Kevin Hawkins - 2012-08-05

    Referring to John Walsh's self-referential example, Lou wrote on TEI-L, "Very ingenious, like much of the discussion on this thread. But I can't help feeling that making @scheme optional would be much easier for the implementor and the encoder! If the @scheme is not specified (is null), does that not imply that there is no information available about the source of the keywords other than the keywords themselves?"

  • James Cummings

    James Cummings - 2012-08-09

    Assigning to lou to report on at next face2face.

    To play devil's advoate: given that @scheme is data.pointer, what is so difficult about doing:
    <keywords scheme="my:keywords"> ?
    i.e. to put in a local private URI? (potentially documented in the header elsewhere, but not necessarily so...)

    Otherwise, I've no problem in making it optional.

  • James Cummings

    James Cummings - 2012-08-09
    • assigned_to: nobody --> louburnard
  • Sebastian Rahtz

    Sebastian Rahtz - 2012-08-09

    making @scheme optional seems like a no-brainer to me. why torture people with hacks like my:scheme, when simple absence means the keywords are uncontrolled?

  • Lou Burnard

    Lou Burnard - 2012-08-09

    Made @scheme optional, added example and explanation at rev 10733

    Doing the same for classCode or catRef seems ill-advised -- these elements by definition contain *codes* taken from an authority file which must ipso facto be defined elsewhere.

  • Lou Burnard

    Lou Burnard - 2012-08-09
    • milestone: --> GREEN
    • status: open --> closed-accepted
  • Kevin Hawkins

    Kevin Hawkins - 2012-08-10

    Lou added my example from above to the Guidelines, but I confess that when posting to TEI-L in haste I simply took an example from the Guidelines from a *controlled* vocabulary and removed the @scheme. I have replaced with a true uncontrolled example at revision 10736.