ISO 12620:2009 is a standard describing the data model and procedures for a Data Category Registry (DCR). Data categories are defined as elementary descriptors in a linguistic structure. In the DCR data model each data category gets assigned a unique Peristent IDentifier (ID), i.e., an URI. Linguistic resources or preferably their schemas that make use of data categories from a DCR should refer to them using this PID. For XML-based resources, like TEI documents, ISO 12620:2009 normative Annex A gives a small Data Category Reference XML vocabulary (also available online at http://www.isocat.org/12620/\) which provides two attributes dcr:datcat anddcr:valueDatcat. The following TEI example illustrates its usage in a TEI feature (structure):
<tei:TEI xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:dcr="http://www.isocat.org/ns/dcr">
name="part of speech"
In this example @dcr:datcat relates the feature name to a /partOfSpeech/ data category and @dcr:valueDatcat the feature value to a /commonNoun/ data category. Both these data categories reside in the ISOcat DCR atwww.isocat.org, which is the DCR is use by ISO TC37 and hosted by its registration authority the MPI for Psycholinguistics. The given example results currently in an invalid TEI document, and the proposal is to remedy that by adding the dcr:datcat and dcr:valueDatcat attributes to the TEI global attribute list. This would allow referring to the used data categories from any place in a TEI document.