Text Encoding Initiative / Feature Requests / #545 Deprecate oVar and pVar, Revamp oRef and pRef

TEI produces the TEI Guidelines and associated software

#545 Deprecate oVar and pVar, Revamp oRef and pRef

Milestone: AMBER

Status: open

Owner: Stefanie Gehrke

Labels: None

Priority: 5(default)

Updated: 2015-02-10

Created: 2015-01-31

Creator: Laurent Romary

Private: No

The TEI dictionary chapter comprises four element for referring back to forms in dictionary entries (oRef, oVar, pRef, pVar), whose respective usage has never corresponded to clear-cut scenarios, especially because of the lack of clear set of use cases and examples. This has lead to a low usage of these elements in most TEI based dictionary projects but also in the absence of best practices for all the concrete cases (examples, etymology) where marking forms and associating them to (real or virtual) entries would help formalising lexical content in a systematic way.

The main issue is that the difference between pRef and pVar (resp. oRef/oVar) does not match the logic of tagging form references in a dictionary entry:

pRef (resp. oRef) is limited (empty content) to the case where the form is exactly the one on the same entry, which is rarely the case (e.g. when orthographic variants exist)
pVar is only intended to be used when there is a variation (e.g. inflected form) but contrary to pRef, but with its non empty content, it is often tempting to use it to mark all types of forms

It has also been pointed out that there are also issues related to the unsatisfactory definition of @type and the absence of @notation.

Proposal: we suggest to drop oVar and pVar and extent both the scope and content model of pRef and oRef to offer a simple system for the annotation of forms (orthographic and phonetic) in dictionary entries, with a clear parallel to orth and pron in the description of forms.

The main changes would be:

allow text in oRef and pRef; while keeping the possibility to leave them empty when necessary
make them member of att.typed
make them member of att.lexicographic to bring them in line with and enable full correspondence with linguistic/lexicographic usage of orth> and prone
add @notation in pRef in order to bring it in line with pron (probably a good opportunity to make a class out of @notation); useful in cases where there are more than one notations being represented in pron
from a semantic point of view, allow these elements to point to any dictionary entry not just the current entry’s head item (same dictionary or even other dictionaries in the case of the marking up of etymology)

Piotr Banski - 2015-01-31

This looks like a good step forward -- deprecating the Var elements while giving the Ref ones more flexibility is a welcome suggestion.

I have a remark and a request, for now:

Syd has already handled @notation in pVar (see ticket #523), so extending this to the modified pRef definitely calls out for a class.

would you please elaborate on the last point, i.e., the long-distance references, possibly by adducing some use cases? the suggestion seems logical on the one hand, but it also implies that the "tilde rendering" gets pushed from the more-or-less central focus of these elements, to a very contextual side-effect. I'm not saying it's bad, but its effect is worth highlighting already at this stage.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

To answer Piotr on the last point, here's a possible example of what we have in ming for etymology. It describes a borrowing from English to Japanese. The idea is to mark-up forms (and pronunciations) by means of the revamped oRef/pRef so that one can point to another lexical resource. It may be the case that this resource doe snot exist (yet) or cannot be referenced. @corresp is thus optional of course. But the underlying semantic that etymon are for that would potentially deserve lexical description seems important to me.

         <entry xml:id="taxi" xml:lang="jpn">
            <form type="lemma">
               <orth type="transliterated" notation="romanji">takushī</orth>
               <orth notation="katakana">タクシー</orth>
               <pron notation="ipa">taku'shi:</pron>
               <gramGrp>
                  <pos>noun</pos>
               </gramGrp>
            </form>
            <sense>
               <cit type="translation">
                  <quote>taxi</quote>
               </cit>
            </sense>
            <etym type="borrowing">
               <lbl>source</lbl>
               <lang>English</lang>
               <cit type="etymon">
                  <oRef xml:lang="eng-US" corresp="http://en.wiktionary.org/wiki/taxi">taxi</oRef>
                  <pRef xml:lang="eng-US">'tæksi</pRef>
               </cit>
            </etym>
         </entry>

Piotr Banski - 2015-02-04

Hi Laurent, thanks for this, I'll try to have a closer look by the end of the day. This appears to take the *Ref elements into the new century, but then, that's what they needed.

One remark for now: your @xml:lang is placed too high: on the <entry>, it also incorrectly applies to <sense> and <etym>.</etym></sense></entry>

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Piotr Banski - 2015-02-04

... and <pref xml:lang="eng-US">'tæksi</pref> is linguistically strange, as well. I don't know if xml:lang (or rather the relevant RFC or BP) accepts sublabels for phonetic script, but if it does, one should be applied here, I think.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hugh A. Cayless - 2015-02-10

assigned_to: Stefanie Gehrke
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hugh A. Cayless - 2015-02-10

Assigning to Stefanie. This looks like a fair amount of work, and I think will need some discussion here and/or on the Council list, and may need to be broken up into smaller chunks. [feature-requests:#544] would be invalidated if this is implemented, so I think they go together.

Related

Feature Requests: #544

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Deprecate oVar and pVar, Revamp oRef and pRef

TEI produces the TEI Guidelines and associated software

Group

Searches

Help

#545 Deprecate oVar and pVar, Revamp oRef and pRef

Related

Discussion

Related