#377 retaining punctuation marks in the text of a TEI document

GREEN
open-accepted
1(low)
2014-05-28
2012-08-12
Kevin Hawkins
No

Section 3.2.1 (#COPU-1) of P5 discusses the question of whether punctuation marks should be excluded from or left as part of the text in a TEI document. But no guidance is given on whether and how to record such decisions made. A few requests arising from this:

1. The <quotation> element (within <editorialDecl>) should be mentioned here in section 3.2.1. (It is already explained in section 2.3.3 (#HD53).

2. Since the default value of quotation@marks is "all", which indicates "all quotation marks have been retained", it seems that the TEI's default guidance in this matter is to retain them. It would be good to say this in section 3.2.1 as well. The reason, I think, is quite simple: it simplifies rendering of the encoded text for readers if you don't have to reinsert the punctuation. Encouraging consistency aids interoperability.

3. It would be good to give guidance on whether punctuation marks should be inside or outside containing elements. Here's an example:

<p>She said, <said>“Nobody uses the term <soCalled>‘electronic text’</soCalled> anymore”</said>!</p>

The encoder could have put the quotation marks outside of the said and soCalled elements, and the exclamation point could have been inside or outside of the p element. Encouraging consistency aids interoperability.

4. There is no element within <editorialDecl> for indicating whether other punctuation marks besides quotation marks were retained. For those creating linguistic corpora using <s>, for example, this is relevant. Perhaps create a <punctuation> for this purpose? I'm not sure how that would relate to <quotation>, though. In any case, if a new element is created, it should also be referenced from section 3.2.1.

Discussion

  • Lou Burnard
    Lou Burnard
    2012-09-16

    This is actually rather tricky. I am far from certain that I'd recommend putting the quotation marks inside the elements like that. We wouldn't recommend that for punctuation inside the children of <bibl>, for example, though I can't remember whether that's explicitly stated in the Glines anywhere. It suspect the Glines are correctly vague on this topic because there is considerable variation in practice. But you;re right to say we should make it easier to document explicitly what policies have been adopted.

     
  • Lou Burnard
    Lou Burnard
    2012-09-16

    • milestone: --> AMBER
     
  • James Cummings
    James Cummings
    2012-09-21

    Council accepts points 1,2, and 4. But does not want to recommend a specific version of 3.
    Creation of a new <punctuation> element which is a member of model.encodingDescPart, with prose content similar to other encodingDescPart elements, and some attribute(s) defining the handling of punctuation marks.

     
  • James Cummings
    James Cummings
    2012-09-21

    • assigned_to: nobody --> rwelzenb
    • status: open --> open-accepted
     
  • Kevin Hawkins
    Kevin Hawkins
    2013-06-20

    In the original ticket I suggested creating <punctuation> within <editorialDecl>, not <encodingDesc>. I suspect that the minutes from the Sept. 2012 Council meeting and the summary above are incorrect and that we intended model.editorialDeclPart.

     
    • Group: AMBER --> GREEN
    • Priority: 5 --> 1(low)
     
  • James Cummings
    James Cummings
    2014-05-28

    Reassigning to PFS to follow up.

     
  • James Cummings
    James Cummings
    2014-05-28

    • assigned_to: Rebecca Welzenbach --> Paul Schaffner