#329 <choice> as child of <pc>


In some cases it is necessary to normalise punctuation in the course of transcribing a text. This may be because the punctuation is used incorrectly in a source text, or is omitted altogether when it should not be.

We would expect to be able to use <choice> as a child of <pc> in the same way as <choice> is used as a child of <w> for encoding diplomatic and normalised word forms (i.e., with <orig> and <reg>). However, the current content model of <pc> does not allow <choice> to be used at all.

The attached example demonstrates what we believe should be permitted.


  • Martin Holmes

    Martin Holmes - 2011-12-12

    Would you not put <choice> around two <pc> tags, like this?


  • Eric Andrew Haswell

    Yes, that does indeed work, but it seems redundant to have to use <pc> twice.

  • BODARD Gabriel

    BODARD Gabriel - 2011-12-12

    Well, what if something that's a <g> or a <c> or something else in one version is a <pc> in the other?

  • Eric Andrew Haswell

    Let me rephrase my response: for what we want to do, it seems redundant to have to use <pc> twice. I'm not suggesting that <g> or <c> should be removed from the content model, but rather that the content model be expanded to allow for other encoding possibilities, e.g. <choice>

  • BODARD Gabriel

    BODARD Gabriel - 2011-12-13

    My previous comment was too terse: I was merely quibbling the issue of the redundancy of <choice><orig><pc/></orig><reg><pc/></reg></choice>, which redundancy might be desirable if the character was not technically a "punctuation character" before it was regularized (or vice versa), e.g. <choice><orig><c>o</c></orig><reg><pc>.</pc></reg></choice>.

    This is not per se an argument against allowing choice inside pc, of course.

  • Martin Holmes

    Martin Holmes - 2011-12-13

    I think the issue is that <pc> is a character-level tag, and as such, it's at the bottom of the granularity tree, if you see what I mean. As such, it can contain only <c> and <g>. According to this model, the alternation represented by <choice> takes place at a higher level up the tree. This is not to say that the inconvenience of including two <pc>s is not annoying, but I don't think it's erroneous or unprincipled.

  • Elena Pierazzo

    Elena Pierazzo - 2011-12-16

    I have to disagree with martin on this: the point that it is made by one character only doesn't work as there are plenty of composite signs that work as a punctuation mark. In my renaissance text, for instance the period is made like this: :~
    The creation of an element for punctuation was also aimed at clarify the distinction between character and punctuation sign, one being a graphical concept (conveyed by <c>) and the other a semantic, complex, object. It is just irrelevant of how many characters are necessary to represent a punctuation sign.<pc> should, really, behave as <w>.
    I don't really see why <choice> is fine within <w> to encode alternatives of any length, including one character's variation, while it is not allowed within <pc>, forcing you to have two of them! So I believe that this is a bug.

  • Martin Holmes

    Martin Holmes - 2011-12-16

    OK, I'm done playing devil's advocate. Can we come up with a proposal, then? Are we talking about manually adding <choice> to the content model of <pc>, or adding <pc> to model.pPart.editorial?

  • Lou Burnard

    Lou Burnard - 2012-02-02

    If we regard <pc> as a special kind of <w>, which I think is Elena's argument, then I think we are talking about adding a reference to model.pPart.edit to the content of <pc>, which would involve some revision since it is currently just macro.xtext

    The note currently reading "Contains a single character, a g element, or a sequence of graphemes to be treated as a single character." may also need some clarification. In modern punctuation, would we treat the sequence "period closing-quote" in the preceding sentence as two <pc>s or one? I would say two. I would also say that sequences like :- at the start of a list should be treated as a single <pc> ,

  • Lou Burnard

    Lou Burnard - 2012-03-13
    • milestone: --> AMBER
  • James Cummings

    James Cummings - 2012-04-15

    While I would probably have encoded this as two <pc> as people have suggested... as long as it is clear that a single act of punctuation is taking place, making <pc> claim membership in pPart.edit seems reasonable.

  • Lou Burnard

    Lou Burnard - 2012-06-09

    If the intention is to make <choice> available within <pc>, which is what the ticket says, the way to do that is not to make <pc> a member of pPart.edit (as both James and Martin suggest below). That would mean <pc> could appear in the same places as <choice>, which is not the same thing at all. Assuming some homeric nodding on the part of my esteemed fellow council members, I have implemented what I believe was intended : the addition of a reference to model.pPart.edit in the content model of<pc>.

  • Lou Burnard

    Lou Burnard - 2012-06-09
    • status: open --> closed-accepted