#402 A note about white space in <persName> et al.

AMBER
closed-accepted
Martin Holmes
5
2013-04-27
2012-11-03
John P. McCaskey
No

I suggest this addition to the Guidelines 3.2.1:

White space is allowed and therefore significant between elements within <name>, <persName>, <orgName>, and <placeName>. Therefore

<persName>
<foreName>Mary</foreName>
<forename>Ann</forename>
<nameLink>De</nameLink><surName>Mint</surName>
</persName>

encodes "Mary Ann DeMint" and

<persName>
<foreName>Mary</foreName><foreName>Ann</forename>
<nameLink>De</nameLink>
<surName>Mint</surName>
</persName>

encodes "MaryAnn De Mint".

Discussion

  • Lou Burnard
    Lou Burnard
    2013-01-07

    3.2.1 is about punctuation, which is not quite the same thing as white space, but I agree some remarks along these lines might be useful here.

     
  • James Cummings
    James Cummings
    2013-01-08

    Assigning to louburnard to clarify the prose; Setting to AMBER for now.

    I also note that one could wrap in <forename> like:

    <forename><forename>Mary</forename><forename>Ann</forename></forename> to help control this.

     
  • James Cummings
    James Cummings
    2013-01-08

    • milestone: --> AMBER
    • assigned_to: nobody --> louburnard
     
  • Nesting <forename> might be useful in some situations, but white space is significant within it, too. So the same care is required with <forename> as with <name>, <persName>, <orgName>, and <placeName>.

    <forename>
    <forename>Mary</forename>
    <forename>Ann</forename>
    </forename>

    encodes " Mary Ann "

    <forename><forename>Mary</forename>
    <forename>Ann></forename></forename>

    encodes "Mary Ann"

    <forename><forename>Mary</forename><forename>Ann></forename></forename>

    encodes "MaryAnn"

    Or, strictly, the first and second both encode Mary on one line and Ann on another. But virtually all downstream processors will convert that carriage return to a space, even though neither TEI nor XML specs say they must or even should.

    For full treatment of the trickiness here, see http://wiki.tei-c.org/index.php/XML_Whitespace.

     
  • I think nesting forename is a red herring. it may help, but it doesnt absolve us from the problem of reminding folks that whitespace (or lack of) is significant more often than they realize

     
  • . . . and of reminding them that examples in the Guidelines often assume particular downstream processing that is nowhere documented.

    That is: Use these examples to guide your own practice, but -- secret of secrets -- be sure the consumers of your TEI file process whitespace the way the examples presume they will. What exactly that is, is for you to figure out. Case by case. Good luck.

     
  • Ah, John, I can see you'll be a founder member of the Society to Define a Genuine TEI Processing Model. I think your description of the situation is spot on!

     
  • James Cummings
    James Cummings
    2013-04-11

    Council face-to-face 2013-04 decided that if you really did mean 3.2.1 then we should close and reject this ticket. Otherwise, please give us the correct section.

     
  • James Cummings
    James Cummings
    2013-04-11

    • assigned_to: Lou Burnard --> Martin Holmes
     
  • Correction: That should have said Guidelines 13.2.1, not 3.2.1

     
  • Martin Holmes
    Martin Holmes
    2013-04-27

    Added John's explanation at rev 12023.

     
  • Martin Holmes
    Martin Holmes
    2013-04-27

    Taking a second shot in rev 12024. My previous commit fell foul of the indenting of the content of egXML elements, so I've had to recast the examples as eg elements with CDATA. I've also added a link to our section on whitespace in ST.

     
    Last edit: Martin Holmes 2013-04-27
  • Martin Holmes
    Martin Holmes
    2013-04-27

    Done. Closing the ticket.

     
  • Martin Holmes
    Martin Holmes
    2013-04-27

    • status: open --> closed-accepted