Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


#442 Allow <foreign> to contain <q>


Consider the following (from FDT's Henry V):

La main, de hand. Les doigts, le fingres.
Je pense que je suis le bon écolier. J’ai gagné deux
mots d’anglais vitement. Comment appelez-vous “les

Ideally we would enclose all foreign words in the <foreign> tag (in this example we consider "de hand" and "le fingres" to be English/non-foreign). We would like to enclose "des ongles" in <q> tags. If <foreign> can contain <soCalled> or <mentioned>, what is the case against being able to do so for <q>?

We find there to be a slippery slope between <soCalled> and <mentioned>, so we prefer to use a generic <q> for all quotes throughout the project. For example, in Hermia's line below, is "little" in quotes because it was mentioned by Helena, or because Hermia is so-called little?
“Little” again? Nothing but “low” and “little”?


  • Lou Burnard
    Lou Burnard

    There are two reasons why <q> is not permitted within <foreign>. Firstly <q> is an "inter" level element, which can appear between paragraphs as well as within them. Allowing it within <foreign> would therefor be quite a major change to its content -- it would have to allow paragraphs and lists as well. Secondly, <foreign>-ness is regarded as subsidiary to <q>-ness. This is not so arbitrary as it sounds: it's quite likely that a <q> will contain both foreign and non foreign material, whereas the reverse seems less probable. Also, foreignness can most economically be indicated by an attribute.

    I would tag the speech you quote as follows
    <p xml:lang="fr">La main, <mentioned xml:lang="en">de hand</mentioned>. Les doigts, <mentioned xml:lang="en">le fingres</mentioned>. Je pense que je suis le bon écolier. J’ai gagné deux mots d’anglais vitement. Comment appelez-vous <mentioned rend="quoted">les ongles</mentioned>?

    [Tho this doesn't of course indicate that what is presented as "French" here wouldnt go down so well in the hexagon ("vitement", forsooth)]

    Which brings us to the slippery slope of <mentioned>. You're quite right in thus characterising it, but you're not required to make this subjective and perhaps over subtle decision. You can simply use <q> for anything wrapped in quote marks -- effectively as a synonym for <hi rend="quoted"> if you like : many people do.

    Last edit: Lou Burnard 2013-03-13
  • Lou Burnard
    Lou Burnard

    • milestone: --> GREEN
  • Lou Burnard
    Lou Burnard

    • status: open --> open-rejected
    • status: open-rejected --> open-accepted
    • status: open-accepted --> open
  • Lou Burnard
    Lou Burnard

    • assigned_to: Sebastian Rahtz
  • Lou Burnard
    Lou Burnard

    I am not sure why this ticket has been opened again. Assigning to Sebastian to explain himself.

    • status: open --> closed
  • Michael asks about this today, and I see no response from Folger, so re-opening this to allow comment.

    • status: closed --> open
    • Priority: 5 --> 1(low)
    • The release notes for 2.4.0 say:
      A number of changes to loosen content models, mostly driven by experience of TCP EEBO project:
      allow <foreign> to contain <q> (per FR 442)

      But I don't see the change in the guidelines.

  • that was a mistake in the release notes, I am afraid. a misreading
    of the output of this ticket.

  • James Cummings
    James Cummings

    • Group: GREEN --> AMBER
  • James Cummings
    James Cummings

    Apologies, that seems to have been in error. What now is the status of this ticket. Changing from GREEN to AMBER.

  • Discussed by Council 11/13. Concluded that <foreign> is not intended to be used as structural wrapper element like this, and that LB's reformulation should be acceptable

    Last edit: Sebastian Rahtz 2013-11-11
    • status: open --> closed
  • This case is exceptional, and there are workarounds, so it's not worth fighting over. Consider this response merely for the record.

    Often, the foreign phrase is a subset of the speech. For example, Lafew in All's Well That Ends Well:

    Lustig, as the Dutchman says. I’ll like a maid
    the better whilst I have a tooth in my head. Why,
    he’s able to lead her a coranto.

    And Parolles:

    I would have that drum or another, or
    hic jacet.

    Here we need a container, probably <foreign>.

    Applying tags consistently is one of our aims. In my mind, someone who wants to search for foreign words/phrases should not have to search in two places: foreign tags and xml:lang attributes. Now, if we can apply the xml:lang attribute to all <foreign> elements, we don't have this problem.

    But, again from All's Well:

    First Soldier: Boskos thromuldo boskos.

    and a few lines later:

    FIRST SOLDIER Boskos vauvado, I understand thee and
    can speak thy tongue. Kerelybonto, sir, betake thee
    to thy faith, for seventeen poniards are at thy

    In the first example, you have a foreign phrase that comprises the whole speech. According to LB, we should use xml:lang in the <sp> element. In the second example, we need to use the <foreign> tag. Why not use the <foreign> tag for both? Besides, this is a made-up foreign language. When the language cannot be identified, or when it is a corruption of a known language, how do we apply the xml:lang attribute?

  • I think we felt that <foreign> is a short-hand for when you just want to quickly mark
    something in an unstructured way, a bit like <hi rend="foreign">. Using xml:lang is more generic and extensible. So we suggest using @xml:lang everywhere, on <seg> if its not distinguished in any other way, and don't use <foreign> at all in your context.

    there are provisions in xml:lang for unidentified languages, made up languages, and private variants

    Last edit: Sebastian Rahtz 2013-11-15
  • Lou Burnard
    Lou Burnard

    You most certainly can use xml:lang on the <foreign> element. In fact, that's what the Guidelines recommend, if you have no other tag available to carry the attribute. You can also (though the Guidelines don't say this) use <foreign> with the xml:lang attribute redundantly within an element the entirety of whose content is in a foreign language. So I am afraid I am still failing to understand what the problem is here.

  • The problem came when we had a foreign language we couldn't identify and didn't know how to apply the xml:lang attribute. If there are provisions for unidentified and made up languages, we'll look into that.

    The original problem was that we can't use the <foreign> tag redundantly when the content contains a <q> tag. But, like I said, it's a minor issue that rarely occurs for us.