#81 some egXMLs inherit 'en' when they aren't

GREEN
closed-fixed
David Sewell
4
2009-06-04
2009-04-03
James Cummings
No

It is a bug that some <egXML> elements inherit the overall language of that version of the Guidelines when indeed they are not in that language. The example raised was that there may be some examples in Middle English (enm) or Anglo-Saxon (ang) (and possibly Latin (lat)) which do not have the right @xml:lang attribute.

It is suggested in another feature request that @xml:lang (and @xml:id) be applied to all egXML and exemplum elements whether it duplicates their inheritance or not.

We will attempt to create a more detailed list of egXML elements with missing/incorrect @xml:lang attributes. For the /en/ version of the guidelines it is assumed that all examples are in Modern English unless otherwise stated. So this list must look at all examples where language isn't specified and double-check that they are indeed in Modern English. Other divisions such as 'Early Modern English' will be ignored and counted as Modern English since I do not believe it has an ISO language code of its own.

We'll add the results of our investigations to this ticket.

-James

Discussion

  • James Cummings
    James Cummings
    2009-04-03

    It occurred to me that I should find an example of this happening. All I did was look randomly through a chapter of the english guidelines for the first example I could find that was not in english.

    Very quickly in http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html I found and example containing this text (I have no other way to reference examples since they aren't addressable!):

    "<head facs="#B49rHead">DU SON ET ACCORD DES CLOCHES ET <lb/> des alleures des chevaulx, [...]"

    Looking at the underlying XML, the <egXML> has no @xml:lang attribute, thus is assumed to be 'en' which it plainly isn't. I should stress that this is just the first example I randomly selected. You may also notice that it doesn't have a link to the bibliography.

    Other randomly selected examples in the same chapter in Middle English don't have an @xml:lang.

    Methinks this means that more checking is probably needed. I will attempt to produce a list of all examples regardless of whether they have an @xml:lang or not. If they do I will output that alongside the example to check whether it is the correct language. Alongside this, I should output some identifying string (XPath? ancestor::tei:div/@xml:id that can be submitted to indicate the errors.

    -James

     
  • Lou Burnard
    Lou Burnard
    2009-05-24

    • milestone: --> GREEN
    • assigned_to: nobody --> dsew
     
  • David Sewell
    David Sewell
    2009-05-26

    This has been done now (2009-05-26) for all <exemplum> elements in the element/class reference files under P5/Source/Specs, to facilitate selection of appropriate exempla according to documentation language.

    It remains to add @xml:lang to all instances of <eg:egXML> in the Guidelines chapters.

    DS

     
  • David Sewell
    David Sewell
    2009-05-26

    • status: open --> open-accepted
     
  • David Sewell
    David Sewell
    2009-06-04

    I've done a first pass at adding @xml:lang to <egXML> elements in the Guidelines chapters that should not inherit @xml:lang="en". I did not in fact bother to distinguish modern & middle English, but I did tag Old English as "ang". Multilingual examples are "mul", ones with no natural language or indeterminate are "und". In many cases these judgments are somewhat subjective; I generally treated anything with primary English content in unconstrained element or attribute values as English.

     
  • David Sewell
    David Sewell
    2009-06-04

    • status: open-accepted --> closed-accepted
     
  • David Sewell
    David Sewell
    2009-06-04

    • status: closed-accepted --> closed-fixed