#135 permit looser content for physDesc

Lou Burnard

The content model for physdesc is (model.pLike+ | model.physDescPart_sequence_optional), which enforces the rule that physical description must either be entirely "unstructured" or entirely composed of specialized model.physDescPart elements. Though reasonable, this is a real problem for large conversion projects where some parts of the input text can definitely be assigned to one of the specialized elements, and other parts cannot. The only option at present is to treat the whole of the input as unstructured, which loses information unnecessarily. The proposal is to relax the content model slightly so as to permit an initial sequence of paragraphs:
(model.pLike*, model.physDescPart_sequence_optional)


  • Syd Bauman

    Syd Bauman - 2008-07-30

    Logged In: YES
    Originator: NO

    This is a *very* welcome development, but does not go nearly far enough in addressing the problem.

    The general concept that content must be entirely unstructured or entirely composed of specialized elements applies to many other elements in the TEI scheme as well. But I can not think of any case in which it is a sensible restriction. It is almost *always* the case that for a particular bit of information someone may have useful things to say that fit neatly into the structured components the TEI has provided, *and* simultaneously have other, non-structured things they'd like to say.

    I strongly recommend that all such content models in the TEI (I will try to provide a list shortly) be reviewed, and unless there is compelling reason to prohibit the combination of both structured and unstructured data, the content model be changed to be more similar to the content model of the P2 to P4 <encodingDesc> element, which permitted structured information (albeit differently than P5) followed by a series of zero or more paragraphs. This was a brilliant way of addressing both the desire to describe in a machine-processable way certain bits of information, and also the desire to describe in prose that which requires further details or cannot be described in the structured elements.

    Thus, for <physDesc>, I am suggesting
    ( model.physDescPart_sequence_optional, modelpLike* )
    But more important than whether the model.pLike goes before or after the rest, I am suggesting that all such elements should have this improvement.

  • Syd Bauman

    Syd Bauman - 2008-07-30

    Logged In: YES
    Originator: NO

    Elements which currently suffer from an unstructured OR structured (but not both) content model, and thus should be considered for this corrective action include:

  • Lou Burnard

    Lou Burnard - 2008-08-17
    • milestone: --> GREEN
  • James Cummings

    James Cummings - 2008-08-18
    • priority: 5 --> 7
  • Lou Burnard

    Lou Burnard - 2008-09-03
    • status: open --> closed
  • Lou Burnard

    Lou Burnard - 2008-09-03

    Logged In: YES
    Originator: YES

    Implemented changes to bindingDesc and binding, plus modifications to the MS chapter, at svn revision #4740


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks