#469 add extent to att.dimensions

GREEN
closed-accepted
Lou Burnard
None
5(default)
2013-11-20
2013-08-23
Laurent Romary
No

Should be a simple one. Extent is so far a bare element offering no other descriptive possibility than a plain text description of the extent. My use case is the description of a corpus (TEICorpus or biblFull) in terms of number units (lemmas, tokens, etc.) so that it can be used as reference when indicating a frequency for a lexical entry. Typically [extent unit="token" quantity="2.3e6"/] for a 2.3 million token corpus.

Discussion

1 2 > >> (Page 1 of 2)
  • Martin Holmes
    Martin Holmes
    2013-08-23

    Sounds very straightforward. The only slight worry is that the <extent> element will then carry the @extent attribute:

    @extent: indicates the size of the object concerned using a project-specific vocabulary combining quantity and units in a single string of words.

    So there will be two distinct ways of including a text description, and users will have to choose between them. I wonder if this ambiguity might be mitigated by some Schematron that says if you use @extent you can't include text inside the <extent> element, and vice versa?

     
    • BODARD Gabriel
      BODARD Gabriel
      2013-08-23

      I don't know if I'd even worry about that. I tend to use @extent with a project-specific but nonetheless constrained vocabulary, whereas the element content would be free text (and possibly transcription from the source). So you might say something like: <extent extent="101.3cm" precision="medium">forty-odd inches</extent>.

      (But in almost every case I use @quantity rather than @extent anyway; my impression is that EEBO is the main reason we haven't deprecated @extent altogether? So maybe we could take a step in the right direction by removing @extent from this element before we start...)

      Either way, I second this proposal, which seems uncontroversial to me.

       
  • Lou Burnard
    Lou Burnard
    2013-08-23

    Adding <extent> to att.dimensions would have the drawback that you could then only supply one set of dimensions for it. So you couldn't say e.g. my corpus has 10 gazillion words, 14 million sentences, and measures exactly 12345 Kbytes.

    What's wrong with making local practice require <measure> as content of <extent> (which is already permitted there)? This would be more consistent with other Header elements at the same level as extent (which should really be something like <extentStmt> of course).

    The comments about @extent vs @quantity + @unit seem slightly irrelevant. We defined @extent as a convenience for those not wishing to specify both @quantity and @unit. Removal/deprecation of it is imho neither warranted nor advisable.

     
  • James Cummings
    James Cummings
    2013-08-23

    I'm with Lou here. I would constrain <extent> to require <measure>. Perhaps we should view <extent> more as a grouping element for a set of <measure>s, but the flexibility of having just a string in here as well as more structured data is better than overloading the container I think.

     
    • Martin Holmes
      Martin Holmes
      2013-08-23

      "I would constrain <extent> to require <measure>. "

      Presumably you mean "in my project schema", rather than in the TEI schemas? Adding that constraint would invalidate thousands of documents in the wild.

      I hadn't considered the availability of <measure> when I responded above, and I wonder if Laurent had also not realized it was there. Using that makes more sense than adding <extent> to att.dimensions. We should, though, check whether there's an example of this usage in the Guidelines, and if not, add one.

       
  • James Cummings
    James Cummings
    2013-08-23

    Err, of course! I mean as a customisation of the TEI - not as a general rule for all TEI documents. But we could create an example demonstrating this and even recommend it as better practice (if indeed we decide it is).

    measure has been allowed there since start of P5 at least. ;-) But I definitely support adding an example. (But if ever in doubt I support more examples than less.)

     
  • Laurent Romary
    Laurent Romary
    2013-08-24

    Since the only examples in the guidelines are plain text ones, I would recommend to provide at least a structured one within the elementSpec of extent and also somewhere in the prose about the header.

     
  • Lou Burnard
    Lou Burnard
    2013-08-25

    I started adding some text and an example to HD. Looking at this however drew to my attention the current overlap between att.dimensions and att.measured, which seems a bit silly: I have raised another ticket (470) on that.

     
  • James Cummings
    James Cummings
    2013-11-09

    What is the conclusion with this ticket then? Should this be closed in preference to FR 470?

     
    • assigned_to: Lou Burnard
     
1 2 > >> (Page 1 of 2)