#20 <extent>

closed
Lou Burnard
None
6
2007-02-15
2004-08-20
Andreas Nolda
No

According to the examples in P4, § 6.10, <extent> can
be used for specifying the number of pages of a
bibliographic item like a book. Another reasonable usage
of <extent> would be, for instance, the specification of
the total number of volumes of a multi-volume book.

As <extent> does not have a "type" attribute, measure
strings like "pp." oder "vols" have to be included into its
content. This is unfortunate in cases where
bibliographic data shall be stored in a language- and
style-neutral way.

In addition, <biblScope type="pages"> cannot simply be
substituted for <extent type="pages"> because of their
distinct semantics. <biblScope> defines the
'scope'--some part--of a bibliographic item (say, a
collection) with respect to some subitem (e.g., an
article). <extent>, on the other hand, measures the
whole (e.g., the collection).

So the xml-biblio group (cf. the archives of the
xml-biblio-discuss@lists.sourceforge.net mailing list)
proposes to add a "type" attribute to <extent> with
suggested values similar to those of like <biblScope>'s
"type" attribute.

Here is an example, using P4's <biblStruct> model:

<biblStruct>
<analytic>
<author>
<persName>
<forename>Edward</forename>
<forename>L.</forename>
<surname>Keenan</surname>
</persName>
</author>
<author>
<persName>
<forename>Dag</forename>
<surname>Westerst&#xE5;hl</surname>
</persName>
</author>
<title lang="eng" level="a">Generalized Quantifiers in
Linguistics and Logic</title>
</analytic>
<monogr>
<title lang="eng" level="m">Handbook of Logic and
Language</title>
<editor>
<persName>
<forename>Johan</forename>
<forename>F.</forename>
<forename>A.</forename>
<forename>K.</forename>
<nameLink>van</nameLink>
<surname>Benthem</surname>
</persName>
</editor>
<editor>
<persName>
<forename>Alice</forename>
<nameLink>ter</nameLink>
<surname>Meulen</surname>
</persName>
</editor>
<imprint>
<pubPlace>Amsterdam</pubPlace>
<publisher>Elsevier</publisher>
<pubPlace>Cambridge, Mass.</pubPlace>
<publisher>MIT Press</publisher>
<date>1997</date>
</imprint>
<extent type="pages">1247</extent>
<biblScope
type="pages">837&#x2013;893</biblScope>
</monogr>
</biblStruct>

Discussion

  • Lou Burnard
    Lou Burnard
    2004-08-20

    Logged In: YES
    user_id=1021146

    I would have expected the TYPE attribute for <extent> to
    take values such as "exact" or "approx". If there is a need for
    an attribute with the meanings suggested here, then I
    think "UNITS" might be a better name for it. However, as
    currently defined, <extent> is not really a structured field as
    it has potential for much wider application than this. For
    example, it might contain more than one kind of unit
    (e.g. "2000 files of average size 4000 mega octets", "12 A4
    pages bound with 16 assorted sizes photographs" )

    You're right to say it's not the same as biblScope tho.

     
  • Lou Burnard
    Lou Burnard
    2004-08-20

    • status: open --> pending
     
  • Andreas Nolda
    Andreas Nolda
    2004-08-20

    Logged In: YES
    user_id=950793

    "Unit" would indeed be more to the point than "type". "Type",
    however, is more in line with <biblScope>'s "type" attribute,
    which serves the same purpose. But one could change that
    attribute name, too, of course ...

    That new attribute on <extent> should be an optional one
    (as is <biblScope>'s "type" attribute). As a consequence, it
    would still be perfectly legal to use <extent> without any
    attribute for complex contents like "12 A4 pages bound with
    16 assorted sizes photographs".

     
  • Andreas Nolda
    Andreas Nolda
    2004-08-20

    • status: pending --> open
     
  • James Cummings
    James Cummings
    2004-09-24

    Logged In: YES
    user_id=612078

    I have experienced people using <extent> for multiple
    extent-like references, and think the further (optional)
    structuring of <extent> would be very useful. Currently
    people do things like:

    <extent>
    <seg type="designation">Text data</seg>
    <seg type="wordsize">60,000 words</seg>
    <seg type="filenumber">1 TEI XML File</seg>
    <seg type="filesize">123 KiB</seg>
    </extent>

    I'm not saying that is the *right* way of doing things, but
    obviously there is a demand for recording this kind of
    information that never seems to fit all together in one
    place easily.

    -James

     
  • Andreas Nolda
    Andreas Nolda
    2004-09-27

    Logged In: YES
    user_id=950793

    With a "type" attribute on <extent>, you could say instead:

    <extent type="words">60000</extent>
    <extent type="files">1</extent>
    <extent type="KB">123</extent>
    ...

     
  • James Cummings
    James Cummings
    2004-09-27

    Logged In: YES
    user_id=612078

    I wasn't looking at using <extent> inside a <biblStruct>,
    but inside the teiHeader's <fileDesc>, where one only seems
    to be (currently) allowed more than one. Since these are
    all types of extent, though, I'd be more comfortable with
    them being nested.
    <extent>
    <foo type="words">60000</foo>
    ...
    </extent>

    But I am certainly in favour of more type attributes, even
    if it makes them prone to abuse.

    -James

     
  • Syd Bauman
    Syd Bauman
    2004-10-05

    Logged In: YES
    user_id=686243

    Currently (i.e., in P4) <extent> may occur within <bibl>,
    <biblFull>, <fileDesc>, or <monogr>, but not <biblSctruct>.
    It can be repeated as a child of <bibl> or <monogr>, but
    not when it is a child of <fileDesc> or <biblFull>.

    But <measure> is a valid child of <extent>, and perfectly
    reasonable for this use, I should think.

    <extent>
    <seg type="designation">Text data</seg> <!-- what is
    this? -->
    <measure type="wordsize">60,000 words</measure>
    <measure type="filenumber">1 TEI XML
    File</measure>
    <measure type="filesize">123 KiB</measure>
    </extent>

    And, it seems to me, we could do a lot better in P5 by
    putting at least a untit= on <measure>. (See discussion of
    feature request #980854.)

    <extent>
    <label>Text data</label> <!-- ?? -->
    <measure num="60000"
    unit="words"
    stuff="textsize">60,000 words</measure>
    <measure num="1"
    unit="count"
    stuff="files">1 TEI XML File</measure>
    <measure num="123"
    unit="KiB"
    stuff="diskspace">123 KiB</measure>
    </extent>

     
  • Syd Bauman
    Syd Bauman
    2006-09-25

    • priority: 5 --> 6
    • assigned_to: nobody --> louburnard
     
  • Lou Burnard
    Lou Burnard
    2007-02-15

    Logged In: YES
    user_id=1021146
    Originator: NO

    I think I agree with Syd. If you want to structure the content of <extent>, then the way to do it in P5 is with multiple <measure>s, which has all the needed attributes (and then some).

    A structured <extent> statement is a possibility, but for a subsequent release.

     
  • Lou Burnard
    Lou Burnard
    2007-02-15

    • status: open --> closed