#453 a place for metadata that you can't fit into existing header elements

GREEN
open
Syd Bauman
None
5(default)
2014-11-24
2013-05-13
Kevin Hawkins
No

People sometimes want to attach metadata to a TEI document that doesn't fit into an existing header element. For example, you might want to attach Dublin Core metadata, or you might want to include Cataloguing-in-Publication data for the TEI document. It would be good to have a place for such metadata.

We might create a new optional child of <teiHeader> called something like <containerForOtherMetadata>. In it you could put elements from another namespace (like <dc:title>), or you might create your own elements (like <cip>). I'm not sure what the content model for <containerForOtherMetadata> should be in order to allow for these things. Maybe it actually requires that <containerForOtherMetadata> be in a different namespace.

(It's possible that before this ticket is resolved we will have added a new element for CIP data as a child of <publicationStmt>. That idea is also being considered on TEI-L as I write this.)

Discussion

<< < 1 2 (Page 2 of 2)
  • Paul Schaffner
    Paul Schaffner
    2013-11-20

    I'm not sure what 'derived' means in this case, but if it means automatically derived from the text itself, then I cannot think of anything in MARC that fits that description. For us, the only metadata format that is 'derived' in that sense is the TEI header itself: our bibliographic data is stored in MARC (externally sourced and then manually modified); and our admin data in a tracking database. The admin data (or a summary of it) is periodically merged into the MARC, and from the MARC a TEI header is derived. It would be nice to be able to carry all of the MARC info into the TEI; ideally, even to the point that we could create a lossless round trip. Aside from boilerplate (licensing restrictions and the like), there is nothing in the header that is not derived from the MARC, and we have begun moving even that back into MARC, so that header generation becomes a purely automatic process, and the manual work is all done in MARC.

     
  • Martin Holmes
    Martin Holmes
    2013-11-20

    The OAI record is generated not only from the original source document but from a set of related documents including personographies, placeographies, and other stuff. Having generated it, I wouldn't mind providing it as part of the TEI you get if you view the XML of the document on the site, in the interests of providing a more fully-expanded set of metadata than is in the actual source document. Obviously I don't want to store it permanently in the original document; this is a question of public vs private XML. If the original document contains <name key="fredbloggs">Fred</name>, then obviously that's not very helpful for someone using the XML document, but if the extra metadata can be provided showing that this is in fact "Fred Aloysius Bloggs M.D.", that's helpful, and makes the downloaded TEI file more useful to an outsider.

     
    Last edit: Martin Holmes 2013-11-20
  • Lou Burnard
    Lou Burnard
    2014-01-05

    A consensus seems to be emerging that (a) the proposed <standoff> element is not an appropriate place for every kind of metadata (b) storing arbitrary metadata generated from existing parts of the Header is not a good idea, even if what's generated is enriched from other sources. Should this ticket therefore be rejected?

     
  • Martin Holmes
    Martin Holmes
    2014-01-05

    I really don't see why we should reject this ticket, and I don't see any consensus, other than around the notion that there should be a place to store a variety of things which are in some way detached from the normal file structure and content. Calling it "arbitrary" metadata seems gratuitous; it's metadata, and even though it's partly generated based on the content of the file, virtually all of it comes from outside the file (for instance, the of individuals are often not used, or used in an abbreviated form, in the file, while the full name and other information is drawn in from external sources). It's time-consuming to generate, too, so we don't want to do it on the fly.

     
  • Lou Burnard
    Lou Burnard
    2014-01-05

    The absence of consensus is sometimes a good reason for rejecting a ticket, imho.

     
  • Paul Schaffner
    Paul Schaffner
    2014-11-18

    This was the action from the 2013 f2f: "Action: MH will offer DC examples and PS will offer MARC examples. LB will pull them together into some text to be inserted into the Guidelines. Given this, Council will reconsider the feature request and whether to create the wrapper element."

     
  • Martin Holmes
    Martin Holmes
    2014-11-18

    Example from me (a real one):

    <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
             <dc:title>The colonial despatches of Vancouver Island and British Columbia 1846-1871: 11566, CO 60/2, p. 291; received 13 November. Trevelyan to Merivale (Permanent Under-Secretary)</dc:title>
             <dc:date>1858-11-12</dc:date>
             <dc:creator>Trevelyan</dc:creator>
             <dc:publisher>University of Victoria Humanities Computing and Media Centre, and UVic Libraries</dc:publisher>
             <dc:type>InteractiveResource</dc:type>
             <dc:format>application/xhtml+xml</dc:format>
             <dc:type>text</dc:type>
             <dc:identifier>http://bcgenesis.uvic.ca/getDoc.htm?id=B585TE13.scx</dc:identifier>
             <dc:rights>This document is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. See http://creativecommons.org/licenses/by-nc-sa/3.0/. Digitized images of documents and maps found on the bcgenesis website are licensed and may not be used without the permission of the licensing institution.  Enquiries to use digitized material should be addressed to the licensor in question and not UVic.</dc:rights>
             <dc:language>(SCHEME=ISO639) en</dc:language>
             <dc:source>Transcribed from microfilm and/or original documents, and marked up in TEI P5 XML. The interactive XHTML resource is generated from the XHTML using XQuery and XSLT.</dc:source>
             <dc:source>repository: CO</dc:source>
             <dc:source>coNumber: 60</dc:source>
             <dc:source>coVol: 2</dc:source>
             <dc:source>page: 291</dc:source>
             <dc:source>coRegistration: 11566</dc:source>
             <dc:source>received: received 13 November</dc:source>
             <dc:subject>Trevelyan, Sir Charles Edward</dc:subject>
             <dc:subject>Merivale, Herman</dc:subject>
             <dc:subject>Elliot, T. Frederick</dc:subject>
             <dc:subject>Moody, Colonel Richard Clement</dc:subject>
             <dc:subject>Lytton, Sir Edward George Earle Bulwer</dc:subject>
             <dc:subject>Jadis, Vane</dc:subject>
             <dc:subject>Carnarvon, Earl</dc:subject>
             <dc:subject>British Columbia</dc:subject>
             <dc:description>British Columbia correspondence: Public Offices document (normally correspondence between government departments)</dc:description>
          </oai_dc:dc>
    
     
    Last edit: Martin Holmes 2014-11-18
  • James Cummings
    James Cummings
    2014-11-24

    • assigned_to: Paul Schaffner --> Syd Bauman
    • Group: AMBER --> GREEN
     
  • James Cummings
    James Cummings
    2014-11-24

    At F2F Raleigh 2014-11-19 Assigning to SB; GREEN in that council agrees a container is needed. Needs a good name and externalMetadata, exoData, exoMetadata, xenoData, xenoMetadata were all suggested with exoData as the least bad. Decision on name postponed though. Need for Examples and Prose left with ticket owner to create, propose, etc. drawing on Council.

     
<< < 1 2 (Page 2 of 2)