#453 a place for metadata that you can't fit into existing header elements

GREEN
open
None
5(default)
2015-05-30
2013-05-13
No

People sometimes want to attach metadata to a TEI document that doesn't fit into an existing header element. For example, you might want to attach Dublin Core metadata, or you might want to include Cataloguing-in-Publication data for the TEI document. It would be good to have a place for such metadata.

We might create a new optional child of <teiHeader> called something like <containerForOtherMetadata>. In it you could put elements from another namespace (like <dc:title>), or you might create your own elements (like <cip>). I'm not sure what the content model for <containerForOtherMetadata> should be in order to allow for these things. Maybe it actually requires that <containerForOtherMetadata> be in a different namespace.

(It's possible that before this ticket is resolved we will have added a new element for CIP data as a child of <publicationStmt>. That idea is also being considered on TEI-L as I write this.)

Discussion

<< < 1 2 3 4 > >> (Page 3 of 4)
  • Martin Holmes

    Martin Holmes - 2013-11-19
    • assigned_to: Martin Holmes --> Paul Schaffner
     
  • Sebastian Rahtz

    Sebastian Rahtz - 2013-11-20

    surely all that OAI stuff is generated? you dont maintain it by hand, do you?

     
  • Peter Stadler

    Peter Stadler - 2013-11-20

    I'm really uncertain about this request. First, I don't think DC and MARC are good examples, since (echoing Sebastian) those are probably derived elements.
    On the other hand I see the need of project specific meta data (elements). Cf. e.g. caseDesc of the St. Louis Freedom Suits Legal Encoding Project or the correspDesc element currently under discussion with the SIG correspondence. Additionally it could facilitate the migration of legacy encodings to P5, where all the non-migratable stuff (from the header) could be put into the new wrapper element. But, if there was an element with an anything-goes-content-model that’s probably Pandora’s box …

     
  • Martin Holmes

    Martin Holmes - 2013-11-20

    Yes, the OAI stuff is generated, but I would rather like to include it in the XML view of the file if I could. We already include a lot of DC: metadata in the XHTML view.

     
  • Sebastian Rahtz

    Sebastian Rahtz - 2013-11-20

    You have a TEI XML original; which you then derive DC/OAI from; and then you want to include back the result into the original? This seems odd to me. If the OAI/DC comes from some other source entirely, and is not related to whats in the TEU, and you want to use TEI as an archival/interchange combination format, then I have a bit more interest. but including the same info twice seems like a recipe for problems.

     
  • Paul Schaffner

    Paul Schaffner - 2013-11-20

    I'm not sure what 'derived' means in this case, but if it means automatically derived from the text itself, then I cannot think of anything in MARC that fits that description. For us, the only metadata format that is 'derived' in that sense is the TEI header itself: our bibliographic data is stored in MARC (externally sourced and then manually modified); and our admin data in a tracking database. The admin data (or a summary of it) is periodically merged into the MARC, and from the MARC a TEI header is derived. It would be nice to be able to carry all of the MARC info into the TEI; ideally, even to the point that we could create a lossless round trip. Aside from boilerplate (licensing restrictions and the like), there is nothing in the header that is not derived from the MARC, and we have begun moving even that back into MARC, so that header generation becomes a purely automatic process, and the manual work is all done in MARC.

     
  • Martin Holmes

    Martin Holmes - 2013-11-20

    The OAI record is generated not only from the original source document but from a set of related documents including personographies, placeographies, and other stuff. Having generated it, I wouldn't mind providing it as part of the TEI you get if you view the XML of the document on the site, in the interests of providing a more fully-expanded set of metadata than is in the actual source document. Obviously I don't want to store it permanently in the original document; this is a question of public vs private XML. If the original document contains <name key="fredbloggs">Fred</name>, then obviously that's not very helpful for someone using the XML document, but if the extra metadata can be provided showing that this is in fact "Fred Aloysius Bloggs M.D.", that's helpful, and makes the downloaded TEI file more useful to an outsider.

     
    Last edit: Martin Holmes 2013-11-20
  • Lou Burnard

    Lou Burnard - 2014-01-05

    A consensus seems to be emerging that (a) the proposed <standoff> element is not an appropriate place for every kind of metadata (b) storing arbitrary metadata generated from existing parts of the Header is not a good idea, even if what's generated is enriched from other sources. Should this ticket therefore be rejected?

     
  • Martin Holmes

    Martin Holmes - 2014-01-05

    I really don't see why we should reject this ticket, and I don't see any consensus, other than around the notion that there should be a place to store a variety of things which are in some way detached from the normal file structure and content. Calling it "arbitrary" metadata seems gratuitous; it's metadata, and even though it's partly generated based on the content of the file, virtually all of it comes from outside the file (for instance, the of individuals are often not used, or used in an abbreviated form, in the file, while the full name and other information is drawn in from external sources). It's time-consuming to generate, too, so we don't want to do it on the fly.

     
  • Lou Burnard

    Lou Burnard - 2014-01-05

    The absence of consensus is sometimes a good reason for rejecting a ticket, imho.

     
<< < 1 2 3 4 > >> (Page 3 of 4)