Have you considered using EAD as a metadata representation of your
collections? It may be a slightly unorthodox use of EAD but it is a
metadata structure that seems to fit your sub-collections very nicely.
It allows the xlink language in most tags. It may not solve your D-space
issues, but could help from a browsing point of view at least.
On that note, EAD can also have subject headings embedded in it at any
level within the hierarchy, top to bottom. Lucene could do a good job of
searching this, along with a parser, which could help you reduce
replication of data.
Interestingly enough, our project has tentatively decided to include
subject headings from each collection and 'sub-collection' (in our case,
'subseries') in every item belonging to that series to do our item level
searching -- it sounds alot like your research.
I look forward to further discussion on this topic on the list.
Irish Virtual Research Library and Archive
----- Original Message -----
From: John McDonough <email@example.com>
Date: Tuesday, February 27, 2007 2:41 pm
Subject: [Fwd: [Dspace-general] Preserving structured collections -
major DSpace change for Collection object]
To: joseph greene <firstname.lastname@example.org>, Adele <email@example.com>
> -------- Original Message --------
> Subject: [Dspace-general] Preserving structured collections -
> DSpace change for Collection object
> Date: Tue, 27 Feb 2007 09:11:47 -0500
> From: Tellier, Stephane <firstname.lastname@example.org>
> To: DSpaceemail@example.com
> CC: firstname.lastname@example.org
> Hi all,
> In our project in which we have to implement a DSpace solution,
> actually facing a major problem that might maybe concerns other
> working in librairies.
> We need to submit and preserve periodicals in DSpace in a
> form. Example :
> Times magazine
> In our library, the main database for metadatas is a catalog. An
> can contains a "note" in this catalog and this note possess some
> descriptive metadatas.
> In the example above, the Times magazine collection, while
> many pdf items, would possess only 1 note in our catalog. That
> after the transfer from the catalog to DSpace, that the DSpace
> Collection representing the magazine should be ideally the only
> that should contains the metadatas, because we don't want to repeat
> those metadatas for each of the DSpace Items possessing the pdf
> files in
> the whole Collection. This is for performance reason because we
> some collections possessing thousands of pdfs (like a newspaper of
> than 100 years old and having a pdf for each day).
> For our team, that means we are actually considering the solution
> making a big change to DSpace so that :
> 1) a collection can have sub-collections (same idea here as
> Communities);2) a collection can be mapped to the metadatas schema
> and therefore be
> considered as an "Item", so that its metadatas would be indexed in
> same way. The collection would then be searchable through the dc
> (for example). In that case, if we make a search and it gives one
> the item, possessing the pdf, as a result (full-text indexed pdf),
> would get the dc metadatas from its "parent" collection, instead
> those in the item's record.
> As any people here have the same needs and has begin some works
> it? We consider that this can be a very useful add-on for DSpace,
> resolving almost any kind of digital collections. However, we know
> this will not be a simple modification...