From: Eric L. M. <em...@nd...> - 2009-03-24 19:06:11
|
Can someone here provide me with an overview of how I might go about indexing, searching, and displaying EAD files using VuFind? I have been charged with the task of using VuFind to implement something we call the "Catholic Portal". [1, 2, 3] The goal of the portal is to bring to light rare, unique, and infrequently held materials of interest to scholars of all things Catholic. Initially, I provided a way for people to fill in a form, have the content saved to a MyLibrary database, and have the whole thing indexed with swish-e. Then there was the desire to index EAD files. Then there was the desire to index MARC records. Now there is the desire to use VuFind. VuFind indexes MARC just fine, but I need some guidance regarding the indexing of EAD files. I suppose I could: 1. Parse an EAD file. 2. Associate EAD elements/attributes with the Solr/Lucene fields. 3. Commit the results. 4. Go to Step #1 for all files. 5. Search the index. This process, specifically Step #2, would require mapping EAD elements to MARC-like fields. No? Additionally, EAD files can be very hierarchical in design, and I wonder how I might support deeper levels of access against my EAD files. Any of your thoughts on how to go about an implementation would be greatly appreciated. [1] Version #1 - http://www.catholicresearch.net/ [2] Version #2 - http://www.catholicresearch.net/crra/ [3] Version #3 - http://zoia.library.nd.edu/vufind/ -- Eric Lease Morgan Head, Digital Access and Information Architecture Department Hesburgh Libraries, University of Notre Dame (574) 631-8604 |
From: Ross S. <ros...@gm...> - 2009-03-24 19:59:03
|
I guess this depends a bit on what you want to search. If you're really only interested in the collection descriptions (rather than the actual box/folder contents, etc.), you could probably modify Terry Reese's EAD to MARCXML XSLT: http://oregonstate.edu/~reeset/marcedit/xslt/EADlitetoMARC21slimXML.xsl for your needs. Then (and I'm just throwing stuff out here) if you want to show actual item information, you could mock up an EAD Driver or something to display them as holdings in the full record view. You could always shove the item list into some Solr field so it gets picked up in a keyword anywhere search, as well. If this is -isn't- something that looks easy to do in VuFind (or is something you need right away), alternately, you could try Blacklight, since UVA uses that for their EAD, AFAIK. -Ross. On Tue, Mar 24, 2009 at 3:05 PM, Eric Lease Morgan <em...@nd...> wrote: > > Can someone here provide me with an overview of how I might go about > indexing, searching, and displaying EAD files using VuFind? > > I have been charged with the task of using VuFind to implement something we > call the "Catholic Portal". [1, 2, 3] The goal of the portal is to bring to > light rare, unique, and infrequently held materials of interest to scholars > of all things Catholic. Initially, I provided a way for people to fill in a > form, have the content saved to a MyLibrary database, and have the whole > thing indexed with swish-e. Then there was the desire to index EAD files. > Then there was the desire to index MARC records. Now there is the desire to > use VuFind. VuFind indexes MARC just fine, but I need some guidance > regarding the indexing of EAD files. > > I suppose I could: > > 1. Parse an EAD file. > 2. Associate EAD elements/attributes with the Solr/Lucene fields. > 3. Commit the results. > 4. Go to Step #1 for all files. > 5. Search the index. > > This process, specifically Step #2, would require mapping EAD elements to > MARC-like fields. No? Additionally, EAD files can be very hierarchical in > design, and I wonder how I might support deeper levels of access against my > EAD files. > > Any of your thoughts on how to go about an implementation would be greatly > appreciated. > > [1] Version #1 - http://www.catholicresearch.net/ > [2] Version #2 - http://www.catholicresearch.net/crra/ > [3] Version #3 - http://zoia.library.nd.edu/vufind/ > > -- > Eric Lease Morgan > Head, Digital Access and Information Architecture Department > Hesburgh Libraries, University of Notre Dame > > (574) 631-8604 > > > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > |
From: Eric L. M. <em...@nd...> - 2009-03-24 20:03:32
Attachments:
index-ead.txt
|
On 3/24/09 3:35 PM, "Ross Singer" <ros...@gm...> wrote: > I guess this depends a bit on what you want to search. If you're really only > interested in the collection descriptions (rather than the actual box/folder > contents, etc.), you could probably modify Terry Reese's EAD to MARCXML XSLT: > > http://oregonstate.edu/~reeset/marcedit/xslt/EADlitetoMARC21slimXML.xsl > > for your needs. Then (and I'm just throwing stuff out here) if you want to > show actual item information, you could mock up an EAD Driver or something to > display them as holdings in the full record view. You could always shove the > item list into some Solr field so it gets picked up in a keyword anywhere > search, as well. Thank you for the prompt reply. Assuming I were to convert my EAD files into MARCXML, how would I go about ingesting the MARCXML into VuFind? While not idea, this option seems plausible. EAD Driver? What would that be? Please elaborate. BTW, I was able to add to the Solr/Lucene index with the attached Perl script and get output from VuFind. A step in the right direction. (Whew!) -- Eric Lease Morgan |
From: Ross S. <ros...@gm...> - 2009-03-24 20:13:31
|
Well, this would be a two step process: XSLT your EAD into a bunch of MARCXML files (or single MARCXML collection) and then run yaz-marcdump to convert it from xml to binary marc. Then load the MARC records in SolrMARC like any other VuFind implementation. The EAD Driver is admittedly more 'hand wavy', but I picture it something like this: If the box/folder lists are analogous to, say, library holdings (and I don't think that's a terrible stretch), then you could basically hack together a driver like the Voyager, Unicorn or Aleph drivers that returns EAD-based items rather than journal issue runs or shelf marks. -Ross. On Tue, Mar 24, 2009 at 4:03 PM, Eric Lease Morgan <em...@nd...> wrote: > On 3/24/09 3:35 PM, "Ross Singer" <ros...@gm...> wrote: > >> I guess this depends a bit on what you want to search. If you're really only >> interested in the collection descriptions (rather than the actual box/folder >> contents, etc.), you could probably modify Terry Reese's EAD to MARCXML XSLT: >> >> http://oregonstate.edu/~reeset/marcedit/xslt/EADlitetoMARC21slimXML.xsl >> >> for your needs. Then (and I'm just throwing stuff out here) if you want to >> show actual item information, you could mock up an EAD Driver or something to >> display them as holdings in the full record view. You could always shove the >> item list into some Solr field so it gets picked up in a keyword anywhere >> search, as well. > > > Thank you for the prompt reply. > > Assuming I were to convert my EAD files into MARCXML, how would I go about > ingesting the MARCXML into VuFind? While not idea, this option seems > plausible. > > EAD Driver? What would that be? Please elaborate. > > BTW, I was able to add to the Solr/Lucene index with the attached Perl > script and get output from VuFind. A step in the right direction. (Whew!) > > -- > Eric Lease Morgan > > > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > > |
From: Eric L. M. <em...@nd...> - 2009-03-24 20:49:05
|
On 3/24/09 4:13 PM, "Ross Singer" <ros...@gm...> wrote: > XSLT your EAD into a bunch of MARCXML files (or single MARCXML > collection) and then run yaz-marcdump to convert it from xml to binary > marc. > > Then load the MARC records in SolrMARC like any other VuFind implementation. > > The EAD Driver is admittedly more 'hand wavy', but I picture it > something like this: > > If the box/folder lists are analogous to, say, library holdings (and I > don't think that's a terrible stretch), then you could basically hack > together a driver like the Voyager, Unicorn or Aleph drivers that > returns EAD-based items rather than journal issue runs or shelf marks. Cool. Besides a few hiccups from the XSL sheet, I was able to create easily indexable data going the EAD -> MARCXML -> MARC route with xsltproc and yaz-marcdump. My next step will be to take a gander at Voyager, Unicorn, and/or Aleph drivers as models for future development. This might not be as hard as I first thought. Something to explore tomorrow. oss4libraries++ # thank you -- Eric Morgan |
From: Greg P. <pen...@us...> - 2009-03-24 22:29:07
|
This is slightly left of field, but you can also consider a database connection (if relevant) during the index process. I just recently finished doing this to create a location facet. Our Virtua install has no 'item' level information for serials that comes out with the marc export, so I establish an oracle connection to see which campus print serials are held at, then index the result as though it came from the marc. Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 -----Original Message----- From: Eric Lease Morgan [mailto:em...@nd...] Sent: Wednesday, 25 March 2009 6:49 AM To: vuf...@li... Subject: Re: [VuFind-Tech] indexing, searching, and displaying ead files On 3/24/09 4:13 PM, "Ross Singer" <ros...@gm...> wrote: > XSLT your EAD into a bunch of MARCXML files (or single MARCXML > collection) and then run yaz-marcdump to convert it from xml to binary > marc. > > Then load the MARC records in SolrMARC like any other VuFind implementation. > > The EAD Driver is admittedly more 'hand wavy', but I picture it > something like this: > > If the box/folder lists are analogous to, say, library holdings (and I > don't think that's a terrible stretch), then you could basically hack > together a driver like the Voyager, Unicorn or Aleph drivers that > returns EAD-based items rather than journal issue runs or shelf marks. Cool. Besides a few hiccups from the XSL sheet, I was able to create easily indexable data going the EAD -> MARCXML -> MARC route with xsltproc and yaz-marcdump. My next step will be to take a gander at Voyager, Unicorn, and/or Aleph drivers as models for future development. This might not be as hard as I first thought. Something to explore tomorrow. oss4libraries++ # thank you -- Eric Morgan ------------------------------------------------------------------------------ Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Vufind-tech mailing list Vuf...@li... https://lists.sourceforge.net/lists/listinfo/vufind-tech This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) |
From: Eric L. M. <em...@nd...> - 2009-03-25 18:24:59
|
On 3/24/09 4:13 PM, "Ross Singer" <ros...@gm...> wrote: > If the box/folder lists are analogous to, say, library holdings (and I don't > think that's a terrible stretch), then you could basically hack together a > driver like the Voyager, Unicorn or Aleph drivers that returns EAD-based items > rather than journal issue runs or shelf marks. On 3/24/09 6:28 PM, "Greg Pendlebury" <pen...@us...> wrote: > This is slightly left of field, but you can also consider a database > connection (if relevant) during the index process. > > I just recently finished doing this to create a location facet. Our Virtua > install has no 'item' level information for serials that comes out with the > marc export, so I establish an oracle connection to see which campus print > serials are held at, then index the result as though it came from the marc. Interesting, times two. After I get a bucket of content in the index, I will explore both of these options more thoroughly. Thank you. -- Eric Lease Morgan University of Notre Dame |
From: Jon G. <jon...@gm...> - 2009-03-24 20:12:34
|
What Ross means by the driver is the fact that VuFind relies on MARC records not just for the indexing, but uses it as a source of what to display. The actual details are dependent on what version of VuFind you're using. Using a EAD -> MARCXML -> MARC pipeline would get your records into a form solrmarc can index, but without changing some of the display stuff you still can't display the records. Also, it's not clear where the holding information and items like that come. (This info also comes from the driver if I remember correctly.) Jon Gorman On Tue, Mar 24, 2009 at 3:03 PM, Eric Lease Morgan <em...@nd...> wrote: > On 3/24/09 3:35 PM, "Ross Singer" <ros...@gm...> wrote: > >> I guess this depends a bit on what you want to search. If you're really only >> interested in the collection descriptions (rather than the actual box/folder >> contents, etc.), you could probably modify Terry Reese's EAD to MARCXML XSLT: >> >> http://oregonstate.edu/~reeset/marcedit/xslt/EADlitetoMARC21slimXML.xsl >> >> for your needs. Then (and I'm just throwing stuff out here) if you want to >> show actual item information, you could mock up an EAD Driver or something to >> display them as holdings in the full record view. You could always shove the >> item list into some Solr field so it gets picked up in a keyword anywhere >> search, as well. > > > Thank you for the prompt reply. > > Assuming I were to convert my EAD files into MARCXML, how would I go about > ingesting the MARCXML into VuFind? While not idea, this option seems > plausible. > > EAD Driver? What would that be? Please elaborate. > > BTW, I was able to add to the Solr/Lucene index with the attached Perl > script and get output from VuFind. A step in the right direction. (Whew!) > > -- > Eric Lease Morgan > > > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > > |
From: Eric L. M. <em...@nd...> - 2009-03-24 20:44:46
|
On 3/24/09 4:12 PM, "Jon Gorman" <jon...@gm...> wrote: > What Ross means by the driver is the fact that VuFind relies on MARC > records not just for the indexing, but uses it as a source of what to > display. The actual details are dependent on what version of VuFind > you're using. Using a EAD -> MARCXML -> MARC pipeline would get your > records into a form solrmarc can index, but without changing some of > the display stuff you still can't display the records. Also, it's not > clear where the holding information and items like that come. (This > info also comes from the driver if I remember correctly.) This is the sort of problem I anticipated. Thank you. -- Eric |
From: Andrew N. <as...@gm...> - 2009-03-26 16:54:27
|
Im gonna supply a bit more of an outside the box type of solution - but here's what I do (admittedly biased since I know vufind inside and out). 1. Map the EAD metadata to the fields in the VuFind schema. (Remember the VuFind was developed for searching bibliographic metadata). 2. Build a new "service" in vufind for the display of EAD files based on the Record service (meant for displaying MARC records). Can you share an EAD file - Im not sure if I have ever seen a live EAD file. I might be able to come up with some additional ideas on this once I look at the data. Andrew On Tue, Mar 24, 2009 at 4:12 PM, Jon Gorman <jon...@gm...>wrote: > What Ross means by the driver is the fact that VuFind relies on MARC > records not just for the indexing, but uses it as a source of what to > display. The actual details are dependent on what version of VuFind > you're using. Using a EAD -> MARCXML -> MARC pipeline would get your > records into a form solrmarc can index, but without changing some of > the display stuff you still can't display the records. Also, it's not > clear where the holding information and items like that come. (This > info also comes from the driver if I remember correctly.) > > Jon Gorman > > On Tue, Mar 24, 2009 at 3:03 PM, Eric Lease Morgan <em...@nd...> wrote: > > On 3/24/09 3:35 PM, "Ross Singer" <ros...@gm...> wrote: > > > >> I guess this depends a bit on what you want to search. If you're really > only > >> interested in the collection descriptions (rather than the actual > box/folder > >> contents, etc.), you could probably modify Terry Reese's EAD to MARCXML > XSLT: > >> > >> http://oregonstate.edu/~reeset/marcedit/xslt/EADlitetoMARC21slimXML.xsl<http://oregonstate.edu/%7Ereeset/marcedit/xslt/EADlitetoMARC21slimXML.xsl> > >> > >> for your needs. Then (and I'm just throwing stuff out here) if you want > to > >> show actual item information, you could mock up an EAD Driver or > something to > >> display them as holdings in the full record view. You could always > shove the > >> item list into some Solr field so it gets picked up in a keyword > anywhere > >> search, as well. > > > > > > Thank you for the prompt reply. > > > > Assuming I were to convert my EAD files into MARCXML, how would I go > about > > ingesting the MARCXML into VuFind? While not idea, this option seems > > plausible. > > > > EAD Driver? What would that be? Please elaborate. > > > > BTW, I was able to add to the Solr/Lucene index with the attached Perl > > script and get output from VuFind. A step in the right direction. (Whew!) > > > > -- > > Eric Lease Morgan > > > > > > > > > > > ------------------------------------------------------------------------------ > > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > > easily build your RIAs with Flex Builder, the Eclipse(TM)based > development > > software that enables intelligent coding and step-through debugging. > > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > > _______________________________________________ > > Vufind-tech mailing list > > Vuf...@li... > > https://lists.sourceforge.net/lists/listinfo/vufind-tech > > > > > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > |
From: Eric L. M. <em...@nd...> - 2009-03-26 17:20:12
Attachments:
longer.xml
shorter.xml
|
On 3/26/09 12:54 PM, "Andrew Nagy" <as...@gm...> wrote: > Im gonna supply a bit more of an outside the box type of solution - but here's > what I do (admittedly biased since I know vufind inside and out). > > 1. Map the EAD metadata to the fields in the VuFind schema. (Remember the > VuFind was developed for searching bibliographic metadata). > > 2. Build a new "service" in vufind for the display of EAD files based on the > Record service (meant for displaying MARC records). > > Can you share an EAD file - Im not sure if I have ever seen a live EAD file. > I might be able to come up with some additional ideas on this once I look at > the data. This sounds like a viable option too. Thank you. Attached are two EAD files. The first is shorter and more rudimentary. The second is much longer. The DTD is online. [1] [1] All about EAD - http://www.loc.gov/ead/ -- Eric "Off To Look At The Record Service" Morgan |
From: Ross S. <ros...@gm...> - 2009-03-26 17:53:39
|
On Thu, Mar 26, 2009 at 12:54 PM, Andrew Nagy <as...@gm...> wrote: > Can you share an EAD file - Im not sure if I have ever seen a live EAD > file. I might be able to come up with some additional ideas on this once I > look at the data. You're going to regret this, for you will forever wish you that you could "unsee" the EAD. http://www.library.gatech.edu/archives/finding-aids/oai?verb=ListRecords&metadataPrefix=ead Apologies about the non-wellformedness, this is a system I, obviously, have no control over anymore... -Ross. |
From: Andrew N. <as...@gm...> - 2009-03-26 19:52:31
|
Thanks Ross So it seems that the metadata should map fairly well (with my quick glance at the data) to the vufind field schema. Eric - I would just create a another "service" in vufind for EAD records that copies or extends much of the functionality in the record service but is customized for displaying the EAD data. This is something that I always wanted to do for displaying digital library records. My thought has always been to refactor the record service a bit to allow for display based on record type or record collection. Library catalog records could be displayed differently from EAD, from digital library, etc. So if you are feeling really adventerous you could take on refactoring the record service. But hopefully I will get to it one of these days. Andrew On Thu, Mar 26, 2009 at 1:53 PM, Ross Singer <ros...@gm...> wrote: > On Thu, Mar 26, 2009 at 12:54 PM, Andrew Nagy <as...@gm...> wrote: > > > Can you share an EAD file - Im not sure if I have ever seen a live EAD > > file. I might be able to come up with some additional ideas on this once > I > > look at the data. > > You're going to regret this, for you will forever wish you that you > could "unsee" the EAD. > > > http://www.library.gatech.edu/archives/finding-aids/oai?verb=ListRecords&metadataPrefix=ead > > Apologies about the non-wellformedness, this is a system I, obviously, > have no control over anymore... > > -Ross. > |
From: Greg P. <pen...@us...> - 2009-03-26 22:53:09
|
It sounds promising as well for integration of ERMS data into the discovery layer. Our staff were impressed by Encore's handling of this during our assessment stage. Their print serials holdings come from MARC, then get merged with online serials from ERMS at display time. This is currently slated as part of Phase 2 of our VuFind role out, so any efforts to make the Record screen modular and multi-functional are a step in the right direction. Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 ________________________________ From: Andrew Nagy [mailto:as...@gm...] Sent: Friday, 27 March 2009 5:52 AM To: Ross Singer Cc: vuf...@li... Subject: Re: [VuFind-Tech] indexing, searching, and displaying ead files Thanks Ross So it seems that the metadata should map fairly well (with my quick glance at the data) to the vufind field schema. Eric - I would just create a another "service" in vufind for EAD records that copies or extends much of the functionality in the record service but is customized for displaying the EAD data. This is something that I always wanted to do for displaying digital library records. My thought has always been to refactor the record service a bit to allow for display based on record type or record collection. Library catalog records could be displayed differently from EAD, from digital library, etc. So if you are feeling really adventerous you could take on refactoring the record service. But hopefully I will get to it one of these days. Andrew On Thu, Mar 26, 2009 at 1:53 PM, Ross Singer <ros...@gm...<mailto:ros...@gm...>> wrote: On Thu, Mar 26, 2009 at 12:54 PM, Andrew Nagy <as...@gm...<mailto:as...@gm...>> wrote: > Can you share an EAD file - Im not sure if I have ever seen a live EAD > file. I might be able to come up with some additional ideas on this once I > look at the data. You're going to regret this, for you will forever wish you that you could "unsee" the EAD. http://www.library.gatech.edu/archives/finding-aids/oai?verb=ListRecords&metadataPrefix=ead Apologies about the non-wellformedness, this is a system I, obviously, have no control over anymore... -Ross. This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) |