From: Servilio A. P. <afr...@mc...> - 2011-08-12 16:47:46
Attachments:
format-bsh.patch
|
Hi, Troubleshooting and issue with journal title searches not including electronic versions, besides the cause for this other issues were spotted by my co-worker Wade Wycoff. Instead of directly creating tickets for these, I would like have a discussion in the list as the actions might be different in each case. This way would make it faster and easier to include Wade in the discussion. The issues found are: 1. Once a format is identified no further inspection is done, resulting in our case caused many journals to be classified as just electronic resources. 2. Archival material was classified as a "Kit" 3. Integrated resources were not taken into account 4. Neither were websites Attached is the diff showing our changes to format.bsh solving these issues. Servilio |
From: Demian K. <dem...@vi...> - 2011-08-16 12:59:57
|
The built-in getFormat function definitely has many flaws... and people keep talking about replacing it with something better, but I've never gotten a polished submission to use for updating the default distribution. (Villanova uses a custom routine that gets better results but is not pretty code... I can send you a copy if you are interested). This is probably something worth bringing up on a future developers call (once I'm back in the office) -- maybe we can finally make better progress on the issue! - Demian ________________________________________ From: Servilio Afre Puentes [afr...@mc...] Sent: Friday, August 12, 2011 12:48 PM To: VuFind Tech Cc: Wade Wyckoff Subject: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? Hi, Troubleshooting and issue with journal title searches not including electronic versions, besides the cause for this other issues were spotted by my co-worker Wade Wycoff. Instead of directly creating tickets for these, I would like have a discussion in the list as the actions might be different in each case. This way would make it faster and easier to include Wade in the discussion. The issues found are: 1. Once a format is identified no further inspection is done, resulting in our case caused many journals to be classified as just electronic resources. 2. Archival material was classified as a "Kit" 3. Integrated resources were not taken into account 4. Neither were websites Attached is the diff showing our changes to format.bsh solving these issues. Servilio |
From: Servilio A. P. <afr...@mc...> - 2011-08-16 14:08:34
|
On Tue, 16 Aug 2011 08:58:05 -0400, Demian Katz <dem...@vi...> wrote: > The built-in getFormat function definitely has many flaws... and people keep talking about replacing it with something better, but I've never gotten a polished submission to use for updating the default distribution. (Villanova uses a custom routine that gets better results but is not pretty code... I can send you a copy if you are interested). Yes, I am :) > This is probably something worth bringing up on a future developers > call (once I'm back in the office) -- maybe we can finally make better > progress on the issue! What about placing a ticket for this enhancement? Servilio |
From: Walker, D. <dw...@ca...> - 2011-08-16 17:27:57
|
Since I'm feeling a bit wordy this morning, I wanted to circle back to this email from Servilio. (Wade, I accidentally left you off my two earlier emails to the vufind-tech listserv proposing a new getFormat method, sorry about that. Perhaps Servilio has already forwarded those to you.) The code I proposed should address issues 1, 3 and 4 below. The second one is interesting, though. MARC doesn't really treat 'Archive' as a format, but rather as a method of 'control'. It's been awhile since I looked at this, but I believe almost any item -- book, photo, film, whatever -- can be under archival control. So, in my mind, it's better not to treat 'Archive' as a 'format' in place of the item's actual format, but rather as a kind of secondary description, or secondary format, of the item. My code includes it separately for that reason, although I don't think we've ever used it. --Dave ================== David Walker Library Web Services Manager California State University http://xerxes.calstate.edu ________________________________________ From: Servilio Afre Puentes [afr...@mc...] Sent: Friday, August 12, 2011 9:48 AM To: VuFind Tech Cc: Wade Wyckoff Subject: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? Hi, Troubleshooting and issue with journal title searches not including electronic versions, besides the cause for this other issues were spotted by my co-worker Wade Wycoff. Instead of directly creating tickets for these, I would like have a discussion in the list as the actions might be different in each case. This way would make it faster and easier to include Wade in the discussion. The issues found are: 1. Once a format is identified no further inspection is done, resulting in our case caused many journals to be classified as just electronic resources. 2. Archival material was classified as a "Kit" 3. Integrated resources were not taken into account 4. Neither were websites Attached is the diff showing our changes to format.bsh solving these issues. Servilio |
From: Wade W. <wy...@mc...> - 2011-08-16 18:21:03
|
Hi David, No, but that's a common misconception about that marc code. "Archival control" refers to the method of description, not the storage or access of the item. Specifically, it refers to collective rather than item level description. I suppose it could be applied to a collection of books, but I've never seen that done and wouldn't really be useful to present to the public. My comment to Servilio regarding archives actually related to the "Type" code in the leader, which was incorrectly matched to a label in the vufind code that we were reviewing. Type "o" is kit, type "p" is officially called "mixed materials" but more understandably presented to non-cataloguers as archives. Wade Sent from my iPhone On 2011-08-16, at 1:27 PM, "Walker, David" <dw...@ca...> wrote: > Since I'm feeling a bit wordy this morning, I wanted to circle back to this email from Servilio. > > > > (Wade, I accidentally left you off my two earlier emails to the vufind-tech listserv proposing a new getFormat method, sorry about that. Perhaps Servilio has already forwarded those to you.) > > > > The code I proposed should address issues 1, 3 and 4 below. > > > > The second one is interesting, though. MARC doesn't really treat 'Archive' as a format, but rather as a method of 'control'. > > > > It's been awhile since I looked at this, but I believe almost any item -- book, photo, film, whatever -- can be under archival control. So, in my mind, it's better not to treat 'Archive' as a 'format' in place of the item's actual format, but rather as a kind of secondary description, or secondary format, of the item. My code includes it separately for that reason, although I don't think we've ever used it. > > > > --Dave > > ================== > David Walker > Library Web Services Manager > California State University > http://xerxes.calstate.edu > ________________________________________ > From: Servilio Afre Puentes [afr...@mc...] > Sent: Friday, August 12, 2011 9:48 AM > To: VuFind Tech > Cc: Wade Wyckoff > Subject: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? > > Hi, > > Troubleshooting and issue with journal title searches not including > electronic versions, besides the cause for this other issues were > spotted by my co-worker Wade Wycoff. > > Instead of directly creating tickets for these, I would like have a > discussion in the list as the actions might be different in each > case. This way would make it faster and easier to include Wade in the > discussion. > > The issues found are: > > 1. Once a format is identified no further inspection is done, resulting > in our case caused many journals to be classified as just electronic > resources. > 2. Archival material was classified as a "Kit" > 3. Integrated resources were not taken into account > 4. Neither were websites > > Attached is the diff showing our changes to format.bsh solving these > issues. > > Servilio |
From: Walker, D. <dw...@ca...> - 2011-08-16 17:57:32
|
Thanks for that clarification, Wade. My code does include a MixedMaterial type -- correctly based on type = 'p', although I had to double-check ;-) -- which could be easily mapped to an 'Archive' label. --Dave ================== David Walker Library Web Services Manager California State University http://xerxes.calstate.edu ________________________________________ From: Wade Wyckoff [wy...@mc...] Sent: Tuesday, August 16, 2011 10:45 AM To: Walker, David Cc: Servilio Afre Puentes; VuFind Tech Subject: Re: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? Hi David, No, but that's a common misconception about that marc code. "Archival control" refers to the method of description, not the storage or access of the item. Specifically, it refers to collective rather than item level description. I suppose it could be applied to a collection of books, but I've never seen that done and wouldn't really be useful to present to the public. My comment to Servilio regarding archives actually related to the "Type" code in the leader, which was incorrectly matched to a label in the vufind code that we were reviewing. Type "o" is kit, type "p" is officially called "mixed materials" but more understandably presented to non-cataloguers as archives. Wade Sent from my iPhone On 2011-08-16, at 1:27 PM, "Walker, David" <dw...@ca...> wrote: > Since I'm feeling a bit wordy this morning, I wanted to circle back to this email from Servilio. > > > > (Wade, I accidentally left you off my two earlier emails to the vufind-tech listserv proposing a new getFormat method, sorry about that. Perhaps Servilio has already forwarded those to you.) > > > > The code I proposed should address issues 1, 3 and 4 below. > > > > The second one is interesting, though. MARC doesn't really treat 'Archive' as a format, but rather as a method of 'control'. > > > > It's been awhile since I looked at this, but I believe almost any item -- book, photo, film, whatever -- can be under archival control. So, in my mind, it's better not to treat 'Archive' as a 'format' in place of the item's actual format, but rather as a kind of secondary description, or secondary format, of the item. My code includes it separately for that reason, although I don't think we've ever used it. > > > > --Dave > > ================== > David Walker > Library Web Services Manager > California State University > http://xerxes.calstate.edu > ________________________________________ > From: Servilio Afre Puentes [afr...@mc...] > Sent: Friday, August 12, 2011 9:48 AM > To: VuFind Tech > Cc: Wade Wyckoff > Subject: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? > > Hi, > > Troubleshooting and issue with journal title searches not including > electronic versions, besides the cause for this other issues were > spotted by my co-worker Wade Wycoff. > > Instead of directly creating tickets for these, I would like have a > discussion in the list as the actions might be different in each > case. This way would make it faster and easier to include Wade in the > discussion. > > The issues found are: > > 1. Once a format is identified no further inspection is done, resulting > in our case caused many journals to be classified as just electronic > resources. > 2. Archival material was classified as a "Kit" > 3. Integrated resources were not taken into account > 4. Neither were websites > > Attached is the diff showing our changes to format.bsh solving these > issues. > > Servilio |
From: Demian K. <dem...@vi...> - 2011-08-29 12:10:33
Attachments:
villanova-format.bsh
|
As requested, here's the VU-specific format script -- as previously noted, the code isn't pretty! Have you had a chance to look at David Walker's work? Do you still want me to take a closer look at your patch, or do you think we would be better off migrating toward the more universal approach that he proposed? - Demian ________________________________________ From: Servilio Afre Puentes [afr...@mc...] Sent: Tuesday, August 16, 2011 10:09 AM To: Demian Katz; VuFind Tech Cc: Wade Wyckoff Subject: RE: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? On Tue, 16 Aug 2011 08:58:05 -0400, Demian Katz <dem...@vi...> wrote: > The built-in getFormat function definitely has many flaws... and people keep talking about replacing it with something better, but I've never gotten a polished submission to use for updating the default distribution. (Villanova uses a custom routine that gets better results but is not pretty code... I can send you a copy if you are interested). Yes, I am :) > This is probably something worth bringing up on a future developers > call (once I'm back in the office) -- maybe we can finally make better > progress on the issue! What about placing a ticket for this enhancement? Servilio |
From: Servilio A. P. <afr...@mc...> - 2011-08-29 12:39:53
|
On Mon, 29 Aug 2011 08:08:54 -0400, Demian Katz <dem...@vi...> wrote: > As requested, here's the VU-specific format script -- as previously noted, the code isn't pretty! > > Have you had a chance to look at David Walker's work? Do you still want me to take a closer look at your patch, or do you think we would be better off migrating toward the more universal approach that he proposed? I think what David proposed is a better approach. Servilio |
From: Demian K. <dem...@vi...> - 2011-08-29 12:41:03
|
Great! I'll look into moving this forward with the SolrMarc community. - Demian ________________________________________ From: Servilio Afre Puentes [afr...@mc...] Sent: Monday, August 29, 2011 8:40 AM To: Demian Katz; VuFind Tech Cc: Wade Wyckoff Subject: RE: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? On Mon, 29 Aug 2011 08:08:54 -0400, Demian Katz <dem...@vi...> wrote: > As requested, here's the VU-specific format script -- as previously noted, the code isn't pretty! > > Have you had a chance to look at David Walker's work? Do you still want me to take a closer look at your patch, or do you think we would be better off migrating toward the more universal approach that he proposed? I think what David proposed is a better approach. Servilio |
From: Tuan N. <tu...@yo...> - 2011-08-29 12:47:25
|
I think so too. Tried it out over the weekend, worked great and the code is very clean. Can we get it incorporated into VufindIndexer as a start? That way we can start testing with it right away. On 2011-08-29, at 8:40 AM, Demian Katz wrote: > Great! I'll look into moving this forward with the SolrMarc community. > > - Demian > ________________________________________ > From: Servilio Afre Puentes [afr...@mc...] > Sent: Monday, August 29, 2011 8:40 AM > To: Demian Katz; VuFind Tech > Cc: Wade Wyckoff > Subject: RE: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? > > On Mon, 29 Aug 2011 08:08:54 -0400, Demian Katz <dem...@vi...> wrote: >> As requested, here's the VU-specific format script -- as previously noted, the code isn't pretty! >> >> Have you had a chance to look at David Walker's work? Do you still want me to take a closer look at your patch, or do you think we would be better off migrating toward the more universal approach that he proposed? > > I think what David proposed is a better approach. > > Servilio > > ------------------------------------------------------------------------------ > EMC VNX: the world's simplest storage, starting under $10K > The only unified storage solution that offers unified management > Up to 160% more powerful than alternatives and 25% more efficient. > Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech |
From: Demian K. <dem...@vi...> - 2011-08-29 12:53:34
|
I've just forwarded David's original emails to solrmarc-tech and proposed that we make his format functionality part of the SolrMarc core. If I get a response to that in the next couple of days, I can move forward on that at we can bypass the extra work of updating the VufindIndexer... but if I don't get any timely feedback, that remains a viable option. I'll let you know what happens! - Demian ________________________________________ From: Tuan Nguyen [tu...@yo...] Sent: Monday, August 29, 2011 8:47 AM To: Demian Katz Cc: Servilio Afre Puentes; VuFind Tech; Wade Wyckoff Subject: Re: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? I think so too. Tried it out over the weekend, worked great and the code is very clean. Can we get it incorporated into VufindIndexer as a start? That way we can start testing with it right away. On 2011-08-29, at 8:40 AM, Demian Katz wrote: > Great! I'll look into moving this forward with the SolrMarc community. > > - Demian > ________________________________________ > From: Servilio Afre Puentes [afr...@mc...] > Sent: Monday, August 29, 2011 8:40 AM > To: Demian Katz; VuFind Tech > Cc: Wade Wyckoff > Subject: RE: [VuFind-Tech] Bugs in VuFindIndexer.getFormat? > > On Mon, 29 Aug 2011 08:08:54 -0400, Demian Katz <dem...@vi...> wrote: >> As requested, here's the VU-specific format script -- as previously noted, the code isn't pretty! >> >> Have you had a chance to look at David Walker's work? Do you still want me to take a closer look at your patch, or do you think we would be better off migrating toward the more universal approach that he proposed? > > I think what David proposed is a better approach. > > Servilio > > ------------------------------------------------------------------------------ > EMC VNX: the world's simplest storage, starting under $10K > The only unified storage solution that offers unified management > Up to 160% more powerful than alternatives and 25% more efficient. > Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech |