From: Demian K. <dem...@vi...> - 2015-05-04 16:54:05
|
Thanks for the input on this. I definitely encourage you to follow the relevant pull request and add feedback there if you have any thoughts on specific changes as I make progress on implementing them: https://github.com/vufind-org/vufind/pull/354 There’s still much to do there, but I expect to put more time into moving the work forward this week. Regarding your suggestion about author_variant and author2_variant, this definitely makes sense to me, and it’s a good way of utilizing the existing initials work from Ronan. The only thing that might be worth some deeper thought or discussion is whether some of the initials logic actually belongs as a Solr analyzer that could be used to create a custom field type for names. In general, it’s often beneficial to do as much work as possible on the Solr side. However, in this particular case, my gut feeling is that we’re actually better off doing this at index time, since it offers flexibility for non-text-manipulation approaches (like the one you describe) and because this is a case where we actually don’t want the user input to be manipulated in the same fashion as the indexed data. - Demian From: André Lahmann [mailto:la...@ub...] Sent: Monday, May 04, 2015 5:25 AM To: vuf...@li... Subject: [VuFind-Tech] Improved Author Indexing - referring to the minutes of last weeks developers call Hi All, I saw that during last week's developers call session (according to https://vufind.org/wiki/developers_call:minutes20150428#other_topics)<https://vufind.org/wiki/developers_call:minutes20150428#other_topics%29> improved author indexing was discussed. I took that as an opportunity to discuss this issue in our team and we came up with some ideas I would like to share with you: First, we very much like the proposed indexing of authors and secondary authors by their role, independent of MARC 100 or 700 - this is a much more appropriate projection of MARC fields and their meaning to Solr Fields certainly resulting in better search results. In addition to that, we would like to suggest some further improvement for author indexing, especially regarding the second issue mentioned in the referenced JIRA ticket (https://vufind.org/jira/browse/VUFIND-542)<https://vufind.org/jira/browse/VUFIND-542%29>: dealing with alternative name notations. Along the author initials indexing approach one could add further indexing strategies to enrich each record with name variants for each author/person name, coming from authority records. This would require the additional Solr fields `author_variant` and `author2_variant` (multivalued, indexed, not stored). Our current in-house-solution at the Leipzig University Library populates those fields with the variant names by enriching the MARC records with linked authority records during pre-processing. An alternative and more general approach could be to index the variants by looking up the person name in a previously indexed Solr authority core using the already stored alternative notations from the authority core and/or retrieving more alternative notations by identifier (e.g. through viaf.org) to populate the `*_variant` fields. This approach improves searching for authors significantly and is adaptable to topics/keywords, analogously. You can take a look at this enriched MARC-record for example: https://katalog.ub.uni-leipzig.de/Record/0001723096/Details#tabnav Fields 900 and 950 are enriched with authority name data and authority topics/keywords data. Of course, it's not necessary to store this data in the MARC-record in order to get the functionality. Any thoughts and suggestions would be highly appreciated. Best, André Lahmann -- André Lahmann Universitätsbibliothek Leipzig Beethovenstraße 6 04107 Leipzig phone: +49 341 97 30 624 mail: la...@ub...<mailto:la...@ub...> |