From: Andrew N. <and...@vi...> - 2008-10-17 12:50:14
|
Naomi - I am moving this topic to the vufind-tech list. I agree and think we should not stem author names. I will create a new field type called text_proper that will not do stemming. Are there other field that should not be stemmed, im thinking any field that stores a proper noun? Andrew ________________________________________ From: sol...@go... [sol...@go...] On Behalf Of Naomi Dushay [nd...@st...] Sent: Thursday, October 16, 2008 8:24 PM To: sol...@go... Subject: author name stemming I've been thinking more about whether or not to stem identifiable personal names (e.g. 100 fields, 700 fields). I think there are some compelling reasons NOT to stem author names. Stemming is meant to increase recall (the amount of relevant records returned). Stemming personal names is unlikely to increase recall, and is very likely to degrade precision (how "right" the results are). I was floundering for examples to illustrate this, asked librarians for some, and Vitus Tang, one of our MARC experts, came up with Michael/Michaels (see below) which also suggests William/Williams Here's Vitus's note: "I think the American writer Leonard Michaels could serve as an example of the effect of stemming personal names. If I search "Leonard Michaels" in the current version of SearchWorks, I get: Michaels, Leonard, 1933- (the correct person) Napolitano, Leonard Michael Koff, Leonard Michael Leonard, Michael James Stein, Michael Leonard Powell, Michael Leonard etc. So, you could get a lot more irrelevant records than if stemming was not applied." So I'll be tweaking our index so it doesn't stem the author fields. Naomi Dushay nd...@st... --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "solrmarc-tech" group. To post to this group, send email to sol...@go... To unsubscribe from this group, send email to sol...@go... For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en -~----------~----~----~----~------~----~------~--~--- |