From: Naomi D. <nd...@st...> - 2008-10-17 16:33:39
|
Jeffrey, It takes you two days to generate an index? We do our 5.5 million marc21 records in about 6 hours. We're in SOLR 1.2, though I don't think that should matter. - Naomi On Oct 17, 2008, at 8:43 AM, Barnett, Jeffrey wrote: > To: 'Andrew Nagy' > Subject: RE: [VuFind-Tech] author name stemming > > I hate to ask, but does this imply another index reload? I just > finished one in order to get ready for the impending 1.0rc. Will > there be more, or is it safe to start the two day process over again? > > -----Original Message----- > From: Andrew Nagy [mailto:and...@vi...] > Sent: Friday, October 17, 2008 8:58 AM > To: Andrew Nagy; vuf...@li... > Subject: Re: [VuFind-Tech] author name stemming > > I just checked in a modified schema.xml where I added a field called > textProper and removed the stemming and synonyms. I added this to > the author fields and to the publisher field. > > Andrew > ________________________________________ > From: Andrew Nagy [and...@vi...] > Sent: Friday, October 17, 2008 8:48 AM > To: vuf...@li... > Subject: Re: [VuFind-Tech] author name stemming > > Naomi - I am moving this topic to the vufind-tech list. I agree and > think we should not stem author names. I will create a new field > type called text_proper that will not do stemming. > > Are there other field that should not be stemmed, im thinking any > field that stores a proper noun? > > Andrew > ________________________________________ > From: sol...@go... [solrmarc- > te...@go...] On Behalf Of Naomi Dushay > [nd...@st...] > Sent: Thursday, October 16, 2008 8:24 PM > To: sol...@go... > Subject: author name stemming > > I've been thinking more about whether or not to stem identifiable > personal names (e.g. 100 fields, 700 fields). I think there are some > compelling reasons NOT to stem author names. > > Stemming is meant to increase recall (the amount of relevant records > returned). > Stemming personal names is unlikely to increase recall, and is very > likely to degrade precision (how "right" the results are). > > I was floundering for examples to illustrate this, asked librarians > for some, and Vitus Tang, one of our MARC experts, came up with > > Michael/Michaels (see below) > > which also suggests William/Williams > > Here's Vitus's note: > > "I think the American writer Leonard Michaels could serve as an > example of the effect of stemming personal names. If I search "Leonard > Michaels" in the current version of SearchWorks, I get: > > Michaels, Leonard, 1933- (the correct person) > Napolitano, Leonard Michael > Koff, Leonard Michael > Leonard, Michael James > Stein, Michael Leonard > Powell, Michael Leonard > etc. > > So, you could get a lot more irrelevant records than if stemming was > not applied." > > > So I'll be tweaking our index so it doesn't stem the author fields. > > Naomi Dushay > nd...@st... > > > > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google > Groups "solrmarc-tech" group. > To post to this group, send email to sol...@go... > To unsubscribe from this group, send email to sol...@go... > For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en > -~----------~----~----~----~------~----~------~--~--- > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech Naomi Dushay nd...@st... |