From: Greg P. <Gre...@us...> - 2009-07-08 23:07:48
|
Matt, We've just finished going through all of this to get our catalogue behaving as required. I can't show you right now, but our test code moves to the beta site tomorrow. We took a leaf out of NLA's book and browse indexed (and reverse browse indexed) every field we wanted this behaviour on. Then our query boosts records that are both left and right browse matched, followed by left browse matched only, then finally on to keyword. It wasn't as simple as I'd have liked :( and I'm still make occasional tweaks, but the nitty gritty is done for us. I might have made it too hard by mistake, but it works and I'm happy to be corrected and improve our processes. I've attached our schema, and some code snippets from the index and search below. One thing I'd love to take the time and put in solrmarc is a built in 'reverse string' option (unless I'm ignorant and it's already there) because we are having to do custom index functions for really simple fields like title rather then use the properties file. public String reverseString(String origin) { if (origin == null) {return origin;} StringBuffer buffer = new StringBuffer(origin); buffer = buffer.reverse(); return buffer.toString(); } /******* * Title - It isn't strictly necessary to get titles this way * but it makes it simpler to keep the logic here so * it isn't replicated for the string reversal as * well as out in the properties file. */ public String getUSQTitle(final Record record) { DataField field245 = (DataField) record.getVariableField("245"); String return_val = ""; if (field245 != null) { Subfield field245a = field245.getSubfield('a'); Subfield field245b = field245.getSubfield('b'); if (field245a != null) {return_val += Utils.cleanData(field245a.getData()) + " ";} if (field245b != null) {return_val += Utils.cleanData(field245b.getData());} } return return_val; } public String getUSQTitleReversed(final Record record) { return reverseString(getUSQTitle(record)); } // End function Solr query: $query .= '((title_short_browse:"' . $phraseQuery . '" AND' . ' title_short_browse_right:"' . $reversedQuery . '")^75 OR ' . '(title_browse:"' . $phraseQuery . '" AND' . ' title_browse_right:"' . $reversedQuery . '")^50) OR ' . '(title_short_browse:"' . $phraseQuery . '"^75 OR' . ' title_browse:"' . $phraseQuery . '"^50) OR ' . '(title_short:"' . $phraseQuery . '"^75 OR' . ' title:"' . $phraseQuery . '"^50 OR' . ' title:(' . $andQuery . ')^10 OR' . ' title_alt:(' . $andQuery . ')^5 OR' . ' title_old:(' . $andQuery . ') OR' . ' title_new:(' . $andQuery . '))^100 OR ' . '(series:("' . $phraseQuery . '")^10 OR' . ' series:(' . $andQuery . ')^5 OR' . ' series2:("' . $phraseQuery . '")^5 OR' . ' series2:(' . $andQuery . ')^5)'; We've also been playing with the Phonetic filter but I can't get it working the way I want. Double Metaphone, Metaphone and Soundex all match 'Organizational Behaviour' and 'Organisational Behavior' (yay) but they also match 'Organise' and 'Organ' (boo), whilst Refined Soundex only matches 'Bahaviour' and 'Behavior' and not the others. Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 ________________________________ From: Riehle, Matthew T [mailto:mtr...@pu...] Sent: Thursday, 9 July 2009 3:29 AM To: vuf...@li... Subject: [VuFind-Tech] Relevance results Hi All, We are trying to get more relevant results returned when searching for specific titles. For example: If the journal "Nature" is searched for by title, the first record that is returned is not "Nature", but rather "The natural history of weasels & stoats". The second record is Nature. In total, we should have 11 records returned for the title "Nature", but none of the other 10 records are being pulled within the first few pages of results . This appears to be an issue with only using the stemmed query for relevancy ranking. We have tried modifying the weights in the query string and only searching the title fields, but this has not helped any. Our results: http://catalog.lib.purdue.edu/Find/Search/Home?lookfor=Nature&type=title&submit=Find Villanova and CARLI both return relevant results when the same title search for "Nature" is done on their systems. http://library.villanova.edu/Find/Search/Home?lookfor=Nature&type=title&search=catalog+title&submit=Find http://vufind-beta.carli.illinois.edu/vf/Search/Home?lookfor=Nature&type=title&start_over=1&submit=Find Is there something we need to modify within Solr? Any help would be greatly appreciated! Thanks, Matt _______________________ Matt Riehle Web Application Developer Purdue University Libraries mtr...@pu... 765.496.1080 This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) |