From: Greg P. <Gre...@us...> - 2009-10-05 22:43:11
|
The nightly build (latest) fixed my second point. The first point is still a big hurdle though. Can't see the librarians being happy with our catalogue suggesting the users switching to US spelling. Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 From: Demian Katz [mailto:dem...@vi...] Sent: Monday, 5 October 2009 11:39 PM To: Greg Pendlebury; 'vuf...@li...' Subject: RE: Idle thoughts on spellcheck Thanks for the update -- I've added another comment to JIRA with this link. Let us know how things work out for you! - Demian From: Greg Pendlebury [mailto:Gre...@us...] Sent: Sunday, October 04, 2009 7:53 PM To: Greg Pendlebury; Demian Katz; 'vuf...@li...' Subject: RE: Idle thoughts on spellcheck Clearly I shouldn't listen to my gut: https://issues.apache.org/jira/browse/SOLR-1071 I'll try one of the latest nightly builds. I think from looking at the structure it will require a couple of code tweaks. Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 From: Greg Pendlebury [mailto:Gre...@us...] Sent: Monday, 5 October 2009 9:32 AM To: 'Demian Katz'; 'vuf...@li...' Subject: Re: [VuFind-Tech] Idle thoughts on spellcheck My gut reaction is it's a bug with php's json_decode(). You'd think it would turn the 'suggestion' index into an array of responses with appropriate testing. Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 From: Demian Katz [mailto:dem...@vi...] Sent: Friday, 2 October 2009 10:38 PM To: Greg Pendlebury; 'vuf...@li...' Subject: RE: Idle thoughts on spellcheck Thanks for sharing these notes -- I've attached them to the spellchecker JIRA issue so they don't get forgotten: http://www.vufind.org/jira/browse/VUFIND-13 Regarding the JSON problem with extendedResults, if I understand correctly, this sounds to me like a bug in Solr. Have you checked to see if the Solr project is aware of this issue? - Demian From: Greg Pendlebury [mailto:Gre...@us...] Sent: Friday, October 02, 2009 1:14 AM To: 'vuf...@li...' Subject: [VuFind-Tech] Idle thoughts on spellcheck I've been playing with spellcheck today on the laptop after finally getting it working from Till's contribution (I made some dumb mistakes :)). For our internal doco I had made a few notes which I've included below. I'm curious if anyone more familiar with solr spellcheck can see a way around them: * Solr does not return suggestions if your query got hits until you turn on 'onlyMorePopular'. Seems to be confusingly named, and it also means you can't get spelling suggestions on queries that have smaller hit counts if your search terms were actually correct. Example from our catalogue. If you search for 'behavior' (US spelling) you get 6,841 hits and no suggestions, but if you search for 'behaviour' (the REAL spelling ;P) you get 1,838 hits and the suggestion to try 'behavior' and 'behavioral'. * To get the number of hits each term would return you need to turn on 'extendedResults', however this causes parsing problems in json_decode() because of repeated use of the associative array index 'suggestion'. The repeated use means you only get the last suggestion from the list of returned values. * Spelling suggestions will be context free with regards to hit counts anyway unless you build context sensitive dictionaries. ie. 'rowling' in the author fields as a search is currently compared against 'rowling' in allfields because it is the origin of the dictionary. This point is somewhat moot given the json_decode() issue with extended results anyway (for now). Ta, Greg Pendlebury Electronic Services Officer (Systems Team) Division of Academic Information Services University of Southern Queensland Phone: +61 7 4631 1501 Fax: +61 7 4631 1841 ________________________________ This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) ________________________________ This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) ________________________________ This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt. The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M) |