This approach seems reasonable, but it won’t work if you replicate the text field; sorting assumes that a field contains only one token, and the text field splits up the data using the WhitespaceTokenizerFactory. You probably want to use the KeywordTokenizerFactory (which leaves the string alone) as the tokenizer, and then the PatternReplaceFilterFactory as a filter after that… and that’s all.

 

- Demian

 

From: Osullivan L. [mailto:L.Osullivan@swansea.ac.uk]
Sent: Tuesday, March 29, 2011 8:29 AM
To: vufind-tech@lists.sourceforge.net
Subject: [VuFind-Tech] Solr Sorting

 

Hi Folks,

 

We want to ensure that results sorted by title ignore punctuation. Thus, for example, "Race," class, and gender in exclusion from school would appear with the “Rs” rather than at the top of the results list.

 

Would the best method be to replicate the text fieldtype as “textNoPunc” adding <filter class="solr.PatternReplaceFilterFactory" pattern="(?&lt;!\b[A-Z])[.\s]*$" replacement="" replace="first"/>, create a new index field and then use that for sorting?

 

Thanks,

 

Luke O'Sullivan
Library Systems Officer - Virtual Academic Library
South West Wales Higher Education Partnership (SWWHEP)

Tel: 01792 602772
Website: www.swwhep.ac.uk

Ffn: 01792 602772
Gwefan:
www.swwhep.ac.uk

Check out the new SWWHEP Online GreenGuide at:
www.swwhep.ac.uk/en/projects/sustainability