I’m moving this over to the tech list since it is getting a bit technical…

 

First of all, apologies for some misinformation in my previous reply: you can’t combine a custom SolrMarc function with the “first” setting.  When setting up custom functions as described previously, you just need to make sure that the publishDateSort field’s custom function returns only one value, while the publishDate custom function may return a set of multiple values if you so desire.

 

In any case, thanks to this conversation, I have just made some changes to the solr3 VuFind branch that I announced yesterday.  These changes will make their way into the trunk as soon as we make the upgrade to Solr 3.1 official:

 

- There is now a single-valued publishDateSort field for date sorting purposes.  This uses the output of SolrMarc’s custom getDate function, which returns a comma-separated list of all dates in a record.

- The regular multi-valued publishDate field now indexes MARC 260c instead of using a custom function.  Since this is a multi-valued field, we don’t actually want a comma-separated list here.  There is a bug in existing versions of VuFind that causes glitchy displays for records with multiple 260 fields as a result of this inconsistency.

 

I don’t think that too many people are going to notice a difference here…  but I believe that this is an improvement nonetheless.  Please let me know if you have any questions or concerns about this.

 

- Demian

 

From: Demian Katz
Sent: Wednesday, April 20, 2011 9:38 AM
To: 'Harmon, Kelly'; vufind-general@lists.sourceforge.net
Subject: RE: Sort by Date Question

 

The date sort is very simple – it’s just an alphabetical sort on the publishDate field.  This works great when all of the values in the field are four-digit years, but in other circumstances it creates problems like the one you describe.

 

Fixing it probably isn’t too hard, but it may require a little bit of programming.  I think this is the general idea:

 

1.)    Add a new field to solr/biblio/conf/schema.xml to use for sorting by date.  Something like this:

a.       <field name="publishDateSort" type="string" indexed="true" stored="true" />

2.)    Update your web/conf/searches.ini file to replace references to “year” with “publishDateSort” in the [Sorting] section so that you can sort on the new sort field rather than on the existing publishDate field, which will now be used only for display purposes.

3.)    Here’s where the programming comes in: create a new copy of the function you are currently using to populate the publishDate field that returns a sortable version of the date instead of a human-readable string.  For example, if it would normally return “May 2011” have it return “20110500” instead.  If it would normally return plain old “2011”, you could do something like “20110000”.

 

The idea here is that you keep the current behavior for the publishDate field, since that is used for display… but the new publishDateSort field will contain a machine-sortable equivalent.

 

Also, one other important note: the publishDate field is currently multi-valued, since a single record may contain multiple publication dates.  However, you can’t sort on a multi-valued field, so I removed the ‘multivalued=”true”’ setting when I created the publishDateSort field in the example above.  This means that in your import settings, you may need to add “first” to the end of the line to ensure that only one value gets put into the sort field.

 

In fact, I will probably officially add a “publishDateSort” field to the VuFind trunk in the near future – in Solr 1.4.1, sorting on a multi-valued field is technically legal but can have unpredictable results.  Once we upgrade to Solr 3.1, I believe that sorting on a multi-valued field will become illegal and will result in errors, so we will actually be forced to adopt a variation of this approach anyway.  Once that happens, steps 1 and 2 will be taken care of for you automatically… but you’ll still need to have the custom indexing script from step 3, since VuFind will continue to expect four-digit years by default.

 

- Demian

 

From: Harmon, Kelly [mailto:Kelly.Harmon@ARS.USDA.GOV]
Sent: Wednesday, April 20, 2011 9:09 AM
To: vufind-general@lists.sourceforge.net
Subject: [VuFind-General] Sort by Date Question

 

Hi All

 

Can someone point me to a location that describes how the date sort function works? (Also:  is it relatively easy for a non-programmer type to alter?)

 

We have combined our two databases into one in the solr index– and have added the Marc 008 field to capture the dates of Journals/Articles, so that we can have one big happy db for folks to search.

 

The problem arises with dates such as “May 2011” and other non-standards.   For example, true descending order would be:

 

May 2011 – Apr 2011 – Feb 2011 – Aug 2010 – May 2010

 

But dates are sorting Apr 2011, Feb 2010, Mar 2011, etc.

 

Is there a method to handle this in Vufind?   If not, can you provide suggestions on how to handle?  (

 

Thanks in advance,

 

Kelly

 

 

 

Kelly A. Harmon

Webmaster, National Agricultural Library

10300 Baltimore Avenue

Beltsville, MD  20705

(301) 504-5788