Hi Matthew,

interesting challenge. I'm not sure how it can be addressed without modifying the Java or the dates in your metadata. 

When looking at:
https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java#L1439

It seems like the date is guessed purely on String length. Maybe this date guessing can be made more robust by doing proper regex matching, like the example here:

http://stackoverflow.com/a/3390252

Note that this example code also requires some additional matches to add timezone support. To make sure this doesn't get lost, I added this as a JIRA ticket: https://jira.duraspace.org/browse/DS-1775

best regards,

Bram

-- 
logo 
Bram Luyten @mire
2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010
Esperantolaan 4, Heverlee 3001, Belgium
www.atmire.com


On Thu, Nov 7, 2013 at 7:36 PM, Matthew McKinley <matthewjamesmckinley@gmail.com> wrote:
Whoops! Sent this to the wrong list.

Matthew McKinley
Digital Project Specialist, University of California, Irvine
about.me


---------- Forwarded message ----------
From: Matthew McKinley <matthewjamesmckinley@gmail.com>
Date: Thu, Nov 7, 2013 at 10:20 AM
Subject: SOLR/Discovery Date Parsing
To: dspace-devel@lists.sourceforge.net


Hi all,

We're running DSpace 1.8.2 on Tomcat 6 on a RedHat server.

Trying to make the switch to discovery and have most of the kinks worked out except indexing dates. Many of our dates are of simple "MM-DD-YYYY" variety, but some include a timestamp as well and these are not being indexed correctly by update-discovery-index. An example of an error encountered is below:


2013-11-07 09:28:26,156 ERROR org.dspace.discovery.SolrServiceImpl @ Unable to parse date format
java.text.ParseException: Unparseable date: "1998-03-05T07:11:44PST"
    at java.text.DateFormat.parse(DateFormat.java:337)
    at org.dspace.discovery.SolrServiceImpl.toDate(SolrServiceImpl.java:1017)
    at org.dspace.discovery.SolrServiceImpl.buildDocument(SolrServiceImpl.java:737)
    at org.dspace.discovery.SolrServiceImpl.indexContent(SolrServiceImpl.java:153)
    at org.dspace.discovery.SolrServiceImpl.updateIndex(SolrServiceImpl.java:297)
    at org.dspace.discovery.SolrServiceImpl.createIndex(SolrServiceImpl.java:262)
    at org.dspace.discovery.IndexClient.main(IndexClient.java:113)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)


From manually editing the dates and re-updating the discovery index, it seems the problem is either the time zone or lack thereof. Looking at the java file (org.dspace.discovery.SolrServiceImpl), it looks like Discovery/SOLR will accept
yyyy-MM-dd'T'HH:mm:ss.SSS'Z'


or

yyyy-MM-dd'T'HH:mm:ss'Z'
But will NOT accept either a timezone such as "PST" at the end of the date string or no time zone at all (i.e. yyyy-MM-dd'T'HH:mm:ss)

Is there a way to get around this issue and have Discovery/SOLR index these date values without modifying the java? We have a lot of dspace objects in this (pretty standard UTC) date + time + timezone format and I'd hate to have to remove information just to make them index nicely.

Thanks!
Matthew


Matthew McKinley
Digital Project Specialist, University of California, Irvine
about.me


------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette