#195 Error when web crawling Solaris directory

v1.4
open
nobody
None
1
2014-07-04
2014-07-04
Tony Dixon
No

I have an issue when crawling office 2013 files as part of a web crawl. The text analyzer returns the underlying ascii text as the content field, rather than human-readable content.

I've done a little investigation, and this problem doesn't occur on file crawls, and indeed only seems to happen on the deployment platform, a Solaris machine.

I have OSS v1.5.3 deployed on Solaris 10 (x64) running Java 1.7.0-b147. The file repository is hosted by Apache2 web server. The same OSS runs fine on a linux mint VM unless the file repository is on Solaris. Likewise, OSS deployed on Solaris doesn't experience the problem with a file repository hosted on Linux or Windows

Best wishes

Discussion