I could imagine a lightweight Java (or preferably Groovy) webservice daemon to do the transformation.  Could be nailgun based or not.  Wouldn't even need to use Grails.  I've done something similar for NLP parsing and analysis, where the JVM spin-up was otherwise prohibitive.  

I imagine the reason the input is sectioned is to aid fault-tolerance and debugging, and it probably makes sense to preserve that.

Helix84, do you want to open a JIRA ticket to track this?


On Tue, Oct 29, 2013 at 5:03 PM, helix84 <helix84@centrum.sk> wrote:
On Tue, Oct 29, 2013 at 9:39 PM, Mosior, Benjamin <BEMosior@ship.edu> wrote:
> If there’s something you would like to see benchmarked, please get in
> contact with us and contribute to the document before November 30th, 2013.


I wasn't present at the Developers Call so I'm sorry if this is off-topic.

There's something that is benchmarkable, but I already know that it's
painfully slow and why. It's the OAI harvesting and import. The reason
why it's slow is that the harvested records are cut up into one xml
file per record and when an XSLT transformation is ran on it, there's
the JVM startup penalty of the XSLT processor for each record. This
could be improved by not cutting up the harvested record batch (the
batch is "naturally" grouped into pages separated by resumption tokens
given by the OAI provider) and running the XSLT transformation on the
whole batch or by applying some solution that eliminates repeated JVM
startup (one such solution is Nailgun, but there might be a better


Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
Vufind-tech mailing list