Dear Michael,

Thanks for your prompt reply. I've previously investigated the schema component model functionality of Saxon which is indeed very fast.
Unfortunately the the composite schemas we deal with are a mixture of built in schemas (which would benefit from scm serialized versions) and user supplied schemas. As I understand from the documentation and testing, a processor cannot be configured with multiple scm files or a mixture of scm files and schemas loaded via a stream, so we cannot take advantage of preloading the built in schemas as scms.

Many thanks,

Peter

On 05/06/13 19:32, Michael Kay wrote:


Begin forwarded message:

From: Michael Kay <mike@saxonica.com>
Subject: Re: [saxon] Schema loading advice
Date: 5 June 2013 19:17:55 GMT+01:00
To: Mailing list for the SAXON XSLT and XQuery processor <saxon-help@lists.sourceforge.net>
Cc: Peter Cowan <prc@corefiling.co.uk>

Thanks for the enquiry. We haven't done detailed performance tests on schema loading but we are aware that there is lots of room for improvement and I'm hoping we can schedule some work on this soon. With large schemas it's often the case that large parts of the schema are unused and a considerable saving might be possible by doing lazy validation/compilation of components.

In the meantime the best I can suggest is to look at SCM, the schema component model. The Validate command allows you to output a .scm file containing the schema in compiled form, and this should be significantly faster to load than the original source schema documents. Of course this assumes that you are using the same schema repeatedly.

Michael Kay
Saxonica


On 4 Jun 2013, at 18:27, Peter Cowan wrote:

Hi All,

I'm using Saxon 9.5EE (under Java 1.6) to perform a large number of
schema aware XPath evaluations on xml documents.

The schemas I'm using are composite, around 3.4mb and are already in
memory provided by an API as DOM (backed by Xerces).

In some independent tests I've tried loading the schemas from in memory
DOMSources and StreamSources (backed by ByteArrayInputStreams) using the
load method on the Schema Manager object which takes about 3.8 seconds.
I then ran a similar test building a Schema with Xerces from the same in
memory sources and this took about 800ms

Regrettably we have to load the schema twice but ideally would like it
to be as fast as possible.
Are these kind of speeds normal?
Are there any tips to improve schema loading speed?
Do you require more information to answer the former questions?

Many thanks,

Peter

--
Peter Cowan, CoreFiling Limited
http://www.corefiling.com
Phone: +44-1865-203192


------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help




-- 
Peter Cowan, CoreFiling Limited
http://www.corefiling.com
Phone: +44-1865-203192