Hello Mr. Kay
I'd like to process small to medium sized XML documents with virtually one
open-sized XSLT stylesheet. The structure of the different input documents
should be able to evolve and grow independently from the process itself. The
process should be able to handle any new namespace and element type with the
addition of corresponding template matches. Since such a all-mighty stylesheet
would soon grow too big, I would like to break it down, for example by
1) The process first scans the input document for all the namespaces
contained. Then it dynamically builds a stylesheet, importing the according
stylesheets per namespace. The result gets compiled and the input transformed.
2) The process scans the input document for the namespaces contained and
builds a pipeline with the pre-compiled stylesheets, handling these
namespaces. The input gets transformed step-by-step.
3) The process starts with the stylesheet for the root namespace. For all the
unknown namespaces transforms the node with the pre-compiled stylesheet for
the according namespace via the saxon:transform() extension. Every "foreign"
node gets transformed separatly.
Which approach do you think performs best? do you have ideas for another
The number of namespaces within one input document is only a small subset of
the possible overall selection. Some namespaces appear more often, others very
seldom. A proper cache management would prevent the process from pre-compiling
all the stylesheets in a long term, wide coverage usage.
Sorry, I'm not sure I can offer any useful design advice here. I would need to
spend a lot more time studying the project requirements.
Maybe I've been too specific somehow..., don't see where though... any way,
the problem, I'd like to grasp, is more of a general nature, there are no
specific project requirements. I've played around with some arbitrary test
Maybe you can use the results and I would be happy about comments.
Thanks for sharing the data. Performance measurement is always illuminating,
and it often raises a new question for every one that it answers. The
difference for the figures with 1 "content element" versus 2 content elements
is certainly striking, and I would think it is worth investigating further.
Without any idea what your actual XSLT code looks like, I can't give any more
sorry about that, http://code.google.com/p/livcos/source/browse/proto/OpenXsl
t/src/proto/ shows all the files.
Just had a very quick glance at your code (I'm afraid I won't have time to do
more). You need to be aware that using a DOMSource with Saxon is very
inefficient - typically 4-10 times the cost of using Saxon's native tree
(which gets built if you supply a StreamSource or SAXSource).
Thank you for your time and thanks for the hint! Things have changed
considerably and make more sense when taking the DOMSource out of the
equation. Here the new results: http://livcos-dev.blogspot.com/2010/04/open-
Good, that makes sense. On occasions I've been tempted to change Saxon so that
when given a DOMSource it defaults to converting it to a TinyTree rather than
wrapping it - the trouble is that neither strategy is better all the time, and
changing it will break applications.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.