Saxon performance vs MSXML performance

  • Andriy Gerasika

    Andriy Gerasika - 2010-11-30

    Hello, Michael,
    I am doing just for fun experiments, and have run into curious problem:

    why Saxon performance dropped 2x times, whilst MSXML performance is the same?
    Theoretically, because Saxon does not construct a DOM, but uses TinyTree,
    Saxon should be winning on sufficiently large documents vs any other XSLT
    processor using DOM just because of the cost of constructing DOM. But that's
    not case. Where the problem could be? Is it Java default SAX impl, or
    something else?

    Thank You
    Kind Regards,
    Andriy Gerasika

  • Michael Kay

    Michael Kay - 2010-11-30

    I don't see any surprises here.

    As far as I can see your two benchmarks are measuring very different
    workloads, so it's not clear why you would expect to get the same result.
    Different processors are going to perform better or worse depending on what
    you throw at them. For example some processors might spend a bit longer
    building the tree in order to make it faster to navigate, and that decision
    will benefit some workloads and not others. It would be interesting to see
    more detailed analysis here, for example to compare the time taken to build
    the tree, the time taken to do the transformation, and the time taken to
    serialize the result.

    You've making some very wild conjectures about the factors that influence
    performance, for example that you would expect C++ to be faster than Java.
    Well, an XSLT processor has to do dynamic memory allocation. If you write it
    in Java, you get the benefit of Java's memory allocation, while in C++ you
    have to do it yourself. So the question is, if you're writing an XSLT
    processor in C++, can you do memory allocation more efficiently than Java does
    it? And in my experience, the answer is usually no.

    You might like to read

    for my analysis of the factors affecting Saxon performance. It's written from
    an XQuery perspective, but most of it applies also to XSLT.

  • Andriy Gerasika

    Andriy Gerasika - 2010-12-01

    Well, fixing the timings for MSXML shows that it takes 2s to construct the DOM
    for 30mb document, 2s to serialize, and 10s to process. Apparently
    constructing the DOM is not so heavy operation, as I thought.

    Saxon is still the winner for files:
    for 15mb file -- Saxon 5s vs. MSXML 4s
    for 30mb file -- Saxon 10s vs. MSXML 14s
    for 45mb file -- Saxon 14s vs. MSXML 26s

    I do agree that benchmarking differs from file to file, but how to benchmark
    and prove Saxon is generally faster compared to
    MSXML/XslCompiledTransform/XSLTC despite of various C++/compiled XSLT tricks?

    The paper says it is more optimal to write empty(...) instead of count(...)=0.
    Does the same apply for not(...)? I very often write <xsl:template match="a"/>
    should I be using empty() instead?

  • Michael Kay

    Michael Kay - 2010-12-01

    The paper says it is more optimal to write empty(...) instead of
    Does the same apply for not(...)? I very often write <xsl:template<br>match="a"/> should I be using empty() instead?

    Saxon usually sorts this kind of thing out for you. If the argument to not()
    is statically recognizable as a node-set, Saxon will rewrite not(X) as
    empty(X) automatically.

  • Andriy Gerasika

    Andriy Gerasika - 2010-12-17

    closed. The company I work for, will be upgrading from Xalan/XSLTC to Saxon.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks