From: Stefan B. <ste...@be...> - 2010-04-07 07:45:31
|
Roussi Roussev, 06.04.2010 23:41: > I am experimenting with elementtree and jython 2.5.1 but its performance > is far from lxml. What would it take to get the jython xml performance > on-par with the C library? Writing a Java XML parser that runs as fast as the C based parsers, I guess. Both cElementTree and lxml.etree quite easily beat the existing Java parsers in terms of performance - at least according to the last benchmarks I read, cross-platform benchmarks are really rare. Most of the time, the platform is fixed anyway, so there's little benefit from looking around. > Has anyone tried to optimize it? Any > fundamental roadblocks? Could someone provide pointers to an in-depth > discussion if it has happened before? I don't remember any major discussion on this, not even when I brought it up on the list. It is true that the ElementTree adaptor for Java isn't particularly tuned. I would expect that you could get it to run a lot faster by running it through a profiler and turning some screws here and there. Another thing to consider is the approach that lxml has taken: write a Python wrapper around a native DOM tree. But that's certainly a lot more work (and likely also a lot less memory friendly) than just tuning ElementTree for Jython. Stefan |