From: Steve B. <Ste...@zv...> - 2003-01-09 21:08:29
|
ro...@po... wrote: > On 8 Jan, Steve Ball wrote: >>There's also the issue of support. Many more developers use >>libxml2 & libxslt, so it is better tested. > > So you're suggesting, that especially libxslt is more compliant and > bug free than the tDOM XSLT engine, don't you? For sure you have also > some hard facts for this, beside the fussy marketing speech? (Visual > Basic is used by much more developers than tcl and therefor much more > mature than tcl, or what is the argument?) No, I don't have a list of compliance test results for each different processor sitting in front of me. I base my "fussy marketing speech" on the observation that libxml2 and libxslt have bindings to (at least) three scripting languages, Tcl, Perl and Python, and are in use in several prominent application frameworks, eg AxKit, Gnome, so it seems obvious that many more developers are hammering these libraries than just those doing Tcl development. More developers => more bug reports => more bug fixes > Before the release of the current tDOM version 0.7.5 I run it against > a suite of much more than 1500 xslt test files. The suite is compiled > together from the (enormous) test suite of the xalan XSLT project > (this alone are almost 1300 texts), the NIST XSLT test suite (more > than 170 tests), the XSLTmark test suite, the libxslt tests, the > various examples of Michael Kays "XSLT - Programmer Reference" book > and a couple of other sources. I compared the tDOM results with the > libxslt-1.0.22 results - the current version at that time, they are at > libxslt-1.0.23 now - (and saxon, xalan-j and sablotron btw). I found, > that libxslt had notable more failures than the tDOM xslt engine. > > Don't get that wrong. I confirm, that libxslt provides a good, almost > compliant XSLT processor. (As, for example also the sablotron folks > do.) Most of the libxslt failures, that I found, are related to not so > common XSLT features or constructs and therefor libxslt will do most > of the 'real world' XSLT right. > > But I state, that tDOM also provides a good, almost compliant > XSLT processor - and probably a bit better, at least in some areas, > than libxslt. Steve, please would you share the test results with us, > that make you belive, it isn't this way? At no point did I state that tDOM was *not* a compliant XSLT processor. Alot of people seem to be getting good results from tDOM - that's great. >>I have no >>comparative usage numbers for TclDOM/TclXSLT vs tDOM, >>but you can check for yourself on the SourceForge project >>page. > > The sourceforge tDOM project is only a placeholder, registered by > Jochen more than 2 years ago, to protect the project name. Since the > tDOM sourceforge project space was never used, it suggest, that there > is no code and no users. Both is definitely not true ;-) Again, I'm not saying that - it is clearly not the case. I'm saying that it does not seem possible at this time to get *comparative* usage data. The SF system collects this kind of data, so that makes it easier for the TclXML project. >>As far as performance goes, tDOM may be slightly better in >>some circumstances but compared to the performance of Java tools >>the difference is trivial. > > Hm. Steve, would you please name only a view cirumstances, for which > tDOM isn't _at least_ slightly better than TclXML etc., as far as > performance goes? ;-) The point I am trying to make is when one compares the performance of processors written in C to those written in Java there can be a 3-4 times improvement. A performance gain of 10-25% when using tDOM vs libxslt is not significant *when compared to the gain over Java*. If some people are really concerned about squeezing an extra 10% performance out of their XSLT stylesheets then they should probably look at their algorithms and choice of technology rather than choice of processing engine. IOW, XSLT is built for comfort, not for speed (like Tcl really). > What speed difference could be named 'trivial' depends of course on > the viewpoint. But I, for once, would not confirm, that the speed > difference between tDOM and TclXML compaired with the speed of Java > tools is completely "trivial". If the java virtual machine has started > and the classes are already loaded (the java virtual machine has > 'warmed up'), the rough numbers are as following: Given a modern, fast > Java dom implementation, libxml is around two times faster, tDOM with > the expat parser 3 times and tDOM with the simple parser 4 times. I'm not sure what you're measuring here - parsing time or processing time? In most of my applications, the time taken to parse a document is trivial compared to the processing time taken by the DOM app or XSLT stylesheet. > Another point is the memory footprint. A DOM tree needs a lot of > memory. A libxml DOM tree needs typically more than double as much > memory, as the tDOM DOM tree of the same document. For example, a just > chosen random 17 MByte XML file needs clearly more than 100 MByte > memory, as libxml DOM tree, and less then 50 MByte as tDOM DOM tree > (numbers measured at linux). Sure, if you process only small XML files > on a box with reasonable memory, that may not bother you much. Well, I'd certainly dispute that claim. I just created a 17MB test document - 17K elements each containing 1KB of text. After parsing it into a libxml2 DOM tree the memory usage was approx. 20MB. Of course, different kinds of documents will require different amounts of memory. This was measured on a Mac OS X system. I'd say that's a modest overhead for the DOM structures. > Both the raw xml parsing/DOM building speed and the memory footprint > are results with only moderate variations. It don't really matter, > what half-way realistic XML document you use, tDOM is faster, while > using lesser memory. I'm not disputing that tDOM is faster, but I'd need more convincing that it uses (significantly) less memory. > XPath and XSLT are much more complicated. There is clearly a greater > variation of the results. But the overall picture - my results, please > provide your own - is pretty clear: tDOM xslt is notable faster than > libxslt. I would say XSLT benchmarking is still in its first > days. XSLTmark (http://www.datapower.com/xml_community/xsltmark.html) > is probably the most known XSLT benchmark suite, at the moment. This > are 40 different Stylesheets, which run 10 to 100 times against > different source documents from a few bytes up to 2 MBytes. With the > default test configuration, the XSLT transformation time sums up for > libxslt to around 140 seconds and for tDOM to 75 seconds. I'd like to see more information about the methodology used, see below. > My to some degree extensive XSLT benchmarking in deed shows, that > libxslt is for example mostly much faster than sablotron - well, > sablotron isn't that bad in compliance, as already said, but that the > sablotron folks claim at there homepage: "Sablotron is a fast [XSLT > processor]" is ludicrous and only possible, because nobody measures by > itself --- do you (and especially Steve: do you)? I found tDOMs xslt > engine often cleary faster than libxslt and only very seldom the other > way around. And there's a hole group of stylesheet/(bigger) document > combinations, for which libxslt is ridiculous slow. Perhaps you should report that to Daniel Veillard. > If you like more 'real live' numbers, Simon Hefi reports > (http://groups.yahoo.com/group/tdom/message/286), that tDOM transforms > the almost 1,7 MByte DocBook document Securing.xml more than double as > fast as libxslt. At my a bit ancient box, libxslt (xsltproc) needs > more than 50 seconds for that transformation, and tDOM only 25. By the > way, a "warmed up" saxon also needs only a bit more than 60 > seconds. In such cases. I think it would be possible to say: the speed > difference of libxslt to Java tools is "trivial", compared with the > tDOM speed. I have seen Simon's report. The tests were not extensive, but more importantly (as I recall) the methodology used was dubious. He ran the tests using a "warmed-up" tDOM within a Websh environment, whereas the libxslt processor started cold. Most of the documents tested were fairly small, so the startup time of the libxslt process plus parse time would have been significant. The Websh/tDOM process would not have had those overheads. Of course, for the large, 1.7MB, document these overheads are far less significant. There may be many reasons as to why tDOM has a 2X performance gain over libxslt, but there would have to be more extensive test suite than just one document to draw the conclusion that tDOM is always that much faster than libxslt. >>Finally, at least for TclXML & TclDOM there is a pure-Tcl >>version for completely compilation-free deployment. > > It should not be suppressed, that especially the scripted TclDOM has > its limits with respect to speed and its immens memory demand. The pure-Tcl implementation of TclXMl and TclDOM are around 20 times slower than any C version. That's no secret. It often amazes me that anyone has any interest in these implementations at all (I originally wrote them purely for research purposes), but it turns out that being able to use an extension-free XML environment is very useful for some purposes. > Don't get me wrong. For private use and a small 'content management > system' I guess both packages are probably very well suited; choose, > whatever you like more. Steve, greased with the words, as it is, has > in deed some tutorial material, which may help you on track. For sure > an additional convenient plus for TclXML et al. is, that it is > included in the ActiveTcl (for windows: probably not so far in the > future) distributions. The long mail is mostly because Steve tends to > get the numbers a bit wrong. I thought the long mail was because the tDOM implementors seem to have a big chip on their shoulders and feel the need to prove something... [sorry folks - I feel myself starting to rant and I shouldn't succumb to the temptation to make personal attacks] For quite some time (three years?) I've been frustrated at the lack of cooperation between these two projects. It seems pointless to force developers to make a choice, but I've long since given up trying to reconcile them. Steve Ball -- Steve Ball | XSLT Standard Library | Training & Seminars Zveno Pty Ltd | Web Tcl Complete | XML XSL Schemas http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development Ste...@zv... +---------------------------+--------------------- Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099 |