Menu

usecase

Jaroslav Kotrc

Use-case study notes

XML Parsers

To present usefulness of SPL it is needed to find project that should depend on its performance. More there should not be known any algorithm how to solve it efficiently, otherwise there may not be much performance improvements. That is why XML parsing was chosen. XML parsers have to be fast so they should improve its performance and they are not based on some exact algorithm so every line of code can significantly influence the performance.

JDOM

JDOM is, quite simply, a Java representation of an XML document. JDOM provides a way to represent that document for easy and efficient reading, manipulation, and writing. It has a straightforward API, is a lightweight and fast, and is optimized for the Java programmer. It's an alternative to DOM and SAX, although it integrates well with both DOM and SAX.

Its code is in Git repository. Initial commit is on 28.5.2000, latest commit (to the 30.10.2012) is on 28.9.2012 so it is more than 12 years old and still alive.

Project is quite well commented both source code and revision comments and has many commits that should improve performance.

Interesting commits:

Commit hash Commit date Commit information
53dc235cd8d2d0294e682f4cc8747b91d2d44bf1 30.7.2000 lazy allocation can change performance
7cd2f2575506d99ddea73a4a8aeeacb069dcb28d 4.12.2000 improved performance by getting object by index instead of reference
a994efea91a8b2ba09c1986e8ca9ab7f6c957f07 14.12.2000 performance enhancement to ResultSetBuilder by caching column names
da4952181a1b15cda1359eda9abc1aa97826a05b 16.12.2000 speed up by deleted list construction
2466f3a6e36da2842915692279ee445823f68148 19.12.2000 interesting heuristic for character checking
629c0e23b1a066ead5d95d72e817c6fd6ca51509 3.4.2001 Performance enhancement for files with namespaces - just added condition
ea6446dec24a71afea62e8f90f8f7ade3619749d 13.4.2001 Optimization in getQualifiedName() by better StringBuffer use
42165aa4a3f64ebccde001cffe2143482dad4637 9.5.2001 setting the declaration handler if expansion is off
eacc01dce56171e27cee05e42bc4be13e429d17f 9.5.2001 Element - remove attribute only when necessary - 8% time savings during a fresh build
9eb684a399561e00c43c9ac1a2f2c15df6357c0d 19.5.2001 Namespace - used StringBuffer instead String concatenation
8d9fcb03f181acc24e343ba458b1704fd046b11a 14.6.2001 Removed unnecessary namespace adding in SAXHandler
8fe2180347230c532f51cd6ddef00579ed32b702 22.6.2001 Remove all namespaces instead one by one during iteration in SAXHandler
95dae43a6c9b6bf06b8de7bde600d386a7230402 16.8.2001 Elements getText now returns CDATA directly as a text
a289181e2c2d6f227ac4cba643f5434e3dff3c32 17.8.2001 SAXHandler optimization reduces by 5% the parsing time when using Crimson
4e2753539ad1217775e3b79dcb4cc7d1326a7798 11.12.2001 adds FilterList functionality instead of using the slow and broken PartialList
d3e55ebe8d0a81e549323536f9b751b28ed9b0ed 25.1.2002 patch - no blazing speed improvements, code smaller and a lot easier to understand
e6788dc0a5b20ed4e11ae3d7b154b7fdf45050ec 26.1.2002 performance patch - SAXHandler uses StringBuffer for Text and CDATA to buffer string building more effectively
b154412b86bec08a666adec87dfd40718e6d3458 27.1.2002 Parsing of documents with lots of entities (like HTML when kept as element text content) is now MUCH faster - reordered method call, removed unnecessary call
fe65c7824f126f94f1db98bf9a8a22c76d76bcd3 29.1.2002 improve the performance of SAXHandler by better using of StringBuffer
4bb5ff061fce399b312aa60b8b6d0e1a58883614 7.4.2002 little improve by reducing the number of calls to size()
519928b5aa43b33540e4bc26bab355387ba5c4a0 27.2.2003 SAXBuilder support reusing parser instance
b5f12999b0b4634bda8bafd74136b787c952a4c0 2.4.2003 faster isWhitespace() impl
5fcb88eafb33002c7650bef6bddf8e8bc142ae0b 14.4.2003 Verifier.checkXMLName() does not check first letter twice
82285a14b87bab6b01d067159542334bcfadd44d 31.5.2003 ContentList.add uses 1 call to determine if a object was attached instead of 2
8e59927ffd7586653a6f99e6c1bce904cd31f71f 5.2.2004 Simplification of the matches() in ElementFilter
c8550aa3c3828644f1a5b2802b9546312ad9a24f 27.2.2004 Changed append("c") to append('c') in SAXHandler and Element
c00e22be5cd592fc94824d27a93313c79fa6012f 28.2.2004 Added UncheckedJDOMFactory - without doing any content or structure checks - 100% confidence in your parser
abf7a9baab661dcc98d9e95e7772a31886c3c60a 11.12.2004 Namespace performance improved - 10-15% speedups in build time for namespace heavy docs
b48373d95cd91986f83ac5345566258d80e88027 15.8.2006 changed almost all local variables and parameters to final in Element - optimization?
da88da1d05c0f5f52b68a5a6f9cd60fcd34bbc5d 16.11.2006 Minor no-op changes - finalized parameters, local variables
a24fb4bd1b0e791f0e28852e0a98bc6a288c7e66 9.11.2007 rewrite of the ListIterator subclass - not sure if performance is affected
2e3adcb0b99ec4e8a84b2e43d30967d2d64309b8 18.12.2008 Adding synchronization to Namespace hashtable get/put - could lower performance
0a7649207bbaaa5f5acf1436eb10381cd0fe9026 2.8.2011 Create an initial build.xml after third-party Jars updated
053e5ed1fd85bf84d7dc4d43bb243b9eb8660b8c 2.10.2011 Lot of refactoring, can it affect performance?
932cfb2148cbb718a8bd19ebb89d9233e62b5f12 11.10.2011 lazy ContentList deals with iterators and Filters - performance upgrade
412170566ebdf8449b442e44f12ed8712d447a19 17.10.2011 Improve performance of iteration in the Array and Content lists
02e2a6585648d5dbf9f8ce1118ce93617e55691f 18.10.2011 Attribute and Content Lists implement RandomAccess - better performance for generic algorithms
f589aa24116b29f72ba4bc1776b14f478a572d17 23.10.2011 Improve the hashCode for NamespaceKey which substantially reduces the number of times that the equals() method is called
751676c8e035b9e8fef3d128ee21e154bfc9d0e8 23.11.2011 Parser reuse is much more efficient
5104fea19e42ab7d6085b9039acd7090c7521109 16.12.2012 Re-work the JAXPDOMAdapter to be much more efficient, and it no longer uses reflection
acac98ce6330def6ee3ed9591ac0d820fc9f46ee 21.3.2012 Improve the performance of the non-raw 'Walker' classes by skipping unnecessary test
3cea635cf86d71a2eb37e74223f73faffae4b5dd 22.3.2012 Improve performance of FormatStack significantly

Newest performance enhancement is Issue #92 - Verifier performance - group of commits that should improve performance of the Verifier that is critical code for the performance of JDOM. More of this issue where are also prepared some test data for download.


Related

Wiki: ProjectNotes

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.