[Htmlparser-developer] Integration Release 1.3-20020112 is out
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2003-01-13 04:50:15
|
Hi Folks, This week's integration release is out. This release has significant contributions from Derrick Oswald and Josh Kerievsky. Derrick is building a nice UI for the parser - and making tons of improvements. Thanks to Josh's insight, we have done some major refactorings on the scanners - resulting in a massive drop in code duplication. Here are some statistics - the scanners package in the last release had 1693 lines of code. In the current release, this has dropped to 1300 lines of code. We have a new class HTMLCompositeTagScanner which does the hard-work for picking up child tags. Most scanners use this code. HTMLTagScanner too does some useful work- and from this release, new scanners dont need to override evaluate() or scan(). Take a look at the refactored scanner code and you might be surprised with its size and simplicity. Here's the change log : Integration build 1.3 - 20030112 -------------------------------- [1] Assume charset is correct for JVM's without Charset class to check it [2] Beanize the parser [3] Switch to swingui junit runner by default [4] Half baked beans [5] Fix javadoc warnings in JDK 1.4 [6] Added StringFindingVisitor + test code + new visitors packages [7] Fixed bug 659723, but HTMLStringNode is not thread-safe anymore. [8] JDK 1.2 compilability [9] Modified HTMLEnumeration interface (made less verbose) [10] Added HTMLCompositeTagScanner [11] Refactored following scanners to use HTMLCompositeTagScanner : (i) HTMLStyleScnner (ii) HTMLSelectScanner (iii) HTMLFrameSetScanner (iv) HTMLTitleScanner (v) HTMLTextAreaScanner (vi) HTMLScriptScanner (vii) HTMLFrameSetScanner [12] Made StringNode the last parse attempt, so now Reader trys in this order: remark tag endtag string (this will return more HTMLStringNode objects than it did before). [13] Improve speed by performing tag/string triage based on '<' as next character. [14] Refactored HTMLTagScanner. The following scanners use refactored code: (i) HTMLBaseHREFScanner (ii) HTMLDoctypeScanner (iii) HTMLFrameScanner (iv) HTMLJspScanner (v) HTMLMetaTagScanner Regards, Somik |