[Htmlparser-cvs] htmlparser/docs changes.txt,1.199,1.200 release.txt,1.58,1.59
Brought to you by:
derrickoswald
From: Derrick O. <der...@us...> - 2004-05-22 12:09:10
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv7046/docs Modified Files: changes.txt release.txt Log Message: Update version to 1.5-20040522 Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.58 retrieving revision 1.59 diff -C2 -d -r1.58 -r1.59 *** release.txt 14 Mar 2004 16:31:40 -0000 1.58 --- release.txt 22 May 2004 12:08:59 -0000 1.59 *************** *** 1,3 **** ! HTMLParser Version 1.4 (Release Build Mar 14, 2004) ********************************************* --- 1,3 ---- ! HTMLParser Version 1.5 (Integration Build May 22, 2004) ********************************************* *************** *** 19,108 **** (v) this file ! Changes since Version 1.3 ------------------------- ! Translation ! Character entity encoding and decoding has been revamped, leading to ! higher throughput and less memory churn. ! Beans ! The StringBean can now be used as a visitor for parsers external to the bean. ! Decorators ! The node decorator package has been added to provide support for the ! delegate model. ! Lexer ! A new lexer i/o subsystem has been added. This provides accurate line number ! and character position data, tag and attribute names maintain their original ! case, and attributes maintain their original order. Line numbers reported by ! tags are now zero based, not one based. The node count for parsing goes up ! in most cases because whitespace is strictly maintained, i.e. every ! whitespace (i.e. newline) now counts as a StringNode too. Storage of ! attributes is now in a Vector which means the element 0 Attribute is ! actually the name of the tag, rather than having the $TAGNAME entry in a ! HashTable. The htmllexer.jar is this new i/o subsystem broken out and made ! JDK 1.1 compliant, the htmlparser.jar, which includes everything in ! htmllexer.jar, is not necessarily intended to be used in JDK 1.1 ! environments. Some support for JIS escape sequences has been added. ! Tags ! Zero arg tag constructors have been added. Attribute maintenance ! (add/remove/edit) improved. There is no EndTag class any more. Just a ! generic tag that responds true to isEndTag(). Improvements to form tag ! handling, getting <input> and <textarea> tags nested within other tags. ! Improvements to applet tag handling regarding parameters and codebases. ! Scanners ! The concept of scanners has been completely reworked. Applications register ! tags not scanners to express interest in parsing only some tags. The default ! is now to parse all tags, which is equivalent to the old registerDOMTags(), ! so some extra nesting of tags will need to be handled. CompositeTagScanner ! logic has been improved to try and match unclosed open tags when an ! unexpected end tag is encountered. This change also moved recursion off the ! JDK stack, eliminating most StackOverflow exceptions. Also, a CompositeTag's ! "startTag()" is "this", and the CompositeTagScanner just adds children. ! The ScriptScanner will now decrypt Microsoft Script Encoder encrypted script ! tags. The plaintext is available via ScriptTag.getScriptCode(). Filters ! A new powerful filtering capability has been added, which makes extracting ! specific tags very easy. ! Applications ! New example applications Thumbelina and SiteCapturer. ! A mainline has been added to the Translate class to encode/decode stdin to ! stdout. Bug Fixes --------- ! 911565 isValued() and isEmpty() don't work ! 902121 StringBean throws NullPointerException. ! 900128 RemarkNode.setText() does not set Text ! 900125 Style Tag Children not grouped ! 899413 bug in javascript end detection. ! 891058 Bug in lexer ! 865279 Documentation ! 851882 zero length alt tag causes bug in ImageScanner ! 839264 toHtml() parse error in Javascripts with "form" keyword ! 833592 DOCTYPE element is not parsed correctly ! 832530 empty attribute causes parser to fail ! 826764 ParserException occurs only when using setInputHTML() instea ! 825820 Words conjoined ! 825645 <input> not getting parsed inside table ! 813838 links not parsed correctly ! 805598 attribute src in tag img sometimes not correctly parsed ! 801118 two " characters at the end of an attribute value problem ! 798554 Applet Tag does not update codebase data ! 798553 setInputHtml does not set text ! 798552 Sample for node iterator incorrect ! 789439 Japanese page causes OutOfMemory Exception ! 788746 parser crashes on comments like <!-- foobar --!> ! 786869 LinkExtractor Sample not working ! 784767 irc://server/channel urls are HTTPLike? ! 778781 SRC-attribute suppression in IMG-tags ! 772700 Jsp Tags are not parsed correctly when in quoted attributes ! 765413 typo ! 761798 Error reading next element. ! 757337 Standalone attributes should remain standalone ! 755929 Empty string attr. value causes attr parsing to be stopped ! 753012 IMG SRC not parsed v1.3 & v1.4 ! 753003 <IMG> within <A> missed when followed by <MAP> ! 750117 StackOverFlow while Node-Iteration ! 749295 Problem Parsing Table ! 745566 StackOverflowError on select with too many unclosed options ! 744610 getLink() Erroneous for Relative Links from Files on Windows Acknowledgements --- 19,41 ---- (v) this file ! Changes since Version 1.4 ------------------------- ! Configuration Management ! Removed the need for the Translate class to be packaged with htmllexer.jar. ! This results in a lighter weight component. ! Refactoring ! Added Tag interface. Obviated LinkProcessor and moved it's functionality to ! the Page class. Filters ! Added CssSelectorNodeFilter. ! ! Enhancement Requests ! -------------------- ! 943593 LinkProcessor.extract(link,base) weird behaviour? Bug Fixes --------- ! 919738 Text has not been extracted correctly using StringBean ! 936392 ScriptTag visitor fails for comments with ' Acknowledgements *************** *** 140,143 **** --- 73,78 ---- [30] Gernot Fricke [31] Anthony Labarre + [32] Alberto Nacher + [33] Rogers George If you find any bugs, please go to Index: changes.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/changes.txt,v retrieving revision 1.199 retrieving revision 1.200 diff -C2 -d -r1.199 -r1.200 *** changes.txt 14 Mar 2004 16:31:39 -0000 1.199 --- changes.txt 22 May 2004 12:08:57 -0000 1.200 *************** *** 11,3582 **** * http://www.red-bean.com/cvs2cl/changelogs.html * * * ******************************************************************************* ! Release Build 1.4 - 20040314 ! -------------------------------- ! ! 2004-03-14 10:53 derrickoswald ! ! * src/org/htmlparser/beans/LinkBean.java: [...3685 lines suppressed...] src/org/htmlparser/tests/tagTests/BaseHrefTagTest.java, src/org/htmlparser/tests/utilTests/AllTests.java, src/org/htmlparser/tests/utilTests/HTMLLinkProcessorTest.java, ! src/org/htmlparser/util/LinkProcessor.java: ! Deprecate LinkProcessor. ! Functionality moved to Page. ! 2004-03-15 17:50 derrickoswald ! * src/doc-files/building.html: ! Update build instruction problem identified by sarsie. ! 2004-03-14 15:31 derrickoswald ! * build.xml, src/org/htmlparser/lexer/nodes/Attribute.java, ! src/org/htmlparser/lexer/nodes/TagNode.java: ! Remove requirement for Translate.class to be in htmllexer.jar. |