Thread: [Htmlparser-cvs] htmlparser/docs changes.txt,1.196,1.197 contributors.html,1.5,1.6 release.txt,1.55,
Brought to you by:
derrickoswald
From: <der...@us...> - 2004-02-16 22:54:28
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv25002/docs Modified Files: changes.txt contributors.html release.txt Log Message: Update version to 1.4-20040216. Index: changes.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/changes.txt,v retrieving revision 1.196 retrieving revision 1.197 diff -C2 -d -r1.196 -r1.197 *** changes.txt 26 Jan 2004 01:01:56 -0000 1.196 --- changes.txt 16 Feb 2004 22:46:07 -0000 1.197 *************** *** 13,16 **** --- 13,77 ---- ******************************************************************************* + Integration Build 1.4 - 20040216 + -------------------------------- + + 2004-02-11 07:37 derrickoswald + + * docs/contributors.html, src/org/htmlparser/beans/StringBean.java: + + Incorporate patch from Nick Burch to make StringBean a NodeVisistor for other parsers. + See task #93155 StringBean driven by visitor. + + 2004-02-08 21:09 derrickoswald + + * build.xml, src/org/htmlparser/lexer/nodes/Attribute.java, + src/org/htmlparser/lexer/nodes/TagNode.java, + src/org/htmlparser/tests/tagTests/TagTest.java, + src/org/htmlparser/tests/utilTests/CharacterTranslationTest.java, + bin/translate, bin/translate.bat, + src/org/htmlparser/util/CharacterReference.java, + src/org/htmlparser/util/Generate.java, + src/org/htmlparser/util/Translate.java, + src/org/htmlparser/util/package.html: + + Rework character entity translation. + See task 58599 enhance character reference translation. + Decode now handles missing semi colons, encoding is more efficient, + hexadecimal numeric character entity references are handled and + both encoding and decoding make minimal use of substring(). + Augmented the tests in CharacterTranslationTest significantly, and + merged the Generate class into the tests. + Added translate command scripts in bin, which read from stdin and write to stdout. + + 2004-02-07 07:53 derrickoswald + + * src/org/htmlparser/: lexer/Lexer.java, + tests/lexerTests/AttributeTests.java: + + Fix bug #891058 Bug in lexer. + Patch submitted by Gernot Fricke. + This change causes attribute parsing to be more 'greedy' resulting in 'empty' attributes + consuming the next attribute. This brings the lexer parsing more in line with other + (browser) interpretations and simplifies it immensely. + + 2004-01-31 15:51 derrickoswald + + * src/org/htmlparser/lexer/Page.java: + + Compare encoding names without case sensitivity. + From HTML spec (http://www.w3.org/TR/html4/charset.html section 5.2.1): + Names for character encodings are case-insensitive, so that for + example "SHIFT_JIS", "Shift_JIS", and "shift_jis" are equivalent. + and from to IANA(http://www.iana.org/assignments/character-sets): + The character set names may be up to 40 characters taken from the + printable characters of US-ASCII. However, no distinction is made + between use of upper and lower case letters. + + 2004-01-31 11:31 derrickoswald + + * src/doc-files/: overview.html, todo.html: + + Move ToDo list to SourceForge trackers and tasks. + Integration Build 1.4 - 20040125 -------------------------------- Index: contributors.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/contributors.html,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** contributors.html 11 Feb 2004 12:37:52 -0000 1.5 --- contributors.html 16 Feb 2004 22:46:08 -0000 1.6 *************** *** 353,360 **** </tr> </table> ! <p>Thanks to Stephen Harrington, Domenico Lordi, Kamen, John Zook, Nick Burch, ! Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, Raj Sharma, Robert Kausch, ! Gordon Deudney, Serge Kruppa, Roger Kjensrud, Rodney S Foley and Manpreet Singh ! for suggestions, bug reports and feature ideas. <br> </body> --- 353,360 ---- </tr> </table> ! <p>Thanks to Gernot Fricke, Nick Burch, Stephen Harrington, Domenico Lordi, Kamen, ! John Zook, Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, Raj Sharma, ! Robert Kausch, Gordon Deudney, Serge Kruppa, Roger Kjensrud, Rodney S Foley ! and Manpreet Singh for suggestions, bug reports and feature ideas. <br> </body> Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.55 retrieving revision 1.56 diff -C2 -d -r1.55 -r1.56 *** release.txt 26 Jan 2004 01:02:09 -0000 1.55 --- release.txt 16 Feb 2004 22:46:08 -0000 1.56 *************** *** 1,3 **** ! HTMLParser Version 1.4 (Integration Build Jan 25, 2004) ********************************************* --- 1,3 ---- ! HTMLParser Version 1.4 (Integration Build Feb 16, 2004) ********************************************* *************** *** 21,24 **** --- 21,29 ---- Changes since Version 1.3 ------------------------- + Translation + Character entity encoding and decoding has been revamped, leading to + higher throughput and less memory churn. + Beans + The StringBean can now be used as a visitor for parsers external to the bean. Decorators The node decorator package has been added to provide support for the *************** *** 57,63 **** --- 62,71 ---- Applications New example applications Thumbelina and SiteCapturer. + A mainline has been added to the Translate class to encode/decode stdin to + stdout. Bug Fixes --------- + 891058 Bug in lexer 865279 Documentation 851882 zero length alt tag causes bug in ImageScanner *************** *** 121,124 **** --- 129,135 ---- [26] Stephen Nightingale [27] Donnla Nic Gearailt + [28] Pim Schrama + [29] Nick Burch + [30] Gernot Fricke If you find any bugs, please go to |