htmlparser-cvs Mailing List for HTML Parser (Page 4)
Brought to you by:
derrickoswald
You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(141) |
Jun
(108) |
Jul
(66) |
Aug
(127) |
Sep
(155) |
Oct
(149) |
Nov
(72) |
Dec
(72) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(100) |
Feb
(36) |
Mar
(21) |
Apr
(3) |
May
(87) |
Jun
(28) |
Jul
(84) |
Aug
(5) |
Sep
(14) |
Oct
|
Nov
|
Dec
|
2005 |
Jan
(1) |
Feb
(39) |
Mar
(26) |
Apr
(38) |
May
(14) |
Jun
(10) |
Jul
|
Aug
|
Sep
(13) |
Oct
(8) |
Nov
(10) |
Dec
|
2006 |
Jan
|
Feb
(1) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(24) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Derrick O. <der...@us...> - 2006-03-20 00:03:00
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/tagTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18176 Modified Files: BodyTagTest.java Log Message: Fix unit test for body tag. Index: BodyTagTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/tagTests/BodyTagTest.java,v retrieving revision 1.21 retrieving revision 1.22 diff -C2 -d -r1.21 -r1.22 *** BodyTagTest.java 2 Jul 2004 00:49:31 -0000 1.21 --- BodyTagTest.java 20 Mar 2006 00:02:50 -0000 1.22 *************** *** 74,81 **** } - public void testToString() throws ParserException { - assertEquals("Body","BODY: Yahoo!",bodyTag.toString()); - } - public void testAttributes () { --- 74,77 ---- |
From: Derrick O. <der...@us...> - 2006-03-19 22:13:59
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5238 Modified Files: panel.html Log Message: Fix name of current build. Index: panel.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/panel.html,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** panel.html 19 Mar 2006 22:03:56 -0000 1.9 --- panel.html 19 Mar 2006 22:13:52 -0000 1.10 *************** *** 21,25 **** <p><strong>Downloads</strong></p> <ul> ! <li> <a href="http://sourceforge.net/project/showfiles.php?group_id=24399&package_id=47712" target="_parent">Version 1.4</a></li> <li> <a href="http://sourceforge.net/project/showfiles.php?group_id=24399&package_id=17243" target="_parent">Old Releases</a></li> <li> <a href="http://cvs.sourceforge.net/viewcvs.py/htmlparser/htmlparser/" target="_parent">CVS Repository</a></li> --- 21,25 ---- <p><strong>Downloads</strong></p> <ul> ! <li> <a href="http://sourceforge.net/project/showfiles.php?group_id=24399&package_id=47712" target="_parent">Version 1.6 (current)</a></li> <li> <a href="http://sourceforge.net/project/showfiles.php?group_id=24399&package_id=17243" target="_parent">Old Releases</a></li> <li> <a href="http://cvs.sourceforge.net/viewcvs.py/htmlparser/htmlparser/" target="_parent">CVS Repository</a></li> |
From: Derrick O. <der...@us...> - 2006-03-19 22:04:05
|
Update of /cvsroot/htmlparser/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv1074 Modified Files: build.xml Log Message: Fix bug #1363500 http://htmlparser.sourceforge.net/bug.html is wrong Take down the wiki. Index: build.xml =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/build.xml,v retrieving revision 1.81 retrieving revision 1.82 diff -C2 -d -r1.81 -r1.82 *** build.xml 26 Sep 2005 01:01:22 -0000 1.81 --- build.xml 19 Mar 2006 22:03:56 -0000 1.82 *************** *** 6,11 **** Release Procedure - cd htmlparser - - 'ant wiki' captures the PhpWiki from http://htmlparser.sourceforge.org/docs/wiki to - the docs/wiki directory (except for indirect image references). - set environment variables CVSROOT and CVS_RSH (see changelog task) - 'ant changelog' generates htmlparser/ChangeLog (this will be changed to use the previous version tag someday) --- 6,9 ---- *************** *** 20,24 **** - perform a CVS update on htmlparser/ to identify new and changed files - commit changed files (i.e. Parser.java, docs/release.txt, docs/changes.txt, ! docs/release.txt and docs/wiki) to the head revision using a reason of the form: Update version to 1.5-20040522. - use CVS to tag the current head revisions with a name like v1_5_20040522. --- 18,22 ---- - perform a CVS update on htmlparser/ to identify new and changed files - commit changed files (i.e. Parser.java, docs/release.txt, docs/changes.txt, ! and docs/release.txt) to the head revision using a reason of the form: Update version to 1.5-20040522. - use CVS to tag the current head revisions with a name like v1_5_20040522. *************** *** 69,73 **** Update the Web Site - - remove the local docs/wiki directory - create a tarball of the docs directory tar -tf docs.tar --- 67,70 ---- *************** *** 90,98 **** mv ../olddocs/performance . mv ../olddocs/test . - mv ../olddocs/wiki . - - edit the panel.html file to change the target of the Wiki link from - wiki/index.html - to - wiki/index.php - delete the old htmldocs directory rm -rf ../oldhtdocs --- 87,90 ---- *************** *** 115,119 **** <property name="classes" value="${src}"/> <property name="docs" value="docs"/> - <property name="wiki" value="${docs}/wiki"/> <property name="bin" value="bin"/> <property name="lib" value="lib"/> --- 107,110 ---- *************** *** 167,171 **** </target> ! <target name="JDK1.4"> <condition property="JDK1.4"> <or> --- 158,162 ---- </target> ! <target name="JDK_OK"> <condition property="JDK1.4"> <or> *************** *** 174,180 **** </or> </condition> </target> ! <target name="JDK_Warning" unless="JDK1.4"> <echo message="***************************************************"/> <echo message="* WARNING: The detected JDK version is not 1.4! *"/> --- 165,176 ---- </or> </condition> + <condition property="JDK1.5"> + <or> + <equals arg1="1.5" arg2="${ant.java.version}"/> + </or> + </condition> </target> ! <target name="JDK_Warning" unless="JDK_OK"> <echo message="***************************************************"/> <echo message="* WARNING: The detected JDK version is not 1.4! *"/> *************** *** 369,377 **** <!-- Create the Thumbelina jar --> ! <target name="thumbelina" depends="JDK1.4,jarlexer" description="create thumbelina.jar" if="JDK1.4"> <!-- Create the lib directory --> <mkdir dir="${lib}"/> <mkdir dir="${classes}"/> ! <javac srcdir="${src}" destdir="${classes}" debug="on" classpath="${classes}:${lib}/htmllexer.jar" source="1.3"> <include name="org/htmlparser/lexerapplications/thumbelina/**/*.java"/> </javac> --- 365,373 ---- <!-- Create the Thumbelina jar --> ! <target name="thumbelina" depends="JDK_OK,jarlexer" description="create thumbelina.jar" if="JDK1.5"> <!-- Create the lib directory --> <mkdir dir="${lib}"/> <mkdir dir="${classes}"/> ! <javac srcdir="${src}" destdir="${classes}" debug="on" classpath="${classes}:${lib}/htmllexer.jar" source="1.5"> <include name="org/htmlparser/lexerapplications/thumbelina/**/*.java"/> </javac> *************** *** 388,392 **** <!-- Create the FilterBuilder jar --> ! <target name="filterbuilder" depends="JDK1.4,jarparser" description="create filterbuilder.jar" if="JDK1.4"> <!-- Create the lib directory --> <mkdir dir="${lib}"/> --- 384,388 ---- <!-- Create the FilterBuilder jar --> ! <target name="filterbuilder" depends="JDK_OK,jarparser" description="create filterbuilder.jar" if="JDK1.4"> <!-- Create the lib directory --> <mkdir dir="${lib}"/> *************** *** 436,467 **** </target> - <!-- Delete the files gathered from the wiki. --> - <target name="cleanwiki" description="delete local wiki files"> - <!-- Delete the content, leave the CVS files. --> - <!-- This is done so deleted wiki pages are not left orphaned in CVS. --> - <delete failonerror="false"> - <fileset dir="${wiki}"> - <filename name="**/*"/> - <not> - <filename name="*CVS*"/> - </not> - </fileset> - </delete> - </target> - - <!-- Capture the wiki --> - <target name="wiki" depends="jar,cleanwiki" description="capture the wiki"> - <java classname="org.htmlparser.parserapplications.WikiCapturer" fork="yes" failonerror="yes"> - <classpath> - <pathelement location="${lib}/htmlparser.jar"/> - </classpath> - <arg value="http://htmlparser.sourceforge.net/wiki/"/> - <arg value="${wiki}"/> - <arg value="true"/> - </java> - </target> - <!-- Create the javadoc for the project --> ! <target name="javadoc" depends="JDK1.4,JDK_Warning,init" description="create JavaDoc (API) documentation"> <mkdir dir="${classes}"/> <javac srcdir="${resources}" includes="HtmlTaglet.java" classpath="${classes}"/> --- 432,437 ---- </target> <!-- Create the javadoc for the project --> ! <target name="javadoc" depends="JDK_OK,JDK_Warning,init" description="create JavaDoc (API) documentation"> <mkdir dir="${classes}"/> <javac srcdir="${resources}" includes="HtmlTaglet.java" classpath="${classes}"/> *************** *** 508,511 **** --- 478,509 ---- </target> + <!-- Create the javadoc for the project --> + <target name="checkjavadoc" depends="JDK_OK,JDK_Warning,init" description="create JavaDoc (API) documentation"> + <mkdir dir="${classes}"/> + <javac srcdir="${resources}" includes="HtmlTaglet.java" classpath="${classes}"/> + <mkdir dir="${docs}/checkjavadoc"/> + <property name="javadoc.doctitle" value="HTML Parser ${versionNumber}"/> + <property name="javadoc.header" value="<A HREF="http://htmlparser.sourceforge.net" target="_top">HTML Parser Home Page</A>"/> + <property name="javadoc.footer" value="&copy; 2005 Derrick Oswald<div align="right">${TODAY_STRING}</div>"/> + <property name="javadoc.bottom" value="<table width='100%'><tr><td>HTML Parser is an open source library released under + <a HREF="http://www.opensource.org/licenses/lgpl-license.html" target="_top">LGPL</a>.</td><td align='right'> + <a HREF="http://sourceforge.net/projects/htmlparser" target="_top"> + <img src="http://sourceforge.net/sflogo.php?group_id=24399&type=1" width="88" height="31" border="0" alt="SourceForge.net"></a></td></tr></table>"/> + <javadoc doclet="com.sun.tools.doclets.doccheck.DocCheck" + docletpath="/home/derrick/htmlparser_cvs/htmlparser/doccheck1.2b2/doccheck.jar" + packagenames="org.htmlparser.*" + sourcepath="${src}" + classpath="${classes}" + defaultexcludes="yes" + excludepackagenames="org.htmlparser.tests.*" + destdir="${docs}/checkjavadoc" + author="true" + version="true" + overview="${src}/doc-files/overview.html"> + </javadoc> + <copy file="${resources}/inherit.gif" tofile="${docs}/javadoc/resources/inherit.gif" overwrite="true"/> + <delete file="${resources}/HtmlTaglet.class"/> + </target> + <target name="release" depends="jar,thumbelina,filterbuilder,javadoc" description="prepare the release files"> </target> |
From: Derrick O. <der...@us...> - 2006-03-19 22:04:04
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv1074/docs Modified Files: panel.html bug.html Log Message: Fix bug #1363500 http://htmlparser.sourceforge.net/bug.html is wrong Take down the wiki. Index: bug.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/bug.html,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** bug.html 4 Jan 2004 03:23:08 -0000 1.2 --- bug.html 19 Mar 2006 22:03:56 -0000 1.3 *************** *** 19,24 **** <li>Have you <a href="http://sourceforge.net/tracker/?func=browse&group_id=24399&atid=381399">checked the list of older bug reports</a></li> ! <li>Have you written a testcase to simulate your bug? Why do we request this? ! - check <a href="wiki/TestDrivenDevelopment.html">Test Driven Development</a>. We do take reports without testcases, but please note that such reports may take longer for us to respond to.</li> --- 19,23 ---- <li>Have you <a href="http://sourceforge.net/tracker/?func=browse&group_id=24399&atid=381399">checked the list of older bug reports</a></li> ! <li>Have you written a testcase to simulate your bug? We do take reports without testcases, but please note that such reports may take longer for us to respond to.</li> Index: panel.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/panel.html,v retrieving revision 1.8 retrieving revision 1.9 diff -C2 -d -r1.8 -r1.9 *** panel.html 31 May 2004 22:27:09 -0000 1.8 --- panel.html 19 Mar 2006 22:03:56 -0000 1.9 *************** *** 29,33 **** <li> <a href="javadoc/index.html" target="_parent">JavaDocs</a></li> <li> <a href="samples.html" target="mainFrame">Sample Programs</a></li> - <li> <a href="wiki/index.html" target="_parent">Wiki</a></li> <li> <a href="articles/index.html" target="mainFrame">Articles</a></li> </ul> --- 29,32 ---- |
From: Derrick O. <der...@us...> - 2006-03-19 21:26:37
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17914/tags Modified Files: BodyTag.java Log Message: Fix bug #1375230 some javascript breaks stringbean Retrace non-conforming end of remark. Index: BodyTag.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/BodyTag.java,v retrieving revision 1.22 retrieving revision 1.23 diff -C2 -d -r1.22 -r1.23 *** BodyTag.java 10 Apr 2005 23:20:45 -0000 1.22 --- BodyTag.java 19 Mar 2006 21:26:32 -0000 1.23 *************** *** 86,97 **** return toPlainTextString(); } - - /** - * Return a string representation of this <code>BODY</code> tag suitable for debugging. - * @return A string representing this <code>BODY</code> tag. - */ - public String toString() - { - return "BODY: "+getBody(); - } } --- 86,88 ---- |
From: Derrick O. <der...@us...> - 2006-03-19 21:26:37
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17914/lexer Modified Files: Lexer.java Log Message: Fix bug #1375230 some javascript breaks stringbean Retrace non-conforming end of remark. Index: Lexer.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Lexer.java,v retrieving revision 1.43 retrieving revision 1.44 diff -C2 -d -r1.43 -r1.44 *** Lexer.java 19 Mar 2006 16:11:18 -0000 1.43 --- Lexer.java 19 Mar 2006 21:26:32 -0000 1.44 *************** *** 1497,1501 **** --- 1497,1508 ---- else if ('>' == ch) state = 0; + else + { + mCursor.retreat (); + mCursor.retreat (); + } } + else + mCursor.retreat (); } break; |
From: Derrick O. <der...@us...> - 2006-03-19 20:15:05
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19543 Modified Files: ConnectionManager.java Cookie.java Log Message: Fix bug #1376851 Null-valued cookies cause exception Add handling for namewless cookies. Index: Cookie.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http/Cookie.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** Cookie.java 15 May 2005 11:49:04 -0000 1.3 --- Cookie.java 19 Mar 2006 20:14:58 -0000 1.4 *************** *** 411,415 **** ret.append (": "); ret.append (getName ()); ! ret.append ("="); if (getValue ().length () > 40) { --- 411,415 ---- ret.append (": "); ret.append (getName ()); ! ret.append (getName ().equals ("") ? "" : "="); if (getValue ().length () > 40) { Index: ConnectionManager.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http/ConnectionManager.java,v retrieving revision 1.8 retrieving revision 1.9 diff -C2 -d -r1.8 -r1.9 *** ConnectionManager.java 19 Mar 2006 18:40:48 -0000 1.8 --- ConnectionManager.java 19 Mar 2006 20:14:58 -0000 1.9 *************** *** 968,972 **** buffer.append ("; "); buffer.append (cookie.getName ()); ! buffer.append ("="); if (0 != version) buffer.append ("\""); --- 968,972 ---- buffer.append ("; "); buffer.append (cookie.getName ()); ! buffer.append (cookie.getName ().equals ("") ? "" : "="); if (0 != version) buffer.append ("\""); *************** *** 1045,1053 **** if (-1 == index) { - name = token; - value = null; if (null == cookie) ! throw new IllegalStateException ("no cookie value"); ! key = name.toLowerCase (); } else --- 1045,1060 ---- if (-1 == index) { if (null == cookie) ! { // an unnamed cookie ! name = ""; ! value = token; ! key = name; ! } ! else ! { ! name = token; ! value = null; ! key = name.toLowerCase (); ! } } else |
From: Derrick O. <der...@us...> - 2006-03-19 18:40:51
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6998 Modified Files: ConnectionManager.java Log Message: Remove deflate option from default request properties. See RFE #1394144 handle deflate encoding. Index: ConnectionManager.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http/ConnectionManager.java,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** ConnectionManager.java 12 Nov 2005 14:19:50 -0000 1.7 --- ConnectionManager.java 19 Mar 2006 18:40:48 -0000 1.8 *************** *** 60,64 **** mDefaultRequestProperties.put ("User-Agent", "HTMLParser/" + org.htmlparser.Parser.VERSION_NUMBER); ! mDefaultRequestProperties.put ("Accept-Encoding", "gzip, deflate"); } --- 60,64 ---- mDefaultRequestProperties.put ("User-Agent", "HTMLParser/" + org.htmlparser.Parser.VERSION_NUMBER); ! mDefaultRequestProperties.put ("Accept-Encoding", "gzip"); } |
From: Derrick O. <der...@us...> - 2006-03-19 17:09:15
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20248 Modified Files: Page.java Log Message: Typo. Index: Page.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Page.java,v retrieving revision 1.52 retrieving revision 1.53 diff -C2 -d -r1.52 -r1.53 *** Page.java 25 Oct 2005 02:06:46 -0000 1.52 --- Page.java 19 Mar 2006 17:09:09 -0000 1.53 *************** *** 1099,1103 **** * @param cursor The position to calculate for. * @return The contents of the URL or file corresponding to the line number ! * containg the cursor position. */ public String getLine (Cursor cursor) --- 1099,1103 ---- * @param cursor The position to calculate for. * @return The contents of the URL or file corresponding to the line number ! * containing the cursor position. */ public String getLine (Cursor cursor) |
From: Derrick O. <der...@us...> - 2006-03-19 16:11:28
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv25673 Modified Files: Lexer.java Log Message: Fix bug #1445795 return as TextNode when processing jsp Handle single and double line comments within jsp nodes. Suggested alteration to handle jsp tags within tag attributes wasn't implemented. Index: Lexer.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Lexer.java,v retrieving revision 1.42 retrieving revision 1.43 diff -C2 -d -r1.42 -r1.43 *** Lexer.java 19 Mar 2006 15:01:25 -0000 1.42 --- Lexer.java 19 Mar 2006 16:11:18 -0000 1.43 *************** *** 1075,1078 **** --- 1075,1114 ---- state = 3; break; + case '/': // // or /* + ch = mPage.getCharacter (mCursor); + if (ch == '/') + { // find the \n or \r + while(true) + { + ch = mPage.getCharacter (mCursor); + if (ch == Page.EOF) + { + done = true; + break; + } + else if (ch == '\n' || ch == '\r') + { + break; + } + } + } + else if (ch == '*') + { + do + { + do + ch = mPage.getCharacter (mCursor); + while ((Page.EOF != ch) && ('*' != ch)); + ch = mPage.getCharacter (mCursor); + if (ch == '*') + mCursor.retreat (); + } + while ((Page.EOF != ch) && ('/' != ch)); + } + else + { + mCursor.retreat (); + } + break; default: // <%???x break; |
From: Derrick O. <der...@us...> - 2006-03-19 15:01:30
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24770/src/org/htmlparser/tags Added Files: ProcessingInstructionTag.java Log Message: Incorporated patch #1450095 Fix for Bug 1445309 from Trejkaz Xaoza. Addition of code to parse XML processing instructions. --- NEW FILE: ProcessingInstructionTag.java --- // HTMLParser Library $Name: $ - A java-based parser for HTML // http://sourceforge.org/projects/htmlparser // Copyright (C) 2004 Somik Raha // // Revision Control Information // // $Source: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/ProcessingInstructionTag.java,v $ // $Author: derrickoswald $ // $Date: 2006/03/19 15:01:25 $ // $Revision: 1.1 $ // // This library is free software; you can redistribute it and/or // modify it under the terms of the GNU Lesser General Public // License as published by the Free Software Foundation; either // version 2.1 of the License, or (at your option) any later version. // // This library is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU // Lesser General Public License for more details. // // You should have received a copy of the GNU Lesser General Public // License along with this library; if not, write to the Free Software // Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // package org.htmlparser.tags; import org.htmlparser.nodes.TagNode; /** * The XML processing instructions like <?xml ... ?> can be identified by this class. */ public class ProcessingInstructionTag extends TagNode { /** * The set of names handled by this tag. */ private static final String[] mIds = new String[] {"?"}; /** * Create a new processing instruction tag. */ public ProcessingInstructionTag () { } /** * Return the set of names handled by this tag. * @return The names to be matched that create tags of this type. */ public String[] getIds () { return (mIds); } /** * Returns a string representation of this processing instruction suitable for debugging. * @return A string representing this tag. */ public String toString() { String guts = toHtml(); guts = guts.substring (1, guts.length () - 2); return "Processing Instruction : "+guts+"; begins at : "+getStartPosition ()+"; ends at : "+getEndPosition (); } } |
From: Derrick O. <der...@us...> - 2006-03-19 15:01:29
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24770/src/org/htmlparser/tests/lexerTests Modified Files: LexerTests.java Log Message: Incorporated patch #1450095 Fix for Bug 1445309 from Trejkaz Xaoza. Addition of code to parse XML processing instructions. Index: LexerTests.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests/LexerTests.java,v retrieving revision 1.27 retrieving revision 1.28 diff -C2 -d -r1.27 -r1.28 *** LexerTests.java 12 Nov 2005 16:44:54 -0000 1.27 --- LexerTests.java 19 Mar 2006 15:01:25 -0000 1.28 *************** *** 151,154 **** --- 151,155 ---- "</head>", "<%=head%>", + "<?php ?>", "<!--head-->", }; *************** *** 816,820 **** assertNull ("too many nodes", lexer.nextNode ()); } ! /** * See bug #899413 bug in javascript end detection. --- 817,840 ---- assertNull ("too many nodes", lexer.nextNode ()); } ! ! /** ! * Unit test for new PI parsing code. ! */ ! public void testPI() throws ParserException ! { ! String html; ! Lexer lexer; ! Node node; ! ! html = "<?php print(\"<p>Hello World!</p>\"); ?>"; ! lexer = new Lexer(html); ! node = lexer.nextNode(); ! if (node == null) ! fail ("too few nodes"); ! else ! assertStringEquals("bad html", html, node.toHtml()); ! assertNull("too many nodes", lexer.nextNode()); ! } ! /** * See bug #899413 bug in javascript end detection. |
From: Derrick O. <der...@us...> - 2006-03-19 15:01:28
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24770/src/org/htmlparser/lexer Modified Files: Lexer.java Log Message: Incorporated patch #1450095 Fix for Bug 1445309 from Trejkaz Xaoza. Addition of code to parse XML processing instructions. Index: Lexer.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Lexer.java,v retrieving revision 1.41 retrieving revision 1.42 diff -C2 -d -r1.41 -r1.42 *** Lexer.java 19 Sep 2005 02:35:05 -0000 1.41 --- Lexer.java 19 Mar 2006 15:01:25 -0000 1.42 *************** *** 288,291 **** --- 288,296 ---- ret = parseJsp (start); } + else if ('?' == ch) + { + mCursor.retreat (); + ret = parsePI (start); + } else if ('/' == ch || '%' == ch || Character.isLetter (ch)) { *************** *** 470,474 **** // the order of these tests might be optimized for speed: else if ('/' == ch || Character.isLetter (ch) ! || '!' == ch || '%' == ch) { done = true; --- 475,479 ---- // the order of these tests might be optimized for speed: else if ('/' == ch || Character.isLetter (ch) ! || '!' == ch || '%' == ch || '?' == ch) { done = true; *************** *** 1138,1141 **** --- 1143,1271 ---- /** + * Parse an XML processing instruction. + * Scan characters until "?>" is encountered, or the input stream is + * exhausted, in which case <code>null</code> is returned. + * @param start The position at which to start scanning. + * @return The parsed node. + * @exception ParserException If a problem occurs reading from the source. + */ + protected Node parsePI (int start) + throws + ParserException + { + boolean done; + char ch; + int state; + Vector attributes; + int code; + + done = false; + state = 0; + code = 0; + attributes = new Vector (); + // <?xyz?> + // 011112d + while (!done) + { + ch = mPage.getCharacter (mCursor); + switch (state) + { + case 0: // prior to the question mark + switch (ch) + { + case '?': // <? + code = mCursor.getPosition (); + attributes.addElement (new PageAttribute (mPage, start + 1, code, -1, -1, (char)0)); + state = 1; + break; + // case Page.EOF: // <\0 + // case '>': // <> + default: + done = true; + break; + } + break; + case 1: // prior to the closing question mark + switch (ch) + { + case Page.EOF: // <?x\0 + case '>': // <?x> + done = true; + break; + case '\'': + case '"':// <?..." + state = ch; + break; + case '?': // <?...? + state = 2; + break; + default: // <?...x + break; + } + break; + case 2: + switch (ch) + { + case Page.EOF: // <?x..?\0 + done = true; + break; + case '>': + state = 3; + done = true; + break; + default: // <?...?x + state = 1; + break; + } + break; + case '"': + switch (ch) + { + case Page.EOF: // <?x.."\0 + done = true; + break; + case '"': + state = 1; + break; + default: // <?...'.x + break; + } + break; + case '\'': + switch (ch) + { + case Page.EOF: // <?x..'\0 + done = true; + break; + case '\'': + state = 1; + break; + default: // <?..."..x + break; + } + break; + default: + throw new IllegalStateException ("how the fuck did we get in state " + state); + } + } + + if (3 == state) // normal exit + { + if (0 != code) + { + state = mCursor.getPosition () - 2; // reuse state + attributes.addElement (new PageAttribute (mPage, code, state, -1, -1, (char)0)); + attributes.addElement (new PageAttribute (mPage, state, state + 1, -1, -1, (char)0)); + } + else + throw new IllegalStateException ("processing instruction with no content"); + } + else + return (parseString (start, true)); // hmmm, true? + + return (makeTag (start, mCursor.getPosition (), attributes)); + } + + /** * Return CDATA as a text node. * According to appendix <a href="http://www.w3.org/TR/html4/appendix/notes.html#notes-specifying-data"> |
From: Derrick O. <der...@us...> - 2006-03-19 15:01:28
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24770/docs Modified Files: contributors.html Log Message: Incorporated patch #1450095 Fix for Bug 1445309 from Trejkaz Xaoza. Addition of code to parse XML processing instructions. Index: contributors.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/contributors.html,v retrieving revision 1.21 retrieving revision 1.22 diff -C2 -d -r1.21 -r1.22 *** contributors.html 12 Nov 2005 14:19:51 -0000 1.21 --- contributors.html 19 Mar 2006 15:01:22 -0000 1.22 *************** *** 396,405 **** </tr> </table> ! <p>Thanks to Marcus Mattern, Ian Macfarlane, Keiron McCammon, Martin Hudson, ! Matthew Buckett, Jamie McCrindle, John Derrick, David Andersen, Manuel Polo, ! Enrico Triolo, Gernot Fricke, Nick Burch, Stephen Harrington, Domenico Lordi, ! Kamen, John Zook, Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, ! Raj Sharma, Robert Kausch, Gordon Deudney, Serge Kruppa, Roger Kjensrud, ! and Manpreet Singh for suggestions, bug reports and feature ideas. <br> <p>Thanks to Jon Gillette for the cool new logo.<br> </body> --- 396,406 ---- </tr> </table> ! <p>Thanks to Trejkaz Xaoza, Marcus Mattern, Ian Macfarlane, Keiron McCammon, ! Martin Hudson, Matthew Buckett, Jamie McCrindle, John Derrick, David Andersen, ! Manuel Polo, Enrico Triolo, Gernot Fricke, Nick Burch, Stephen Harrington, ! Domenico Lordi, Kamen, John Zook, Cheng Jun, Mazlan Mat, Rob Shields, ! Wolfgang Germund, Raj Sharma, Robert Kausch, Gordon Deudney, Serge Kruppa, ! Roger Kjensrud, and Manpreet Singh for suggestions, bug reports ! and feature ideas. <br> <p>Thanks to Jon Gillette for the cool new logo.<br> </body> |
From: Derrick O. <der...@us...> - 2006-03-19 15:01:28
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24770/src/org/htmlparser Modified Files: PrototypicalNodeFactory.java Log Message: Incorporated patch #1450095 Fix for Bug 1445309 from Trejkaz Xaoza. Addition of code to parse XML processing instructions. Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.18 retrieving revision 1.19 diff -C2 -d -r1.18 -r1.19 *** PrototypicalNodeFactory.java 15 Nov 2005 02:09:10 -0000 1.18 --- PrototypicalNodeFactory.java 19 Mar 2006 15:01:24 -0000 1.19 *************** *** 62,65 **** --- 62,66 ---- import org.htmlparser.tags.OptionTag; import org.htmlparser.tags.ParagraphTag; + import org.htmlparser.tags.ProcessingInstructionTag; import org.htmlparser.tags.ScriptTag; import org.htmlparser.tags.SelectTag; *************** *** 319,322 **** --- 320,324 ---- registerTag (new OptionTag ()); registerTag (new ParagraphTag ()); + registerTag (new ProcessingInstructionTag ()); registerTag (new ScriptTag ()); registerTag (new SelectTag ()); |
From: Ian M. <ian...@us...> - 2006-02-13 14:50:46
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20370/src/org/htmlparser/util Added Files: NodeTreeWalker.java Log Message: A utility class to traverse a tree of Node objects using either depth-first or breadth-first tree traversal. Kind of like a NodeIterator for DOM-type trees of Nodes instead of linear sequences of Nodes. Post to the dev mailing list about this on the way. --- NEW FILE: NodeTreeWalker.java --- // HTMLParser Library $Name: $ - A java-based parser for HTML // http://sourceforge.org/projects/htmlparser // Copyright (C) 2004 Somik Raha // // Revision Control Information // // $Source: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/NodeTreeWalker.java,v $ // $Author: ian_macfarlane $ // $Date: 2006/02/13 14:50:35 $ // $Revision: 1.1 $ // // This library is free software; you can redistribute it and/or // modify it under the terms of the GNU Lesser General Public // License as published by the Free Software Foundation; either // version 2.1 of the License, or (at your option) any later version. // // This library is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU // Lesser General Public License for more details. // // You should have received a copy of the GNU Lesser General Public // License along with this library; if not, write to the Free Software // Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // package org.htmlparser.util; import org.htmlparser.Node; /** * A class for walking a tree of {@link Node} objects, in either a depth-first or breadth-first manner. * The following two diagrams show the represent tree traversal with the two different methods. * <table> * <tr> * <th>Depth-first traversal</th> * <th>Breadth-first traversal</th> * </tr> * <tr> * <img src="http://htmlparser.sourceforge.net/tree-traversal-depth-first.gif" alt="Diagram showing depth-first tree traversal" width="300" height="300" /> * </tr> * <tr> * <img src="http://htmlparser.sourceforge.net/tree-traversal-breadth-first.gif" alt="Diagram showing breadth-first tree traversal" width="300" height="300" /> * </tr> * </table> * @author ian_macfarlane */ public class NodeTreeWalker implements NodeIterator { /** * The root Node element which defines the scope of the current tree to walk. */ protected Node mRootNode; /** * The current Node element, which will be a child of the root Node, or null. */ protected Node mCurrentNode; /** * The next Node element after the current Node element. * Stored for internal use only. */ protected Node mNextNode; /** * The maximum depth (child-parent links) from which this NodeTreeWalker may be removed from the root Node. * A value of -1 indicates that there is no depth restriction. */ protected int mMaxDepth; /** * Whether the tree traversal method used is depth-first (default) or breadth-first. */ protected boolean mDepthFirst; /** * Creates a new instance of NodeTreeWalker using depth-first tree traversal, without limits on how deep it may traverse. * @param root Node The Node to set as the root of the tree. * @throws NullPointerException if root Node is null. */ public NodeTreeWalker(Node rootNode) { this(rootNode, true, -1); } /** * Creates a new instance of NodeTreeWalker using the specified type of tree traversal, without limits on how deep it may traverse. * @param rootNode The Node to set as the root of the tree. * @param depthFirst Whether to use depth-first (true) or breadth-first (false) tree traversal. * @throws NullPointerException if rootNode is null. */ public NodeTreeWalker(Node rootNode, boolean depthFirst) { this(rootNode, depthFirst, -1); } /** * Creates a new instance of NodeTreeWalker using the specified type of tree traversal and maximum depth from the root Node to traverse. * @param rootNode The Node to set as the root of the tree. * @param depthFirst Whether to use depth-first (true) or breadth-first (false) tree traversal. * @param maxDepth The maximum depth from the root Node that this NodeTreeWalker may traverse. This must be > 0 or equal to -1. * @throws NullPointerException if rootNode is null. * @throws IllegalArgumentException maxDepth is not > 0 or equal to -1. */ public NodeTreeWalker(Node rootNode, boolean depthFirst, int maxDepth) { //check maxDepth is valid if ( ! ((maxDepth >= 1) || (maxDepth == -1)))//if not one of these valid possibilities throw new IllegalArgumentException("Paramater maxDepth must be > 0 or equal to -1."); initRootNode(rootNode);//this method also checks if rootNode is valid this.mDepthFirst = depthFirst; this.mMaxDepth = maxDepth; } /** * Whether the NodeTreeWalker is currently set to use depth-first or breadth-first tree traversal. * @return True if depth-first tree-traversal is used, or false if breadth-first tree-traversal is being used. */ public boolean isDepthFirst() { return (this.mDepthFirst); } /** * Sets whether the NodeTreeWalker should use depth-first or breadth-first tree traversal. * @param depthFirst Whether to use depth-first (true) or breadth-first (false) tree traversal. */ public void setDepthFirst(boolean depthFirst) { if (this.mDepthFirst != depthFirst)//if we are changing search pattern this.mNextNode = null; this.mDepthFirst = depthFirst; } /** * The maximum depth (number of child-parent links) below the root Node that this NodeTreeWalker may traverse. * @return The maximum depth that this NodeTreeWalker can traverse to. */ public int getMaxDepth() { return (this.mMaxDepth); } /** * Removes any restrictions in place that prevent this NodeTreeWalker from traversing beyond a certain depth. */ public void removeMaxDepthRestriction() { this.mMaxDepth = -1; } /** * Get the root Node that defines the scope of the tree to traverse. * @return The root Node. */ public Node getRootNode() { return (this.mRootNode); } /** * Get the Node in the tree that the NodeTreeWalker is current at. * @return The current Node. */ public Node getCurrentNode() { return (this.mCurrentNode); } /** * Sets the current Node as the root Node. * Resets the current position in the tree. * @throws NullPointerException if the current Node is null (i.e. if the tree traversal has not yet begun). */ public void setCurrentNodeAsRootNode() throws NullPointerException { if (this.mCurrentNode == null) throw new NullPointerException("Current Node is null, cannot set as root Node."); initRootNode(this.mCurrentNode); } /** * Sets the specified Node as the root Node. * Resets the current position in the tree. * @param rootNode The Node to set as the root of the tree. * @throws NullPointerException if rootNode is null. */ public void setRootNode(Node rootNode) throws NullPointerException { initRootNode(rootNode); } /** * Resets the current position in the tree, * such that calling <code>nextNode()</code> will return the first Node again. */ public void reset() { this.mCurrentNode = null; this.mNextNode = null; } /** * Traverses to the next Node from the current Node, using either depth-first or breadth-first tree traversal as appropriate. * @return The next Node from the current Node. */ public Node nextNode() { if (this.mNextNode != null)//check if we've already found the next Node by calling hasMoreNodes() { this.mCurrentNode = this.mNextNode; this.mNextNode = null;//reset mNextNode } else { //Check if we have started traversing yet. If not, start with first child (for either traversal method). if (this.mCurrentNode == null) this.mCurrentNode = this.mRootNode.getFirstChild(); else { if (this.mDepthFirst) this.mCurrentNode = getNextNodeDepthFirst(); else this.mCurrentNode = getNextNodeBreadthFirst(); } } return (this.mCurrentNode); } /** * Get the number of places down that the current Node is from the root Node. * Returns 1 if current Node is a child of the root Node. * Returns 0 if this NodeTreeWalker has not yet traversed to any Nodes. * @return The depth the current Node is from the root Node. */ public int getCurrentNodeDepth() { int depth = 0; if (this.mCurrentNode != null)//if we are not at the root Node. { Node traverseNode = this.mCurrentNode; while (traverseNode != this.mRootNode) { ++depth; traverseNode = traverseNode.getParent(); } } return (depth); } /** * Returns whether or not there are more nodes available based on the current configuration of this NodeTreeWalker. * @return True if there are more Nodes available, based on the current configuration, or false otherwise. */ public boolean hasMoreNodes() { if (this.mNextNode == null)//if we've already generated mNextNode { if (this.mCurrentNode == null) this.mNextNode = this.mRootNode.getFirstChild(); else { if (this.mDepthFirst) this.mNextNode = getNextNodeDepthFirst(); else this.mNextNode = getNextNodeBreadthFirst(); } } return (this.mNextNode != null); } /** * Sets the root Node to be the given Node. * Resets the current position in the tree. * @param rootNode The Node to set as the root of the tree. * @throws NullPointerException if rootNode is null. */ protected void initRootNode(Node rootNode) throws NullPointerException { if (rootNode == null) throw new NullPointerException("Root Node cannot be null."); this.mRootNode = rootNode; this.mCurrentNode = null; this.mNextNode = null; } /** * Traverses to the next Node from the current Node using depth-first tree traversal * @return The next Node from the current Node using depth-first tree traversal. */ protected Node getNextNodeDepthFirst() { //loosely based on http://www.myarch.com/treeiter/traditways.jhtml int currentDepth = getCurrentNodeDepth(); Node traverseNode = null; if ((this.mMaxDepth == -1) || (currentDepth < this.mMaxDepth))//if it is less than max depth, then getting first child won't be more than max depth { traverseNode = this.mCurrentNode.getFirstChild(); if (traverseNode != null) return (traverseNode); } traverseNode = this.mCurrentNode; Node tempNextSibling = null;//keeping a reference to this this saves calling getNextSibling once later while ((traverseNode != this.mRootNode) && (tempNextSibling = traverseNode.getNextSibling()) == null)//CANNOT assign traverseNode as root Node traverseNode = traverseNode.getParent();// use child-parent link to get to the parent level return (tempNextSibling);//null if ran out of Node's } /** * Traverses to the next Node from the current Node using breadth-first tree traversal * @return The next Node from the current Node using breadth-first tree traversal. */ protected Node getNextNodeBreadthFirst() { Node traverseNode; //see if the mCurrentNode has a sibling after it traverseNode = this.mCurrentNode.getNextSibling(); if (traverseNode != null) return (traverseNode); int depth = getCurrentNodeDepth(); //try and find the next Node at the same depth that is not a sibling NodeList traverseNodeList; //step up to the parent Node to look through its children traverseNode = this.mCurrentNode.getParent(); int currentDepth = depth - 1; while(currentDepth > 0)//this is safe as we've tried getNextSibling already { Node tempNextSibling = null;//keeping a reference to this this saves calling getNextSibling once later //go to first parent with nextSibling, then to that sibling while(((tempNextSibling = traverseNode.getNextSibling()) == null) && (traverseNode != this.mRootNode))//CAN assign traverseNode as root Node { traverseNode = traverseNode.getParent(); --currentDepth; } //if have traversed back to the root Node, skip to next part where it finds the first Node at the next depth down if (traverseNode == this.mRootNode) break; traverseNode = tempNextSibling; if (traverseNode != null) { //go through children of that sibling traverseNodeList = traverseNode.getChildren(); while((traverseNodeList != null) && (traverseNodeList.size() != 0)) { traverseNode = traverseNode.getFirstChild(); ++currentDepth; if (currentDepth == depth) return (traverseNode);//found the next Node at the current depth else traverseNodeList = traverseNode.getChildren(); } // while((traverseNodeList != null) && (traverseNodeList.size() != 0)) } // if (traverseNode != null) } // while(currentDepth > 0) //step to the next depth down //check first whether we are about to go past max depth if (this.mMaxDepth != -1)//if -1, then there is no max depth restriction { if (depth >= this.mMaxDepth) return (null);//can't go past max depth } traverseNode = this.mRootNode.getFirstChild(); ++depth;//look for next depth currentDepth = 1; while(currentDepth > 0) { //go through children of that sibling traverseNodeList = traverseNode.getChildren(); while((traverseNodeList != null) && (traverseNodeList.size() != 0)) { traverseNode = traverseNode.getFirstChild(); ++currentDepth; if (currentDepth == depth) return (traverseNode);//found the next Node at the current depth else traverseNodeList = traverseNode.getChildren(); } // while((traverseNodeList != null) && (traverseNodeList.size() != 0)) //go to first parent with nextSibling, then to that sibling while((traverseNode.getNextSibling() == null) && (traverseNode != this.mRootNode)) { traverseNode = traverseNode.getParent(); --currentDepth; } traverseNode = traverseNode.getNextSibling(); if (traverseNode == null)//if null (i.e. reached end of tree), return null return (null); } // while(currentDepth > 0) //otherwise, finished searching, return null return (null); } // todo // previousNode() // getPreviousNodeDepthFirst() // getPreviousNodeBreadthFirst() // hasPreviousNodes() ? // these should be specificed in an interface - suggest something like ReversableNodeIterator (extends NodeIterator) // possible optimisations: when doing mNextNode, we should save mCurrentNode as previousNode, and vice versa } |
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv23209 Modified Files: Attribute.java Node.java Parser.java PrototypicalNodeFactory.java Remark.java StringNodeFactory.java Tag.java Text.java Log Message: Fix warnings flagged by doccheck. Index: Remark.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Remark.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** Remark.java 15 May 2005 11:49:03 -0000 1.3 --- Remark.java 15 Nov 2005 02:09:10 -0000 1.4 *************** *** 37,40 **** --- 37,41 ---- * Returns the text contents of the comment tag. * @return The contents of the text inside the comment delimiters. + * @see #setText */ String getText(); *************** *** 45,48 **** --- 46,50 ---- * these are stripped off. * @param text The new text for the node. + * @see #getText */ void setText (String text); Index: Node.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Node.java,v retrieving revision 1.54 retrieving revision 1.55 diff -C2 -d -r1.54 -r1.55 *** Node.java 26 Oct 2005 22:01:23 -0000 1.54 --- Node.java 15 Nov 2005 02:09:10 -0000 1.55 *************** *** 134,137 **** --- 134,138 ---- * This is the character (not byte) offset of this node in the page. * @return The start position. + * @see #setStartPosition */ int getStartPosition (); *************** *** 140,143 **** --- 141,145 ---- * Sets the starting position of the node. * @param position The new start position. + * @see #getStartPosition */ void setStartPosition (int position); *************** *** 148,151 **** --- 150,154 ---- * node in the page. * @return The end position. + * @see #setEndPosition */ int getEndPosition (); *************** *** 154,157 **** --- 157,161 ---- * Sets the ending position of the node. * @param position The new end position. + * @see #getEndPosition */ void setEndPosition (int position); *************** *** 160,163 **** --- 164,168 ---- * Get the page this node came from. * @return The page that supplied this node. + * @see #setPage */ Page getPage (); *************** *** 166,169 **** --- 171,175 ---- * Set the page this node came from. * @param page The page that supplied this node. + * @see #getPage */ void setPage (Page page); *************** *** 184,187 **** --- 190,194 ---- * @return The parent of this node, if it's been set, <code>null</code> * otherwise. + * @see #setParent */ Node getParent (); *************** *** 190,193 **** --- 197,201 ---- * Sets the parent of this node. * @param node The node that contains this node. + * @see #getParent */ void setParent (Node node); *************** *** 197,200 **** --- 205,209 ---- * @return The list of children contained by this node, if it's been set, * <code>null</code> otherwise. + * @see #setChildren */ NodeList getChildren (); *************** *** 203,206 **** --- 212,216 ---- * Set the children of this node. * @param children The new list of children this node contains. + * @see #getChildren */ void setChildren (NodeList children); *************** *** 238,241 **** --- 248,252 ---- * @return The contents of the string or remark node, and in the case of * a tag, the contents of the tag less the enclosing angle brackets. + * @see #setText */ String getText (); *************** *** 244,247 **** --- 255,259 ---- * Sets the string contents of the node. * @param text The new text for the node. + * @see #getText */ void setText (String text); Index: Tag.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Tag.java,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** Tag.java 15 May 2005 11:49:03 -0000 1.6 --- Tag.java 15 Nov 2005 02:09:10 -0000 1.7 *************** *** 42,46 **** * @param name Name of attribute, case insensitive. * @return The value associated with the attribute or null if it does ! * not exist, or is a stand-alone or */ String getAttribute (String name); --- 42,47 ---- * @param name Name of attribute, case insensitive. * @return The value associated with the attribute or null if it does ! * not exist, or is a stand-alone. ! * @see #setAttribute */ String getAttribute (String name); *************** *** 51,54 **** --- 52,57 ---- * @param key The name of the attribute. * @param value The value of the attribute. + * @see #getAttribute + * @see #setAttribute(String,String,char) */ void setAttribute (String key, String value); *************** *** 60,63 **** --- 63,67 ---- * @param quote The quote character to be used around value. * If zero, it is an unquoted value. + * @see #getAttribute */ void setAttribute (String key, String value, char quote); *************** *** 74,77 **** --- 78,82 ---- * @return The attribute or null if it does * not exist. + * @see #setAttributeEx */ Attribute getAttributeEx (String name); *************** *** 82,85 **** --- 87,91 ---- * To set the zeroth attribute (the tag name), use setTagName(). * @param attribute The attribute to set. + * @see #getAttributeEx */ void setAttributeEx (Attribute attribute); *************** *** 88,91 **** --- 94,98 ---- * Gets the attributes in the tag. * @return Returns the list of {@link Attribute Attributes} in the tag. + * @see #setAttributesEx */ Vector getAttributesEx (); *************** *** 97,100 **** --- 104,108 ---- * and the second element being the value. * @param attribs The attribute collection to set. + * @see #getAttributesEx */ void setAttributesEx (Vector attribs); *************** *** 112,118 **** * for an attribute (either no equals sign or nothing to the right of the * equals sign). A special entry with a key of ! * SpecialHashtable.TAGNAME ("$<TAGNAME>$") holds the tag name. * The conversion to uppercase is performed with an ENGLISH locale. * @deprecated Use getAttributesEx() instead. */ Hashtable getAttributes (); --- 120,127 ---- * for an attribute (either no equals sign or nothing to the right of the * equals sign). A special entry with a key of ! * SpecialHashtable.TAGNAME ("$<TAGNAME>$") holds the tag name. * The conversion to uppercase is performed with an ENGLISH locale. * @deprecated Use getAttributesEx() instead. + * @see #setAttributes */ Hashtable getAttributes (); *************** *** 120,127 **** /** * Sets the attributes. ! * A special entry with a key of SpecialHashtable.TAGNAME ("$<TAGNAME>$") * sets the tag name. * @param attributes The attribute collection to set. * @deprecated Use setAttributesEx() instead. */ void setAttributes (Hashtable attributes); --- 129,137 ---- /** * Sets the attributes. ! * A special entry with a key of SpecialHashtable.TAGNAME ("$<TAGNAME>$") * sets the tag name. * @param attributes The attribute collection to set. * @deprecated Use setAttributesEx() instead. + * @see #getAttributes */ void setAttributes (Hashtable attributes); *************** *** 137,140 **** --- 147,151 ---- * </em> * @return The tag name. + * @see #setTagName */ String getTagName (); *************** *** 145,148 **** --- 156,160 ---- * zeroth element of the attribute vector). * @param name The tag name. + * @see #getTagName */ void setTagName (String name); *************** *** 213,216 **** --- 225,229 ---- * For a non-composite tag this always returns <code>null</code>. * @return The tag that terminates this composite tag, i.e. </HTML>. + * @see #setEndTag */ Tag getEndTag (); *************** *** 220,223 **** --- 233,237 ---- * For a non-composite tag this is a no-op. * @param tag The tag that closes this composite tag, i.e. </HTML>. + * @see #getEndTag */ void setEndTag (Tag tag); *************** *** 226,229 **** --- 240,244 ---- * Return the scanner associated with this tag. * @return The scanner associated with this tag. + * @see #setThisScanner */ Scanner getThisScanner (); *************** *** 232,235 **** --- 247,251 ---- * Set the scanner associated with this tag. * @param scanner The scanner for this tag. + * @see #getThisScanner */ void setThisScanner (Scanner scanner); Index: Text.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Text.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** Text.java 15 May 2005 11:49:03 -0000 1.3 --- Text.java 15 Nov 2005 02:09:10 -0000 1.4 *************** *** 37,40 **** --- 37,41 ---- * Accesses the textual contents of the node. * @return The text of the node. + * @see #setText */ String getText (); *************** *** 43,46 **** --- 44,48 ---- * Sets the contents of the node. * @param text The new text for the node. + * @see #getText */ void setText (String text); Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.109 retrieving revision 1.110 diff -C2 -d -r1.109 -r1.110 *** Parser.java 12 Nov 2005 15:11:45 -0000 1.109 --- Parser.java 15 Nov 2005 02:09:10 -0000 1.110 *************** *** 197,200 **** --- 197,201 ---- * Get the connection manager all Parsers use. * @return The connection manager. + * @see #setConnectionManager */ public static ConnectionManager getConnectionManager () *************** *** 206,209 **** --- 207,211 ---- * Set the connection manager all Parsers use. * @param manager The new connection manager. + * @see #getConnectionManager */ public static void setConnectionManager (ConnectionManager manager) *************** *** 357,360 **** --- 359,363 ---- * lexer. * @see #setLexer + * @see #getConnection */ public void setConnection (URLConnection connection) *************** *** 385,388 **** --- 388,392 ---- * @throws ParserException If the url is invalid or creation of the * underlying Lexer cannot be performed. + * @see #getURL */ public void setURL (String url) *************** *** 400,403 **** --- 404,408 ---- * for example, a file name may be modified to be a URL. * @see Page#getUrl + * @see #setURL */ public String getURL () *************** *** 413,416 **** --- 418,422 ---- * have been seen had the new encoding been in force. * @see org.htmlparser.util.EncodingChangeException + * @see #getEncoding */ public void setEncoding (String encoding) *************** *** 426,429 **** --- 432,436 ---- * tags in the head, so this may change after the head has been parsed. * @return The encoding currently in force. + * @see #setEncoding */ public String getEncoding () *************** *** 440,443 **** --- 447,451 ---- * @param lexer The lexer object to use. * @see #setNodeFactory + * @see #getLexer */ public void setLexer (Lexer lexer) *************** *** 465,470 **** /** ! * Returns the lexer associated with the parser * @return The current lexer. */ public Lexer getLexer () --- 473,479 ---- /** ! * Returns the lexer associated with the parser. * @return The current lexer. + * @see #setLexer */ public Lexer getLexer () *************** *** 476,479 **** --- 485,489 ---- * Get the current node factory. * @return The current lexer's node factory. + * @see #setNodeFactory */ public NodeFactory getNodeFactory () *************** *** 485,488 **** --- 495,499 ---- * Set the current node factory. * @param factory The new node factory for the current lexer. + * @see #getNodeFactory */ public void setNodeFactory (NodeFactory factory) *************** *** 497,500 **** --- 508,512 ---- * @param fb The new feedback object to use. If this is null a * {@link #DEVNULL silent feedback object} is used. + * @see #getFeedback */ public void setFeedback (ParserFeedback fb) *************** *** 509,512 **** --- 521,525 ---- * Returns the current feedback object. * @return The feedback object currently being used. + * @see #setFeedback */ public ParserFeedback getFeedback() *************** *** 760,764 **** /** ! * The main program, which can be executed from the command line * @param args A URL or file name to parse, and an optional tag name to be * used as a filter. --- 773,777 ---- /** ! * The main program, which can be executed from the command line. * @param args A URL or file name to parse, and an optional tag name to be * used as a filter. Index: Attribute.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Attribute.java,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** Attribute.java 15 May 2005 11:49:03 -0000 1.7 --- Attribute.java 15 Nov 2005 02:09:10 -0000 1.8 *************** *** 68,72 **** * <p> * <table width="100.0%" align="Center" border="1"> ! * <caption>Valid States for Attributes. * <tr> * <th align="Center">Description</th> --- 68,72 ---- * <p> * <table width="100.0%" align="Center" border="1"> ! * <caption>Valid States for Attributes.</caption> * <tr> * <th align="Center">Description</th> *************** *** 340,343 **** --- 340,344 ---- * @return The name, or <code>null</code> if it's just a whitepace * 'attribute'. + * @see #setName */ public String getName () *************** *** 350,353 **** --- 351,355 ---- * @param buffer The buffer to place the name in. * @see #getName() + * @see #setName */ public void getName (StringBuffer buffer) *************** *** 364,367 **** --- 366,371 ---- * malformed HTML if the assignment string is not <code>null</code>. * @param name The new name. + * @see #getName + * @see #getName(StringBuffer) */ public void setName (String name) *************** *** 375,378 **** --- 379,383 ---- * can include whitespace on either or both sides of an equals sign. * @return The assignment string. + * @see #setAssignment */ public String getAssignment () *************** *** 385,388 **** --- 390,394 ---- * @param buffer The buffer to place the assignment string in. * @see #getAssignment() + * @see #setAssignment */ public void getAssignment (StringBuffer buffer) *************** *** 399,402 **** --- 405,410 ---- * <code>null</code>. * @param assignment The new assignment string. + * @see #getAssignment + * @see #getAssignment(StringBuffer) */ public void setAssignment (String assignment) *************** *** 414,417 **** --- 422,426 ---- * @return The value, or <code>null</code> if it's a stand-alone or * empty attribute, or the text if it's just a whitepace 'attribute'. + * @see #setValue */ public String getValue () *************** *** 424,427 **** --- 433,437 ---- * @param buffer The buffer to place the value in. * @see #getValue() + * @see #setValue */ public void getValue (StringBuffer buffer) *************** *** 439,442 **** --- 449,454 ---- * HTML. * @param value The new value. + * @see #getValue + * @see #getValue(StringBuffer) */ public void setValue (String value) *************** *** 449,452 **** --- 461,465 ---- * @return Either ' or " if the attribute value was quoted, or zero * if there are no quotes around it. + * @see #setQuote */ public char getQuote () *************** *** 459,462 **** --- 472,476 ---- * @param buffer The buffer to place the quote in. * @see #getQuote() + * @see #setQuote */ public void getQuote (StringBuffer buffer) *************** *** 472,475 **** --- 486,491 ---- * whitespace). * @param quote The new quote value. + * @see #getQuote + * @see #getQuote(StringBuffer) */ public void setQuote (char quote) *************** *** 484,487 **** --- 500,504 ---- * @return The value, or <code>null</code> if it's a stand-alone attribute, * or the text if it's just a whitepace 'attribute'. + * @see #setRawValue */ public String getRawValue () *************** *** 517,520 **** --- 534,538 ---- * @param buffer The string buffer to append the attribute value to. * @see #getRawValue() + * @see #setRawValue */ public void getRawValue (StringBuffer buffer) *************** *** 535,538 **** --- 553,558 ---- * double quotes within the string to character references. * @param value The new value. + * @see #getRawValue + * @see #getRawValue(StringBuffer) */ public void setRawValue (String value) Index: StringNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/StringNodeFactory.java,v retrieving revision 1.13 retrieving revision 1.14 diff -C2 -d -r1.13 -r1.14 *** StringNodeFactory.java 10 Apr 2005 23:20:42 -0000 1.13 --- StringNodeFactory.java 15 Nov 2005 02:09:10 -0000 1.14 *************** *** 70,74 **** { /** ! * Flag to tell the parser to decode strings returned by StringNode's toPlainTextString. * Decoding occurs via the method, org.htmlparser.util.Translate.decode() */ --- 70,74 ---- { /** ! * Flag to toggle decoding of strings. * Decoding occurs via the method, org.htmlparser.util.Translate.decode() */ *************** *** 77,91 **** /** ! * Flag to tell the parser to remove escape characters, like \n and \t, returned by StringNode's toPlainTextString. ! * Escape character removal occurs via the method, org.htmlparser.util.ParserUtils.removeEscapeCharacters() */ protected boolean mRemoveEscapes; /** ! * Flag to tell the parser to convert non breaking space (from \u00a0 to a space " "). * If true, this will happen inside StringNode's toPlainTextString. */ protected boolean mConvertNonBreakingSpaces; ! public StringNodeFactory () { --- 77,95 ---- /** ! * Flag to toggle removal of escape characters, like \n and \t. ! * Escape character removal occurs via the method, ! * org.htmlparser.util.ParserUtils.removeEscapeCharacters() */ protected boolean mRemoveEscapes; /** ! * Flag to toggle converting non breaking spaces (from \u00a0 to space " "). * If true, this will happen inside StringNode's toPlainTextString. */ protected boolean mConvertNonBreakingSpaces; ! ! /** ! * Create the default string node factory. ! */ public StringNodeFactory () { *************** *** 104,107 **** --- 108,112 ---- * @param start The beginning position of the string. * @param end The ending positiong of the string. + * @return The text node for the page and range given. */ public Text createStringNode (Page page, int start, int end) *************** *** 122,126 **** /** * Set the decoding state. ! * @param decode If <code>true</code>, string nodes decode text using {@link org.htmlparser.util.Translate#decode}. */ public void setDecode (boolean decode) --- 127,133 ---- /** * Set the decoding state. ! * @param decode If <code>true</code>, string nodes decode text using ! * {@link org.htmlparser.util.Translate#decode}. ! * @see #getDecode */ public void setDecode (boolean decode) *************** *** 132,135 **** --- 139,143 ---- * Get the decoding state. * @return <code>true</code> if string nodes decode text. + * @see #setDecode */ public boolean getDecode () *************** *** 140,144 **** /** * Set the escape removing state. ! * @param remove If <code>true</code>, string nodes remove escape characters. */ public void setRemoveEscapes (boolean remove) --- 148,154 ---- /** * Set the escape removing state. ! * @param remove If <code>true</code>, string nodes remove escape ! * characters. ! * @see #getRemoveEscapes */ public void setRemoveEscapes (boolean remove) *************** *** 150,153 **** --- 160,164 ---- * Get the escape removing state. * @return The removing state. + * @see #setRemoveEscapes */ public boolean getRemoveEscapes () *************** *** 158,162 **** /** * Set the non-breaking space replacing state. ! * @param convert If <code>true</code>, string nodes replace ;nbsp; characters with spaces. */ public void setConvertNonBreakingSpaces (boolean convert) --- 169,175 ---- /** * Set the non-breaking space replacing state. ! * @param convert If <code>true</code>, string nodes replace ;nbsp; ! * characters with spaces. ! * @see #getConvertNonBreakingSpaces */ public void setConvertNonBreakingSpaces (boolean convert) *************** *** 168,171 **** --- 181,185 ---- * Get the non-breaking space replacing state. * @return The replacing state. + * @see #setConvertNonBreakingSpaces */ public boolean getConvertNonBreakingSpaces () Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.17 retrieving revision 1.18 diff -C2 -d -r1.17 -r1.18 *** PrototypicalNodeFactory.java 12 Nov 2005 14:19:51 -0000 1.17 --- PrototypicalNodeFactory.java 15 Nov 2005 02:09:10 -0000 1.18 *************** *** 77,81 **** * This factory uses the prototype pattern to generate new nodes. * These are cloned as needed to form new {@link Text}, {@link Remark} and ! * {@link Tag} nodes.</p> * <p>Text and remark nodes are generated from prototypes accessed * via the {@link #setTextPrototype(Text) textPrototype} and --- 77,81 ---- * This factory uses the prototype pattern to generate new nodes. * These are cloned as needed to form new {@link Text}, {@link Remark} and ! * {@link Tag} nodes. * <p>Text and remark nodes are generated from prototypes accessed * via the {@link #setTextPrototype(Text) textPrototype} and *************** *** 341,344 **** --- 341,345 ---- * Get the object that is cloned to generate text nodes. * @return The prototype for {@link Text} nodes. + * @see #setTextPrototype */ public Text getTextPrototype () *************** *** 352,355 **** --- 353,357 ---- * If <code>null</code> the prototype is set to the default * ({@link TextNode}). + * @see #getTextPrototype */ public void setTextPrototype (Text text) *************** *** 364,367 **** --- 366,370 ---- * Get the object that is cloned to generate remark nodes. * @return The prototype for {@link Remark} nodes. + * @see #setRemarkPrototype */ public Remark getRemarkPrototype () *************** *** 375,378 **** --- 378,382 ---- * If <code>null</code> the prototype is set to the default * ({@link RemarkNode}). + * @see #getRemarkPrototype */ public void setRemarkPrototype (Remark remark) *************** *** 389,392 **** --- 393,397 ---- * specific tag is found in the list of registered tags. * @return The prototype for {@link Tag} nodes. + * @see #setTagPrototype */ public Tag getTagPrototype () *************** *** 402,405 **** --- 407,411 ---- * If <code>null</code> the prototype is set to the default * ({@link TagNode}). + * @see #getTagPrototype */ public void setTagPrototype (Tag tag) |
From: Derrick O. <der...@us...> - 2005-11-12 16:45:02
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/tagTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv13581/tests/tagTests Modified Files: FormTagTest.java LinkTagTest.java Log Message: Update tests for addition of Paragraph tag. Index: FormTagTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/tagTests/FormTagTest.java,v retrieving revision 1.46 retrieving revision 1.47 diff -C2 -d -r1.46 -r1.47 *** FormTagTest.java 31 Jul 2004 16:42:31 -0000 1.46 --- FormTagTest.java 12 Nov 2005 16:44:54 -0000 1.47 *************** *** 41,44 **** --- 41,45 ---- import org.htmlparser.tags.LinkTag; import org.htmlparser.tags.OptionTag; + import org.htmlparser.tags.ParagraphTag; import org.htmlparser.tags.SelectTag; import org.htmlparser.tags.TableTag; *************** *** 272,281 **** "</FORM>" ); ! parseAndAssertNodeCount(8); ! assertTrue("Seventh Node is a link",node[6] instanceof LinkTag); ! LinkTag linkTag = (LinkTag)node[6]; assertEquals("Link Text","Yahoo!\n",linkTag.getLinkText()); assertEquals("Link URL","http://www.yahoo.com",linkTag.getLink()); ! assertType("Eigth Node",FormTag.class,node[7]); } --- 273,284 ---- "</FORM>" ); ! parseAndAssertNodeCount(5); ! assertTrue("Fourth Node is a paragraph",node[3] instanceof ParagraphTag); ! ParagraphTag paragraph = (ParagraphTag)node[3]; ! assertTrue("Second Node of paragraph is a link", paragraph.getChildren ().elementAt (1) instanceof LinkTag); ! LinkTag linkTag = (LinkTag)paragraph.getChildren ().elementAt (1); assertEquals("Link Text","Yahoo!\n",linkTag.getLinkText()); assertEquals("Link URL","http://www.yahoo.com",linkTag.getLink()); ! assertType("Fifth Node",FormTag.class,node[4]); } *************** *** 321,324 **** --- 324,328 ---- factory.unregisterTag (new HeadTag ()); factory.unregisterTag (new BodyTag ()); + factory.unregisterTag (new ParagraphTag ()); parser.setNodeFactory (factory); i = 0; Index: LinkTagTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/tagTests/LinkTagTest.java,v retrieving revision 1.52 retrieving revision 1.53 diff -C2 -d -r1.52 -r1.53 *** LinkTagTest.java 2 Sep 2004 02:28:14 -0000 1.52 --- LinkTagTest.java 12 Nov 2005 16:44:54 -0000 1.53 *************** *** 378,399 **** public void testErroneousLinkBug() throws ParserException { createParser( ! "<p>Site Comments?<br>" + "<a href=\"mailto:sa...@ne...?subject=Site Comments\">" + "Mail Us" + ! "<a>" + ! "</p>" ); ! parseAndAssertNodeCount(6); ! // The first node should be a Tag ! assertTrue("First node should be a Tag",node[0] instanceof Tag); ! // The second node should be a Text ! assertTrue("Second node should be a Text",node[1] instanceof Text); ! Text stringNode = (Text)node[1]; assertEquals("Text of the Text","Site Comments?",stringNode.getText()); ! assertTrue("Third node should be a tag",node[2] instanceof Tag); ! assertTrue("Fourth node should be a link",node[3] instanceof LinkTag); // LinkScanner.evaluate() says no HREF means it isn't a link: ! assertTrue("Fifth node should be a tag",node[4] instanceof Tag); ! assertTrue("Sixth node should be a tag",node[5] instanceof Tag); } --- 378,395 ---- public void testErroneousLinkBug() throws ParserException { createParser( ! "Site Comments?<br>" + "<a href=\"mailto:sa...@ne...?subject=Site Comments\">" + "Mail Us" + ! "<a>" ); ! parseAndAssertNodeCount(4); ! // The first node should be a Text ! assertTrue("First node should be a Text",node[0] instanceof Text); ! Text stringNode = (Text)node[0]; assertEquals("Text of the Text","Site Comments?",stringNode.getText()); ! assertTrue("Second node should be a tag",node[1] instanceof Tag); ! assertTrue("Third node should be a link",node[2] instanceof LinkTag); // LinkScanner.evaluate() says no HREF means it isn't a link: ! assertTrue("Fourth node should be a tag",node[3] instanceof Tag); } |
From: Derrick O. <der...@us...> - 2005-11-12 16:45:02
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv13581/tests/lexerTests Modified Files: LexerTests.java Log Message: Update tests for addition of Paragraph tag. Index: LexerTests.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests/LexerTests.java,v retrieving revision 1.26 retrieving revision 1.27 diff -C2 -d -r1.26 -r1.27 *** LexerTests.java 19 Sep 2005 02:35:05 -0000 1.26 --- LexerTests.java 12 Nov 2005 16:44:54 -0000 1.27 *************** *** 617,620 **** --- 617,621 ---- mAcceptable.add ("UL"); mAcceptable.add ("LI"); + mAcceptable.add ("IFRAME"); } |
From: Derrick O. <der...@us...> - 2005-11-12 15:11:53
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv28975/htmlparser/docs Modified Files: changes.txt release.txt Log Message: Update version to 1.5-20051112. Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.71 retrieving revision 1.72 diff -C2 -d -r1.71 -r1.72 *** release.txt 25 Sep 2005 13:23:00 -0000 1.71 --- release.txt 12 Nov 2005 15:11:45 -0000 1.72 *************** *** 1,3 **** ! HTMLParser Version 1.6 (Integration Build Sep 25, 2005) ********************************************* --- 1,3 ---- ! HTMLParser Version 1.6 (Integration Build Nov 12, 2005) ********************************************* *************** *** 28,31 **** --- 28,41 ---- ------------------------- + New Functionality + ----------------- + Support has been added for commonly requested composite tags, P and H1-H6. + Definition list tags (dl, dt, dd), are also now included in the standard + set of tags recognized by the parser. + The node interface has been augmented with get first/last child and + get previous/next sibling methods to ease traversing the HTML document. + The TextNode class has an added isWhiteSpace method that returns true + when it contains no printable characters. + Refactoring ----------- *************** *** 37,42 **** --- 47,59 ---- Bug Fixes --------- + #1344687 A bug when set cookies + #1334408 Exception occurs based on string length + #1322686 when illegal charset specified #1227213 Particular SCRIPT tags close too late + Patches + ------- + #1338534 Support get first/last child, previous/next sibling + Changes since Version 1.4 ------------------------- Index: changes.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/changes.txt,v retrieving revision 1.207 retrieving revision 1.208 diff -C2 -d -r1.207 -r1.208 *** changes.txt 25 Sep 2005 13:23:00 -0000 1.207 --- changes.txt 12 Nov 2005 15:11:45 -0000 1.208 *************** *** 16,19 **** --- 16,82 ---- ******************************************************************************* + Integration Build 1.6 - 20051112 + -------------------------------- + 2005-11-12 09:19 derrickoswald + + * src/org/htmlparser/http/ConnectionManager.java, + docs/contributors.html, + src/org/htmlparser/PrototypicalNodeFactory.java: + + Add cookie processing changes suggested by Marcus Mattern. + + 2005-11-04 10:49 ian_macfarlane + + * src/org/htmlparser/nodes/TextNode.java: + + Add method isWhiteSpace to TextNode that returns if the node is nothing + but white space (or null) or if it contains some characters. + + 2005-11-01 03:55 ian_macfarlane + + * src/org/htmlparser/nodeDecorators/AbstractNodeDecorator.java: + + Add methods first/last child previous/next sibling added to AbstractNode. + This is required to enable the project to compile. + + 2005-10-31 11:26 ian_macfarlane + + * src/org/htmlparser/: PrototypicalNodeFactory.java, + tags/DefinitionList.java, tags/DefinitionListBullet.java, + tags/HeadingTag.java, tags/ParagraphTag.java, + tags/TableColumn.java, tags/TableHeader.java, tags/TableRow.java: + + Added support for P and h1-h6 tags. + Added support for definition list tags (dl, dt, dd). + Let table row/column tags know when to close if encounter TBODY/TFOOT/THEAD. + + 2005-10-26 18:01 derrickoswald + + * docs/contributors.html, src/org/htmlparser/Node.java, + src/org/htmlparser/nodes/AbstractNode.java: + + Incorporate patch #1338534 Support get first/last child, previous/next sibling + from Ian Macfarlane. No unit tests. + + 2005-10-24 22:06 derrickoswald + + * src/org/htmlparser/: lexer/Page.java, tags/MetaTag.java: + + Fix bug 1322686 when illegal charset specified + Use current source charset as the default if there is already a source. + + 2005-10-24 21:26 derrickoswald + + * src/org/htmlparser/lexer/InputStreamSource.java: + + Fixed bug #1334408 Exception occurs based on string length + Changed >= test to > to avoid off-by-one error. + + 2005-09-25 21:01 derrickoswald + + * build.xml: + + Fix htmlparser target. + Integration Build 1.6 - 20050925 -------------------------------- |
From: Derrick O. <der...@us...> - 2005-11-12 15:11:53
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv28975/htmlparser/src/org/htmlparser Modified Files: Parser.java Log Message: Update version to 1.5-20051112. Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.108 retrieving revision 1.109 diff -C2 -d -r1.108 -r1.109 *** Parser.java 25 Sep 2005 13:23:00 -0000 1.108 --- Parser.java 12 Nov 2005 15:11:45 -0000 1.109 *************** *** 133,137 **** */ public static final String ! VERSION_DATE = "Sep 25, 2005" ; --- 133,137 ---- */ public static final String ! VERSION_DATE = "Nov 12, 2005" ; |
From: Derrick O. <der...@us...> - 2005-11-12 14:19:59
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20699/docs Modified Files: contributors.html Log Message: Add cookie processing changes suggested by Marcus Mattern. Index: contributors.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/contributors.html,v retrieving revision 1.20 retrieving revision 1.21 diff -C2 -d -r1.20 -r1.21 *** contributors.html 26 Oct 2005 22:01:23 -0000 1.20 --- contributors.html 12 Nov 2005 14:19:51 -0000 1.21 *************** *** 396,404 **** </tr> </table> ! <p>Thanks to Ian Macfarlane, Keiron McCammon, Martin Hudson, Matthew Buckett, ! Jamie McCrindle, John Derrick, David Andersen, Manuel Polo, Enrico Triolo, ! Gernot Fricke, Nick Burch, Stephen Harrington, Domenico Lordi, Kamen, ! John Zook, Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, Raj Sharma, ! Robert Kausch, Gordon Deudney, Serge Kruppa, Roger Kjensrud, and Manpreet Singh for suggestions, bug reports and feature ideas. <br> <p>Thanks to Jon Gillette for the cool new logo.<br> --- 396,404 ---- </tr> </table> ! <p>Thanks to Marcus Mattern, Ian Macfarlane, Keiron McCammon, Martin Hudson, ! Matthew Buckett, Jamie McCrindle, John Derrick, David Andersen, Manuel Polo, ! Enrico Triolo, Gernot Fricke, Nick Burch, Stephen Harrington, Domenico Lordi, ! Kamen, John Zook, Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, ! Raj Sharma, Robert Kausch, Gordon Deudney, Serge Kruppa, Roger Kjensrud, and Manpreet Singh for suggestions, bug reports and feature ideas. <br> <p>Thanks to Jon Gillette for the cool new logo.<br> |
From: Derrick O. <der...@us...> - 2005-11-12 14:19:59
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20699/src/org/htmlparser Modified Files: PrototypicalNodeFactory.java Log Message: Add cookie processing changes suggested by Marcus Mattern. Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.16 retrieving revision 1.17 diff -C2 -d -r1.16 -r1.17 *** PrototypicalNodeFactory.java 31 Oct 2005 16:26:11 -0000 1.16 --- PrototypicalNodeFactory.java 12 Nov 2005 14:19:51 -0000 1.17 *************** *** 92,96 **** * {@link #setTagPrototype(Tag) tagPrototype} property.</p> * <p>The hash table of registered tags can be automatically populated with ! * all the know tags from the {@link org.htmlparser.tags} package when * the factory is constructed, or it can start out empty and be populated * explicitly.</p> --- 92,96 ---- * {@link #setTagPrototype(Tag) tagPrototype} property.</p> * <p>The hash table of registered tags can be automatically populated with ! * all the known tags from the {@link org.htmlparser.tags} package when * the factory is constructed, or it can start out empty and be populated * explicitly.</p> |
From: Derrick O. <der...@us...> - 2005-11-12 14:19:58
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20699/src/org/htmlparser/http Modified Files: ConnectionManager.java Log Message: Add cookie processing changes suggested by Marcus Mattern. Index: ConnectionManager.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http/ConnectionManager.java,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** ConnectionManager.java 20 Jun 2005 01:56:32 -0000 1.6 --- ConnectionManager.java 12 Nov 2005 14:19:50 -0000 1.7 *************** *** 425,428 **** --- 425,429 ---- Vector cookies; Cookie probe; + boolean found; // flag if a cookie with current name is already there if (null != cookie.getDomain ()) *************** *** 434,437 **** --- 435,439 ---- if (null != cookies) { + found = false; for (int j = 0; j < cookies.size (); j++) { *************** *** 443,446 **** --- 445,449 ---- { cookies.setElementAt (cookie, j); // replace + found = true; // cookie found, set flag break; } *************** *** 448,455 **** --- 451,463 ---- { cookies.insertElementAt (cookie, j); + found = true; // cookie found, set flag break; } } } + if (!found) + // there's no cookie with the current name, therefore it's added + // at the end of the list (faster then inserting at the front) + cookies.addElement (cookie); } else *************** *** 459,463 **** mCookieJar.put (domain, cookies); } - } --- 467,470 ---- |
From: Ian M. <ian...@us...> - 2005-11-04 15:50:00
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv27141/src/org/htmlparser/nodes Modified Files: TextNode.java Log Message: Add method isWhiteSpace to TextNode that returns if the node is nothing but white space (or null) or if it contains some characters. Index: TextNode.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes/TextNode.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** TextNode.java 10 Apr 2005 23:20:44 -0000 1.4 --- TextNode.java 4 Nov 2005 15:49:45 -0000 1.5 *************** *** 211,214 **** --- 211,225 ---- /** + * Returns if the node consists of only white space. + * White space can be spaces, new lines, etc. + */ + public boolean isWhiteSpace() + { + if (mText == null || mText.trim().equals("")) + return true; + return false; + } + + /** * String visiting code. * @param visitor The <code>NodeVisitor</code> object to invoke |