htmlparser-cvs Mailing List for HTML Parser (Page 15)
Brought to you by:
derrickoswald
You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(141) |
Jun
(108) |
Jul
(66) |
Aug
(127) |
Sep
(155) |
Oct
(149) |
Nov
(72) |
Dec
(72) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(100) |
Feb
(36) |
Mar
(21) |
Apr
(3) |
May
(87) |
Jun
(28) |
Jul
(84) |
Aug
(5) |
Sep
(14) |
Oct
|
Nov
|
Dec
|
2005 |
Jan
(1) |
Feb
(39) |
Mar
(26) |
Apr
(38) |
May
(14) |
Jun
(10) |
Jul
|
Aug
|
Sep
(13) |
Oct
(8) |
Nov
(10) |
Dec
|
2006 |
Jan
|
Feb
(1) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(24) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Derrick O. <der...@us...> - 2004-07-02 00:49:39
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/scannersTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32670/src/org/htmlparser/tests/scannersTests Modified Files: CompositeTagScannerTest.java TagScannerTest.java Log Message: Part four of a multiphase refactoring. Most internals now use the Tag interface. This interface has been broadened to add set/get scanner and set/get endtag. Removed the org.htmlparser.tags.Tag class and moved the remaining (minor) functionality to the TagNode class. So now tags inherit directly from TagNode or CompositeTag. ** NOTE: If you have subclassed org.htmlparser.tags.Tag, use org.htmlparser.nodes.TagNode now.** Removed deprecated methods getTagBegin/getTagEnd and deleted unused classes: PeekingIterator and it's Implementation. Index: TagScannerTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/scannersTests/TagScannerTest.java,v retrieving revision 1.40 retrieving revision 1.41 diff -C2 -d -r1.40 -r1.41 *** TagScannerTest.java 14 Jan 2004 02:53:47 -0000 1.40 --- TagScannerTest.java 2 Jul 2004 00:49:30 -0000 1.41 *************** *** 27,31 **** package org.htmlparser.tests.scannersTests; ! import org.htmlparser.tags.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; --- 27,31 ---- package org.htmlparser.tests.scannersTests; ! import org.htmlparser.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; Index: CompositeTagScannerTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/scannersTests/CompositeTagScannerTest.java,v retrieving revision 1.60 retrieving revision 1.61 diff -C2 -d -r1.60 -r1.61 *** CompositeTagScannerTest.java 24 May 2004 16:18:33 -0000 1.60 --- CompositeTagScannerTest.java 2 Jul 2004 00:49:30 -0000 1.61 *************** *** 29,32 **** --- 29,33 ---- import org.htmlparser.Node; import org.htmlparser.PrototypicalNodeFactory; + import org.htmlparser.Tag; import org.htmlparser.Text; import org.htmlparser.nodes.AbstractNode; *************** *** 38,42 **** import org.htmlparser.tags.TableRow; import org.htmlparser.tags.TableTag; - import org.htmlparser.tags.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; --- 39,42 ---- |
From: Derrick O. <der...@us...> - 2004-07-02 00:49:39
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/parserHelperTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32670/src/org/htmlparser/tests/parserHelperTests Modified Files: CompositeTagScannerHelperTest.java RemarkNodeParserTest.java Log Message: Part four of a multiphase refactoring. Most internals now use the Tag interface. This interface has been broadened to add set/get scanner and set/get endtag. Removed the org.htmlparser.tags.Tag class and moved the remaining (minor) functionality to the TagNode class. So now tags inherit directly from TagNode or CompositeTag. ** NOTE: If you have subclassed org.htmlparser.tags.Tag, use org.htmlparser.nodes.TagNode now.** Removed deprecated methods getTagBegin/getTagEnd and deleted unused classes: PeekingIterator and it's Implementation. Index: RemarkNodeParserTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/parserHelperTests/RemarkNodeParserTest.java,v retrieving revision 1.45 retrieving revision 1.46 diff -C2 -d -r1.45 -r1.46 *** RemarkNodeParserTest.java 24 May 2004 16:18:32 -0000 1.45 --- RemarkNodeParserTest.java 2 Jul 2004 00:49:30 -0000 1.46 *************** *** 30,35 **** import org.htmlparser.PrototypicalNodeFactory; import org.htmlparser.Remark; import org.htmlparser.Text; - import org.htmlparser.tags.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; --- 30,35 ---- import org.htmlparser.PrototypicalNodeFactory; import org.htmlparser.Remark; + import org.htmlparser.Tag; import org.htmlparser.Text; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; Index: CompositeTagScannerHelperTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/parserHelperTests/CompositeTagScannerHelperTest.java,v retrieving revision 1.30 retrieving revision 1.31 diff -C2 -d -r1.30 -r1.31 *** CompositeTagScannerHelperTest.java 2 Jan 2004 16:24:56 -0000 1.30 --- CompositeTagScannerHelperTest.java 2 Jul 2004 00:49:30 -0000 1.31 *************** *** 27,31 **** package org.htmlparser.tests.parserHelperTests; ! import org.htmlparser.tags.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; --- 27,31 ---- package org.htmlparser.tests.parserHelperTests; ! import org.htmlparser.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.ParserException; |
From: Derrick O. <der...@us...> - 2004-07-02 00:49:38
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/filterTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32670/src/org/htmlparser/tests/filterTests Modified Files: FilterTest.java Log Message: Part four of a multiphase refactoring. Most internals now use the Tag interface. This interface has been broadened to add set/get scanner and set/get endtag. Removed the org.htmlparser.tags.Tag class and moved the remaining (minor) functionality to the TagNode class. So now tags inherit directly from TagNode or CompositeTag. ** NOTE: If you have subclassed org.htmlparser.tags.Tag, use org.htmlparser.nodes.TagNode now.** Removed deprecated methods getTagBegin/getTagEnd and deleted unused classes: PeekingIterator and it's Implementation. Index: FilterTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/filterTests/FilterTest.java,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** FilterTest.java 24 May 2004 19:36:23 -0000 1.6 --- FilterTest.java 2 Jul 2004 00:49:29 -0000 1.7 *************** *** 28,31 **** --- 28,32 ---- import org.htmlparser.Parser; + import org.htmlparser.Tag; import org.htmlparser.filters.AndFilter; import org.htmlparser.filters.CssSelectorNodeFilter; *************** *** 42,46 **** import org.htmlparser.tags.BodyTag; import org.htmlparser.tags.LinkTag; - import org.htmlparser.tags.Tag; import org.htmlparser.tests.ParserTestCase; import org.htmlparser.util.NodeIterator; --- 43,46 ---- |
From: Derrick O. <der...@us...> - 2004-06-26 11:56:18
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3381 Modified Files: AttributeTests.java Log Message: Add test case for bug #979893 Not Parsing all Attributes. Not reproducible. Index: AttributeTests.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests/AttributeTests.java,v retrieving revision 1.15 retrieving revision 1.16 diff -C2 -d -r1.15 -r1.16 *** AttributeTests.java 24 May 2004 16:31:22 -0000 1.15 --- AttributeTests.java 26 Jun 2004 11:56:08 -0000 1.16 *************** *** 35,38 **** --- 35,39 ---- import org.htmlparser.lexer.PageAttribute; import org.htmlparser.tags.ImageTag; + import org.htmlparser.tags.LinkTag; import org.htmlparser.tags.Tag; import org.htmlparser.tests.ParserTestCase; *************** *** 748,750 **** --- 749,777 ---- assertTrue ("setQuote('\\'') failed", "src='images/third'".equals (src.toString ())); } + + /** + * see bug #979893 Not Parsing all Attributes + */ + public void testNoSpace () throws ParserException + { + String id = "A19012_00002"; + String rawid = "\"" + id + "\""; + String cls = "BuyLink"; + String rawcls = "\"" + cls + "\""; + String href = "http://www.someplace.com/buyme.html"; + String rawhref = "\"" + href + "\""; + String html = "<a id=" + rawid + /* no space */ "class=" + rawcls + " href=" + rawhref + ">Pick me.</a>"; + createParser (html); + parseAndAssertNodeCount (1); + assertTrue ("Node should be an LinkTag", node[0] instanceof LinkTag); + LinkTag link = (LinkTag)node[0]; + Vector attributes = link.getAttributesEx (); + assertEquals ("Incorrect number of attributes", 6, attributes.size ()); + assertStringEquals ("id wrong", rawid, link.getAttributeEx ("id").getRawValue ()); + assertStringEquals ("class wrong", rawcls, link.getAttributeEx ("class").getRawValue ()); + assertStringEquals ("href wrong", rawhref, link.getAttributeEx ("href").getRawValue ()); + assertStringEquals ("id wrong", id, link.getAttributeEx ("id").getValue ()); + assertStringEquals ("class wrong", cls, link.getAttributeEx ("class").getValue ()); + assertStringEquals ("href wrong", href, link.getAttributeEx ("href").getValue ()); + } } |
From: Derrick O. <der...@us...> - 2004-06-26 11:25:10
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31782/src/org/htmlparser Modified Files: PrototypicalNodeFactory.java Log Message: Incorporate ObjectTag submitted by Enrico Triolo. Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** PrototypicalNodeFactory.java 14 Jun 2004 01:51:10 -0000 1.9 --- PrototypicalNodeFactory.java 26 Jun 2004 11:25:01 -0000 1.10 *************** *** 61,64 **** --- 61,65 ---- import org.htmlparser.tags.LinkTag; import org.htmlparser.tags.MetaTag; + import org.htmlparser.tags.ObjectTag; import org.htmlparser.tags.OptionTag; import org.htmlparser.tags.ScriptTag; *************** *** 274,277 **** --- 275,279 ---- registerTag (new LinkTag ()); registerTag (new MetaTag ()); + registerTag (new ObjectTag ()); registerTag (new OptionTag ()); registerTag (new ScriptTag ()); |
From: Derrick O. <der...@us...> - 2004-06-26 11:25:10
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31782/src/org/htmlparser/tags Added Files: ObjectTag.java Log Message: Incorporate ObjectTag submitted by Enrico Triolo. --- NEW FILE: ObjectTag.java --- // HTMLParser Library $Name: $ - A java-based parser for HTML // http://sourceforge.org/projects/htmlparser // Copyright (C) 2004 Enrico Triolo // // Revision Control Information // // $Source: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/ObjectTag.java,v $ // $Author: derrickoswald $ // $Date: 2004/06/26 11:25:01 $ // $Revision: 1.1 $ // // This library is free software; you can redistribute it and/or // modify it under the terms of the GNU Lesser General Public // License as published by the Free Software Foundation; either // version 2.1 of the License, or (at your option) any later version. // // This library is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU // Lesser General Public License for more details. // // You should have received a copy of the GNU Lesser General Public // License along with this library; if not, write to the Free Software // Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA // package org.htmlparser.tags; import java.util.Enumeration; import java.util.Hashtable; import java.util.Vector; import org.htmlparser.Node; import org.htmlparser.nodes.TextNode; import org.htmlparser.Attribute; import org.htmlparser.util.NodeList; import org.htmlparser.util.SimpleNodeIterator; /** * ObjectTag represents an <Object> tag. * It extends a basic tag by providing accessors to the * type, codetype, codebase, classid, data, height, width, standby attributes and parameters. */ public class ObjectTag extends CompositeTag { /** * The set of names handled by this tag. */ private static final String[] mIds = new String[] {"OBJECT"}; /** * The set of end tag names that indicate the end of this tag. */ private static final String[] mEndTagEnders = new String[] {"BODY", "HTML"}; /** * Create a new object tag. */ public ObjectTag () { } /** * Return the set of names handled by this tag. * @return The names to be matched that create tags of this type. */ public String[] getIds () { return (mIds); } /** * Return the set of end tag names that cause this tag to finish. * @return The names of following end tags that stop further scanning. */ public String[] getEndTagEnders () { return (mEndTagEnders); } /** * Extract the object <code>PARAM</code> tags from the child list. * @return The list of object parameters (keys and values are String objects). */ public Hashtable createObjectParamsTable () { NodeList kids; Node node; Tag tag; String paramName; String paramValue; Hashtable ret; ret = new Hashtable (); kids = getChildren (); if (null != kids) for (int i = 0; i < kids.size (); i++) { node = children.elementAt(i); if (node instanceof Tag) { tag = (Tag)node; if (tag.getTagName().equals ("PARAM")) { paramName = tag.getAttribute ("NAME"); if (null != paramName && 0 != paramName.length ()) { paramValue = tag.getAttribute ("VALUE"); ret.put (paramName.toUpperCase(),paramValue); } } } } return (ret); } /** * Get the classid of the object. * @return The value of the <code>CLASSID</code> attribute. */ public String getObjectClassId () { return getAttribute ("CLASSID"); } /** * Get the codebase of the object. * @return The value of the <code>CODEBASE</code> attribute. */ public String getObjectCodeBase () { return getAttribute ("CODEBASE"); } /** * Get the codetype of the object. * @return The value of the <code>CODETYPE</code> attribute. */ public String getObjectCodeType () { return getAttribute ("CODETYPE"); } /** * Get the data of the object. * @return The value of the <code>DATA</code> attribute. */ public String getObjectData () { return getAttribute ("DATA"); } /** * Get the height of the object. * @return The value of the <code>HEIGHT</code> attribute. */ public String getObjectHeight () { return getAttribute ("HEIGHT"); } /** * Get the standby of the object. * @return The value of the <code>STANDBY</code> attribute. */ public String getObjectStandby () { return getAttribute ("STANDBY"); } /** * Get the type of the object. * @return The value of the <code>TYPE</code> attribute. */ public String getObjectType () { return getAttribute ("TYPE"); } /** * Get the width of the object. * @return The value of the <code>WIDTH</code> attribute. */ public String getObjectWidth () { return getAttribute ("WIDTH"); } /** * Get the object parameters. * @return The list of parameter values (keys and values are String objects). */ public Hashtable getObjectParams () { return createObjectParamsTable (); } /** * Get the <code>PARAM<code> tag with the given name. * @param key The object parameter name to get. * @return The value of the parameter or <code>null</code> if there is no parameter of that name. */ public String getParameter (String key) { return ((String)(getObjectParams ().get (key.toUpperCase ()))); } /** * Get an enumeration over the (String) parameter names. * @return An enumeration of the <code>PARAM<code> tag <code>NAME<code> attributes. */ public Enumeration getParameterNames () { return getObjectParams ().keys (); } /** * Set the <code>CLASSID<code> attribute. * @param newClassId The new classid. */ public void setObjectClassId (String newClassId) { setAttribute ("CLASSID", newClassId); } /** * Set the <code>CODEBASE<code> attribute. * @param newCodeBase The new codebase. */ public void setObjectCodeBase (String newCodeBase) { setAttribute ("CODEBASE", newCodeBase); } /** * Set the <code>CODETYPE<code> attribute. * @param newCodeType The new codetype. */ public void setObjectCodeType (String newCodeType) { setAttribute ("CODETYPE", newCodeType); } /** * Set the <code>DATA<code> attribute. * @param newData The new data. */ public void setObjectData (String newData) { setAttribute ("DATA", newData); } /** * Set the <code>HEIGHT<code> attribute. * @param newHeight The new height. */ public void setObjectHeight (String newHeight) { setAttribute ("HEIGHT", newHeight); } /** * Set the <code>STANDBY<code> attribute. * @param newStandby The new standby. */ public void setObjectStandby (String newStandby) { setAttribute ("STANDBY", newStandby); } /** * Set the <code>TYPE<code> attribute. * @param newType The new type. */ public void setObjectType (String newType) { setAttribute ("TYPE", newType); } /** * Set the <code>WIDTH<code> attribute. * @param newWidth The new width. */ public void setObjectWidth (String newWidth) { setAttribute ("WIDTH", newWidth); } /** * Set the enclosed <code>PARAM<code> children. * @param newObjectParams The new parameters. */ public void setObjectParams (Hashtable newObjectParams) { NodeList kids; Node node; Tag tag; String paramName; String paramValue; Vector attributes; TextNode string; kids = getChildren (); if (null == kids) kids = new NodeList (); else // erase objectParams from kids for (int i = 0; i < kids.size (); ) { node = kids.elementAt (i); if (node instanceof Tag) if (((Tag)node).getTagName ().equals ("PARAM")) { kids.remove (i); // remove whitespace too if (i < kids.size ()) { node = kids.elementAt (i); if (node instanceof TextNode) { string = (TextNode)node; if (0 == string.getText ().trim ().length ()) kids.remove (i); } } } else i++; else i++; } // add newObjectParams to kids for (Enumeration e = newObjectParams.keys (); e.hasMoreElements (); ) { attributes = new Vector (); // should the tag copy the attributes? paramName = (String)e.nextElement (); paramValue = (String)newObjectParams.get (paramName); attributes.addElement (new Attribute ("PARAM", null)); attributes.addElement (new Attribute (" ")); attributes.addElement (new Attribute ("VALUE", paramValue, '"')); attributes.addElement (new Attribute (" ")); attributes.addElement (new Attribute ("NAME", paramName.toUpperCase (), '"')); tag = new Tag (null, 0, 0, attributes); kids.add (tag); } //set kids as new children setChildren (kids); } /** * Output a string representing this object tag. * @return A string showing the contents of the object tag. */ public String toString () { Hashtable parameters; Enumeration params; String paramName; String paramValue; boolean found; Node node; StringBuffer ret; ret = new StringBuffer (500); ret.append ("Object Tag\n"); ret.append ("**********\n"); ret.append ("ClassId = "); ret.append (getObjectClassId ()); ret.append ("\n"); ret.append ("CodeBase = "); ret.append (getObjectCodeBase ()); ret.append ("\n"); ret.append ("CodeType = "); ret.append (getObjectCodeType ()); ret.append ("\n"); ret.append ("Data = "); ret.append (getObjectData ()); ret.append ("\n"); ret.append ("Height = "); ret.append (getObjectHeight ()); ret.append ("\n"); ret.append ("Standby = "); ret.append (getObjectStandby ()); ret.append ("\n"); ret.append ("Type = "); ret.append (getObjectType ()); ret.append ("\n"); ret.append ("Width = "); ret.append (getObjectWidth ()); ret.append ("\n"); parameters = getObjectParams (); params = parameters.keys (); if (null == params) ret.append ("No Params found.\n"); else for (int cnt = 0; params.hasMoreElements (); cnt++) { paramName = (String)params.nextElement (); paramValue = (String)parameters.get (paramName); ret.append (cnt); ret.append (": Parameter name = "); ret.append (paramName); ret.append (", Parameter value = "); ret.append (paramValue); ret.append ("\n"); } found = false; for (SimpleNodeIterator e = children (); e.hasMoreNodes ();) { node = e.nextNode (); if (node instanceof Tag) if (((Tag)node).getTagName ().equals ("PARAM")) continue; if (!found) ret.append ("Miscellaneous items :\n"); else ret.append (" "); found = true; ret.append (node.toString ()); } if (found) ret.append ("\n"); ret.append ("End of Object Tag\n"); ret.append ("*****************\n"); return (ret.toString ()); } } |
From: Derrick O. <der...@us...> - 2004-06-26 11:25:09
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31782/docs Modified Files: contributors.html release.txt Log Message: Incorporate ObjectTag submitted by Enrico Triolo. Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.62 retrieving revision 1.63 diff -C2 -d -r1.62 -r1.63 *** release.txt 14 Jun 2004 01:26:50 -0000 1.62 --- release.txt 26 Jun 2004 11:25:00 -0000 1.63 *************** *** 86,89 **** --- 86,90 ---- [33] Rogers George [34] Jon Gillette + [35] Enrico Triolo If you find any bugs, please go to Index: contributors.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/contributors.html,v retrieving revision 1.10 retrieving revision 1.11 diff -C2 -d -r1.10 -r1.11 *** contributors.html 3 Jun 2004 01:18:27 -0000 1.10 --- contributors.html 26 Jun 2004 11:25:00 -0000 1.11 *************** *** 396,400 **** </tr> </table> ! <p>Thanks to Gernot Fricke, Nick Burch, Stephen Harrington, Domenico Lordi, Kamen, John Zook, Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, Raj Sharma, Robert Kausch, Gordon Deudney, Serge Kruppa, Roger Kjensrud, and Manpreet Singh --- 396,400 ---- </tr> </table> ! <p>Thanks to Enrico Triolo, Gernot Fricke, Nick Burch, Stephen Harrington, Domenico Lordi, Kamen, John Zook, Cheng Jun, Mazlan Mat, Rob Shields, Wolfgang Germund, Raj Sharma, Robert Kausch, Gordon Deudney, Serge Kruppa, Roger Kjensrud, and Manpreet Singh |
From: Derrick O. <der...@us...> - 2004-06-16 02:17:34
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9637/tests/lexerTests Modified Files: TagTests.java Log Message: Fix bug #973137 Double-bytes characters are messed after parsing. Add an encoding parameter to the static createParser() method. Index: TagTests.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/lexerTests/TagTests.java,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** TagTests.java 2 Jan 2004 16:24:56 -0000 1.9 --- TagTests.java 16 Jun 2004 02:17:26 -0000 1.10 *************** *** 359,363 **** this.id = id; this.max = max; ! this.parser = Parser.createParser(testHtml); } --- 359,363 ---- this.id = id; this.max = max; ! this.parser = Parser.createParser(testHtml, null); } |
From: Derrick O. <der...@us...> - 2004-06-16 02:17:34
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9637/tests Modified Files: ParserTestCase.java InstanceofPerformanceTest.java Log Message: Fix bug #973137 Double-bytes characters are messed after parsing. Add an encoding parameter to the static createParser() method. Index: ParserTestCase.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/ParserTestCase.java,v retrieving revision 1.47 retrieving revision 1.48 diff -C2 -d -r1.47 -r1.48 *** ParserTestCase.java 2 Jun 2004 22:47:21 -0000 1.47 --- ParserTestCase.java 16 Jun 2004 02:17:26 -0000 1.48 *************** *** 233,238 **** actual = removeEscapeCharacters(actual); ! Parser expectedParser = Parser.createParser(expected); ! Parser resultParser = Parser.createParser(actual); NodeIterator expectedIterator = expectedParser.elements(); --- 233,238 ---- actual = removeEscapeCharacters(actual); ! Parser expectedParser = Parser.createParser(expected, null); ! Parser resultParser = Parser.createParser(actual, null); NodeIterator expectedIterator = expectedParser.elements(); Index: InstanceofPerformanceTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/InstanceofPerformanceTest.java,v retrieving revision 1.20 retrieving revision 1.21 diff -C2 -d -r1.20 -r1.21 *** InstanceofPerformanceTest.java 2 Jan 2004 16:24:55 -0000 1.20 --- InstanceofPerformanceTest.java 16 Jun 2004 02:17:26 -0000 1.21 *************** *** 59,63 **** Parser parser = Parser.createParser( ! FORM_HTML ); NodeIterator e = parser.elements(); --- 59,64 ---- Parser parser = Parser.createParser( ! FORM_HTML, ! null ); NodeIterator e = parser.elements(); |
From: Derrick O. <der...@us...> - 2004-06-16 02:17:34
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/visitorsTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9637/tests/visitorsTests Modified Files: UrlModifyingVisitorTest.java Log Message: Fix bug #973137 Double-bytes characters are messed after parsing. Add an encoding parameter to the static createParser() method. Index: UrlModifyingVisitorTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/visitorsTests/UrlModifyingVisitorTest.java,v retrieving revision 1.16 retrieving revision 1.17 diff -C2 -d -r1.16 -r1.17 *** UrlModifyingVisitorTest.java 2 Jan 2004 16:24:57 -0000 1.16 --- UrlModifyingVisitorTest.java 16 Jun 2004 02:17:26 -0000 1.17 *************** *** 57,61 **** public void testUrlModificationWithVisitor() throws Exception { ! Parser parser = Parser.createParser(HTML_WITH_LINK); UrlModifyingVisitor visitor = new UrlModifyingVisitor(parser, "localhost://"); --- 57,61 ---- public void testUrlModificationWithVisitor() throws Exception { ! Parser parser = Parser.createParser(HTML_WITH_LINK, null); UrlModifyingVisitor visitor = new UrlModifyingVisitor(parser, "localhost://"); |
From: Derrick O. <der...@us...> - 2004-06-16 02:17:34
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9637 Modified Files: Parser.java Log Message: Fix bug #973137 Double-bytes characters are messed after parsing. Add an encoding parameter to the static createParser() method. Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.93 retrieving revision 1.94 diff -C2 -d -r1.93 -r1.94 *** Parser.java 14 Jun 2004 01:26:51 -0000 1.93 --- Parser.java 16 Jun 2004 02:17:25 -0000 1.94 *************** *** 27,33 **** --- 27,35 ---- package org.htmlparser; + import java.io.ByteArrayInputStream; import java.io.File; import java.io.IOException; import java.io.Serializable; + import java.io.UnsupportedEncodingException; import java.net.MalformedURLException; import java.net.URL; *************** *** 792,807 **** /** * Creates the parser on an input string. ! * @param inputHTML ! * @return Parser */ ! public static Parser createParser(String inputHTML) { ! Lexer lexer; Parser ret; ! if (null == inputHTML) throw new IllegalArgumentException ("html cannot be null"); ! lexer = new Lexer (new Page (inputHTML)); ! ret = new Parser (lexer); return (ret); --- 794,828 ---- /** * Creates the parser on an input string. ! * Uses the character set encoding to create a stream of bytes that is ! * fed into the parser as if it had come off the wire. ! * @param html The string containing HTML. ! * @param charset Character set encoding to use when converting the ! * <code>html</code> to a stream of bytes. If charset is <code>null</code> ! * the default character set is used. ! * @return A parser with the <code>html</code> string as input. */ ! public static Parser createParser (String html, String charset) { ! ByteArrayInputStream stream; Parser ret; ! if (null == html) throw new IllegalArgumentException ("html cannot be null"); ! if (null == charset) ! charset = Page.DEFAULT_CHARSET; ! try ! { ! stream = new ByteArrayInputStream (html.getBytes (charset)); ! ret = new Parser (new Lexer (new Page (stream, charset))); ! } ! catch (UnsupportedEncodingException uee) ! { ! String msg; ! ! msg = uee.getMessage (); ! if (null == msg) ! msg = "unsupported encoding (" + charset + ") exception"; ! ret = new Parser (new Lexer (new Page (msg))); ! } return (ret); |
From: Derrick O. <der...@us...> - 2004-06-14 01:51:24
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32349 Modified Files: PrototypicalNodeFactory.java Log Message: Fix Remark creation booboo. Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.8 retrieving revision 1.9 diff -C2 -d -r1.8 -r1.9 *** PrototypicalNodeFactory.java 14 Jun 2004 00:06:51 -0000 1.8 --- PrototypicalNodeFactory.java 14 Jun 2004 01:51:10 -0000 1.9 *************** *** 382,388 **** { ret = (Remark)(getRemarkPrototype ().clone ()); ! // if (ret instanceof AbstractNode) ! // ((AbstractNode)ret).setPage (page); ! // else { first = start + 4; // <!-- --- 382,388 ---- { ret = (Remark)(getRemarkPrototype ().clone ()); ! if (ret instanceof AbstractNode) ! ((AbstractNode)ret).setPage (page); ! else { first = start + 4; // <!-- |
Update of /cvsroot/htmlparser/htmlparser/docs/wiki/index.php In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10080/docs/wiki/index.php Modified Files: Benchmarks BlockFeedback CollectingParameter CompositePattern CustomTagExtraction CustomTagLinks CustomVisitorLinks EmailExtraction EnableFeedback ExternalIterators FactoryMethod FeedbackMechanism FilterLinks FrequentlyAskedQuestions HomePage ImageExtraction InternalIterators IteratorPattern JavaBeans LexerLinks LinkBeanLinks LinkExtraction ParserDesign PatternStories PostOperation RSSFeeds ReverseHtml SamplePrograms SearchingForData SomikRaha StrategyPattern StringExtraction TemplateMethod TestDrivenDevelopment UsingCookiesWithParser VisitorLinks VisitorPattern WebCrawler WebRipper WritingYourOwnScanners Log Message: Update version to 1.5-20040613 Index: CustomTagLinks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/CustomTagLinks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** CustomTagLinks 30 May 2004 01:43:56 -0000 1.1 --- CustomTagLinks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 90,94 **** </td><td> ! <span class="debug">Page Execution took 0.332 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 90,94 ---- </td><td> ! <span class="debug">Page Execution took 0.225 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: ReverseHtml =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/ReverseHtml,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** ReverseHtml 30 May 2004 01:43:56 -0000 1.1 --- ReverseHtml 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 102,106 **** </td><td> ! <span class="debug">Page Execution took 0.421 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 102,106 ---- </td><td> ! <span class="debug">Page Execution took 0.254 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: PatternStories =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/PatternStories,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** PatternStories 30 May 2004 01:43:56 -0000 1.1 --- PatternStories 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 72,76 **** </td><td> ! <span class="debug">Page Execution took 0.267 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 72,76 ---- </td><td> ! <span class="debug">Page Execution took 0.273 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: StringExtraction =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/StringExtraction,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** StringExtraction 30 May 2004 01:43:56 -0000 1.1 --- StringExtraction 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 92,96 **** </td><td> ! <span class="debug">Page Execution took 0.286 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 92,96 ---- </td><td> ! <span class="debug">Page Execution took 0.232 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: LinkExtraction =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/LinkExtraction,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** LinkExtraction 30 May 2004 01:43:56 -0000 1.1 --- LinkExtraction 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 72,76 **** </td><td> ! <span class="debug">Page Execution took 0.426 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 72,76 ---- </td><td> ! <span class="debug">Page Execution took 0.305 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: ExternalIterators =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/ExternalIterators,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** ExternalIterators 30 May 2004 01:43:56 -0000 1.1 --- ExternalIterators 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 73,77 **** </td><td> ! <span class="debug">Page Execution took 0.225 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 73,77 ---- </td><td> ! <span class="debug">Page Execution took 0.389 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: PostOperation =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/PostOperation,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** PostOperation 30 May 2004 01:43:56 -0000 1.1 --- PostOperation 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 168,172 **** </td><td> ! <span class="debug">Page Execution took 0.342 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 168,172 ---- </td><td> ! <span class="debug">Page Execution took 0.413 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: SearchingForData =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/SearchingForData,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** SearchingForData 30 May 2004 01:43:56 -0000 1.1 --- SearchingForData 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 140,144 **** </td><td> ! <span class="debug">Page Execution took 0.257 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 140,144 ---- </td><td> ! <span class="debug">Page Execution took 0.261 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: RSSFeeds =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/RSSFeeds,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** RSSFeeds 30 May 2004 01:43:56 -0000 1.1 --- RSSFeeds 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 203,207 **** </td><td> ! <span class="debug">Page Execution took 0.265 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 203,207 ---- </td><td> ! <span class="debug">Page Execution took 0.302 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: WebCrawler =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/WebCrawler,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** WebCrawler 30 May 2004 01:43:56 -0000 1.1 --- WebCrawler 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 190,194 **** </td><td> ! <span class="debug">Page Execution took 0.283 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 190,194 ---- </td><td> ! <span class="debug">Page Execution took 0.291 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: EnableFeedback =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/EnableFeedback,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** EnableFeedback 30 May 2004 01:43:56 -0000 1.1 --- EnableFeedback 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 82,86 **** </td><td> ! <span class="debug">Page Execution took 0.32 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 82,86 ---- </td><td> ! <span class="debug">Page Execution took 0.249 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: CompositePattern =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/CompositePattern,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** CompositePattern 30 May 2004 01:43:56 -0000 1.1 --- CompositePattern 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 64,68 **** </td><td> ! <span class="debug">Page Execution took 0.34 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 64,68 ---- </td><td> ! <span class="debug">Page Execution took 0.329 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: CollectingParameter =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/CollectingParameter,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** CollectingParameter 30 May 2004 01:43:56 -0000 1.1 --- CollectingParameter 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 68,72 **** </td><td> ! <span class="debug">Page Execution took 0.242 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 68,72 ---- </td><td> ! <span class="debug">Page Execution took 0.258 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: TestDrivenDevelopment =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/TestDrivenDevelopment,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** TestDrivenDevelopment 30 May 2004 01:43:56 -0000 1.1 --- TestDrivenDevelopment 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 109,113 **** </td><td> ! <span class="debug">Page Execution took 0.357 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 109,113 ---- </td><td> ! <span class="debug">Page Execution took 0.37 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: TemplateMethod =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/TemplateMethod,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** TemplateMethod 30 May 2004 01:43:56 -0000 1.1 --- TemplateMethod 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 76,80 **** </td><td> ! <span class="debug">Page Execution took 0.276 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 76,80 ---- </td><td> ! <span class="debug">Page Execution took 0.238 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: CustomVisitorLinks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/CustomVisitorLinks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** CustomVisitorLinks 30 May 2004 01:43:56 -0000 1.1 --- CustomVisitorLinks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 116,120 **** </td><td> ! <span class="debug">Page Execution took 0.227 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 116,120 ---- </td><td> ! <span class="debug">Page Execution took 0.226 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: FactoryMethod =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/FactoryMethod,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** FactoryMethod 30 May 2004 01:43:56 -0000 1.1 --- FactoryMethod 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 70,74 **** </td><td> ! <span class="debug">Page Execution took 0.389 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 70,74 ---- </td><td> ! <span class="debug">Page Execution took 0.251 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: ImageExtraction =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/ImageExtraction,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** ImageExtraction 30 May 2004 01:43:56 -0000 1.1 --- ImageExtraction 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 81,85 **** </td><td> ! <span class="debug">Page Execution took 0.262 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 81,85 ---- </td><td> ! <span class="debug">Page Execution took 0.258 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: JavaBeans =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/JavaBeans,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** JavaBeans 30 May 2004 01:43:56 -0000 1.1 --- JavaBeans 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 111,115 **** </td><td> ! <span class="debug">Page Execution took 0.469 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 111,115 ---- </td><td> ! <span class="debug">Page Execution took 0.351 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: VisitorLinks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/VisitorLinks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** VisitorLinks 30 May 2004 01:43:56 -0000 1.1 --- VisitorLinks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 85,89 **** </td><td> ! <span class="debug">Page Execution took 0.228 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 85,89 ---- </td><td> ! <span class="debug">Page Execution took 0.221 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: SamplePrograms =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/SamplePrograms,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** SamplePrograms 30 May 2004 01:43:56 -0000 1.1 --- SamplePrograms 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 73,77 **** </td><td> ! <span class="debug">Page Execution took 0.29 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 73,77 ---- </td><td> ! <span class="debug">Page Execution took 0.289 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: WritingYourOwnScanners =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/WritingYourOwnScanners,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** WritingYourOwnScanners 30 May 2004 01:43:56 -0000 1.1 --- WritingYourOwnScanners 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 129,133 **** </td><td> ! <span class="debug">Page Execution took 0.284 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 129,133 ---- </td><td> ! <span class="debug">Page Execution took 0.32 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: FilterLinks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/FilterLinks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** FilterLinks 30 May 2004 01:43:56 -0000 1.1 --- FilterLinks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 109,113 **** </td><td> ! <span class="debug">Page Execution took 0.25 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 109,113 ---- </td><td> ! <span class="debug">Page Execution took 0.283 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: CustomTagExtraction =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/CustomTagExtraction,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** CustomTagExtraction 30 May 2004 01:43:56 -0000 1.1 --- CustomTagExtraction 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 84,88 **** </td><td> ! <span class="debug">Page Execution took 0.352 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 84,88 ---- </td><td> ! <span class="debug">Page Execution took 0.213 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: HomePage =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/HomePage,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** HomePage 30 May 2004 01:43:56 -0000 1.1 --- HomePage 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 81,85 **** </td><td> ! <span class="debug">Page Execution took 0.327 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 81,85 ---- </td><td> ! <span class="debug">Page Execution took 0.333 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: ParserDesign =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/ParserDesign,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** ParserDesign 30 May 2004 01:43:56 -0000 1.1 --- ParserDesign 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 66,70 **** </td><td> ! <span class="debug">Page Execution took 0.229 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 66,70 ---- </td><td> ! <span class="debug">Page Execution took 0.261 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: WebRipper =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/WebRipper,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** WebRipper 30 May 2004 01:43:56 -0000 1.1 --- WebRipper 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 72,76 **** </td><td> ! <span class="debug">Page Execution took 0.242 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 72,76 ---- </td><td> ! <span class="debug">Page Execution took 0.261 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: VisitorPattern =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/VisitorPattern,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** VisitorPattern 30 May 2004 01:43:56 -0000 1.1 --- VisitorPattern 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 65,69 **** </td><td> ! <span class="debug">Page Execution took 0.474 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 65,69 ---- </td><td> ! <span class="debug">Page Execution took 0.303 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: IteratorPattern =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/IteratorPattern,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** IteratorPattern 30 May 2004 01:43:56 -0000 1.1 --- IteratorPattern 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 66,70 **** </td><td> ! <span class="debug">Page Execution took 0.335 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 66,70 ---- </td><td> ! <span class="debug">Page Execution took 0.288 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: FrequentlyAskedQuestions =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/FrequentlyAskedQuestions,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** FrequentlyAskedQuestions 30 May 2004 01:43:56 -0000 1.1 --- FrequentlyAskedQuestions 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 71,75 **** </td><td> ! <span class="debug">Page Execution took 0.321 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 71,75 ---- </td><td> ! <span class="debug">Page Execution took 0.393 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: FeedbackMechanism =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/FeedbackMechanism,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** FeedbackMechanism 30 May 2004 01:43:56 -0000 1.1 --- FeedbackMechanism 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 21,25 **** <link rel="author" title="The PhpWiki Programming Team" href="http://phpwiki.sourceforge.net/phpwiki/ThePhpWikiProgrammingTeam" /> <link rel="search" title="FindPage" href="FindPage" /> ! <link rel="alternate" title="View Source: FeedbackMechanism" href="FeedbackMechanism?action=viewsource&version=5" /> <link rel="alternate" type="application/rss+xml" title="RSS" href="RecentChanges?format=rss" /> --- 21,25 ---- <link rel="author" title="The PhpWiki Programming Team" href="http://phpwiki.sourceforge.net/phpwiki/ThePhpWikiProgrammingTeam" /> <link rel="search" title="FindPage" href="FindPage" /> ! <link rel="alternate" title="View Source: FeedbackMechanism" href="FeedbackMechanism?action=viewsource&version=6" /> <link rel="alternate" type="application/rss+xml" title="RSS" href="RecentChanges?format=rss" /> *************** *** 48,52 **** ! <div class="wikitext"><p><b>Feedback Mechanism</b></p> <p>The parser has a feedback mechanism that allows you to obtain feedback about the parsing process. You can get to know if there were any errors, or any warnings, or any general information. Warnings occur when the parser has encountered dirty html, but was able to fix it and continue. Errors occur when the parser was not able to handle the html.</p> <p>An understanding of the feedback mechanism is useful if you wish to perform logging, or turn off the default feedback and incorporate your own.</p> --- 48,54 ---- ! <div class="wikitext"><ul> ! <li>Feedback Mechanism *</li> ! </ul> <p>The parser has a feedback mechanism that allows you to obtain feedback about the parsing process. You can get to know if there were any errors, or any warnings, or any general information. Warnings occur when the parser has encountered dirty html, but was able to fix it and continue. Errors occur when the parser was not able to handle the html.</p> <p>An understanding of the feedback mechanism is useful if you wish to perform logging, or turn off the default feedback and incorporate your own.</p> *************** *** 95,99 **** </td><td> ! <span class="debug">Page Execution took 0.235 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 97,101 ---- </td><td> ! <span class="debug">Page Execution took 0.239 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: SomikRaha =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/SomikRaha,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** SomikRaha 30 May 2004 01:43:56 -0000 1.1 --- SomikRaha 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 65,69 **** </td><td> ! <span class="debug">Page Execution took 0.284 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 65,69 ---- </td><td> ! <span class="debug">Page Execution took 0.48 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: EmailExtraction =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/EmailExtraction,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** EmailExtraction 30 May 2004 01:43:56 -0000 1.1 --- EmailExtraction 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 100,104 **** </td><td> ! <span class="debug">Page Execution took 0.26 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 100,104 ---- </td><td> ! <span class="debug">Page Execution took 0.256 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: BlockFeedback =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/BlockFeedback,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** BlockFeedback 30 May 2004 01:43:56 -0000 1.1 --- BlockFeedback 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 73,77 **** </td><td> ! <span class="debug">Page Execution took 0.345 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 73,77 ---- </td><td> ! <span class="debug">Page Execution took 0.241 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: InternalIterators =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/InternalIterators,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** InternalIterators 30 May 2004 01:43:56 -0000 1.1 --- InternalIterators 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 64,68 **** </td><td> ! <span class="debug">Page Execution took 0.237 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 64,68 ---- </td><td> ! <span class="debug">Page Execution took 0.366 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: Benchmarks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/Benchmarks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** Benchmarks 30 May 2004 01:43:56 -0000 1.1 --- Benchmarks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 67,71 **** </td><td> ! <span class="debug">Page Execution took 0.226 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 67,71 ---- </td><td> ! <span class="debug">Page Execution took 0.256 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: LinkBeanLinks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/LinkBeanLinks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** LinkBeanLinks 30 May 2004 01:43:56 -0000 1.1 --- LinkBeanLinks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 77,81 **** </td><td> ! <span class="debug">Page Execution took 0.272 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 77,81 ---- </td><td> ! <span class="debug">Page Execution took 0.269 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: UsingCookiesWithParser =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/UsingCookiesWithParser,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** UsingCookiesWithParser 30 May 2004 01:43:56 -0000 1.1 --- UsingCookiesWithParser 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 184,188 **** </td><td> ! <span class="debug">Page Execution took 0.247 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 184,188 ---- </td><td> ! <span class="debug">Page Execution took 0.301 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: LexerLinks =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/LexerLinks,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** LexerLinks 30 May 2004 01:43:56 -0000 1.1 --- LexerLinks 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 93,97 **** </td><td> ! <span class="debug">Page Execution took 0.251 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 93,97 ---- </td><td> ! <span class="debug">Page Execution took 0.285 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> Index: StrategyPattern =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.php/StrategyPattern,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** StrategyPattern 30 May 2004 01:43:56 -0000 1.1 --- StrategyPattern 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 64,68 **** </td><td> ! <span class="debug">Page Execution took 0.243 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 64,68 ---- </td><td> ! <span class="debug">Page Execution took 0.255 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> |
From: Derrick O. <der...@us...> - 2004-06-14 01:27:00
|
Update of /cvsroot/htmlparser/htmlparser/docs/wiki In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10080/docs/wiki Modified Files: index.html Log Message: Update version to 1.5-20040613 Index: index.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/wiki/index.html,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** index.html 30 May 2004 01:43:55 -0000 1.1 --- index.html 14 Jun 2004 01:26:50 -0000 1.2 *************** *** 81,85 **** </td><td> ! <span class="debug">Page Execution took 0.332 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> --- 81,85 ---- </td><td> ! <span class="debug">Page Execution took 0.372 seconds</span> </td></tr></table> <!-- This keeps the valid XHTML! icons from "hanging off the bottom of the scree" --> |
From: Derrick O. <der...@us...> - 2004-06-14 01:27:00
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10080/src/org/htmlparser Modified Files: Parser.java Log Message: Update version to 1.5-20040613 Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.92 retrieving revision 1.93 diff -C2 -d -r1.92 -r1.93 *** Parser.java 24 May 2004 16:18:12 -0000 1.92 --- Parser.java 14 Jun 2004 01:26:51 -0000 1.93 *************** *** 86,90 **** */ public final static String ! VERSION_DATE = "May 22, 2004" ; --- 86,90 ---- */ public final static String ! VERSION_DATE = "Jun 13, 2004" ; |
From: Derrick O. <der...@us...> - 2004-06-14 01:27:00
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10080/docs Modified Files: changes.txt release.txt Log Message: Update version to 1.5-20040613 Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.61 retrieving revision 1.62 diff -C2 -d -r1.61 -r1.62 *** release.txt 31 May 2004 22:27:09 -0000 1.61 --- release.txt 14 Jun 2004 01:26:50 -0000 1.62 *************** *** 1,3 **** ! HTMLParser Version 1.5 (Integration Build May 22, 2004) ********************************************* --- 1,3 ---- ! HTMLParser Version 1.5 (Integration Build Jun 13, 2004) ********************************************* *************** *** 29,42 **** Configuration Management Removed the need for the Translate class to be packaged with htmllexer.jar. ! This results in a lighter weight component. Refactoring ! Added Tag interface. Obviated LinkProcessor and moved it's functionality to ! the Page class. Filters ! Added CssSelectorNodeFilter. Enhancement Requests -------------------- 943593 LinkProcessor.extract(link,base) weird behaviour? Bug Fixes --- 29,46 ---- Configuration Management Removed the need for the Translate class to be packaged with htmllexer.jar. ! This results in a lighter weight component. Updated the logo and included ! the LGPL license. Refactoring ! Obviated LinkProcessor and moved it's functionality to the Page class. ! Added Tag, Text and Remark interfaces and moved concrete node ! implementations to the nodes package, removing the lexer.nodes package. Filters ! Added CssSelectorNodeFilter and RegExFilter. Enhancement Requests -------------------- 943593 LinkProcessor.extract(link,base) weird behaviour? + 943197 Accept gzip / deflate content encodings + 874000 Remove specialized tag signatures from NodeVisitor Bug Fixes Index: changes.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/changes.txt,v retrieving revision 1.200 retrieving revision 1.201 diff -C2 -d -r1.200 -r1.201 *** changes.txt 22 May 2004 12:08:57 -0000 1.200 --- changes.txt 14 Jun 2004 01:26:50 -0000 1.201 *************** *** 16,19 **** --- 16,272 ---- ******************************************************************************* + Integration Build 1.5 - 20040613 + -------------------------------- + + 2004-06-13 20:06 derrickoswald + + * src/org/htmlparser/: Node.java, PrototypicalNodeFactory.java, + package.html, nodeDecorators/AbstractNodeDecorator.java, + nodes/AbstractNode.java, nodes/RemarkNode.java, + nodes/TextNode.java, scanners/ScriptScanner.java, + scanners/StyleScanner.java, tests/MemoryTest.java: + + Rework PrototypicalNodeFactory to use interfaces. + + 2004-06-08 06:20 derrickoswald + + * src/org/htmlparser/: lexer/Page.java, + filters/HasParentFilter.java: + + DocComment fix and another getText() signature. + + 2004-06-02 21:20 derrickoswald + + * docs/index.html: + + Allow scrolling left panel. + + 2004-06-02 21:18 derrickoswald + + * docs/: contributors.html, pics/rsf.gif: + + Add Rodney S. Foley's photo. + + 2004-06-02 21:12 derrickoswald + + * resources/logofiles/: htmlparser2in.gif, htmlparser_cmyk.eps, + htmlparser_greyscale.eps, htmlparser_pms.eps, + htmlparser_rgb_2inch.jpg, htmlparser_rgb_5inch.jpg: + + Full set of logo files from Jon Gillette. + + 2004-06-02 18:47 somik + + * src/org/htmlparser/tests/ParserTestCase.java: + + modified to allow usage of assertXmlEquals + + 2004-06-02 18:47 somik + + * .cvsignore: + + added .cvsignore + + 2004-05-31 21:44 derrickoswald + + * docs/contributors.html: + + Add htmlparser.org reference in Rodney S. Foley's writeup. + + 2004-05-31 18:27 derrickoswald + + * docs/: contributors.html, htmlparser.jpg, htmlparserlogo.jpg, + panel.html, release.txt: + + New logo from Jon Gillette. + + 2004-05-29 21:43 derrickoswald + + * build.xml, + src/org/htmlparser/parserapplications/WikiCapturer.java, + docs/wiki/index.html, docs/wiki/index.php/Benchmarks, + docs/wiki/index.php/BlockFeedback, + docs/wiki/index.php/CollectingParameter, + docs/wiki/index.php/CompositePattern, + docs/wiki/index.php/CustomTagExtraction, + docs/wiki/index.php/CustomTagLinks, + docs/wiki/index.php/CustomVisitorLinks, + docs/wiki/index.php/EmailExtraction, + docs/wiki/index.php/EnableFeedback, + docs/wiki/index.php/ExternalIterators, + docs/wiki/index.php/FactoryMethod, + docs/wiki/index.php/FeedbackMechanism, + docs/wiki/index.php/FilterLinks, + docs/wiki/index.php/FrequentlyAskedQuestions, + docs/wiki/index.php/HomePage, docs/wiki/index.php/ImageExtraction, + docs/wiki/index.php/InternalIterators, + docs/wiki/index.php/IteratorPattern, docs/wiki/index.php/JavaBeans, + docs/wiki/index.php/LexerLinks, docs/wiki/index.php/LinkBeanLinks, + docs/wiki/index.php/LinkExtraction, + docs/wiki/index.php/ParserDesign, + docs/wiki/index.php/PatternStories, + docs/wiki/index.php/PostOperation, docs/wiki/index.php/RSSFeeds, + docs/wiki/index.php/ReverseHtml, + docs/wiki/index.php/SamplePrograms, + docs/wiki/index.php/SearchingForData, + docs/wiki/index.php/SomikRaha, docs/wiki/index.php/StrategyPattern, + docs/wiki/index.php/StringExtraction, + docs/wiki/index.php/TemplateMethod, + docs/wiki/index.php/TestDrivenDevelopment, + docs/wiki/index.php/UsingCookiesWithParser, + docs/wiki/index.php/VisitorLinks, + docs/wiki/index.php/VisitorPattern, docs/wiki/index.php/WebCrawler, + docs/wiki/index.php/WebRipper, + docs/wiki/index.php/WritingYourOwnScanners, + docs/wiki/themes/MacOSX/buttons/uww.png, + docs/wiki/themes/MacOSX/buttons/en/BackLinks.png, + docs/wiki/themes/MacOSX/buttons/en/DebugInfo.png, + docs/wiki/themes/MacOSX/buttons/en/Diff.png, + docs/wiki/themes/MacOSX/buttons/en/Edit.png, + docs/wiki/themes/MacOSX/buttons/en/FindPage.png, + docs/wiki/themes/MacOSX/buttons/en/LikePages.png, + docs/wiki/themes/MacOSX/buttons/en/PageHistory.png, + docs/wiki/themes/MacOSX/buttons/en/PageInfo.png, + docs/wiki/themes/MacOSX/buttons/en/RecentChanges.png, + docs/wiki/themes/MacOSX/images/http.png, + docs/wiki/themes/MacOSX/images/logo.png, + docs/wiki/themes/default/buttons/vcss.gif, + docs/wiki/themes/default/buttons/vxhtml10.gif: + + Use WikiCapturer to pull Wiki pages locally. + + 2004-05-29 16:40 derrickoswald + + * build.xml, docs/release.txt, resources/license.txt: + + Add LGPL license.txt to the distribution. + + 2004-05-29 15:51 derrickoswald + + * build.xml, resources/inherit.gif: + + Fix javadoc inheritance white background GIF. + + 2004-05-24 15:36 derrickoswald + + * src/org/htmlparser/: tests/filterTests/FilterTest.java, + filters/RegexFilter.java: + + Add regular expression filter. + + 2004-05-24 12:31 derrickoswald + + * src/org/htmlparser/: scanners/package.html, + tests/lexerTests/AttributeTests.java: + + Fix some files misplaced in last refactoring submission. + + 2004-05-24 12:18 derrickoswald + + * build.xml, src/org/htmlparser/AbstractNode.java, + src/org/htmlparser/Attribute.java, + src/org/htmlparser/NodeFactory.java, + src/org/htmlparser/Parser.java, + src/org/htmlparser/PrototypicalNodeFactory.java, + src/org/htmlparser/Remark.java, src/org/htmlparser/RemarkNode.java, + src/org/htmlparser/StringNode.java, + src/org/htmlparser/StringNodeFactory.java, + src/org/htmlparser/Tag.java, src/org/htmlparser/Text.java, + src/org/htmlparser/beans/StringBean.java, + src/org/htmlparser/filters/HasAttributeFilter.java, + src/org/htmlparser/filters/StringFilter.java, + src/org/htmlparser/filters/TagNameFilter.java, + src/org/htmlparser/lexer/Lexer.java, + src/org/htmlparser/lexer/PageAttribute.java, + src/org/htmlparser/lexerapplications/thumbelina/Thumbelina.java, + src/org/htmlparser/nodeDecorators/AbstractNodeDecorator.java, + src/org/htmlparser/nodeDecorators/DecodingNode.java, + src/org/htmlparser/nodeDecorators/EscapeCharacterRemovingNode.java, + src/org/htmlparser/nodeDecorators/NonBreakingSpaceConvertingNode.java, + src/org/htmlparser/scanners/CompositeTagScanner.java, + src/org/htmlparser/scanners/ScriptScanner.java, + src/org/htmlparser/scanners/StyleScanner.java, + src/org/htmlparser/tags/AppletTag.java, + src/org/htmlparser/tags/CompositeTag.java, + src/org/htmlparser/tags/ImageTag.java, + src/org/htmlparser/tags/MetaTag.java, + src/org/htmlparser/tags/Tag.java, + src/org/htmlparser/tests/ParserTest.java, + src/org/htmlparser/tests/ParserTestCase.java, + src/org/htmlparser/tests/filterTests/FilterTest.java, + src/org/htmlparser/tests/lexerTests/AttributeTests.java, + src/org/htmlparser/tests/lexerTests/KitTest.java, + src/org/htmlparser/tests/lexerTests/LexerTests.java, + src/org/htmlparser/tests/parserHelperTests/RemarkNodeParserTest.java, + src/org/htmlparser/tests/parserHelperTests/StringParserTest.java, + src/org/htmlparser/tests/scannersTests/CompositeTagScannerTest.java, + src/org/htmlparser/tests/tagTests/BulletListTagTest.java, + src/org/htmlparser/tests/tagTests/CompositeTagTest.java, + src/org/htmlparser/tests/tagTests/FormTagTest.java, + src/org/htmlparser/tests/tagTests/ImageTagTest.java, + src/org/htmlparser/tests/tagTests/LinkTagTest.java, + src/org/htmlparser/tests/tagTests/OptionTagTest.java, + src/org/htmlparser/tests/tagTests/StyleTagTest.java, + src/org/htmlparser/tests/tagTests/TagTest.java, + src/org/htmlparser/tests/utilTests/CharacterTranslationTest.java, + src/org/htmlparser/tests/utilTests/HTMLParserUtilsTest.java, + src/org/htmlparser/tests/utilTests/NodeListTest.java, + src/org/htmlparser/tests/visitorsTests/HtmlPageTest.java, + src/org/htmlparser/tests/visitorsTests/NodeVisitorTest.java, + src/org/htmlparser/util/ParserUtils.java, + src/org/htmlparser/visitors/NodeVisitor.java, + src/org/htmlparser/visitors/StringFindingVisitor.java, + src/org/htmlparser/visitors/TextExtractingVisitor.java, + src/org/htmlparser/visitors/UrlModifyingVisitor.java, + src/org/htmlparser/nodes/AbstractNode.java, + src/org/htmlparser/nodes/RemarkNode.java, + src/org/htmlparser/nodes/TagNode.java, + src/org/htmlparser/nodes/TextNode.java, + src/org/htmlparser/nodes/package.html: + + Part three of a multiphase refactoring. + The three node types are now fronted by interfaces (program to the interface paradigm) + with concrete implementations in the new htmlparser.nodes package. Classes from the + lexer.nodes package are moved to this package, and obvious references to the concrete + classes that got broken by this have been changed to use the interfaces where possible. + + 2004-05-23 20:38 derrickoswald + + * src/org/htmlparser/: AbstractNode.java, Node.java, + RemarkNode.java, StringNode.java, beans/StringBean.java, + filters/StringFilter.java, lexer/Lexer.java, + nodeDecorators/AbstractNodeDecorator.java, tags/ImageTag.java, + tags/LinkTag.java, tags/Tag.java, tags/TitleTag.java, + tests/filterTests/FilterTest.java, + tests/lexerTests/LexerTests.java, + tests/utilTests/NodeListTest.java, + tests/visitorsTests/NodeVisitorTest.java, + tests/visitorsTests/ScriptCommentTest.java, visitors/HtmlPage.java, + visitors/LinkFindingVisitor.java, visitors/NodeVisitor.java, + visitors/ObjectFindingVisitor.java, + visitors/TagFindingVisitor.java, + visitors/TextExtractingVisitor.java, + visitors/UrlModifyingVisitor.java: + + Part two of a multiphase refactoring. Part one added the Tag interface. + This submission eliminates some of the duplication between the lexer.nodes package + and the htmlparser package by removing the tag specific signatures, visitTitleTag, + visitLinkTag and visitImageTag, from the NodeVisitor class. This allows the lexer to + return htmlparser level classes for StringNode and RemarkNode. The TagNode is + still present in the lexer.nodes package, but will move next. + This means that classes derived from NodeVisitor *will not* work using the above + signatures; instead a check for tag class (or name) should be performed in visitTag. + A document will be added to the visitors package with comprehensive porting instructions. + + 2004-05-23 15:42 derrickoswald + + * src/org/htmlparser/lexer/Page.java: + + Incorporate feature request submitted by Bradford A. Folkens + #943197 Accept gzip / deflate content encodings + by setting request property "Accept-Encoding" to "gzip, deflate" in Page.setConnection(), + if possible, and handling those encodings. + No test case added because it needs a specially configured HTTP server. + Integration Build 1.5 - 20040522 -------------------------------- |
From: Derrick O. <der...@us...> - 2004-06-14 00:07:02
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3132 Modified Files: Node.java PrototypicalNodeFactory.java package.html Log Message: Rework PrototypicalNodeFactory to use interfaces. Index: package.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/package.html,v retrieving revision 1.20 retrieving revision 1.21 diff -C2 -d -r1.20 -r1.21 *** package.html 2 Jan 2004 16:24:52 -0000 1.20 --- package.html 14 Jun 2004 00:06:51 -0000 1.21 *************** *** 29,44 **** --> </head> ! <body bgcolor="white"> ! The basic API classes which will be used by most users when working with the html parser (the Parser class is the most important one in this). ! ! <h2>Related Documentation</h2> ! ! For overviews, tutorials, examples, guides, and tool documentation, please see: ! <ul> ! <li><a href="http://htmlparser.sourceforge.net">HTML Parser Home Page</a> ! </ul> ! ! <!-- Put @see and @since tags down here. --> ! </body> </html> --- 29,59 ---- --> </head> ! <body> ! The basic API classes which will be used by most developers when working with ! the HTML Parser. ! <p>The {@link org.htmlparser.Parser} class is the main high level class that ! provides simplified access to the contents of an HTML page. The page can be ! specified as either a URLConnection or a String. In the case of a String, an ! attempt is made to open it as a URL, and if that fails it assumes it is a local ! disk file. ! A wide range of methods is available to customize the operation of the Parser, ! as well as access specific pieces of the page as ! {@link org.htmlparser.Node Nodes}.</p> ! <p>The {@link org.htmlparser.NodeFactory} interface specifies the requirements ! for a developer to have the Parser or Lexer generate nodes. Three types of ! nodes are required: {@link org.htmlparser.Text}, {@link org.htmlparser.Remark} ! and {@link org.htmlparser.Tag Tags}. Tags contain lists ! of child nodes and {@link org.htmlparser.Attribute attributes}.</p> ! <p>The only provided implementation of the NodeFactory interface ! is the {@link org.htmlparser.PrototypicalNodeFactory} which ! operates by holding example nodes and cloning them as needed to satisfy the ! requests for nodes by the Parser. The Lexer is it's own NodeFactory, returning ! new {@link org.htmlparser.nodes.TextNode}, ! {@link org.htmlparser.nodes.RemarkNode} and undifferentiated ! {@link org.htmlparser.nodes.TagNode Tagnodes} (see the ! {@link org.htmlparser.nodes nodes} package).</p> ! <p>The {@link org.htmlparser.NodeFilter} interface is used by the filtering ! code to determine if a node meets a certain criteria. Some generic examples of ! filters can be found in the {@link org.htmlparser.filters filters} package. </body> </html> Index: Node.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Node.java,v retrieving revision 1.49 retrieving revision 1.50 diff -C2 -d -r1.49 -r1.50 *** Node.java 24 May 2004 00:38:15 -0000 1.49 --- Node.java 14 Jun 2004 00:06:51 -0000 1.50 *************** *** 31,67 **** import org.htmlparser.visitors.NodeVisitor; public interface Node { /** ! * Returns a string representation of the node. This is an important method, it allows a simple string transformation ! * of a web page, regardless of a node.<br> ! * Typical application code (for extracting only the text from a web page) would then be simplified to :<br> * <pre> ! * Node node; ! * for (Enumeration e = parser.elements();e.hasMoreElements();) { ! * node = (Node)e.nextElement(); ! * System.out.println(node.toPlainTextString()); // Or do whatever processing you wish with the plain text string ! * } * </pre> */ ! public abstract String toPlainTextString(); /** ! * This method will make it easier when using html parser to reproduce html pages (with or without modifications) ! * Applications reproducing html can use this method on nodes which are to be used or transferred as they were ! * recieved, with the original html */ ! public abstract String toHtml(); /** * Return the string representation of the node. ! * Subclasses must define this method, and this is typically to be used in the manner<br> ! * <pre>System.out.println(node)</pre> ! * @return java.lang.String */ ! public abstract String toString(); /** ! * Collect this node and its child nodes (if-applicable) into the collectionList parameter, provided the node * satisfies the filtering criteria.<P> * --- 31,98 ---- import org.htmlparser.visitors.NodeVisitor; + /** + * Specifies the minimum requirements for nodes returned by the Lexer or Parser. + * There are three types of nodes in HTML: text, remarks and tags. You may wish + * to define your own nodes to be returned by the + * {@link org.htmlparser.lexer.Lexer} or {@link Parser}, but each of the types + * must support this interface. + * More specific interface requirements for each of the node types are specified + * by the {@link Text}, {@link Remark} and {@link Tag} interfaces. + */ public interface Node + extends + Cloneable { /** ! * A string representation of the node. ! * This is an important method, it allows a simple string transformation ! * of a web page, regardless of a node. For a Text node this is obviously ! * the textual contents itself. For a Remark node this is the remark ! * contents (sic). For tags this is the text contents of it's children ! * (if any). Because multiple nodes are combined when presenting ! * a page in a browser, this will not reflect what a user would see. ! * See HTML specification section 9.1 White space ! * <a href="http://www.w3.org/TR/html4/struct/text.html#h-9.1"> ! * http://www.w3.org/TR/html4/struct/text.html#h-9.1</a>.<br> ! * Typical application code (for extracting only the text from a web page) ! * would be:<br> * <pre> ! * for (Enumeration e = parser.elements (); e.hasMoreElements ();) ! * // or do whatever processing you wish with the plain text string ! * System.out.println ((Node)e.nextElement ()).toPlainTextString ()); * </pre> + * @return The text of this node including it's children. */ ! public abstract String toPlainTextString (); /** ! * Return the HTML for this node. ! * This should be the exact sequence of characters that were encountered by ! * the parser that caused this node to be created. Where this breaks down is ! * where broken nodes (tags and remarks) have been encountered and fixed. ! * Applications reproducing html can use this method on nodes which are to ! * be used or transferred as they were received or created. ! * @return The (exact) sequence of characters that would cause this node ! * to be returned by the parser or lexer. */ ! public abstract String toHtml (); /** * Return the string representation of the node. ! * The return value may not be the entire contents of the node, and non- ! * printable characters may be translated in order to make them visible. ! * This is typically to be used in ! * the manner<br> ! * <pre> ! * System.out.println (node); ! * </pre> ! * or within a debugging environment. ! * @return A string representation of this node suitable for printing, ! * that isn't too large. */ ! public abstract String toString (); /** ! * Collect this node and its child nodes (if applicable) into a list, provided the node * satisfies the filtering criteria.<P> * *************** *** 71,99 **** * get it at the top-level, as many tags (like form tags), can contain * links embedded in them. We could get the links out by checking if the ! * current node is a {@link org.htmlparser.tags.CompositeTag}, and going through its children. ! * So this method provides a convenient way to do this.<P> * * Using collectInto(), programs get a lot shorter. Now, the code to * extract all links from a page would look like: * <pre> ! * NodeList collectionList = new NodeList(); * NodeFilter filter = new TagNameFilter ("A"); ! * for (NodeIterator e = parser.elements(); e.hasMoreNodes();) ! * e.nextNode().collectInto(collectionList, filter); * </pre> ! * Thus, collectionList will hold all the link nodes, irrespective of how * deep the links are embedded.<P> * * Another way to accomplish the same objective is: * <pre> ! * NodeList collectionList = new NodeList(); * NodeFilter filter = new TagClassFilter (LinkTag.class); ! * for (NodeIterator e = parser.elements(); e.hasMoreNodes();) ! * e.nextNode().collectInto(collectionList, filter); * </pre> * This is slightly less specific because the LinkTag class may be * registered for more than one node name, e.g. <LINK> tags too. */ ! public abstract void collectInto(NodeList collectionList, NodeFilter filter); /** --- 102,133 ---- * get it at the top-level, as many tags (like form tags), can contain * links embedded in them. We could get the links out by checking if the ! * current node is a {@link org.htmlparser.tags.CompositeTag}, and going ! * through its children. So this method provides a convenient way to do this.<P> * * Using collectInto(), programs get a lot shorter. Now, the code to * extract all links from a page would look like: * <pre> ! * NodeList list = new NodeList (); * NodeFilter filter = new TagNameFilter ("A"); ! * for (NodeIterator e = parser.elements (); e.hasMoreNodes ();) ! * e.nextNode ().collectInto (list, filter); * </pre> ! * Thus, <code>list</code> will hold all the link nodes, irrespective of how * deep the links are embedded.<P> * * Another way to accomplish the same objective is: * <pre> ! * NodeList list = new NodeList (); * NodeFilter filter = new TagClassFilter (LinkTag.class); ! * for (NodeIterator e = parser.elements (); e.hasMoreNodes ();) ! * e.nextNode ().collectInto (list, filter); * </pre> * This is slightly less specific because the LinkTag class may be * registered for more than one node name, e.g. <LINK> tags too. + * @param list The list to collect nodes into. + * @param filter The criteria to use when deciding if a node should + * be added to the list. */ ! public abstract void collectInto (NodeList list, NodeFilter filter); /** *************** *** 101,105 **** * <br>deprecated Use {@link #getStartPosition} */ ! public abstract int elementBegin(); /** --- 135,139 ---- * <br>deprecated Use {@link #getStartPosition} */ ! public abstract int elementBegin (); /** *************** *** 107,114 **** * <br>deprecated Use {@link #getEndPosition} */ ! public abstract int elementEnd(); /** * Gets the starting position of the node. * @return The start position. */ --- 141,149 ---- * <br>deprecated Use {@link #getEndPosition} */ ! public abstract int elementEnd (); /** * Gets the starting position of the node. + * This is the character (not byte) offset of this node in the page. * @return The start position. */ *************** *** 123,126 **** --- 158,163 ---- /** * Gets the ending position of the node. + * This is the character (not byte) offset of the character following this + * node in the page. * @return The end position. */ *************** *** 134,138 **** /** ! * Apply the visitor object (of type NodeVisitor) to this node. */ public abstract void accept (NodeVisitor visitor); --- 171,176 ---- /** ! * Apply the visitor to this node. ! * @param visitor The visitor to this node. */ public abstract void accept (NodeVisitor visitor); *************** *** 140,147 **** /** * Get the parent of this node. ! * This will always return null when parsing without scanners, ! * i.e. if semantic parsing was not performed. ! * The object returned from this method can be safely cast to a <code>CompositeTag</code>. ! * @return The parent of this node, if it's been set, <code>null</code> otherwise. */ public abstract Node getParent (); --- 178,188 ---- /** * Get the parent of this node. ! * This will always return null when parsing with the ! * {@link org.htmlparser.lexer.Lexer}. ! * Currently, the object returned from this method can be safely cast to a ! * {@link org.htmlparser.tags.CompositeTag}, but this behaviour should not ! * be expected in the future. ! * @return The parent of this node, if it's been set, <code>null</code> ! * otherwise. */ public abstract Node getParent (); *************** *** 149,153 **** /** * Sets the parent of this node. ! * @param node The node that contains this node. Must be a <code>CompositeTag</code>. */ public abstract void setParent (Node node); --- 190,194 ---- /** * Sets the parent of this node. ! * @param node The node that contains this node. */ public abstract void setParent (Node node); *************** *** 155,159 **** /** * Get the children of this node. ! * @return The list of children contained by this node, if it's been set, <code>null</code> otherwise. */ public abstract NodeList getChildren (); --- 196,201 ---- /** * Get the children of this node. ! * @return The list of children contained by this node, if it's been set, ! * <code>null</code> otherwise. */ public abstract NodeList getChildren (); *************** *** 167,172 **** /** * Returns the text of the node. */ ! public String getText(); /** --- 209,216 ---- /** * Returns the text of the node. + * @return The contents of the string or remark node, and in the case of + * a tag, the contents of the tag less the enclosing angle brackets. */ ! public String getText (); /** *************** *** 174,178 **** * @param text The new text for the node. */ ! public void setText(String text); /** --- 218,222 ---- * @param text The new text for the node. */ ! public void setText (String text); /** *************** *** 181,188 **** * bold text on and off. * Only a few tags have semantic meaning to the parser. These have to do ! * with the character set to use (<META>), the base URL to use * (<BASE>). Other than that, the semantic meaning is up to the ! * application and it's custom nodes. */ ! public void doSemanticAction () throws ParserException; } --- 225,304 ---- * bold text on and off. * Only a few tags have semantic meaning to the parser. These have to do ! * with the character set to use (<META>) and the base URL to use * (<BASE>). Other than that, the semantic meaning is up to the ! * application and it's custom nodes.<br> ! * The semantic action is performed when the node has been parsed. For ! * composite nodes (those that contain other nodes), the children will have ! * already been parsed and will be available via {@link #getChildren}. */ ! public void doSemanticAction () ! throws ! ParserException; ! ! // ! // Cloneable interface ! // ! ! /** ! * Allow cloning of nodes. ! * Creates and returns a copy of this object. The precise meaning ! * of "copy" may depend on the class of the object. The general ! * intent is that, for any object <tt>x</tt>, the expression: ! * <blockquote> ! * <pre> ! * x.clone() != x</pre></blockquote> ! * will be true, and that the expression: ! * <blockquote> ! * <pre> ! * x.clone().getClass() == x.getClass()</pre></blockquote> ! * will be <tt>true</tt>, but these are not absolute requirements. ! * While it is typically the case that: ! * <blockquote> ! * <pre> ! * x.clone().equals(x)</pre></blockquote> ! * will be <tt>true</tt>, this is not an absolute requirement. ! * <p> ! * By convention, the returned object should be obtained by calling ! * <tt>super.clone</tt>. If a class and all of its superclasses (except ! * <tt>Object</tt>) obey this convention, it will be the case that ! * <tt>x.clone().getClass() == x.getClass()</tt>. ! * <p> ! * By convention, the object returned by this method should be independent ! * of this object (which is being cloned). To achieve this independence, ! * it may be necessary to modify one or more fields of the object returned ! * by <tt>super.clone</tt> before returning it. Typically, this means ! * copying any mutable objects that comprise the internal "deep structure" ! * of the object being cloned and replacing the references to these ! * objects with references to the copies. If a class contains only ! * primitive fields or references to immutable objects, then it is usually ! * the case that no fields in the object returned by <tt>super.clone</tt> ! * need to be modified. ! * <p> ! * The method <tt>clone</tt> for class <tt>Object</tt> performs a ! * specific cloning operation. First, if the class of this object does ! * not implement the interface <tt>Cloneable</tt>, then a ! * <tt>CloneNotSupportedException</tt> is thrown. Note that all arrays ! * are considered to implement the interface <tt>Cloneable</tt>. ! * Otherwise, this method creates a new instance of the class of this ! * object and initializes all its fields with exactly the contents of ! * the corresponding fields of this object, as if by assignment; the ! * contents of the fields are not themselves cloned. Thus, this method ! * performs a "shallow copy" of this object, not a "deep copy" operation. ! * <p> ! * The class <tt>Object</tt> does not itself implement the interface ! * <tt>Cloneable</tt>, so calling the <tt>clone</tt> method on an object ! * whose class is <tt>Object</tt> will result in throwing an ! * exception at run time. ! * ! * @return a clone of this instance. ! * @exception CloneNotSupportedException if the object's class does not ! * support the <code>Cloneable</code> interface. Subclasses ! * that override the <code>clone</code> method can also ! * throw this exception to indicate that an instance cannot ! * be cloned. ! * @see java.lang.Cloneable ! */ ! public Object clone () ! throws ! CloneNotSupportedException; } Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** PrototypicalNodeFactory.java 24 May 2004 16:18:12 -0000 1.7 --- PrototypicalNodeFactory.java 14 Jun 2004 00:06:51 -0000 1.8 *************** *** 31,34 **** --- 31,35 ---- import java.util.Locale; import java.util.Map; + import java.util.Set; import java.util.Vector; *************** *** 39,42 **** --- 40,44 ---- import org.htmlparser.Text; import org.htmlparser.lexer.Page; + import org.htmlparser.nodes.AbstractNode; import org.htmlparser.nodes.TextNode; import org.htmlparser.nodes.RemarkNode; *************** *** 74,80 **** /** * A node factory based on the prototype pattern. ! * This factory uses the prototype pattern to generate new Tag nodes. * Prototype tags, in the form of undifferentiated tags are held in a hash ! * table. On a */ public class PrototypicalNodeFactory --- 76,91 ---- /** * A node factory based on the prototype pattern. ! * This factory uses the prototype pattern to generate new nodes. ! * It generates generic text and remark nodes from prototypes accessed ! * via the textPrototype and remarkPrototype properties respectively. ! * These are cloned as needed to form new {@link Text} and {@link Remark} nodes. * Prototype tags, in the form of undifferentiated tags are held in a hash ! * table. On a request for a tag, the attributes are examined for the name ! * of the tag and if a prototype of that name is registered, it is cloned ! * and the clone is given the characteristics ! * {@link Attribute Attributes}, start and end position) of the requested tag. ! * If no tag is registered under the needed name, a generic tag is created. ! * Note that in all casses, the {@link Page} property is only set if the node ! * is a subclass of {@link AbstractNode}. */ public class PrototypicalNodeFactory *************** *** 84,88 **** { /** ! * The list of tags to return at the top level. * The list is keyed by tag name. */ --- 95,109 ---- { /** ! * The prototypical text node. ! */ ! protected Text mText; ! ! /** ! * The prototypical remark node. ! */ ! protected Remark mRemark; ! ! /** ! * The list of tags to return. * The list is keyed by tag name. */ *************** *** 90,94 **** /** ! * Create a new factory with all but DOM tags registered. */ public PrototypicalNodeFactory () --- 111,115 ---- /** ! * Create a new factory with all tags registered. */ public PrototypicalNodeFactory () *************** *** 99,106 **** --- 120,131 ---- /** * Create a new factory with no registered tags. + * @param empty If <code>true</code>, creates an empty factory, + * otherwise is equivalent to {@link #PrototypicalNodeFactory()}. */ public PrototypicalNodeFactory (boolean empty) { clear (); + mText = new TextNode (null, 0, 0); + mRemark = new RemarkNode (null, 0, 0); if (!empty) registerTags (); *************** *** 108,112 **** /** ! * Create a new factory with the given tag as the only one registered. */ public PrototypicalNodeFactory (org.htmlparser.tags.Tag tag) --- 133,138 ---- /** ! * Create a new factory with the given tag as the only registered tag. ! * @param tag The single tag to register in the otherwise empty factory. */ public PrototypicalNodeFactory (org.htmlparser.tags.Tag tag) *************** *** 118,121 **** --- 144,148 ---- /** * Create a new factory with the given tags registered. + * @param tags The tags to register in the otherwise empty factory. */ public PrototypicalNodeFactory (org.htmlparser.tags.Tag[] tags) *************** *** 129,137 **** * Adds a tag to the registry. * @param id The name under which to register the tag. ! * @param tag The tag to be returned from a createTag(id) call. ! * @return The tag previously registered with that id, * or <code>null</code> if none. */ ! public Tag put (String id, org.htmlparser.tags.Tag tag) { return ((Tag)mBlastocyst.put (id, tag)); --- 156,164 ---- * Adds a tag to the registry. * @param id The name under which to register the tag. ! * @param tag The tag to be returned from a {@link #createTagNode} call. ! * @return The tag previously registered with that id if any, * or <code>null</code> if none. */ ! public Tag put (String id, Tag tag) { return ((Tag)mBlastocyst.put (id, tag)); *************** *** 141,149 **** * Gets a tag from the registry. * @param id The name of the tag to return. ! * @return The tag registered under the id name or <code>null</code> if none. */ ! public org.htmlparser.tags.Tag get (String id) { ! return ((org.htmlparser.tags.Tag)mBlastocyst.get (id)); } --- 168,176 ---- * Gets a tag from the registry. * @param id The name of the tag to return. ! * @return The tag registered under the <code>id</code> name or <code>null</code> if none. */ ! public Tag get (String id) { ! return ((Tag)mBlastocyst.get (id)); } *************** *** 151,159 **** * Remove a tag from the registry. * @param id The name of the tag to remove. ! * @return The tag that was registered with that id. */ ! public org.htmlparser.tags.Tag remove (String id) { ! return ((org.htmlparser.tags.Tag)mBlastocyst.remove (id)); } --- 178,186 ---- * Remove a tag from the registry. * @param id The name of the tag to remove. ! * @return The tag that was registered with that <code>id</code>. */ ! public Tag remove (String id) { ! return ((Tag)mBlastocyst.remove (id)); } *************** *** 166,170 **** --- 193,211 ---- } + /** + * Get the list of tag names. + * @return The names of the tags currently registered. + */ + public Set getTagNames () + { + return (mBlastocyst.keySet ()); + } + /** + * Register a tag. + * Registers the given tag under every id the tag has. + * @param tag The tag to register (subclass of + * {@link org.htmlparser.tags.Tag}). + */ public void registerTag (org.htmlparser.tags.Tag tag) { *************** *** 176,179 **** --- 217,226 ---- } + /** + * Unregister a tag. + * Unregisters the given tag from every id the tag has. + * @param tag The tag to unregister (subclass of + * {@link org.htmlparser.tags.Tag}). + */ public void unregisterTag (org.htmlparser.tags.Tag tag) { *************** *** 185,188 **** --- 232,261 ---- } + /** + * Register a tag. + * Registers the given tag under the tag {@link Tag#getTagName() name}. + * @param tag The tag to register (implements {@link org.htmlparser.Tag}). + */ + public void registerTag (Tag tag) + { + put (tag.getTagName (), tag); + } + + /** + * Unregister a tag. + * Unregisters the given tag from the tag {@link Tag#getTagName() name}. + * @param tag The tag to unregister (implements {@link org.htmlparser.Tag}). + */ + public void unregisterTag (Tag tag) + { + remove (tag.getTagName ()); + } + + /** + * Register all known tags in the tag package. + * Registers tags from the {@link org.htmlparser.tags tag package} by + * calling {@link #registerTag(org.htmlparser.tags.Tag) registerTag()}. + * @return 'this' nodefactory as a convenience. + */ public PrototypicalNodeFactory registerTags () { *************** *** 220,223 **** --- 293,338 ---- } + /** + * Get the object being used to generate text nodes. + * @return The prototype for {@link Text} nodes. + */ + public Text getTextPrototype () + { + return (mText); + } + + /** + * Set the object to be used to generate text nodes. + * @param text The prototype for {@link Text} nodes. + */ + public void setTextPrototype (Text text) + { + if (null == text) + throw new IllegalArgumentException ("text prototype node cannot be null"); + else + mText = text; + } + + /** + * Get the object being used to generate remark nodes. + * @return The prototype for {@link Remark} nodes. + */ + public Remark getRemarkPrototype () + { + return (mRemark); + } + + /** + * Set the object to be used to generate remark nodes. + * @param remark The prototype for {@link Remark} nodes. + */ + public void setRemarkPrototype (Remark remark) + { + if (null == remark) + throw new IllegalArgumentException ("remark prototype node cannot be null"); + else + mRemark = remark; + } + // // NodeFactory interface *************** *** 228,236 **** * @param page The page the node is on. * @param start The beginning position of the string. ! * @param end The ending positiong of the string. */ public Text createStringNode (Page page, int start, int end) { ! return (new TextNode (page, start, end)); } --- 343,368 ---- * @param page The page the node is on. * @param start The beginning position of the string. ! * @param end The ending position of the string. */ public Text createStringNode (Page page, int start, int end) { ! Text ret; ! ! try ! { ! ret = (Text)(getTextPrototype ().clone ()); ! if (ret instanceof AbstractNode) ! ((AbstractNode)ret).setPage (page); ! else ! ret.setText (page.getText (start, end)); ! ret.setStartPosition (start); ! ret.setEndPosition (end); ! } ! catch (CloneNotSupportedException cnse) ! { ! ret = new TextNode (page, start, end); ! } ! ! return (ret); } *************** *** 243,247 **** public Remark createRemarkNode (Page page, int start, int end) { ! return (new RemarkNode (page, start, end)); } --- 375,405 ---- public Remark createRemarkNode (Page page, int start, int end) { ! int first; ! int last; ! Remark ret; ! ! try ! { ! ret = (Remark)(getRemarkPrototype ().clone ()); ! // if (ret instanceof AbstractNode) ! // ((AbstractNode)ret).setPage (page); ! // else ! { ! first = start + 4; // <!-- ! last = end - 3; // --> ! if (first >= last) ! ret.setText (""); ! else ! ret.setText (page.getText (first, last)); ! } ! ret.setStartPosition (start); ! ret.setEndPosition (end); ! } ! catch (CloneNotSupportedException cnse) ! { ! ret = new RemarkNode (page, start, end); ! } ! ! return (ret); } *************** *** 263,268 **** Attribute attribute; String id; ! org.htmlparser.tags.Tag prototype; ! org.htmlparser.tags.Tag ret; ret = null; --- 421,426 ---- Attribute attribute; String id; ! Tag prototype; ! Tag ret; ret = null; *************** *** 281,289 **** if (id.endsWith ("/")) id = id.substring (0, id.length () - 1); ! prototype = (org.htmlparser.tags.Tag)mBlastocyst.get (id); if (null != prototype) { ! ret = (org.htmlparser.tags.Tag)prototype.clone (); ! ret.setPage (page); ret.setStartPosition (start); ret.setEndPosition (end); --- 439,448 ---- if (id.endsWith ("/")) id = id.substring (0, id.length () - 1); ! prototype = (Tag)mBlastocyst.get (id); if (null != prototype) { ! ret = (Tag)prototype.clone (); ! if (ret instanceof AbstractNode) ! ((AbstractNode)ret).setPage (page); ret.setStartPosition (start); ret.setEndPosition (end); *************** *** 299,302 **** --- 458,462 ---- } if (null == ret) + // generate a generic node ret = new org.htmlparser.tags.Tag (page, start, end, attributes); |
From: Derrick O. <der...@us...> - 2004-06-14 00:07:01
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3132/tests Modified Files: MemoryTest.java Log Message: Rework PrototypicalNodeFactory to use interfaces. Index: MemoryTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/MemoryTest.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** MemoryTest.java 22 May 2004 03:57:30 -0000 1.1 --- MemoryTest.java 14 Jun 2004 00:06:52 -0000 1.2 *************** *** 72,76 **** fail ("out of memory"); } ! assertEquals ("wrong size fetched", size, 4697411); } --- 72,76 ---- fail ("out of memory"); } ! assertEquals ("wrong size fetched", 4697411, size); } |
From: Derrick O. <der...@us...> - 2004-06-14 00:07:01
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/scanners In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3132/scanners Modified Files: ScriptScanner.java StyleScanner.java Log Message: Rework PrototypicalNodeFactory to use interfaces. Index: StyleScanner.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/scanners/StyleScanner.java,v retrieving revision 1.34 retrieving revision 1.35 diff -C2 -d -r1.34 -r1.35 *** StyleScanner.java 24 May 2004 16:18:30 -0000 1.34 --- StyleScanner.java 14 Jun 2004 00:06:52 -0000 1.35 *************** *** 70,80 **** boolean done; int position; ! Text last; Tag end; NodeFactory factory; CompositeTag ret; done = false; ! last = null; end = null; factory = lexer.getNodeFactory (); --- 70,83 ---- boolean done; int position; ! int startpos; ! int endpos; Tag end; NodeFactory factory; + Text content; CompositeTag ret; done = false; ! startpos = lexer.getPosition (); ! endpos = startpos; end = null; factory = lexer.getNodeFactory (); *************** *** 87,91 **** node = lexer.nextNode (true); if (null == node) ! break; else if (node instanceof Tag) --- 90,94 ---- node = lexer.nextNode (true); if (null == node) ! done = true; else if (node instanceof Tag) *************** *** 102,144 **** } else - { // must be a string, even though it looks like a tag ! if (null != last) ! // append it to the previous one ! last.setEndPosition (node.elementEnd ()); ! else ! last = factory.createStringNode (lexer.getPage (), node.elementBegin (), node.elementEnd ()); ! } else if (node instanceof Remark) ! { ! if (null != last) ! last.setEndPosition (node.getEndPosition ()); ! else ! { ! // last = factory.createStringNode (lexer, node.elementBegin (), node.elementEnd ()); ! last = factory.createStringNode (lexer.getPage (), node.elementBegin (), node.elementEnd ()); ! } ! } else // Text ! { ! if (null != last) ! last.setEndPosition (node.getEndPosition ()); ! else ! last = (Text)node; ! } } while (!done); ! // build new string tag if required ! if (null == last) ! last = factory.createStringNode (lexer.getPage (), position, position); // build new end tag if required if (null == end) ! end = new Tag (lexer.getPage (), tag.getEndPosition (), tag.getEndPosition (), new Vector ()); ret = (CompositeTag)tag; ret.setEndTag (end); ! ret.setChildren (new NodeList (last)); ! last.setParent (ret); end.setParent (ret); ret.doSemanticAction (); --- 105,126 ---- } else // must be a string, even though it looks like a tag ! endpos = node.getEndPosition (); else if (node instanceof Remark) ! endpos = node.getEndPosition (); else // Text ! endpos = node.getEndPosition (); } while (!done); ! content = factory.createStringNode (lexer.getPage (), startpos, endpos); // build new end tag if required if (null == end) ! end = new Tag (lexer.getPage (), endpos, endpos, new Vector ()); ret = (CompositeTag)tag; ret.setEndTag (end); ! ret.setChildren (new NodeList (content)); ! content.setParent (ret); end.setParent (ret); ret.doSemanticAction (); Index: ScriptScanner.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/scanners/ScriptScanner.java,v retrieving revision 1.57 retrieving revision 1.58 diff -C2 -d -r1.57 -r1.58 *** ScriptScanner.java 24 May 2004 16:18:30 -0000 1.57 --- ScriptScanner.java 14 Jun 2004 00:06:52 -0000 1.58 *************** *** 75,85 **** boolean done; int position; ! Text last; Tag end; NodeFactory factory; CompositeTag ret; done = false; ! last = null; end = null; factory = lexer.getNodeFactory (); --- 75,88 ---- boolean done; int position; ! int startpos; ! int endpos; Tag end; NodeFactory factory; + Text content; CompositeTag ret; done = false; ! startpos = lexer.getPosition (); ! endpos = startpos; end = null; factory = lexer.getNodeFactory (); *************** *** 94,98 **** String code = ScriptDecoder.Decode (lexer.getPage (), lexer.getCursor ()); ((ScriptTag)tag).setScriptCode (code); ! last = factory.createStringNode (lexer.getPage (), start, lexer.getPosition ()); } } --- 97,101 ---- String code = ScriptDecoder.Decode (lexer.getPage (), lexer.getCursor ()); ((ScriptTag)tag).setScriptCode (code); ! endpos = lexer.getPosition (); } } *************** *** 105,109 **** node = lexer.nextNode (true); if (null == node) ! break; else if (node instanceof Tag) --- 108,112 ---- node = lexer.nextNode (true); if (null == node) ! done = true; else if (node instanceof Tag) *************** *** 120,162 **** } else - { // must be a string, even though it looks like a tag ! if (null != last) ! // append it to the previous one ! last.setEndPosition (node.elementEnd ()); ! else ! last = factory.createStringNode (lexer.getPage (), node.elementBegin (), node.elementEnd ()); ! } else if (node instanceof Remark) ! { ! if (null != last) ! last.setEndPosition (node.getEndPosition ()); ! else ! { ! // last = factory.createStringNode (lexer, node.elementBegin (), node.elementEnd ()); ! last = factory.createStringNode (lexer.getPage (), node.elementBegin (), node.elementEnd ()); ! } ! } else // Text ! { ! if (null != last) ! last.setEndPosition (node.getEndPosition ()); ! else ! last = (Text)node; ! } ! } while (!done); ! // build new string tag if required ! if (null == last) ! last = factory.createStringNode (lexer.getPage (), position, position); // build new end tag if required if (null == end) ! end = new Tag (lexer.getPage (), tag.getEndPosition (), tag.getEndPosition (), new Vector ()); ret = (CompositeTag)tag; ret.setEndTag (end); ! ret.setChildren (new NodeList (last)); ! last.setParent (ret); end.setParent (ret); ret.doSemanticAction (); --- 123,143 ---- } else // must be a string, even though it looks like a tag ! endpos = node.getEndPosition (); else if (node instanceof Remark) ! endpos = node.getEndPosition (); else // Text ! endpos = node.getEndPosition (); } while (!done); ! content = factory.createStringNode (lexer.getPage (), startpos, endpos); // build new end tag if required if (null == end) ! end = new Tag (lexer.getPage (), endpos, endpos, new Vector ()); ret = (CompositeTag)tag; ret.setEndTag (end); ! ret.setChildren (new NodeList (content)); ! content.setParent (ret); end.setParent (ret); ret.doSemanticAction (); |
From: Derrick O. <der...@us...> - 2004-06-14 00:07:00
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3132/nodes Modified Files: AbstractNode.java RemarkNode.java TextNode.java Log Message: Rework PrototypicalNodeFactory to use interfaces. Index: RemarkNode.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes/RemarkNode.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** RemarkNode.java 24 May 2004 16:18:37 -0000 1.1 --- RemarkNode.java 14 Jun 2004 00:06:51 -0000 1.2 *************** *** 202,208 **** ret.append (endpos); ret.append ("): "); ! while (startpos < endpos) { ! c = mText.charAt (startpos); switch (c) { --- 202,208 ---- ret.append (endpos); ret.append ("): "); ! for (int i = 0; i < mText.length (); i++) { ! c = mText.charAt (i); switch (c) { *************** *** 224,228 **** break; } - startpos++; } } --- 224,227 ---- Index: TextNode.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes/TextNode.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** TextNode.java 24 May 2004 16:18:37 -0000 1.1 --- TextNode.java 14 Jun 2004 00:06:51 -0000 1.2 *************** *** 172,178 **** ret.append (endpos); ret.append ("): "); ! while (startpos < endpos) { ! c = mText.charAt (startpos); switch (c) { --- 172,178 ---- ret.append (endpos); ret.append ("): "); ! for (int i = 0; i < mText.length (); i++) { ! c = mText.charAt (i); switch (c) { *************** *** 194,198 **** break; } - startpos++; } } --- 194,197 ---- Index: AbstractNode.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes/AbstractNode.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** AbstractNode.java 24 May 2004 16:18:37 -0000 1.1 --- AbstractNode.java 14 Jun 2004 00:06:51 -0000 1.2 *************** *** 84,87 **** --- 84,99 ---- /** + * Clone this object. + * Exposes java.lang.Object clone as a public method. + * @return A clone of this object. + * @exception CloneNotSupportedException This shouldn't be thrown since + * the {@link Node} interface extends Cloneable. + */ + public Object clone() throws CloneNotSupportedException + { + return (super.clone ()); + } + + /** * Returns a string representation of the node. This is an important method, it allows a simple string transformation * of a web page, regardless of a node.<br> |
From: Derrick O. <der...@us...> - 2004-06-14 00:07:00
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodeDecorators In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3132/nodeDecorators Modified Files: AbstractNodeDecorator.java Log Message: Rework PrototypicalNodeFactory to use interfaces. Index: AbstractNodeDecorator.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodeDecorators/AbstractNodeDecorator.java,v retrieving revision 1.20 retrieving revision 1.21 diff -C2 -d -r1.20 -r1.21 *** AbstractNodeDecorator.java 24 May 2004 16:18:18 -0000 1.20 --- AbstractNodeDecorator.java 14 Jun 2004 00:06:51 -0000 1.21 *************** *** 43,46 **** --- 43,58 ---- } + /** + * Clone this object. + * Exposes java.lang.Object clone as a public method. + * @return A clone of this object. + * @exception CloneNotSupportedException This shouldn't be thrown since + * the {@link Node} interface extends Cloneable. + */ + public Object clone() throws CloneNotSupportedException + { + return (super.clone ()); + } + public void accept (NodeVisitor visitor) { |
From: Derrick O. <der...@us...> - 2004-06-08 10:20:32
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20594/lexer Modified Files: Page.java Log Message: DocComment fix and another getText() signature. Index: Page.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Page.java,v retrieving revision 1.36 retrieving revision 1.37 diff -C2 -d -r1.36 -r1.37 *** Page.java 23 May 2004 19:42:14 -0000 1.36 --- Page.java 8 Jun 2004 10:20:18 -0000 1.37 *************** *** 949,952 **** --- 949,979 ---- /** + * Put the text identified by the given limits into the given array at the specified offset. + * @param array The array of characters. + * @param offset The starting position in the array where characters are to be placed. + * @param start The starting position, zero based. + * @param end The ending position + * (exclusive, i.e. the character at the ending position is not included), + * zero based. + * @exception IllegalArgumentException If an attempt is made to get + * characters ahead of the current source offset (character position). + */ + public void getText (char[] array, int offset, int start, int end) + { + int length; + + if ((mSource.mOffset < start) || (mSource.mOffset < end)) + throw new IllegalArgumentException ("attempt to extract future characters from source"); + if (end < start) + { + length = end; + end = start; + start = length; + } + length = end - start; + System.arraycopy (mSource.mBuffer, start, array, offset, length); + } + + /** * Get the text line the position of the cursor lies on. * @param cursor The position to calculate for. |
From: Derrick O. <der...@us...> - 2004-06-08 10:20:32
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/filters In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20594/filters Modified Files: HasParentFilter.java Log Message: DocComment fix and another getText() signature. Index: HasParentFilter.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/filters/HasParentFilter.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** HasParentFilter.java 24 Jan 2004 23:57:49 -0000 1.1 --- HasParentFilter.java 8 Jun 2004 10:20:19 -0000 1.2 *************** *** 33,42 **** /** ! * This class accepts all tags that have a parent acceptable to the filter. */ public class HasParentFilter implements NodeFilter { /** ! * The filter to apply to children. */ public NodeFilter mFilter; --- 33,42 ---- /** ! * This class accepts all tags that have a parent acceptable to another filter. */ public class HasParentFilter implements NodeFilter { /** ! * The filter to apply to the parent. */ public NodeFilter mFilter; |
From: Derrick O. <der...@us...> - 2004-06-03 01:20:25
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6109 Modified Files: index.html Log Message: Allow scrolling left panel. Index: index.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/index.html,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** index.html 4 Jan 2004 03:23:08 -0000 1.3 --- index.html 3 Jun 2004 01:20:16 -0000 1.4 *************** *** 8,12 **** </head> <frameset cols="15%,85%" frameborder="NO" border="0" framespacing="0" rows="*"> ! <frame name="leftFrame" scrolling="NO" src="panel.html" frameborder="NO" noresize> <frame name="mainFrame" src="main.html" frameborder="NO"> </frameset> --- 8,12 ---- </head> <frameset cols="15%,85%" frameborder="NO" border="0" framespacing="0" rows="*"> ! <frame name="leftFrame" src="panel.html" frameborder="NO" noresize> <frame name="mainFrame" src="main.html" frameborder="NO"> </frameset> |
From: Derrick O. <der...@us...> - 2004-06-03 01:18:35
|
Update of /cvsroot/htmlparser/htmlparser/docs/pics In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5722/pics Added Files: rsf.gif Log Message: Add Rodney S. Foley's photo. --- NEW FILE: rsf.gif --- (This appears to be a binary file; contents omitted.) |