htmlparser-cvs Mailing List for HTML Parser (Page 12)
Brought to you by:
derrickoswald
You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(141) |
Jun
(108) |
Jul
(66) |
Aug
(127) |
Sep
(155) |
Oct
(149) |
Nov
(72) |
Dec
(72) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(100) |
Feb
(36) |
Mar
(21) |
Apr
(3) |
May
(87) |
Jun
(28) |
Jul
(84) |
Aug
(5) |
Sep
(14) |
Oct
|
Nov
|
Dec
|
2005 |
Jan
(1) |
Feb
(39) |
Mar
(26) |
Apr
(38) |
May
(14) |
Jun
(10) |
Jul
|
Aug
|
Sep
(13) |
Oct
(8) |
Nov
(10) |
Dec
|
2006 |
Jan
|
Feb
(1) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(24) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Derrick O. <der...@us...> - 2004-07-31 16:43:12
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/utilTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/tests/utilTests Modified Files: BeanTest.java SortTest.java CharacterTranslationTest.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: BeanTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/utilTests/BeanTest.java,v retrieving revision 1.49 retrieving revision 1.50 diff -C2 -d -r1.49 -r1.50 *** BeanTest.java 2 Jan 2004 16:24:57 -0000 1.49 --- BeanTest.java 31 Jul 2004 16:42:32 -0000 1.50 *************** *** 197,201 **** "Nodes before and after serialization differ", ((Node)vector.remove (0)).toHtml (), ! ((Node)enumeration.nextNode ()).toHtml ()); } --- 197,201 ---- "Nodes before and after serialization differ", ((Node)vector.remove (0)).toHtml (), ! enumeration.nextNode ().toHtml ()); } *************** *** 225,229 **** "Nodes before and after serialization differ", ((Node)vector.remove (0)).toHtml (), ! ((Node)enumeration.nextNode ()).toHtml ()); } --- 225,229 ---- "Nodes before and after serialization differ", ((Node)vector.remove (0)).toHtml (), ! enumeration.nextNode ().toHtml ()); } Index: SortTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/utilTests/SortTest.java,v retrieving revision 1.11 retrieving revision 1.12 diff -C2 -d -r1.11 -r1.12 *** SortTest.java 2 Jan 2004 16:24:57 -0000 1.11 --- SortTest.java 31 Jul 2004 16:42:32 -0000 1.12 *************** *** 141,147 **** ret = lastModified () - f.lastModified (); ! if (ret < (long)Integer.MIN_VALUE) ret = Integer.MIN_VALUE; ! if (ret > (long)Integer.MAX_VALUE) ret = Integer.MAX_VALUE; --- 141,147 ---- ret = lastModified () - f.lastModified (); ! if (ret < Integer.MIN_VALUE) ret = Integer.MIN_VALUE; ! if (ret > Integer.MAX_VALUE) ret = Integer.MAX_VALUE; Index: CharacterTranslationTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/utilTests/CharacterTranslationTest.java,v retrieving revision 1.45 retrieving revision 1.46 diff -C2 -d -r1.45 -r1.46 *** CharacterTranslationTest.java 17 Jul 2004 13:45:03 -0000 1.45 --- CharacterTranslationTest.java 31 Jul 2004 16:42:32 -0000 1.46 *************** *** 156,160 **** * The working parser. */ ! protected Parser parser; protected String nl = System.getProperty ("line.separator", "\n"); --- 156,160 ---- * The working parser. */ ! protected Parser mParser; protected String nl = System.getProperty ("line.separator", "\n"); *************** *** 169,173 **** throws ParserException { ! parser = new Parser ("http://www.w3.org/TR/REC-html40/sgml/entities.html"); } --- 169,173 ---- throws ParserException { ! mParser = new Parser ("http://www.w3.org/TR/REC-html40/sgml/entities.html"); } *************** *** 521,525 **** // Run through an enumeration of html elements, and pick up // only those that are plain string. ! for (NodeIterator e = parser.elements (); e.hasMoreNodes ();) { node = e.nextNode (); --- 521,525 ---- // Run through an enumeration of html elements, and pick up // only those that are plain string. ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes ();) { node = e.nextNode (); |
From: Derrick O. <der...@us...> - 2004-07-31 16:43:12
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/visitorsTests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/tests/visitorsTests Modified Files: UrlModifyingVisitorTest.java ScriptCommentTest.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: ScriptCommentTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/visitorsTests/ScriptCommentTest.java,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** ScriptCommentTest.java 18 Jul 2004 21:31:20 -0000 1.3 --- ScriptCommentTest.java 31 Jul 2004 16:42:33 -0000 1.4 *************** *** 70,78 **** + "</script>"; - private String failingHtml3 = - this.anotherFailingScriptTag - + "<HTML>" - + "</HTML>"; - public ScriptCommentTest(String name) { super(name); --- 70,73 ---- Index: UrlModifyingVisitorTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/visitorsTests/UrlModifyingVisitorTest.java,v retrieving revision 1.17 retrieving revision 1.18 diff -C2 -d -r1.17 -r1.18 *** UrlModifyingVisitorTest.java 16 Jun 2004 02:17:26 -0000 1.17 --- UrlModifyingVisitorTest.java 31 Jul 2004 16:42:33 -0000 1.18 *************** *** 59,63 **** Parser parser = Parser.createParser(HTML_WITH_LINK, null); UrlModifyingVisitor visitor = ! new UrlModifyingVisitor(parser, "localhost://"); parser.visitAllNodesWith(visitor); String result = visitor.getModifiedResult(); --- 59,63 ---- Parser parser = Parser.createParser(HTML_WITH_LINK, null); UrlModifyingVisitor visitor = ! new UrlModifyingVisitor("localhost://"); parser.visitAllNodesWith(visitor); String result = visitor.getModifiedResult(); |
From: Derrick O. <der...@us...> - 2004-07-31 16:43:12
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/thumbelina In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/lexerapplications/thumbelina Modified Files: ThumbelinaFrame.java TileSet.java Thumbelina.java Sequencer.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: ThumbelinaFrame.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/thumbelina/ThumbelinaFrame.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** ThumbelinaFrame.java 16 Dec 2003 02:29:56 -0000 1.2 --- ThumbelinaFrame.java 31 Jul 2004 16:42:30 -0000 1.3 *************** *** 381,386 **** public void restoreSize () { - Toolkit tk; - Dimension dim; Preferences prefs; String size; --- 381,384 ---- *************** *** 849,853 **** { String result; - URL url; result = JOptionPane.showInputDialog ( --- 847,850 ---- *************** *** 993,999 **** for (int i = 0; i < results[1].length; i++) { ! String found = ((URL)results[1][i]).toExternalForm (); if (-1 == found.indexOf ("google")) ! getThumbelina ().append ((URL)results[1][i]); } prefs.put (GOOGLEQUERY, query); --- 990,996 ---- for (int i = 0; i < results[1].length; i++) { ! String found = results[1][i].toExternalForm (); if (-1 == found.indexOf ("google")) ! getThumbelina ().append (results[1][i]); } prefs.put (GOOGLEQUERY, query); *************** *** 1061,1066 **** { String url; ! ThumbelinaFrame frame; ! Thumbelina thumbelina; System.setProperty ("sun.net.client.defaultReadTimeout", "7000"); --- 1058,1062 ---- { String url; ! ThumbelinaFrame thumbelina; System.setProperty ("sun.net.client.defaultReadTimeout", "7000"); *************** *** 1080,1085 **** try { ! frame = new ThumbelinaFrame (url); ! frame.setVisible (true); } catch (MalformedURLException murle) --- 1076,1081 ---- try { ! thumbelina = new ThumbelinaFrame (url); ! thumbelina.setVisible (true); } catch (MalformedURLException murle) Index: Thumbelina.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/thumbelina/Thumbelina.java,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** Thumbelina.java 24 May 2004 16:18:17 -0000 1.5 --- Thumbelina.java 31 Jul 2004 16:42:30 -0000 1.6 *************** *** 57,60 **** --- 57,61 ---- import javax.swing.JTextField; import javax.swing.ListSelectionModel; + import javax.swing.ScrollPaneConstants; import javax.swing.border.BevelBorder; import javax.swing.event.ChangeEvent; *************** *** 388,393 **** URL url; String ref; - boolean found; - URL u; list = new ArrayList (); --- 389,392 ---- *************** *** 487,493 **** mPicturePanelScroller.setDoubleBuffered (false); mPicturePanelScroller.setHorizontalScrollBarPolicy ( ! JScrollPane.HORIZONTAL_SCROLLBAR_ALWAYS); mPicturePanelScroller.setVerticalScrollBarPolicy ( ! JScrollPane.VERTICAL_SCROLLBAR_ALWAYS); add (mMainArea, java.awt.BorderLayout.CENTER); --- 486,492 ---- mPicturePanelScroller.setDoubleBuffered (false); mPicturePanelScroller.setHorizontalScrollBarPolicy ( ! ScrollPaneConstants.HORIZONTAL_SCROLLBAR_ALWAYS); mPicturePanelScroller.setVerticalScrollBarPolicy ( ! ScrollPaneConstants.VERTICAL_SCROLLBAR_ALWAYS); add (mMainArea, java.awt.BorderLayout.CENTER); *************** *** 1156,1160 **** source = (JSlider)event.getSource (); if (!source.getValueIsAdjusting ()) ! setSpeed ((int)source.getValue ()); } --- 1155,1159 ---- source = (JSlider)event.getSource (); if (!source.getValueIsAdjusting ()) ! setSpeed (source.getValue ()); } *************** *** 1179,1183 **** if (source == mHistory && !event.getValueIsAdjusting ()) { ! hrefs = (Object[])source.getSelectedValues (); for (int i = 0; i < hrefs.length; i++) { --- 1178,1182 ---- if (source == mHistory && !event.getValueIsAdjusting ()) { ! hrefs = source.getSelectedValues (); for (int i = 0; i < hrefs.length; i++) { *************** *** 1454,1457 **** --- 1453,1459 ---- * * $Log$ + * Revision 1.6 2004/07/31 16:42:30 derrickoswald + * Remove unused variables and other fixes exposed by turning on compiler warnings. + * * Revision 1.5 2004/05/24 16:18:17 derrickoswald * Part three of a multiphase refactoring. Index: Sequencer.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/thumbelina/Sequencer.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** Sequencer.java 21 Sep 2003 18:20:56 -0000 1.1 --- Sequencer.java 31 Jul 2004 16:42:30 -0000 1.2 *************** *** 224,235 **** public void add (final Image image, final URL url, final boolean background) { - int x; - int y; - Point p; Picture picture; int size; - x = image.getWidth (null); - y = image.getHeight (null); picture = new Picture (); picture.setImage (image); --- 224,230 ---- *************** *** 285,289 **** Picture picture; int size; - Point p; while (true) --- 280,283 ---- *************** *** 355,358 **** --- 349,355 ---- * * $Log$ + * Revision 1.2 2004/07/31 16:42:30 derrickoswald + * Remove unused variables and other fixes exposed by turning on compiler warnings. + * * Revision 1.1 2003/09/21 18:20:56 derrickoswald * Thumbelina Index: TileSet.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/thumbelina/TileSet.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** TileSet.java 21 Sep 2003 18:20:56 -0000 1.1 --- TileSet.java 31 Jul 2004 16:42:30 -0000 1.2 *************** *** 103,107 **** splits = split (r, rover, false); for (frags = splits.elements (); frags.hasMoreElements (); ) ! regions.addElement ((Picture)frags.nextElement ()); } else --- 103,107 ---- splits = split (r, rover, false); for (frags = splits.elements (); frags.hasMoreElements (); ) ! regions.addElement (frags.nextElement ()); } else *************** *** 540,543 **** --- 540,546 ---- * * $Log$ + * Revision 1.2 2004/07/31 16:42:30 derrickoswald + * Remove unused variables and other fixes exposed by turning on compiler warnings. + * * Revision 1.1 2003/09/21 18:20:56 derrickoswald * Thumbelina |
From: Derrick O. <der...@us...> - 2004-07-31 16:42:44
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/nodes Modified Files: TagNode.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: TagNode.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/nodes/TagNode.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** TagNode.java 18 Jul 2004 21:31:21 -0000 1.4 --- TagNode.java 31 Jul 2004 16:42:35 -0000 1.5 *************** *** 413,417 **** Attribute attribute; String value; - StringBuffer _value; Hashtable ret; --- 413,416 ---- |
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/util Modified Files: IteratorImpl.java Translate.java NodeList.java CharacterReference.java LinkProcessor.java ParserUtils.java Removed Files: CommandLine.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: CharacterReference.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/CharacterReference.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** CharacterReference.java 9 Feb 2004 02:09:45 -0000 1.1 --- CharacterReference.java 31 Jul 2004 16:42:34 -0000 1.2 *************** *** 97,101 **** ret = new StringBuffer (6 + 8 + 2); // max 8 in string ! hex = Integer.toHexString ((int)getCharacter ()); ret.append ("\\u"); for (int i = hex.length (); i < 4; i++) --- 97,101 ---- ret = new StringBuffer (6 + 8 + 2); // max 8 in string ! hex = Integer.toHexString (getCharacter ()); ret.append ("\\u"); for (int i = hex.length (); i < 4; i++) Index: Translate.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/Translate.java,v retrieving revision 1.45 retrieving revision 1.46 diff -C2 -d -r1.45 -r1.46 *** Translate.java 18 Jul 2004 21:31:20 -0000 1.45 --- Translate.java 31 Jul 2004 16:42:33 -0000 1.46 *************** *** 824,828 **** if (0 == radix) radix = 10; ! number = number * radix + ((int)character - (int)'0'); break; case 'A': --- 824,828 ---- if (0 == radix) radix = 10; ! number = number * radix + (character - '0'); break; case 'A': *************** *** 833,837 **** case 'F': if (16 == radix) ! number = number * radix + ((int)character - (int)'A' + 10); else done = true; --- 833,837 ---- case 'F': if (16 == radix) ! number = number * radix + (character - 'A' + 10); else done = true; *************** *** 844,848 **** case 'f': if (16 == radix) ! number = number * radix + ((int)character - (int)'a' + 10); else done = true; --- 844,848 ---- case 'f': if (16 == radix) ! number = number * radix + (character - 'a' + 10); else done = true; *************** *** 1077,1081 **** int length; char c; - int index; CharacterReference candidate; StringBuffer ret; --- 1077,1080 ---- Index: IteratorImpl.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/IteratorImpl.java,v retrieving revision 1.41 retrieving revision 1.42 diff -C2 -d -r1.41 -r1.42 *** IteratorImpl.java 2 Jul 2004 00:49:32 -0000 1.41 --- IteratorImpl.java 31 Jul 2004 16:42:33 -0000 1.42 *************** *** 69,73 **** { Tag tag; - String name; Scanner scanner; NodeList stack; --- 69,72 ---- Index: LinkProcessor.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/LinkProcessor.java,v retrieving revision 1.34 retrieving revision 1.35 diff -C2 -d -r1.34 -r1.35 *** LinkProcessor.java 18 Mar 2004 04:04:08 -0000 1.34 --- LinkProcessor.java 31 Jul 2004 16:42:34 -0000 1.35 *************** *** 180,189 **** */ public static boolean isURL (String resourceLocn) { - URL url; boolean ret; try { ! url = new URL (resourceLocn); ret = true; } --- 180,188 ---- */ public static boolean isURL (String resourceLocn) { boolean ret; try { ! new URL (resourceLocn); ret = true; } --- CommandLine.java DELETED --- Index: ParserUtils.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/ParserUtils.java,v retrieving revision 1.44 retrieving revision 1.45 diff -C2 -d -r1.44 -r1.45 *** ParserUtils.java 17 Jul 2004 13:45:06 -0000 1.44 --- ParserUtils.java 31 Jul 2004 16:42:34 -0000 1.45 *************** *** 717,721 **** { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = (Tag)beginTag.getEndTag(); // positions of begin and end tags --- 717,721 ---- { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = beginTag.getEndTag(); // positions of begin and end tags *************** *** 842,846 **** { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = (Tag)beginTag.getEndTag(); // positions of begin and end tags --- 842,846 ---- { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = beginTag.getEndTag(); // positions of begin and end tags *************** *** 946,950 **** { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = (Tag)beginTag.getEndTag(); // positions of begin and end tags --- 946,950 ---- { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = beginTag.getEndTag(); // positions of begin and end tags *************** *** 1046,1050 **** { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = (Tag)beginTag.getEndTag(); // positions of begin and end tags --- 1046,1050 ---- { CompositeTag beginTag = (CompositeTag)links.elementAt(j); ! Tag endTag = beginTag.getEndTag(); // positions of begin and end tags *************** *** 1122,1126 **** { CompositeTag jStartTag = (CompositeTag)links.elementAt(j); ! Tag jEndTag = (Tag)jStartTag.getEndTag(); int jStartTagBegin = jStartTag.getStartPosition (); int jEndTagEnd = jEndTag.getEndPosition (); --- 1122,1126 ---- { CompositeTag jStartTag = (CompositeTag)links.elementAt(j); ! Tag jEndTag = jStartTag.getEndTag(); int jStartTagBegin = jStartTag.getStartPosition (); int jEndTagEnd = jEndTag.getEndPosition (); *************** *** 1128,1132 **** { CompositeTag kStartTag = (CompositeTag)links.elementAt(k); ! Tag kEndTag = (Tag)kStartTag.getEndTag(); int kStartTagBegin = kStartTag.getStartPosition (); int kEndTagEnd = kEndTag.getEndPosition (); --- 1128,1132 ---- { CompositeTag kStartTag = (CompositeTag)links.elementAt(k); ! Tag kEndTag = kStartTag.getEndTag(); int kStartTagBegin = kStartTag.getStartPosition (); int kEndTagEnd = kEndTag.getEndPosition (); *************** *** 1162,1166 **** { CompositeTag jStartTag = (CompositeTag)links.elementAt(j); ! Tag jEndTag = (Tag)jStartTag.getEndTag(); int jStartTagBegin = jStartTag.getStartPosition (); int jEndTagEnd = jEndTag.getEndPosition (); --- 1162,1166 ---- { CompositeTag jStartTag = (CompositeTag)links.elementAt(j); ! Tag jEndTag = jStartTag.getEndTag(); int jStartTagBegin = jStartTag.getStartPosition (); int jEndTagEnd = jEndTag.getEndPosition (); *************** *** 1168,1172 **** { CompositeTag kStartTag = (CompositeTag)links.elementAt(k); ! Tag kEndTag = (Tag)kStartTag.getEndTag(); int kStartTagBegin = kStartTag.getStartPosition (); int kEndTagEnd = kEndTag.getEndPosition (); --- 1168,1172 ---- { CompositeTag kStartTag = (CompositeTag)links.elementAt(k); ! Tag kEndTag = kStartTag.getEndTag(); int kStartTagBegin = kStartTag.getStartPosition (); int kEndTagEnd = kEndTag.getEndPosition (); Index: NodeList.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/NodeList.java,v retrieving revision 1.55 retrieving revision 1.56 diff -C2 -d -r1.55 -r1.56 *** NodeList.java 22 Jul 2004 02:22:32 -0000 1.55 --- NodeList.java 31 Jul 2004 16:42:33 -0000 1.56 *************** *** 204,208 **** public NodeList extractAllNodesThatMatch (NodeFilter filter, boolean recursive) { - String name; Node node; NodeList children; --- 204,207 ---- *************** *** 244,248 **** Node node; NodeList children; - NodeList ret; for (int i = 0; i < size; ) --- 243,246 ---- |
From: Derrick O. <der...@us...> - 2004-07-31 16:42:44
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/visitors In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/visitors Modified Files: UrlModifyingVisitor.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: UrlModifyingVisitor.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/visitors/UrlModifyingVisitor.java,v retrieving revision 1.45 retrieving revision 1.46 diff -C2 -d -r1.45 -r1.46 *** UrlModifyingVisitor.java 24 May 2004 16:18:36 -0000 1.45 --- UrlModifyingVisitor.java 31 Jul 2004 16:42:35 -0000 1.46 *************** *** 28,32 **** import org.htmlparser.Node; - import org.htmlparser.Parser; import org.htmlparser.Remark; import org.htmlparser.Text; --- 28,31 ---- *************** *** 39,47 **** private String linkPrefix; private StringBuffer modifiedResult; - private Parser parser; ! public UrlModifyingVisitor(Parser parser, String linkPrefix) { super(true,true); - this.parser = parser; this.linkPrefix =linkPrefix; modifiedResult = new StringBuffer(); --- 38,44 ---- private String linkPrefix; private StringBuffer modifiedResult; ! public UrlModifyingVisitor(String linkPrefix) { super(true,true); this.linkPrefix =linkPrefix; modifiedResult = new StringBuffer(); |
From: Derrick O. <der...@us...> - 2004-07-31 16:42:43
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/tags Modified Files: TableHeader.java LinkTag.java TableColumn.java CompositeTag.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: TableColumn.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/TableColumn.java,v retrieving revision 1.36 retrieving revision 1.37 diff -C2 -d -r1.36 -r1.37 *** TableColumn.java 2 Jan 2004 16:24:55 -0000 1.36 --- TableColumn.java 31 Jul 2004 16:42:34 -0000 1.37 *************** *** 69,73 **** public String[] getEnders () { ! return (mIds); } --- 69,73 ---- public String[] getEnders () { ! return (mEnders); } Index: LinkTag.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/LinkTag.java,v retrieving revision 1.51 retrieving revision 1.52 diff -C2 -d -r1.51 -r1.52 *** LinkTag.java 17 Jul 2004 13:45:04 -0000 1.51 --- LinkTag.java 31 Jul 2004 16:42:34 -0000 1.52 *************** *** 278,282 **** for (SimpleNodeIterator e=children();e.hasMoreNodes();) { ! node = (Node)e.nextNode(); sb.append(" "+(i++)+ " "); sb.append(node.toString()+"\n"); --- 278,282 ---- for (SimpleNodeIterator e=children();e.hasMoreNodes();) { ! node = e.nextNode(); sb.append(" "+(i++)+ " "); sb.append(node.toString()+"\n"); Index: CompositeTag.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/CompositeTag.java,v retrieving revision 1.78 retrieving revision 1.79 diff -C2 -d -r1.78 -r1.79 *** CompositeTag.java 2 Jul 2004 00:49:28 -0000 1.78 --- CompositeTag.java 31 Jul 2004 16:42:34 -0000 1.79 *************** *** 57,65 **** * The default scanner for non-composite tags. */ ! protected final static CompositeTagScanner mDefaultScanner = new CompositeTagScanner (); public CompositeTag () { ! setThisScanner (mDefaultScanner); } --- 57,65 ---- * The default scanner for non-composite tags. */ ! protected final static CompositeTagScanner mDefaultCompositeScanner = new CompositeTagScanner (); public CompositeTag () { ! setThisScanner (mDefaultCompositeScanner); } *************** *** 174,178 **** boolean found = false; for (SimpleNodeIterator e = children();e.hasMoreNodes() && !found;) { ! node = (Node)e.nextNode(); if (node instanceof Tag) { --- 174,178 ---- boolean found = false; for (SimpleNodeIterator e = children();e.hasMoreNodes() && !found;) { ! node = e.nextNode(); if (node instanceof Tag) { *************** *** 432,436 **** while (children.hasMoreNodes ()) { ! child = (Node)children.nextNode (); child.accept (visitor); } --- 432,436 ---- while (children.hasMoreNodes ()) { ! child = children.nextNode (); child.accept (visitor); } Index: TableHeader.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tags/TableHeader.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** TableHeader.java 24 Jan 2004 18:12:57 -0000 1.1 --- TableHeader.java 31 Jul 2004 16:42:34 -0000 1.2 *************** *** 69,73 **** public String[] getEnders () { ! return (mIds); } --- 69,73 ---- public String[] getEnders () { ! return (mEnders); } |
From: Derrick O. <der...@us...> - 2004-07-31 16:42:43
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser Modified Files: PrototypicalNodeFactory.java Attribute.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: Attribute.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Attribute.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** Attribute.java 17 Jul 2004 13:45:05 -0000 1.2 --- Attribute.java 31 Jul 2004 16:42:35 -0000 1.3 *************** *** 685,692 **** public String toString () { - String name; - String assignment; - String value; - char quote; int length; StringBuffer ret; --- 685,688 ---- Index: PrototypicalNodeFactory.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/PrototypicalNodeFactory.java,v retrieving revision 1.12 retrieving revision 1.13 diff -C2 -d -r1.12 -r1.13 *** PrototypicalNodeFactory.java 18 Jul 2004 21:31:22 -0000 1.12 --- PrototypicalNodeFactory.java 31 Jul 2004 16:42:35 -0000 1.13 *************** *** 386,391 **** public Remark createRemarkNode (Page page, int start, int end) { - int first; - int last; Remark ret; --- 386,389 ---- |
From: Derrick O. <der...@us...> - 2004-07-31 16:42:43
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/tabby In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/lexerapplications/tabby Modified Files: Tabby.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: Tabby.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexerapplications/tabby/Tabby.java,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** Tabby.java 10 Sep 2003 03:38:26 -0000 1.1 --- Tabby.java 31 Jul 2004 16:42:34 -0000 1.2 *************** *** 103,107 **** { File[] files; - File f; if (file.isDirectory ()) --- 103,106 ---- *************** *** 297,300 **** --- 296,302 ---- * * $Log$ + * Revision 1.2 2004/07/31 16:42:34 derrickoswald + * Remove unused variables and other fixes exposed by turning on compiler warnings. + * * Revision 1.1 2003/09/10 03:38:26 derrickoswald * Add style checking target to ant build script: |
From: Derrick O. <der...@us...> - 2004-07-31 16:42:42
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/filters In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18039/src/org/htmlparser/filters Modified Files: HasParentFilter.java Log Message: Remove unused variables and other fixes exposed by turning on compiler warnings. Index: HasParentFilter.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/filters/HasParentFilter.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** HasParentFilter.java 17 Jul 2004 13:45:04 -0000 1.4 --- HasParentFilter.java 31 Jul 2004 16:42:34 -0000 1.5 *************** *** 29,33 **** import org.htmlparser.Node; import org.htmlparser.NodeFilter; - import org.htmlparser.util.NodeList; /** --- 29,32 ---- *************** *** 57,61 **** { Node parent; - NodeList children; boolean ret; --- 56,59 ---- |
From: Derrick O. <der...@us...> - 2004-07-31 01:22:56
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv1666/src/org/htmlparser/tests Modified Files: MemoryTest.java Log Message: Changed test case MemoryTest.testBigFile () to check for characters recieved, not bytes. Index: MemoryTest.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/tests/MemoryTest.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** MemoryTest.java 14 Jun 2004 00:06:52 -0000 1.2 --- MemoryTest.java 31 Jul 2004 01:22:45 -0000 1.3 *************** *** 72,76 **** fail ("out of memory"); } ! assertEquals ("wrong size fetched", 4697411, size); } --- 72,76 ---- fail ("out of memory"); } ! assertEquals ("wrong size fetched", 4697386, size); } |
From: Derrick O. <der...@us...> - 2004-07-29 03:02:28
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18102/docs Modified Files: release.txt Log Message: Fix distribution build and update release notes. Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.64 retrieving revision 1.65 diff -C2 -d -r1.64 -r1.65 *** release.txt 29 Jul 2004 02:01:02 -0000 1.64 --- release.txt 29 Jul 2004 03:02:19 -0000 1.65 *************** *** 27,38 **** Changes since Version 1.4 ------------------------- Configuration Management Removed the need for the Translate class to be packaged with htmllexer.jar. ! This results in a lighter weight component. Updated the logo and included ! the LGPL license. Refactoring Obviated LinkProcessor and moved it's functionality to the Page class. Added Tag, Text and Remark interfaces and moved concrete node implementations to the nodes package, removing the lexer.nodes package. Filters Added CssSelectorNodeFilter and RegExFilter. --- 27,51 ---- Changes since Version 1.4 ------------------------- + New APIs + Implement rudimentary sax parser. Currently exposes DOM parser via sax project Configuration Management Removed the need for the Translate class to be packaged with htmllexer.jar. ! This results in a lighter weight component. ! Updated the logo and included the LGPL license. ! Fixed the Windows batch files. Refactoring Obviated LinkProcessor and moved it's functionality to the Page class. Added Tag, Text and Remark interfaces and moved concrete node implementations to the nodes package, removing the lexer.nodes package. + Most internals now use the Tag interface. + Removed the org.htmlparser.tags.Tag class and moved the remaining (minor) + functionality to the TagNode class. + So now tags inherit directly from TagNode or CompositeTag. + ** NOTE: If you have subclassed org.htmlparser.tags.Tag, use org.htmlparser.nodes.TagNode now.** + Removed deprecated methods getTagBegin/getTagEnd and deleted unused classes: + PeekingIterator and it's Implementation. + Added ObjectTag (like an applet tag). + Added a real StringSource that reads directly from a String rather than + creating a byte array. This avoids character encoding losses. Filters Added CssSelectorNodeFilter and RegExFilter. *************** *** 46,51 **** Bug Fixes --------- ! 919738 Text has not been extracted correctly using StringBean 936392 ScriptTag visitor fails for comments with ' Acknowledgements --- 59,68 ---- Bug Fixes --------- ! 998195 SiteCatpurer just crashed ! 995703 Parser Crash ! 988846 Linkbean getLinks() segmentation fault (duplicate of above) ! 973137 Double-bytes characters are messed after parsing 936392 ScriptTag visitor fails for comments with ' + 919738 Text has not been extracted correctly using StringBean Acknowledgements |
From: Derrick O. <der...@us...> - 2004-07-29 03:02:28
|
Update of /cvsroot/htmlparser/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18102 Modified Files: build.xml Log Message: Fix distribution build and update release notes. Index: build.xml =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/build.xml,v retrieving revision 1.70 retrieving revision 1.71 diff -C2 -d -r1.70 -r1.71 *** build.xml 14 Jul 2004 01:58:02 -0000 1.70 --- build.xml 29 Jul 2004 03:02:19 -0000 1.71 *************** *** 450,454 **** <zipfileset dir="${bin}" prefix="htmlparser${versionQualifier}/${bin}" includes="*" excludes="*.bat" filemode="755"/> <zipfileset dir="${docs}" prefix="htmlparser${versionQualifier}/${docs}" excludes="docs/**,samples/**"/> ! <zipfileset dir="${docs}/docs" prefix="htmlparser${versionQualifier}/${docs}/wiki"/> <zipfileset dir="${lib}" prefix="htmlparser${versionQualifier}/${lib}"/> <zipfileset dir="." prefix="htmlparser${versionQualifier}/" includes="src.zip"/> --- 450,454 ---- <zipfileset dir="${bin}" prefix="htmlparser${versionQualifier}/${bin}" includes="*" excludes="*.bat" filemode="755"/> <zipfileset dir="${docs}" prefix="htmlparser${versionQualifier}/${docs}" excludes="docs/**,samples/**"/> ! <zipfileset dir="${wiki}" prefix="htmlparser${versionQualifier}/${docs}/wiki"/> <zipfileset dir="${lib}" prefix="htmlparser${versionQualifier}/${lib}"/> <zipfileset dir="." prefix="htmlparser${versionQualifier}/" includes="src.zip"/> |
From: Derrick O. <der...@us...> - 2004-07-29 02:01:15
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10868/src/org/htmlparser Modified Files: Parser.java Log Message: Update version to 1.5-20040728 Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.97 retrieving revision 1.98 diff -C2 -d -r1.97 -r1.98 *** Parser.java 29 Jul 2004 01:19:21 -0000 1.97 --- Parser.java 29 Jul 2004 02:01:02 -0000 1.98 *************** *** 86,90 **** */ public final static String ! VERSION_DATE = "Jun 13, 2004" ; --- 86,90 ---- */ public final static String ! VERSION_DATE = "Jul 28, 2004" ; |
From: Derrick O. <der...@us...> - 2004-07-29 02:01:11
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10868/docs Modified Files: changes.txt release.txt Log Message: Update version to 1.5-20040728 Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.63 retrieving revision 1.64 diff -C2 -d -r1.63 -r1.64 *** release.txt 26 Jun 2004 11:25:00 -0000 1.63 --- release.txt 29 Jul 2004 02:01:02 -0000 1.64 *************** *** 1,3 **** ! HTMLParser Version 1.5 (Integration Build Jun 13, 2004) ********************************************* --- 1,3 ---- ! HTMLParser Version 1.5 (Integration Build Jul 28, 2004) ********************************************* Index: changes.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/changes.txt,v retrieving revision 1.201 retrieving revision 1.202 diff -C2 -d -r1.201 -r1.202 *** changes.txt 14 Jun 2004 01:26:50 -0000 1.201 --- changes.txt 29 Jul 2004 02:01:02 -0000 1.202 *************** *** 16,19 **** --- 16,213 ---- ******************************************************************************* + Integration Build 1.5 - 20040728 + -------------------------------- + + 2004-07-28 21:50 derrickoswald + + * src/org/htmlparser/parserapplications/SiteCapturer.java: + + Fix bug #998195 SiteCatpurer just crashed + After EncodingChangeException try again with the encoding now set correctly. + + 2004-07-28 21:19 derrickoswald + + * src/org/htmlparser/: Parser.java, lexer/Page.java: + + Fix bug #995703 Parser Crash and bug #988846 Linkbean getLinks() segmentation fault + by not testing for content type "text/XXX" in Page, but rather issuing a warning when this is + discovered by the Parser level. + + 2004-07-28 07:38 derrickoswald + + * bin/: beanybaby.bat, lexer.bat, linkextractor.bat, parser.bat, + stringextractor.bat, thumbelina, thumbelina.bat: + + Fix batch files. + + 2004-07-21 22:22 derrickoswald + + * src/org/htmlparser/: tests/tagTests/LinkTagTest.java, + util/NodeList.java: + + Add test case for bug #982175 False Positives on ® entity. + Not reproducible (version 1.5). + + 2004-07-18 17:31 derrickoswald + + * src/: org/htmlparser/tests/scannersTests/ScriptScannerTest.java, + org/htmlparser/tags/FormTag.java, + org/htmlparser/tests/visitorsTests/ScriptCommentTest.java, + org/htmlparser/util/Translate.java, doc-files/overview.html, + org/htmlparser/nodes/TagNode.java, + org/htmlparser/tests/parserHelperTests/RemarkNodeParserTest.java, + org/htmlparser/tests/parserHelperTests/StringParserTest.java, + org/htmlparser/tests/tagTests/ImageTagTest.java, + org/htmlparser/tests/tagTests/LinkTagTest.java, + org/htmlparser/tests/tagTests/ScriptTagTest.java, + org/htmlparser/tests/utilTests/AllTests.java, + org/htmlparser/PrototypicalNodeFactory.java, + org/htmlparser/sax/Feedback.java, + org/htmlparser/tests/lexerTests/AttributeTests.java: + + Fix some javadoc warnings. + + 2004-07-17 09:45 derrickoswald + + * src/org/htmlparser/: scanners/CompositeTagScanner.java, + scanners/ScriptDecoder.java, scanners/ScriptScanner.java, + scanners/StyleScanner.java, + tests/utilTests/CharacterTranslationTest.java, + tests/utilTests/HTMLParserUtilsTest.java, + filters/CssSelectorNodeFilter.java, filters/HasParentFilter.java, + nodes/AbstractNode.java, nodes/RemarkNode.java, nodes/TagNode.java, + nodes/TextNode.java, tags/ImageTag.java, tags/LinkTag.java, + tags/TitleTag.java, tests/ParserTestCase.java, Attribute.java, + Parser.java, lexer/InputStreamSource.java, + tests/lexerTests/AttributeTests.java, + tests/scannersTests/ScriptScannerTest.java, util/Translate.java, + tests/tagTests/StyleTagTest.java, tests/tagTests/TagTest.java, + util/ParserUtils.java: + + Remove unused imports. + + 2004-07-13 21:58 derrickoswald + + * build.xml, lib/sax2.jar, src/org/htmlparser/sax/Attributes.java, + src/org/htmlparser/sax/Feedback.java, + src/org/htmlparser/sax/Locator.java, + src/org/htmlparser/sax/XMLReader.java, + src/org/htmlparser/sax/package.html, + src/org/htmlparser/tests/SAXTest.java: + + Implement rudimentary sax parser. + Currently exposes DOM parser via sax project (http://sourceforge.net/projects/sax) interfaces. + + 2004-07-12 21:02 derrickoswald + + * src/org/htmlparser/lexer/Page.java, docs/contributors.html: + + Add fix to Page.getContentType() suggested by Manuel Polo. + + 2004-07-03 09:56 derrickoswald + + * src/org/htmlparser/: Parser.java, lexer/InputStreamSource.java, + lexer/Page.java, lexer/Source.java, lexer/StringSource.java, + tests/lexerTests/SourceTests.java, util/ParserUtils.java: + + Further fix to bug #973137 Double-bytes characters are messed after parsing. + Created a proper String based source with the encoding only optionally specified. + A string is no longer converted to a byte array and then back to characters. + + 2004-07-01 21:33 derrickoswald + + * src/org/htmlparser/tests/ParserTestCase.java: + + Fix broken test framework. + + 2004-07-01 20:49 derrickoswald + + * build.xml, src/org/htmlparser/Node.java, + src/org/htmlparser/PrototypicalNodeFactory.java, + src/org/htmlparser/Tag.java, + src/org/htmlparser/filters/CssSelectorNodeFilter.java, + src/org/htmlparser/filters/HasParentFilter.java, + src/org/htmlparser/nodeDecorators/AbstractNodeDecorator.java, + src/org/htmlparser/nodes/TagNode.java, + src/org/htmlparser/scanners/CompositeTagScanner.java, + src/org/htmlparser/scanners/Scanner.java, + src/org/htmlparser/scanners/ScriptScanner.java, + src/org/htmlparser/scanners/StyleScanner.java, + src/org/htmlparser/scanners/TagScanner.java, + src/org/htmlparser/tags/AppletTag.java, + src/org/htmlparser/tags/BaseHrefTag.java, + src/org/htmlparser/tags/CompositeTag.java, + src/org/htmlparser/tags/DoctypeTag.java, + src/org/htmlparser/tags/FrameTag.java, + src/org/htmlparser/tags/ImageTag.java, + src/org/htmlparser/tags/InputTag.java, + src/org/htmlparser/tags/JspTag.java, + src/org/htmlparser/tags/MetaTag.java, + src/org/htmlparser/tags/ObjectTag.java, + src/org/htmlparser/tags/Tag.java, + src/org/htmlparser/tests/ParserTest.java, + src/org/htmlparser/tests/ParserTestCase.java, + src/org/htmlparser/tests/filterTests/FilterTest.java, + src/org/htmlparser/tests/lexerTests/AttributeTests.java, + src/org/htmlparser/tests/lexerTests/TagTests.java, + src/org/htmlparser/tests/parserHelperTests/CompositeTagScannerHelperTest.java, + src/org/htmlparser/tests/parserHelperTests/RemarkNodeParserTest.java, + src/org/htmlparser/tests/scannersTests/CompositeTagScannerTest.java, + src/org/htmlparser/tests/scannersTests/TagScannerTest.java, + src/org/htmlparser/tests/tagTests/BaseHrefTagTest.java, + src/org/htmlparser/tests/tagTests/BodyTagTest.java, + src/org/htmlparser/tests/tagTests/DivTagTest.java, + src/org/htmlparser/tests/tagTests/EndTagTest.java, + src/org/htmlparser/tests/tagTests/FormTagTest.java, + src/org/htmlparser/tests/tagTests/FrameSetTagTest.java, + src/org/htmlparser/tests/tagTests/HtmlTagTest.java, + src/org/htmlparser/tests/tagTests/JspTagTest.java, + src/org/htmlparser/tests/tagTests/LinkTagTest.java, + src/org/htmlparser/tests/tagTests/MetaTagTest.java, + src/org/htmlparser/tests/tagTests/ObjectCollectionTest.java, + src/org/htmlparser/tests/tagTests/SpanTagTest.java, + src/org/htmlparser/tests/tagTests/TagTest.java, + src/org/htmlparser/tests/tagTests/TitleTagTest.java, + src/org/htmlparser/tests/utilTests/CharacterTranslationTest.java, + src/org/htmlparser/tests/visitorsTests/TagFindingVisitorTest.java, + src/org/htmlparser/util/IteratorImpl.java, + src/org/htmlparser/util/ParserUtils.java, + src/org/htmlparser/util/PeekingIterator.java, + src/org/htmlparser/util/PeekingIteratorImpl.java: + + Part four of a multiphase refactoring. + Most internals now use the Tag interface. + This interface has been broadened to add set/get scanner and set/get endtag. + Removed the org.htmlparser.tags.Tag class and moved the remaining (minor) functionality + to the TagNode class. So now tags inherit directly from TagNode or CompositeTag. + ** NOTE: If you have subclassed org.htmlparser.tags.Tag, use org.htmlparser.nodes.TagNode now.** + Removed deprecated methods getTagBegin/getTagEnd and deleted unused classes: + PeekingIterator and it's Implementation. + + 2004-06-26 07:56 derrickoswald + + * src/org/htmlparser/tests/lexerTests/AttributeTests.java: + + Add test case for bug #979893 Not Parsing all Attributes. + Not reproducible. + + 2004-06-26 07:25 derrickoswald + + * docs/contributors.html, docs/release.txt, + src/org/htmlparser/PrototypicalNodeFactory.java, + src/org/htmlparser/tags/ObjectTag.java: + + Incorporate ObjectTag submitted by Enrico Triolo. + + 2004-06-15 22:17 derrickoswald + + * src/org/htmlparser/: Parser.java, + tests/InstanceofPerformanceTest.java, tests/ParserTestCase.java, + tests/lexerTests/TagTests.java, + tests/visitorsTests/UrlModifyingVisitorTest.java: + + Fix bug #973137 Double-bytes characters are messed after parsing. + Add an encoding parameter to the static createParser() method. + Integration Build 1.5 - 20040613 -------------------------------- |
From: Derrick O. <der...@us...> - 2004-07-29 01:50:28
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/parserapplications In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9575/src/org/htmlparser/parserapplications Modified Files: SiteCapturer.java Log Message: Fix bug #998195 SiteCatpurer just crashed After EncodingChangeException try again with the encoding now set correctly. Index: SiteCapturer.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/parserapplications/SiteCapturer.java,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** SiteCapturer.java 19 Jan 2004 23:14:18 -0000 1.5 --- SiteCapturer.java 29 Jul 2004 01:50:19 -0000 1.6 *************** *** 52,55 **** --- 52,56 ---- import org.htmlparser.tags.LinkTag; import org.htmlparser.tags.MetaTag; + import org.htmlparser.util.EncodingChangeException; import org.htmlparser.util.NodeIterator; import org.htmlparser.util.NodeList; *************** *** 439,445 **** // fetch the page and gather the list of nodes mParser.setURL (url); ! list = new NodeList (); ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes (); ) ! list.add (e.nextNode ()); // URL conversion occurs in the tags // handle robots meta tag according to http://www.robotstxt.org/wc/meta-user.html --- 440,459 ---- // fetch the page and gather the list of nodes mParser.setURL (url); ! try ! { ! list = new NodeList (); ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes (); ) ! list.add (e.nextNode ()); // URL conversion occurs in the tags ! } ! catch (EncodingChangeException ece) ! { ! // fix bug #998195 SiteCatpurer just crashed ! // try again with the encoding now set correctly ! // hopefully mPages, mImages, mCopied and mFinished won't be corrupted ! mParser.reset (); ! list = new NodeList (); ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes (); ) ! list.add (e.nextNode ()); ! } // handle robots meta tag according to http://www.robotstxt.org/wc/meta-user.html |
From: Derrick O. <der...@us...> - 2004-07-29 01:19:30
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6056/src/org/htmlparser/lexer Modified Files: Page.java Log Message: Fix bug #995703 Parser Crash and bug #988846 Linkbean getLinks() segmentation fault by not testing for content type "text/XXX" in Page, but rather issuing a warning when this is discovered by the Parser level. Index: Page.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Page.java,v retrieving revision 1.39 retrieving revision 1.40 diff -C2 -d -r1.39 -r1.40 *** Page.java 13 Jul 2004 01:02:38 -0000 1.39 --- Page.java 29 Jul 2004 01:19:22 -0000 1.40 *************** *** 355,363 **** } type = getContentType (); - if (type != null && !type.startsWith ("text")) - throw new ParserException ( - "URL " - + connection.getURL ().toExternalForm () - + " does not contain text"); charset = getCharset (type); try --- 355,358 ---- |
From: Derrick O. <der...@us...> - 2004-07-29 01:19:30
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6056/src/org/htmlparser Modified Files: Parser.java Log Message: Fix bug #995703 Parser Crash and bug #988846 Linkbean getLinks() segmentation fault by not testing for content type "text/XXX" in Page, but rather issuing a warning when this is discovered by the Parser level. Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.96 retrieving revision 1.97 diff -C2 -d -r1.96 -r1.97 *** Parser.java 17 Jul 2004 13:45:05 -0000 1.96 --- Parser.java 29 Jul 2004 01:19:21 -0000 1.97 *************** *** 440,443 **** --- 440,444 ---- { NodeFactory factory; + String type; if (null != lexer) *************** *** 449,452 **** --- 450,460 ---- lexer.setNodeFactory (factory); mLexer = lexer; + // warn about content that's not likely text + type = mLexer.getPage ().getContentType (); + if (type != null && !type.startsWith ("text")) + getFeedback ().warning ( + "URL " + + mLexer.getPage ().getUrl () + + " does not contain text"); } } |
From: Derrick O. <der...@us...> - 2004-07-28 11:38:20
|
Update of /cvsroot/htmlparser/htmlparser/bin In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv26937 Modified Files: beanybaby.bat lexer.bat linkextractor.bat parser.bat stringextractor.bat thumbelina thumbelina.bat Log Message: Fix batch files. Index: parser.bat =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/parser.bat,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** parser.bat 23 Sep 2003 03:41:34 -0000 1.1 --- parser.bat 28 Jul 2004 11:38:08 -0000 1.2 *************** *** 1 **** ! java -jar ..\lib\htmlparser.jar %1 %2 \ No newline at end of file --- 1 ---- ! java -classpath ..\lib\htmlparser.jar org.htmlparser.Parser %1 %2 \ No newline at end of file Index: thumbelina =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/thumbelina,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** thumbelina 23 Sep 2003 03:41:34 -0000 1.1 --- thumbelina 28 Jul 2004 11:38:08 -0000 1.2 *************** *** 54,57 **** HTMLPARSER_LIB="${HTMLPARSER_HOME}/lib" ! "$JAVACMD" -Xmx256M -jar "${HTMLPARSER_LIB}/thumbelina.jar" "$@" --- 54,57 ---- HTMLPARSER_LIB="${HTMLPARSER_HOME}/lib" ! "$JAVACMD" -Xmx256M -classpath "${HTMLPARSER_LIB}/thumbelina.jar" org.htmlparser.lexerapplications.thumbelina.Thumbelina "$@" Index: linkextractor.bat =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/linkextractor.bat,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** linkextractor.bat 31 Dec 2003 02:50:50 -0000 1.1 --- linkextractor.bat 28 Jul 2004 11:38:08 -0000 1.2 *************** *** 1 **** ! java -jar ..\lib\htmlparser.jar org.htmlparser.parserapplications.LinkExtractor %1 %2 --- 1 ---- ! java -classpath ..\lib\htmlparser.jar org.htmlparser.parserapplications.LinkExtractor %1 %2 Index: stringextractor.bat =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/stringextractor.bat,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** stringextractor.bat 4 Jan 2004 03:23:09 -0000 1.1 --- stringextractor.bat 28 Jul 2004 11:38:08 -0000 1.2 *************** *** 1 **** ! java -jar ..\lib\htmlparser.jar org.htmlparser.parserapplications.StringExtractor %1 %2 --- 1 ---- ! java -classpath ..\lib\htmlparser.jar org.htmlparser.parserapplications.StringExtractor %1 %2 Index: beanybaby.bat =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/beanybaby.bat,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** beanybaby.bat 4 Jan 2004 03:23:09 -0000 1.1 --- beanybaby.bat 28 Jul 2004 11:38:07 -0000 1.2 *************** *** 1 **** ! java -jar ..\lib\htmlparser.jar org.htmlparser.beans.BeanyBaby %1 %2 --- 1 ---- ! java -classpath ..\lib\htmlparser.jar org.htmlparser.beans.BeanyBaby %1 %2 Index: lexer.bat =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/lexer.bat,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** lexer.bat 23 Sep 2003 03:41:34 -0000 1.1 --- lexer.bat 28 Jul 2004 11:38:08 -0000 1.2 *************** *** 1 **** ! java -jar ..\lib\htmlparser.jar org.htmlparser.lexer.Lexer %1 %2 --- 1 ---- ! java -classpath ..\lib\htmllexer.jar org.htmlparser.lexer.Lexer %1 %2 Index: thumbelina.bat =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/bin/thumbelina.bat,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** thumbelina.bat 23 Sep 2003 03:41:34 -0000 1.1 --- thumbelina.bat 28 Jul 2004 11:38:08 -0000 1.2 *************** *** 1 **** ! java -Xmx256M -jar ..\lib\thumbelina.jar %1 %2 --- 1 ---- ! java -Xmx256M -classpath ..\lib\thumbelina.jar org.htmlparser.lexerapplications.thumbelina.Thumbelina %1 %2 |
From: Derrick O. <der...@us...> - 2004-07-28 00:26:31
|
Update of /cvsroot/htmlparser/htmlparser/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv27270/docs Modified Files: Tag: v1_41 changes.txt release.txt Log Message: Update version to 1.42 patch release. Index: release.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/release.txt,v retrieving revision 1.58.2.1 retrieving revision 1.58.2.2 diff -C2 -d -r1.58.2.1 -r1.58.2.2 *** release.txt 22 May 2004 20:10:31 -0000 1.58.2.1 --- release.txt 28 Jul 2004 00:26:21 -0000 1.58.2.2 *************** *** 1,3 **** ! HTMLParser Version 1.41 (Release Build May 22, 2004) ********************************************* --- 1,3 ---- ! HTMLParser Version 1.42 (Release Build Jul 27, 2004) ********************************************* *************** *** 19,22 **** --- 19,32 ---- (v) this file + Changes since Version 1.41 + ------------------------- + + Bug Fixes + --------- + #998195 SiteCatpurer just crashed + #995744 Translate.decode(String) + #995703 Parser Crash + #988846 Linkbean getLinks() segmentation fault (duplicate of above) + Changes since Version 1.4 ------------------------- Index: changes.txt =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/docs/changes.txt,v retrieving revision 1.199.2.1 retrieving revision 1.199.2.2 diff -C2 -d -r1.199.2.1 -r1.199.2.2 *** changes.txt 22 May 2004 20:10:30 -0000 1.199.2.1 --- changes.txt 28 Jul 2004 00:26:21 -0000 1.199.2.2 *************** *** 13,16 **** --- 13,40 ---- ******************************************************************************* + Release Build 1.42 - 20040727 + -------------------------------- + + 2004-07-27 07:56 derrickoswald + + * src/org/htmlparser/parserapplications/SiteCapturer.java (v1_41): + + Fix bug #998195 SiteCatpurer just crashed + After EncodingChangeException try again with the encoding now set correctly. + + 2004-07-27 07:32 derrickoswald + + * src/org/htmlparser/util/LinkProcessor.java (v1_41): + + Avoid bug #995744 Translate.decode(String) + don't apply translation to URLs + + 2004-07-27 07:15 derrickoswald + + * src/org/htmlparser/lexer/Page.java (v1_41): + + Avoid bug #995703 Parser Crash and also #988846 Linkbean getLinks() segmentation fault + by not testing for content type "text/XXXX" + Release Build 1.41 - 20040522 -------------------------------- |
From: Derrick O. <der...@us...> - 2004-07-28 00:26:30
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv27270/src/org/htmlparser Modified Files: Tag: v1_41 Parser.java Log Message: Update version to 1.42 patch release. Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.89.2.1 retrieving revision 1.89.2.2 diff -C2 -d -r1.89.2.1 -r1.89.2.2 *** Parser.java 22 May 2004 20:10:31 -0000 1.89.2.1 --- Parser.java 28 Jul 2004 00:26:21 -0000 1.89.2.2 *************** *** 74,78 **** */ public final static double ! VERSION_NUMBER = 1.41 ; --- 74,78 ---- */ public final static double ! VERSION_NUMBER = 1.42 ; *************** *** 88,92 **** */ public final static String ! VERSION_DATE = "May 22, 2004" ; --- 88,92 ---- */ public final static String ! VERSION_DATE = "Jul 27, 2004" ; |
From: Derrick O. <der...@us...> - 2004-07-27 11:56:35
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/parserapplications In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9312/src/org/htmlparser/parserapplications Modified Files: Tag: v1_41 SiteCapturer.java Log Message: Fix bug #998195 SiteCatpurer just crashed After EncodingChangeException try again with the encoding now set correctly. Index: SiteCapturer.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/parserapplications/SiteCapturer.java,v retrieving revision 1.5 retrieving revision 1.5.2.1 diff -C2 -d -r1.5 -r1.5.2.1 *** SiteCapturer.java 19 Jan 2004 23:14:18 -0000 1.5 --- SiteCapturer.java 27 Jul 2004 11:56:26 -0000 1.5.2.1 *************** *** 52,55 **** --- 52,56 ---- import org.htmlparser.tags.LinkTag; import org.htmlparser.tags.MetaTag; + import org.htmlparser.util.EncodingChangeException; import org.htmlparser.util.NodeIterator; import org.htmlparser.util.NodeList; *************** *** 439,445 **** // fetch the page and gather the list of nodes mParser.setURL (url); ! list = new NodeList (); ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes (); ) ! list.add (e.nextNode ()); // URL conversion occurs in the tags // handle robots meta tag according to http://www.robotstxt.org/wc/meta-user.html --- 440,459 ---- // fetch the page and gather the list of nodes mParser.setURL (url); ! try ! { ! list = new NodeList (); ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes (); ) ! list.add (e.nextNode ()); // URL conversion occurs in the tags ! } ! catch (EncodingChangeException ece) ! { ! // fix bug #998195 SiteCatpurer just crashed ! // try again with the encoding now set correctly ! // hopefully mPages, mImages, mCopied and mFinished won't be corrupted ! mParser.reset (); ! list = new NodeList (); ! for (NodeIterator e = mParser.elements (); e.hasMoreNodes (); ) ! list.add (e.nextNode ()); ! } // handle robots meta tag according to http://www.robotstxt.org/wc/meta-user.html |
From: Derrick O. <der...@us...> - 2004-07-27 11:32:32
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5661/src/org/htmlparser/util Modified Files: Tag: v1_41 LinkProcessor.java Log Message: Avoid bug #995744 Translate.decode(String) don't apply translation to URLs Index: LinkProcessor.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/LinkProcessor.java,v retrieving revision 1.33 retrieving revision 1.33.2.1 diff -C2 -d -r1.33 -r1.33.2.1 *** LinkProcessor.java 2 Jan 2004 16:24:58 -0000 1.33 --- LinkProcessor.java 27 Jul 2004 11:32:23 -0000 1.33.2.1 *************** *** 83,87 **** } ! return (Translate.decode (ret)); } --- 83,89 ---- } ! // avoid bug #995744 Translate.decode(String) ! // don't apply translation to URLs ! return (ret); } |
From: Derrick O. <der...@us...> - 2004-07-27 11:15:32
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2443/src/org/htmlparser/lexer Modified Files: Tag: v1_41 Page.java Log Message: Avoid bug #995703 Parser Crash and also #988846 Linkbean getLinks() segmentation fault by not testing for content type "text/XXXX" Index: Page.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/lexer/Page.java,v retrieving revision 1.33 retrieving revision 1.33.2.1 diff -C2 -d -r1.33 -r1.33.2.1 *** Page.java 31 Jan 2004 20:51:01 -0000 1.33 --- Page.java 27 Jul 2004 11:15:23 -0000 1.33.2.1 *************** *** 336,344 **** } type = getContentType (); ! if (!type.startsWith ("text")) ! throw new ParserException ( ! "URL " ! + connection.getURL ().toExternalForm () ! + " does not contain text"); charset = getCharset (type); try --- 336,345 ---- } type = getContentType (); ! // removed to avoid bug #995703 Parser Crash and also #988846 Linkbean getLinks() segmentation fault ! // if (!type.startsWith ("text")) ! // throw new ParserException ( ! // "URL " ! // + connection.getURL ().toExternalForm () ! // + " does not contain text"); charset = getCharset (type); try |
From: Derrick O. <der...@us...> - 2004-07-22 02:22:40
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19476/src/org/htmlparser/util Modified Files: NodeList.java Log Message: Add test case for bug #982175 False Positives on ® entity. Not reproducible (version 1.5). Index: NodeList.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/util/NodeList.java,v retrieving revision 1.54 retrieving revision 1.55 diff -C2 -d -r1.54 -r1.55 *** NodeList.java 10 Jan 2004 00:06:03 -0000 1.54 --- NodeList.java 22 Jul 2004 02:22:32 -0000 1.55 *************** *** 242,246 **** public void keepAllNodesThatMatch (NodeFilter filter, boolean recursive) { - String name; Node node; NodeList children; --- 242,245 ---- |