[Htmlparser-cvs] htmlparser/src/org/htmlparser NodeFilter.java,1.2,1.3 Parser.java,1.103,1.104 packa
Brought to you by:
derrickoswald
From: Derrick O. <der...@us...> - 2005-04-05 01:03:01
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv28518/htmlparser/src/org/htmlparser Modified Files: NodeFilter.java Parser.java package.html Log Message: Update javadocs. Enable SiteCapturer to handle resource names containing spaces. Index: package.html =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/package.html,v retrieving revision 1.21 retrieving revision 1.22 diff -C2 -d -r1.21 -r1.22 *** package.html 14 Jun 2004 00:06:51 -0000 1.21 --- package.html 5 Apr 2005 00:48:12 -0000 1.22 *************** *** 33,40 **** the HTML Parser. <p>The {@link org.htmlparser.Parser} class is the main high level class that ! provides simplified access to the contents of an HTML page. The page can be ! specified as either a URLConnection or a String. In the case of a String, an ! attempt is made to open it as a URL, and if that fails it assumes it is a local ! disk file. A wide range of methods is available to customize the operation of the Parser, as well as access specific pieces of the page as --- 33,37 ---- the HTML Parser. <p>The {@link org.htmlparser.Parser} class is the main high level class that ! provides simplified access to the contents of an HTML page. A wide range of methods is available to customize the operation of the Parser, as well as access specific pieces of the page as *************** *** 48,56 **** is the {@link org.htmlparser.PrototypicalNodeFactory} which operates by holding example nodes and cloning them as needed to satisfy the ! requests for nodes by the Parser. The Lexer is it's own NodeFactory, returning ! new {@link org.htmlparser.nodes.TextNode}, {@link org.htmlparser.nodes.RemarkNode} and undifferentiated {@link org.htmlparser.nodes.TagNode Tagnodes} (see the ! {@link org.htmlparser.nodes nodes} package).</p> <p>The {@link org.htmlparser.NodeFilter} interface is used by the filtering code to determine if a node meets a certain criteria. Some generic examples of --- 45,55 ---- is the {@link org.htmlparser.PrototypicalNodeFactory} which operates by holding example nodes and cloning them as needed to satisfy the ! requests for nodes by the Parser. By default, a Lexer is it's own NodeFactory, ! returning new {@link org.htmlparser.nodes.TextNode}, {@link org.htmlparser.nodes.RemarkNode} and undifferentiated {@link org.htmlparser.nodes.TagNode Tagnodes} (see the ! {@link org.htmlparser.nodes nodes} package), but when the parser uses a lexer ! it replaces this behaviour with a PrototypicalNodeFactory to return a rich ! set of specific tags (see the {@link org.htmlparser.tags tags} package).</p> <p>The {@link org.htmlparser.NodeFilter} interface is used by the filtering code to determine if a node meets a certain criteria. Some generic examples of Index: Parser.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v retrieving revision 1.103 retrieving revision 1.104 diff -C2 -d -r1.103 -r1.104 *** Parser.java 13 Mar 2005 15:36:11 -0000 1.103 --- Parser.java 5 Apr 2005 00:48:10 -0000 1.104 *************** *** 46,59 **** /** ! * This is the class that the user will use, either to get an iterator into ! * the html page or to directly parse the page and print the results ! * <BR> ! * Typical usage of the parser is as follows : <BR> ! * [1] Create a parser object - passing the URL and a feedback object to the parser<BR> ! * [2] Enumerate through the elements from the parser object <BR> ! * It is important to note that the parsing occurs when you enumerate, ON DEMAND. ! * This is a thread-safe way, and you only get the control back after a ! * particular element is parsed and returned, which could be the entire body. ! * @see Parser#elements() */ public class Parser --- 46,108 ---- /** ! * The main parser class. ! * This is the primary class of the HTML Parser library. It provides ! * constructors that take a {@link #Parser(String) String}, ! * a {@link #Parser(URLConnection) URLConnection}, or a ! * {@link #Parser(Lexer) Lexer}. In the case of a String, an ! * attempt is made to open it as a URL, and if that fails it assumes it is a ! * local disk file. If you want to actually parse a String, use ! * {@link #setInputHTML setInputHTML()} after using the ! * {@link #Parser() no-args} constructor, or use {@link #createParser}. ! * <p>The Parser provides access to the contents of the ! * page, via a {@link #elements() NodeIterator}, a ! * {@link #parse(NodeFilter) NodeList} or a ! * {@link #visitAllNodesWith NodeVisitor}. ! * <p>Typical usage of the parser is: ! * <code> ! * <pre> ! * Parser parser = new Parser ("http://whatever"); ! * NodeList list = parser.parse (); ! * // do something with your list of nodes. ! * </pre> ! * </code></p> ! * <p>What types of nodes and what can be done with them is dependant on the ! * setup, but in general a node can be converted back to HTML and it's ! * children (enclosed nodes) and parent can be obtained, because nodes are ! * nested. See the {@link Node} interface.</p> ! * <p>For example, if the URL contains:<br> ! * <code> ! * {@.html ! * <html> ! * <head> ! * <title>Mondays -- What a bad idea.</title> ! * </head> ! * <body BGCOLOR="#FFFFFF"> ! * Most people have a pathological hatred of Mondays... ! * </body> ! * </html>} ! * </code><br> ! * and the example code above is used, the list contain only one element, the ! * {@.html <html>} node. This node is a {@link org.htmlparser.tags tag}, ! * which is an object of class ! * {@link org.htmlparser.tags.Html Html} if the default {@link NodeFactory} ! * (a {@link PrototypicalNodeFactory}) is used.</p> ! * <p>To get at further content, the children of the top ! * level nodes must be examined. When digging through a node list one must be ! * conscious of the possibility of whitespace between nodes, e.g. in the example ! * above: ! * <code> ! * <pre> ! * Node node = list.elementAt (0); ! * NodeList sublist = node.getChildren (); ! * System.out.println (sublist.size ()); ! * </pre> ! * </code> ! * would print out 5, not 2, because there are newlines after {@.html <html>}, ! * {@.html </head>} and {@.html </body>} that are children of the HTML node ! * besides the {@.html <head>} and {@.html <body>} nodes.</p> ! * <p>Because processing nodes is so common, two interfaces are provided to ! * ease this task, {@link org.htmlparser.filters filters} ! * and {@link org.htmlparser.visitors visitors}. */ public class Parser *************** *** 66,70 **** /** ! * The floating point version number. */ public final static double --- 115,119 ---- /** ! * The floating point version number ({@value}). */ public final static double *************** *** 73,77 **** /** ! * The type of version. */ public final static String --- 122,126 ---- /** ! * The type of version ({@value}). */ public final static String *************** *** 80,84 **** /** ! * The date of the version. */ public final static String --- 129,133 ---- /** ! * The date of the version ({@value}). */ public final static String *************** *** 87,91 **** /** ! * The display version. */ public final static String --- 136,140 ---- /** ! * The display version ({@value}). */ public final static String *************** *** 186,191 **** /** * Zero argument constructor. ! * The parser is in a safe but useless state. ! * Set the lexer or connection using setLexer() or setConnection(). * @see #setLexer(Lexer) * @see #setConnection(URLConnection) --- 235,241 ---- /** * Zero argument constructor. ! * The parser is in a safe but useless state parsing an empty string. ! * Set the lexer or connection using {@link #setLexer} ! * or {@link #setConnection}. * @see #setLexer(Lexer) * @see #setConnection(URLConnection) *************** *** 197,213 **** /** ! * This constructor enables the construction of test cases, with readers ! * associated with test string buffers. It can also be used with readers of the user's choice ! * streaming data into the parser.<p/> ! * <B>Important:</B> If you are using this constructor, and you would like to use the parser ! * to parse multiple times (multiple calls to parser.elements()), you must ensure the following:<br> ! * <ul> ! * <li>Before the first parse, you must mark the reader for a length that you anticipate (the size of the stream).</li> ! * <li>After the first parse, calls to elements() must be preceded by calls to : ! * <pre> ! * parser.getReader().reset(); ! * </pre> ! * </li> ! * </ul> * @param lexer The lexer to draw characters from. * @param fb The object to use when information, --- 247,253 ---- /** ! * Construct a parser using the provided lexer and feedback object. ! * This would be used to create a parser for special cases where the ! * normal creation of a lexer on a URLConnection needs to be customized. * @param lexer The lexer to draw characters from. * @param fb The object to use when information, *************** *** 226,232 **** --- 266,276 ---- /** * Constructor for custom HTTP access. + * This would be used to create a parser for a URLConnection that needs + * a special setup or negotiation conditioning beyond what is available + * from the {@link #getConnectionManager ConnectionManager}. * @param connection A fully conditioned connection. The connect() * method will be called so it need not be connected yet. * @param fb The object to use for message communication. + * @throws ParserException If the creation of the underlying Lexer cannot be performed. */ public Parser (URLConnection connection, ParserFeedback fb) *************** *** 240,243 **** --- 284,288 ---- * Creates a Parser object with the location of the resource (URL or file) * You would typically create a DefaultHTMLParserFeedback object and pass it in. + * @see #Parser(URLConnection,ParserFeedback) * @param resourceLocn Either the URL or the filename (autodetects). * A standard HTTP GET is performed to read the content of the URL. *************** *** 245,249 **** * warning and error messages are produced. If <em>null</em> no feedback * is provided. ! * @see #Parser(URLConnection,ParserFeedback) */ public Parser (String resourceLocn, ParserFeedback feedback) throws ParserException --- 290,294 ---- * warning and error messages are produced. If <em>null</em> no feedback * is provided. ! * @throws ParserException If the URL is invalid. */ public Parser (String resourceLocn, ParserFeedback feedback) throws ParserException *************** *** 256,259 **** --- 301,305 ---- * A DefaultHTMLParserFeedback object is used for feedback. * @param resourceLocn Either the URL or the filename (autodetects). + * @throws ParserException If the resourceLocn argument does not resolve to a valid page or file. */ public Parser (String resourceLocn) throws ParserException *************** *** 263,279 **** /** ! * This constructor is present to enable users to plugin their own lexers. ! * A DefaultHTMLParserFeedback object is used for feedback. It can also be used with readers of the user's choice ! * streaming data into the parser.<p/> ! * <B>Important:</B> If you are using this constructor, and you would like to use the parser ! * to parse multiple times (multiple calls to parser.elements()), you must ensure the following:<br> ! * <ul> ! * <li>Before the first parse, you must mark the reader for a length that you anticipate (the size of the stream).</li> ! * <li>After the first parse, calls to elements() must be preceded by calls to : ! * <pre> ! * parser.getReader().reset(); ! * </pre> ! * </li> ! * @param lexer The source for HTML to be parsed. */ public Parser (Lexer lexer) --- 309,317 ---- /** ! * Construct a parser using the provided lexer. ! * A feedback object printing to {@link #stdout System.out} is used. ! * This would be used to create a parser for special cases where the ! * normal creation of a lexer on a URLConnection needs to be customized. ! * @param lexer The lexer to draw characters from. */ public Parser (Lexer lexer) *************** *** 283,291 **** /** ! * Constructor for non-standard access. ! * A DefaultHTMLParserFeedback object is used for feedback. * @param connection A fully conditioned connection. The connect() * method will be called so it need not be connected yet. ! * @see #Parser(URLConnection,ParserFeedback) */ public Parser (URLConnection connection) throws ParserException --- 321,333 ---- /** ! * Construct a parser using the provided URLConnection. ! * This would be used to create a parser for a URLConnection that needs ! * a special setup or negotiation conditioning beyond what is available ! * from the {@link #getConnectionManager ConnectionManager}. ! * A feedback object printing to {@link #stdout System.out} is used. ! * @see #Parser(URLConnection,ParserFeedback) * @param connection A fully conditioned connection. The connect() * method will be called so it need not be connected yet. ! * @throws ParserException If the creation of the underlying Lexer cannot be performed. */ public Parser (URLConnection connection) throws ParserException *************** *** 301,305 **** * Set the connection for this parser. * This method creates a new <code>Lexer</code> reading from the connection. - * Trying to set the connection to null is a noop. * @param connection A fully conditioned connection. The connect() * method will be called so it need not be connected yet. --- 343,346 ---- *************** *** 313,318 **** ParserException { ! if (null != connection) ! setLexer (new Lexer (connection)); } --- 354,360 ---- ParserException { ! if (null == connection) ! throw new IllegalArgumentException ("connection cannot be null"); ! setLexer (new Lexer (connection)); } *************** *** 320,324 **** * Return the current connection. * @return The connection either created by the parser or passed into this ! * parser via <code>setConnection</code>. * @see #setConnection(URLConnection) */ --- 362,366 ---- * Return the current connection. * @return The connection either created by the parser or passed into this ! * parser via {@link #setConnection}. * @see #setConnection(URLConnection) */ *************** *** 331,336 **** * Set the URL for this parser. * This method creates a new Lexer reading from the given URL. ! * Trying to set the url to null or an empty string is a noop. ! * @see #setConnection(URLConnection) */ public void setURL (String url) --- 373,380 ---- * Set the URL for this parser. * This method creates a new Lexer reading from the given URL. ! * Trying to set the url to null or an empty string is a no-op. ! * @param url The new URL for the parser. ! * @throws ParserException If the url is invalid or creation of the ! * underlying Lexer cannot be performed. */ public void setURL (String url) *************** *** 339,349 **** { if ((null != url) && !"".equals (url)) ! setConnection (Page.getConnectionManager ().openConnection (url)); } /** * Return the current URL being parsed. ! * @return The url passed into the constructor or the file name ! * passed to the constructor modified to be a URL. */ public String getURL () --- 383,395 ---- { if ((null != url) && !"".equals (url)) ! setConnection (getConnectionManager ().openConnection (url)); } /** * Return the current URL being parsed. ! * @return The current url. This is the URL for the current page. ! * A string passed into the constructor or set via setURL may be altered, ! * for example, a file name may be modified to be a URL. ! * @see Page#getUrl */ public String getURL () *************** *** 355,358 **** --- 401,408 ---- * Set the encoding for the page this parser is reading from. * @param encoding The new character set to use. + * @throws ParserException If the encoding change causes characters that + * have already been consumed to differ from the characters that would + * have been seen had the new encoding been in force. + * @see org.htmlparser.util.EncodingChangeException */ public void setEncoding (String encoding) *************** *** 367,370 **** --- 417,421 ---- * This item is set from the HTTP header but may be overridden by meta * tags in the head, so this may change after the head has been parsed. + * @return The encoding currently in force. */ public String getEncoding () *************** *** 375,383 **** /** * Set the lexer for this parser. ! * The current NodeFactory is set on the given lexer, since the lexer ! * contains the node factory object. * It does not adjust the <code>feedback</code> object. ! * Trying to set the lexer to <code>null</code> is a noop. * @param lexer The lexer object to use. */ public void setLexer (Lexer lexer) --- 426,435 ---- /** * Set the lexer for this parser. ! * The current NodeFactory is transferred to (set on) the given lexer, ! * since the lexer owns the node factory object. * It does not adjust the <code>feedback</code> object. ! * Trying to set the lexer to <code>null</code> is a no-op. * @param lexer The lexer object to use. + * @see #setNodeFactory */ public void setLexer (Lexer lexer) *************** *** 405,409 **** /** ! * Returns the reader associated with the parser * @return The current lexer. */ --- 457,461 ---- /** ! * Returns the lexer associated with the parser * @return The current lexer. */ *************** *** 415,419 **** /** * Get the current node factory. ! * @return The parser's node factory. */ public NodeFactory getNodeFactory () --- 467,471 ---- /** * Get the current node factory. ! * @return The current lexer's node factory. */ public NodeFactory getNodeFactory () *************** *** 424,428 **** /** * Set the current node factory. ! * @param factory The new node factory for the parser. */ public void setNodeFactory (NodeFactory factory) --- 476,480 ---- /** * Set the current node factory. ! * @param factory The new node factory for the current lexer. */ public void setNodeFactory (NodeFactory factory) *************** *** 435,439 **** /** * Sets the feedback object used in scanning. ! * @param fb The new feedback object to use. */ public void setFeedback (ParserFeedback fb) --- 487,492 ---- /** * Sets the feedback object used in scanning. ! * @param fb The new feedback object to use. If this is null a ! * {@link #noFeedback silent feedback object} is used. */ public void setFeedback (ParserFeedback fb) *************** *** 443,448 **** /** ! * Returns the feedback. ! * @return HTMLParserFeedback */ public ParserFeedback getFeedback() --- 496,501 ---- /** ! * Returns the current feedback object. ! * @return The feedback object currently being used. */ public ParserFeedback getFeedback() *************** *** 457,460 **** --- 510,515 ---- /** * Reset the parser to start from the beginning again. + * This assumes support for a reset from the underlying + * {@link org.htmlparser.lexer.Source} object. */ public void reset () *************** *** 464,488 **** /** ! * Returns an iterator (enumeration) to the html nodes. Each node can be a tag/endtag/ ! * string/link/image<br> ! * This is perhaps the most important method of this class. In typical situations, you will need to use ! * the parser like this : * <pre> ! * Parser parser = new Parser("http://www.yahoo.com"); ! * for (NodeIterator i = parser.elements();i.hasMoreElements();) { ! * Node node = i.nextHTMLNode(); ! * if (node instanceof StringNode) { ! * // Downcasting to StringNode ! * StringNode stringNode = (StringNode)node; ! * // Do whatever processing you want with the string node ! * System.out.println(stringNode.getText()); ! * } ! * // Check for the node or tag that you want ! * if (node instanceof ...) { ! * // Downcast, and process ! * // recursively (nodes within nodes) ! * } * } * </pre> */ public NodeIterator elements () throws ParserException --- 519,569 ---- /** ! * Returns an iterator (enumeration) over the html nodes. ! * {@link org.htmlparser.nodes Nodes} can be of three main types: ! * <ul> ! * <li>{@link org.htmlparser.nodes.TagNode TagNode}</li> ! * <li>{@link org.htmlparser.nodes.TextNode TextNode}</li> ! * <li>{@link org.htmlparser.nodes.RemarkNode RemarkNode}</li> ! * </ul> ! * In general, when parsing with an iterator or processing a NodeList, ! * you will need to use recursion. For example: ! * <code> * <pre> ! * void processMyNodes (Node node) ! * { ! * if (node instanceof TextNode) ! * { ! * // downcast to TextNode ! * TextNode text = (TextNode)node; ! * // do whatever processing you want with the text ! * System.out.println (text.getText ()); ! * } ! * if (node instanceof RemarkNode) ! * { ! * // downcast to RemarkNode ! * RemarkNode remark = (RemarkNode)node; ! * // do whatever processing you want with the comment ! * } ! * else if (node instanceof TagNode) ! * { ! * // downcast to TagNode ! * TagNode tag = (TagNode)node; ! * // do whatever processing you want with the tag itself ! * // ... ! * // process recursively (nodes within nodes) via getChildren() ! * NodeList list = tag.getChildren (); ! * if (null != list) ! * for (NodeIterator i = list.elements (); i.hasMoreElements (); ) ! * processMyNodes (i.nextNode ()); ! * } * } + * + * Parser parser = new Parser ("http://www.yahoo.com"); + * for (NodeIterator i = parser.elements (); i.hasMoreElements (); ) + * processMyNodes (i.nextNode ()); * </pre> + * </code> + * @throws ParserException If a parsing error occurs. + * @return An iterator over the top level nodes (usually {@.html <html>}). */ public NodeIterator elements () throws ParserException *************** *** 493,499 **** /** * Parse the given resource, using the filter provided. - * @param filter The filter to apply to the parsed nodes. * @return The list of matching nodes (for a <code>null</code> * filter this is all the top level nodes). */ public NodeList parse (NodeFilter filter) throws ParserException --- 574,582 ---- /** * Parse the given resource, using the filter provided. * @return The list of matching nodes (for a <code>null</code> * filter this is all the top level nodes). + * @param filter The filter to apply to the parsed nodes, + * or <code>null</code> to retrieve all the top level nodes. + * @throws ParserException If a parsing error occurs. */ public NodeList parse (NodeFilter filter) throws ParserException *************** *** 516,520 **** } ! public void visitAllNodesWith(NodeVisitor visitor) throws ParserException { Node node; visitor.beginParsing(); --- 599,612 ---- } ! /** ! * Apply the given visitor to the current page. ! * The visitor is passed to the <code>accept()</code> method of each node ! * in the page in a depth first traversal. The visitor ! * <code>beginParsing()</code> method is called prior to processing the ! * page and <code>finishedParsing()</code> is called after the processing. ! * @param visitor The visitor to visit all nodes with. ! * @throws ParserException If a parse error occurs while traversing the page with the visitor. ! */ ! public void visitAllNodesWith (NodeVisitor visitor) throws ParserException { Node node; visitor.beginParsing(); *************** *** 529,532 **** --- 621,625 ---- * Initializes the parser with the given input HTML String. * @param inputHTML the input HTML that is to be parsed. + * @throws ParserException If a error occurs in setting up the underlying Lexer. */ public void setInputHTML (String inputHTML) *************** *** 543,546 **** --- 636,644 ---- * Extract all nodes matching the given filter. * @see Node#collectInto(NodeList, NodeFilter) + * @param filter The filter to be applied to the nodes. + * @throws ParserException If a parse error occurs. + * @return A list of nodes matching the filter criteria, + * i.e. for which the filter's accept method + * returned <code>true</code>. */ public NodeList extractAllNodesThatMatch (NodeFilter filter) throws ParserException *************** *** 558,564 **** /** * Convenience method to extract all nodes of a given class type. ! * @see Node#collectInto(NodeList, NodeFilter) */ ! public Node [] extractAllNodesThatAre (Class nodeType) throws ParserException { NodeList ret; --- 656,669 ---- /** * Convenience method to extract all nodes of a given class type. ! * Equivalent to <code>extractAllNodesThatMatch (new NodeClassFilter (nodeType))</code>. ! * @param nodeType The class of the nodes to collect. ! * @throws ParserException If a parse error occurs. ! * @return A list of nodes which have the class specified. ! * @deprecated Use extractAllNodesThatMatch (new NodeClassFilter (nodeType)). ! * @see #extractAllNodesThatAre */ ! public Node [] extractAllNodesThatAre (Class nodeType) ! throws ! ParserException { NodeList ret; *************** *** 575,602 **** /** * Called just prior to calling connect. ! * The connection has been conditioned with proxy, URL user/password, ! * and cookie information. It is still possible to adjust the ! * connection to alter the request method for example. * @param connection The connection which is about to be connected. ! * @exception This exception is thrown if the connection monitor ! * wants the ConnectionManager to bail out. */ public void preConnect (HttpURLConnection connection) ! throws ! ParserException ! { if (null != getFeedback ()) getFeedback ().info (ConnectionManager.getRequestHeader (connection)); ! } ! /** Called just after calling connect. ! * The response code and header fields can be examined. * @param connection The connection that was just connected. ! * @exception This exception is thrown if the connection monitor ! * wants the ConnectionManager to bail out. */ public void postConnect (HttpURLConnection connection) ! throws ! ParserException { if (null != getFeedback ()) --- 680,708 ---- /** * Called just prior to calling connect. ! * Part of the ConnectionMonitor interface, this implementation just ! * sends the request header to the feedback object if any. * @param connection The connection which is about to be connected. ! * @throws ParserException <em>Not used</em> ! * @see ConnectionMonitor#preConnect */ public void preConnect (HttpURLConnection connection) ! throws ! ParserException ! { if (null != getFeedback ()) getFeedback ().info (ConnectionManager.getRequestHeader (connection)); ! } ! /** ! * Called just after calling connect. ! * Part of the ConnectionMonitor interface, this implementation just ! * sends the response header to the feedback object if any. * @param connection The connection that was just connected. ! * @throws ParserException <em>Not used.</em> ! * @see ConnectionMonitor#postConnect */ public void postConnect (HttpURLConnection connection) ! throws ! ParserException { if (null != getFeedback ()) *************** *** 606,609 **** --- 712,717 ---- /** * The main program, which can be executed from the command line + * @param args A URL or file name to parse, and an optional tag name to be + * used as a filter. */ public static void main (String [] args) *************** *** 630,651 **** } else ! try ! { ! parser = new Parser (); ! if (1 < args.length) ! filter = new TagNameFilter (args[1]); ! else ! { // for a simple dump, use more verbose settings ! filter = null; ! parser.setFeedback (Parser.stdout); ! getConnectionManager ().setMonitor (parser); ! } ! parser.setURL (args[0]); ! System.out.println (parser.parse (filter)); ! } ! catch (ParserException e) ! { ! e.printStackTrace (); ! } } } --- 738,759 ---- } else ! try ! { ! parser = new Parser (); ! if (1 < args.length) ! filter = new TagNameFilter (args[1]); ! else ! { // for a simple dump, use more verbose settings ! filter = null; ! parser.setFeedback (Parser.stdout); ! getConnectionManager ().setMonitor (parser); ! } ! parser.setURL (args[0]); ! System.out.println (parser.parse (filter)); ! } ! catch (ParserException e) ! { ! e.printStackTrace (); ! } } } Index: NodeFilter.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/NodeFilter.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** NodeFilter.java 13 Feb 2005 20:36:01 -0000 1.2 --- NodeFilter.java 5 Apr 2005 00:48:10 -0000 1.3 *************** *** 44,47 **** --- 44,48 ---- * @return <code>true</code> if the node is to be kept, <code>false</code> * if it is to be discarded. + * @param node The node to test. */ boolean accept (Node node); |