[Htmlparser-cvs] htmlparser/src/org/htmlparser NodeFilter.java,1.2,1.3 Parser.java,1.103,1.104 packa

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv28518/htmlparser/src/org/htmlparser

Modified Files:
	NodeFilter.java Parser.java package.html 
Log Message:
Update javadocs.
Enable SiteCapturer to handle resource names containing spaces.

Index: package.html
===================================================================
RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/package.html,v
retrieving revision 1.21
retrieving revision 1.22
diff -C2 -d -r1.21 -r1.22
*** package.html	14 Jun 2004 00:06:51 -0000	1.21
--- package.html	5 Apr 2005 00:48:12 -0000	1.22
***************
*** 33,40 ****
  the HTML Parser.
  <p>The {@link org.htmlparser.Parser} class is the main high level class that
! provides simplified access to the contents of an HTML page. The page can be
! specified as either a URLConnection or a String. In the case of a String, an
! attempt is made to open it as a URL, and if that fails it assumes it is a local
! disk file.
  A wide range of methods is available to customize the operation of the Parser,
  as well as access specific pieces of the page as
--- 33,37 ----
  the HTML Parser.
  <p>The {@link org.htmlparser.Parser} class is the main high level class that
! provides simplified access to the contents of an HTML page.
  A wide range of methods is available to customize the operation of the Parser,
  as well as access specific pieces of the page as
***************
*** 48,56 ****
  is the {@link org.htmlparser.PrototypicalNodeFactory} which
  operates by holding example nodes and cloning them as needed to satisfy the
! requests for nodes by the Parser. The Lexer is it's own NodeFactory, returning
! new {@link org.htmlparser.nodes.TextNode},
  {@link org.htmlparser.nodes.RemarkNode} and undifferentiated
  {@link org.htmlparser.nodes.TagNode Tagnodes} (see the
! {@link org.htmlparser.nodes nodes} package).</p>
  <p>The {@link org.htmlparser.NodeFilter} interface is used by the filtering
  code to determine if a node meets a certain criteria. Some generic examples of
--- 45,55 ----
  is the {@link org.htmlparser.PrototypicalNodeFactory} which
  operates by holding example nodes and cloning them as needed to satisfy the
! requests for nodes by the Parser. By default, a Lexer is it's own NodeFactory,
! returning new {@link org.htmlparser.nodes.TextNode},
  {@link org.htmlparser.nodes.RemarkNode} and undifferentiated
  {@link org.htmlparser.nodes.TagNode Tagnodes} (see the
! {@link org.htmlparser.nodes nodes} package), but when the parser uses a lexer
! it replaces this behaviour with a PrototypicalNodeFactory to return a rich
! set of specific tags (see the {@link org.htmlparser.tags tags} package).</p>
  <p>The {@link org.htmlparser.NodeFilter} interface is used by the filtering
  code to determine if a node meets a certain criteria. Some generic examples of

Index: Parser.java
===================================================================
RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/Parser.java,v
retrieving revision 1.103
retrieving revision 1.104
diff -C2 -d -r1.103 -r1.104
*** Parser.java	13 Mar 2005 15:36:11 -0000	1.103
--- Parser.java	5 Apr 2005 00:48:10 -0000	1.104
***************
*** 46,59 ****

  /**
!  * This is the class that the user will use, either to get an iterator into
!  * the html page or to directly parse the page and print the results
!  * <BR>
!  * Typical usage of the parser is as follows : <BR>
!  * [1] Create a parser object - passing the URL and a feedback object to the parser<BR>
!  * [2] Enumerate through the elements from the parser object <BR>
!  * It is important to note that the parsing occurs when you enumerate, ON DEMAND.
!  * This is a thread-safe way, and you only get the control back after a
!  * particular element is parsed and returned, which could be the entire body.
!  * @see Parser#elements()
   */
  public class Parser
--- 46,108 ----

/**
! * The main parser class.
! * This is the primary class of the HTML Parser library. It provides
! * constructors that take a {@link #Parser(String) String},
! * a {@link #Parser(URLConnection) URLConnection}, or a
! * {@link #Parser(Lexer) Lexer}. In the case of a String, an
! * attempt is made to open it as a URL, and if that fails it assumes it is a
! * local disk file. If you want to actually parse a String, use
! * {@link #setInputHTML setInputHTML()} after using the
! * {@link #Parser() no-args} constructor, or use {@link #createParser}.
! * The Parser provides access to the contents of the
! * page, via a {@link #elements() NodeIterator}, a
! * {@link #parse(NodeFilter) NodeList} or a
! * {@link #visitAllNodesWith NodeVisitor}.
! * Typical usage of the parser is:
! * <code>
! * <pre>
! * Parser parser = new Parser ("http://whatever");
! * NodeList list = parser.parse ();
! * // do something with your list of nodes.
! * </pre>
! * </code>
! * What types of nodes and what can be done with them is dependant on the
! * setup, but in general a node can be converted back to HTML and it's
! * children (enclosed nodes) and parent can be obtained, because nodes are
! * nested. See the {@link Node} interface.
! * For example, if the URL contains: 
! * <code>
! * {@.html
! * <html>
! * <head>
! * <title>Mondays -- What a bad idea.</title>
! * </head>
! * <body BGCOLOR="#FFFFFF">
! * Most people have a pathological hatred of Mondays...
! * </body>
! * </html>}
! * </code> 
! * and the example code above is used, the list contain only one element, the
! * {@.html <html>} node. This node is a {@link org.htmlparser.tags tag},
! * which is an object of class
! * {@link org.htmlparser.tags.Html Html} if the default {@link NodeFactory}
! * (a {@link PrototypicalNodeFactory}) is used.
! * To get at further content, the children of the top
! * level nodes must be examined. When digging through a node list one must be
! * conscious of the possibility of whitespace between nodes, e.g. in the example
! * above:
! * <code>
! * <pre>
! * Node node = list.elementAt (0);
! * NodeList sublist = node.getChildren ();
! * System.out.println (sublist.size ());
! * </pre>
! * </code>
! * would print out 5, not 2, because there are newlines after {@.html <html>},
! * {@.html </head>} and {@.html </body>} that are children of the HTML node
! * besides the {@.html <head>} and {@.html <body>} nodes.
! * Because processing nodes is so common, two interfaces are provided to
! * ease this task, {@link org.htmlparser.filters filters}
! * and {@link org.htmlparser.visitors visitors}.
 */
 public class Parser
***************
*** 66,70 ****

      /**
!      * The floating point version number.
       */
      public final static double
--- 115,119 ----

      /**
!      * The floating point version number ({@value}).
       */
      public final static double
***************
*** 73,77 ****

      /**
!      * The type of version.
       */
      public final static String
--- 122,126 ----

      /**
!      * The type of version ({@value}).
       */
      public final static String
***************
*** 80,84 ****

      /**
!      * The date of the version.
       */
      public final static String
--- 129,133 ----

      /**
!      * The date of the version ({@value}).
       */
      public final static String
***************
*** 87,91 ****

      /**
!      * The display version.
       */
      public final static String
--- 136,140 ----

      /**
!      * The display version ({@value}).
       */
      public final static String
***************
*** 186,191 ****
      /**
       * Zero argument constructor.
!      * The parser is in a safe but useless state.
!      * Set the lexer or connection using setLexer() or setConnection().
       * @see #setLexer(Lexer)
       * @see #setConnection(URLConnection)
--- 235,241 ----
      /**
       * Zero argument constructor.
!      * The parser is in a safe but useless state parsing an empty string.
!      * Set the lexer or connection using {@link #setLexer}
!      * or {@link #setConnection}.
       * @see #setLexer(Lexer)
       * @see #setConnection(URLConnection)
***************
*** 197,213 ****

      /**
!      * This constructor enables the construction of test cases, with readers
!      * associated with test string buffers. It can also be used with readers of the user's choice
!      * streaming data into the parser.<p/>
!      * <B>Important:</B> If you are using this constructor, and you would like to use the parser
!      * to parse multiple times (multiple calls to parser.elements()), you must ensure the following:<br>
!      * <ul>
!      * <li>Before the first parse, you must mark the reader for a length that you anticipate (the size of the stream).</li>
!      * <li>After the first parse, calls to elements() must be preceded by calls to :
!      * <pre>
!      * parser.getReader().reset();
!      * </pre>
!      * </li>
!      * </ul>
       * @param lexer The lexer to draw characters from.
       * @param fb The object to use when information,
--- 247,253 ----

      /**
!      * Construct a parser using the provided lexer and feedback object.
!      * This would be used to create a parser for special cases where the
!      * normal creation of a lexer on a URLConnection needs to be customized.
       * @param lexer The lexer to draw characters from.
       * @param fb The object to use when information,
***************
*** 226,232 ****
--- 266,276 ----
      /**
       * Constructor for custom HTTP access.
+      * This would be used to create a parser for a URLConnection that needs
+      * a special setup or negotiation conditioning beyond what is available
+      * from the {@link #getConnectionManager ConnectionManager}.
       * @param connection A fully conditioned connection. The connect()
       * method will be called so it need not be connected yet.
       * @param fb The object to use for message communication.
+      * @throws ParserException If the creation of the underlying Lexer cannot be performed.
       */
      public Parser (URLConnection connection, ParserFeedback fb)
***************
*** 240,243 ****
--- 284,288 ----
       * Creates a Parser object with the location of the resource (URL or file)
       * You would typically create a DefaultHTMLParserFeedback object and pass it in.
+      * @see #Parser(URLConnection,ParserFeedback)
       * @param resourceLocn Either the URL or the filename (autodetects).
       * A standard HTTP GET is performed to read the content of the URL.
***************
*** 245,249 ****
       * warning and error messages are produced. If <em>null</em> no feedback
       * is provided.
!      * @see #Parser(URLConnection,ParserFeedback)
       */
      public Parser (String resourceLocn, ParserFeedback feedback) throws ParserException
--- 290,294 ----
       * warning and error messages are produced. If <em>null</em> no feedback
       * is provided.
!      * @throws ParserException If the URL is invalid.
       */
      public Parser (String resourceLocn, ParserFeedback feedback) throws ParserException
***************
*** 256,259 ****
--- 301,305 ----
       * A DefaultHTMLParserFeedback object is used for feedback.
       * @param resourceLocn Either the URL or the filename (autodetects).
+      * @throws ParserException If the resourceLocn argument does not resolve to a valid page or file.
       */
      public Parser (String resourceLocn) throws ParserException
***************
*** 263,279 ****

      /**
!      * This constructor is present to enable users to plugin their own lexers.
!      * A DefaultHTMLParserFeedback object is used for feedback. It can also be used with readers of the user's choice
!      * streaming data into the parser.<p/>
!      * <B>Important:</B> If you are using this constructor, and you would like to use the parser
!      * to parse multiple times (multiple calls to parser.elements()), you must ensure the following:<br>
!      * <ul>
!      * <li>Before the first parse, you must mark the reader for a length that you anticipate (the size of the stream).</li>
!      * <li>After the first parse, calls to elements() must be preceded by calls to :
!      * <pre>
!      * parser.getReader().reset();
!      * </pre>
!      * </li>
!      * @param lexer The source for HTML to be parsed.
       */
      public Parser (Lexer lexer)
--- 309,317 ----

      /**
!      * Construct a parser using the provided lexer.
!      * A feedback object printing to {@link #stdout System.out} is used.
!      * This would be used to create a parser for special cases where the
!      * normal creation of a lexer on a URLConnection needs to be customized.
!      * @param lexer The lexer to draw characters from.
       */
      public Parser (Lexer lexer)
***************
*** 283,291 ****

      /**
!      * Constructor for non-standard access.
!      * A DefaultHTMLParserFeedback object is used for feedback.
       * @param connection A fully conditioned connection. The connect()
       * method will be called so it need not be connected yet.
!      * @see #Parser(URLConnection,ParserFeedback)
       */
      public Parser (URLConnection connection) throws ParserException
--- 321,333 ----

      /**
!      * Construct a parser using the provided URLConnection.
!      * This would be used to create a parser for a URLConnection that needs
!      * a special setup or negotiation conditioning beyond what is available
!      * from the {@link #getConnectionManager ConnectionManager}.
!      * A feedback object printing to {@link #stdout System.out} is used.
!      * @see #Parser(URLConnection,ParserFeedback)
       * @param connection A fully conditioned connection. The connect()
       * method will be called so it need not be connected yet.
!      * @throws ParserException If the creation of the underlying Lexer cannot be performed.
       */
      public Parser (URLConnection connection) throws ParserException
***************
*** 301,305 ****
       * Set the connection for this parser.
       * This method creates a new <code>Lexer</code> reading from the connection.
-      * Trying to set the connection to null is a noop.
       * @param connection A fully conditioned connection. The connect()
       * method will be called so it need not be connected yet.
--- 343,346 ----
***************
*** 313,318 ****
              ParserException
      {
!         if (null != connection)
!             setLexer (new Lexer (connection));
      }

--- 354,360 ----
              ParserException
      {
!         if (null == connection)
!             throw new IllegalArgumentException ("connection cannot be null");
!         setLexer (new Lexer (connection));
      }

***************
*** 320,324 ****
       * Return the current connection.
       * @return The connection either created by the parser or passed into this
!      * parser via <code>setConnection</code>.
       * @see #setConnection(URLConnection)
       */
--- 362,366 ----
       * Return the current connection.
       * @return The connection either created by the parser or passed into this
!      * parser via {@link #setConnection}.
       * @see #setConnection(URLConnection)
       */
***************
*** 331,336 ****
       * Set the URL for this parser.
       * This method creates a new Lexer reading from the given URL.
!      * Trying to set the url to null or an empty string is a noop.
!      * @see #setConnection(URLConnection)
       */
      public void setURL (String url)
--- 373,380 ----
       * Set the URL for this parser.
       * This method creates a new Lexer reading from the given URL.
!      * Trying to set the url to null or an empty string is a no-op.
!      * @param url The new URL for the parser.
!      * @throws ParserException If the url is invalid or creation of the
!      * underlying Lexer cannot be performed.
       */
      public void setURL (String url)
***************
*** 339,349 ****
      {
          if ((null != url) && !"".equals (url))
!             setConnection (Page.getConnectionManager ().openConnection (url));
      }

      /**
       * Return the current URL being parsed.
!      * @return The url passed into the constructor or the file name
!      * passed to the constructor modified to be a URL.
       */
      public String getURL ()
--- 383,395 ----
      {
          if ((null != url) && !"".equals (url))
!             setConnection (getConnectionManager ().openConnection (url));
      }

      /**
       * Return the current URL being parsed.
!      * @return The current url. This is the URL for the current page.
!      * A string passed into the constructor or set via setURL may be altered,
!      * for example, a file name may be modified to be a URL.
!      * @see Page#getUrl
       */
      public String getURL ()
***************
*** 355,358 ****
--- 401,408 ----
       * Set the encoding for the page this parser is reading from.
       * @param encoding The new character set to use.
+      * @throws ParserException If the encoding change causes characters that
+      * have already been consumed to differ from the characters that would
+      * have been seen had the new encoding been in force.
+      * @see org.htmlparser.util.EncodingChangeException
       */
      public void setEncoding (String encoding)
***************
*** 367,370 ****
--- 417,421 ----
       * This item is set from the HTTP header but may be overridden by meta
       * tags in the head, so this may change after the head has been parsed.
+      * @return The encoding currently in force.
       */
      public String getEncoding ()
***************
*** 375,383 ****
      /**
       * Set the lexer for this parser.
!      * The current NodeFactory is set on the given lexer, since the lexer
!      * contains the node factory object.
       * It does not adjust the <code>feedback</code> object.
!      * Trying to set the lexer to <code>null</code> is a noop.
       * @param lexer The lexer object to use.
       */
      public void setLexer (Lexer lexer)
--- 426,435 ----
      /**
       * Set the lexer for this parser.
!      * The current NodeFactory is transferred to (set on) the given lexer,
!      * since the lexer owns the node factory object.
       * It does not adjust the <code>feedback</code> object.
!      * Trying to set the lexer to <code>null</code> is a no-op.
       * @param lexer The lexer object to use.
+      * @see #setNodeFactory
       */
      public void setLexer (Lexer lexer)
***************
*** 405,409 ****

      /**
!      * Returns the reader associated with the parser
       * @return The current lexer.
       */
--- 457,461 ----

      /**
!      * Returns the lexer associated with the parser
       * @return The current lexer.
       */
***************
*** 415,419 ****
      /**
       * Get the current node factory.
!      * @return The parser's node factory.
       */
      public NodeFactory getNodeFactory ()
--- 467,471 ----
      /**
       * Get the current node factory.
!      * @return The current lexer's node factory.
       */
      public NodeFactory getNodeFactory ()
***************
*** 424,428 ****
      /**
       * Set the current node factory.
!      * @param factory The new node factory for the parser.
       */
      public void setNodeFactory (NodeFactory factory)
--- 476,480 ----
      /**
       * Set the current node factory.
!      * @param factory The new node factory for the current lexer.
       */
      public void setNodeFactory (NodeFactory factory)
***************
*** 435,439 ****
      /**
       * Sets the feedback object used in scanning.
!      * @param fb The new feedback object to use.
       */
      public void setFeedback (ParserFeedback fb)
--- 487,492 ----
      /**
       * Sets the feedback object used in scanning.
!      * @param fb The new feedback object to use. If this is null a
!      * {@link #noFeedback silent feedback object} is used.
       */
      public void setFeedback (ParserFeedback fb)
***************
*** 443,448 ****

      /**
!      * Returns the feedback.
!      * @return HTMLParserFeedback
       */
      public ParserFeedback getFeedback()
--- 496,501 ----

      /**
!      * Returns the current feedback object.
!      * @return The feedback object currently being used.
       */
      public ParserFeedback getFeedback()
***************
*** 457,460 ****
--- 510,515 ----
      /**
       * Reset the parser to start from the beginning again.
+      * This assumes support for a reset from the underlying
+      * {@link org.htmlparser.lexer.Source} object.
       */
      public void reset ()
***************
*** 464,488 ****

      /**
!      * Returns an iterator (enumeration) to the html nodes. Each node can be a tag/endtag/
!      * string/link/image<br>
!      * This is perhaps the most important method of this class. In typical situations, you will need to use
!      * the parser like this :
       * <pre>
!      * Parser parser = new Parser("http://www.yahoo.com");
!      * for (NodeIterator i = parser.elements();i.hasMoreElements();) {
!      *    Node node = i.nextHTMLNode();
!      *    if (node instanceof StringNode) {
!      *      // Downcasting to StringNode
!      *      StringNode stringNode = (StringNode)node;
!      *      // Do whatever processing you want with the string node
!      *      System.out.println(stringNode.getText());
!      *    }
!      *    // Check for the node or tag that you want
!      *    if (node instanceof ...) {
!      *      // Downcast, and process
!      *      // recursively (nodes within nodes)
!      *    }
       * }
       * </pre>
       */
      public NodeIterator elements () throws ParserException
--- 519,569 ----

      /**
!      * Returns an iterator (enumeration) over the html nodes.
!      * {@link org.htmlparser.nodes Nodes} can be of three main types:
!      * <ul>
!      * <li>{@link org.htmlparser.nodes.TagNode TagNode}</li>
!      * <li>{@link org.htmlparser.nodes.TextNode TextNode}</li>
!      * <li>{@link org.htmlparser.nodes.RemarkNode RemarkNode}</li>
!      * </ul>
!      * In general, when parsing with an iterator or processing a NodeList,
!      * you will need to use recursion. For example:
!      * <code>
       * <pre>
!      * void processMyNodes (Node node)
!      * {
!      *     if (node instanceof TextNode)
!      *     {
!      *         // downcast to TextNode
!      *         TextNode text = (TextNode)node;
!      *         // do whatever processing you want with the text
!      *         System.out.println (text.getText ());
!      *     }
!      *     if (node instanceof RemarkNode)
!      *     {
!      *         // downcast to RemarkNode
!      *         RemarkNode remark = (RemarkNode)node;
!      *         // do whatever processing you want with the comment
!      *     }
!      *     else if (node instanceof TagNode)
!      *     {
!      *         // downcast to TagNode
!      *         TagNode tag = (TagNode)node;
!      *         // do whatever processing you want with the tag itself
!      *         // ...
!      *         // process recursively (nodes within nodes) via getChildren()
!      *         NodeList list = tag.getChildren ();
!      *         if (null != list)
!      *             for (NodeIterator i = list.elements (); i.hasMoreElements (); )
!      *                 processMyNodes (i.nextNode ());
!      *     }
       * }
+      * 
+      * Parser parser = new Parser ("http://www.yahoo.com");
+      * for (NodeIterator i = parser.elements (); i.hasMoreElements (); )
+      *     processMyNodes (i.nextNode ());
       * </pre>
+      * </code>
+      * @throws ParserException If a parsing error occurs.
+      * @return An iterator over the top level nodes (usually {@.html <html>}).
       */
      public NodeIterator elements () throws ParserException
***************
*** 493,499 ****
      /**
       * Parse the given resource, using the filter provided.
-      * @param filter The filter to apply to the parsed nodes.
       * @return The list of matching nodes (for a <code>null</code>
       * filter this is all the top level nodes).
       */
      public NodeList parse (NodeFilter filter) throws ParserException
--- 574,582 ----
      /**
       * Parse the given resource, using the filter provided.
       * @return The list of matching nodes (for a <code>null</code>
       * filter this is all the top level nodes).
+      * @param filter The filter to apply to the parsed nodes,
+      * or <code>null</code> to retrieve all the top level nodes.
+      * @throws ParserException If a parsing error occurs.
       */
      public NodeList parse (NodeFilter filter) throws ParserException
***************
*** 516,520 ****
      }

!     public void visitAllNodesWith(NodeVisitor visitor) throws ParserException {
          Node node;
          visitor.beginParsing();
--- 599,612 ----
      }

!     /**
!      * Apply the given visitor to the current page.
!      * The visitor is passed to the <code>accept()</code> method of each node
!      * in the page in a depth first traversal. The visitor
!      * <code>beginParsing()</code> method is called prior to processing the
!      * page and <code>finishedParsing()</code> is called after the processing.
!      * @param visitor The visitor to visit all nodes with.
!      * @throws ParserException If a parse error occurs while traversing the page with the visitor.
!      */
!     public void visitAllNodesWith (NodeVisitor visitor) throws ParserException {
          Node node;
          visitor.beginParsing();
***************
*** 529,532 ****
--- 621,625 ----
       * Initializes the parser with the given input HTML String.
       * @param inputHTML the input HTML that is to be parsed.
+      * @throws ParserException If a error occurs in setting up the underlying Lexer.
       */
      public void setInputHTML (String inputHTML)
***************
*** 543,546 ****
--- 636,644 ----
       * Extract all nodes matching the given filter.
       * @see Node#collectInto(NodeList, NodeFilter)
+      * @param filter The filter to be applied to the nodes.
+      * @throws ParserException If a parse error occurs.
+      * @return A list of nodes matching the filter criteria,
+      * i.e. for which the filter's accept method
+      * returned <code>true</code>.
       */
      public NodeList extractAllNodesThatMatch (NodeFilter filter) throws ParserException
***************
*** 558,564 ****
      /**
       * Convenience method to extract all nodes of a given class type.
!      * @see Node#collectInto(NodeList, NodeFilter)
       */
!     public Node [] extractAllNodesThatAre (Class nodeType) throws ParserException
      {
          NodeList ret;
--- 656,669 ----
      /**
       * Convenience method to extract all nodes of a given class type.
!      * Equivalent to <code>extractAllNodesThatMatch (new NodeClassFilter (nodeType))</code>.
!      * @param nodeType The class of the nodes to collect.
!      * @throws ParserException If a parse error occurs.
!      * @return A list of nodes which have the class specified.
!      * @deprecated Use extractAllNodesThatMatch (new NodeClassFilter (nodeType)).
!      * @see #extractAllNodesThatAre
       */
!     public Node [] extractAllNodesThatAre (Class nodeType)
!         throws
!             ParserException
      {
          NodeList ret;
***************
*** 575,602 ****
      /**
       * Called just prior to calling connect.
!      * The connection has been conditioned with proxy, URL user/password,
!      * and cookie information. It is still possible to adjust the
!      * connection to alter the request method for example. 
       * @param connection The connection which is about to be connected.
!      * @exception This exception is thrown if the connection monitor
!      * wants the ConnectionManager to bail out.
       */
      public void preConnect (HttpURLConnection connection)
!     	throws
!     		ParserException
! 	{
          if (null != getFeedback ())
              getFeedback ().info (ConnectionManager.getRequestHeader (connection));
! 	}

!     /** Called just after calling connect.
!      * The response code and header fields can be examined.
       * @param connection The connection that was just connected.
!      * @exception This exception is thrown if the connection monitor
!      * wants the ConnectionManager to bail out.
       */
      public void postConnect (HttpURLConnection connection)
! 		throws
! 			ParserException
      {
          if (null != getFeedback ())
--- 680,708 ----
      /**
       * Called just prior to calling connect.
!      * Part of the ConnectionMonitor interface, this implementation just
!      * sends the request header to the feedback object if any.
       * @param connection The connection which is about to be connected.
!      * @throws ParserException <em>Not used</em>
!      * @see ConnectionMonitor#preConnect
       */
      public void preConnect (HttpURLConnection connection)
!         throws
!             ParserException
!     {
          if (null != getFeedback ())
              getFeedback ().info (ConnectionManager.getRequestHeader (connection));
!     }

!     /**
!      * Called just after calling connect.
!      * Part of the ConnectionMonitor interface, this implementation just
!      * sends the response header to the feedback object if any.
       * @param connection The connection that was just connected.
!      * @throws ParserException <em>Not used.</em>
!      * @see ConnectionMonitor#postConnect
       */
      public void postConnect (HttpURLConnection connection)
!         throws
!             ParserException
      {
          if (null != getFeedback ())
***************
*** 606,609 ****
--- 712,717 ----
      /**
       * The main program, which can be executed from the command line
+      * @param args A URL or file name to parse, and an optional tag name to be
+      * used as a filter.
       */
      public static void main (String [] args)
***************
*** 630,651 ****
          }
          else
! 	        try
! 	        {
! 	            parser = new Parser ();
! 	            if (1 < args.length)
! 	                filter = new TagNameFilter (args[1]);
! 	            else
! 	            {   // for a simple dump, use more verbose settings
! 	                filter = null;
! 	                parser.setFeedback (Parser.stdout);
! 	                getConnectionManager ().setMonitor (parser);
! 	            }
! 	            parser.setURL (args[0]);
! 	            System.out.println (parser.parse (filter));
! 	        }
! 	        catch (ParserException e)
! 	        {
! 	            e.printStackTrace ();
! 	        }
      }
  }
--- 738,759 ----
          }
          else
!             try
!             {
!                 parser = new Parser ();
!                 if (1 < args.length)
!                     filter = new TagNameFilter (args[1]);
!                 else
!                 {   // for a simple dump, use more verbose settings
!                     filter = null;
!                     parser.setFeedback (Parser.stdout);
!                     getConnectionManager ().setMonitor (parser);
!                 }
!                 parser.setURL (args[0]);
!                 System.out.println (parser.parse (filter));
!             }
!             catch (ParserException e)
!             {
!                 e.printStackTrace ();
!             }
      }
  }

Index: NodeFilter.java
===================================================================
RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/NodeFilter.java,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** NodeFilter.java	13 Feb 2005 20:36:01 -0000	1.2
--- NodeFilter.java	5 Apr 2005 00:48:10 -0000	1.3
***************
*** 44,47 ****
--- 44,48 ----
       * @return <code>true</code> if the node is to be kept, <code>false</code>
       * if it is to be discarded.
+      * @param node The node to test.
       */
      boolean accept (Node node);

[Htmlparser-cvs] htmlparser/src/org/htmlparser NodeFilter.java,1.2,1.3 Parser.java,1.103,1.104 packa

[Htmlparser-cvs] htmlparser/src/org/htmlparser NodeFilter.java,1.2,1.3 Parser.java,1.103,1.104 package.html,1.21,1.22