Hi Mats,
HTMLParser.elements returns an Enumeration. So you can enumerate through
a list of nodes. This is actually the Iterator design pattern.
HTMLNode is the interface that represents just about any kind of html
element. The element might be a string node, a remark node, a tag, or end
tag. If it is a tag - then there are several types of tags - and that forms
another hierarchy.
All this is explained with class diagrams at :
http://htmlparser.sourceforge.net/design/index.html
http://htmlparser.sourceforge.net/design/tags.html (this shows the HTMLNode
particularly).
To use the parser is quite simple - from the user perspective, you only
need a loop -
HTMLNode node;
for (Enumeration e = parser.elements();e.hasMoreElements();) {
node = (HTMLNode)e.nextElement();
// Now you have an object of type HTMLNode.
// This is however of a type which implements HTMLNode. So you can use
instanceof if you are interested
// in particular types. Or you can use reflections to find out
information about the object itself. The former is usually
// what is used by most folks.
// Suppose you want to only print strings, you will want to take action
if the node is a HTMLStringNode
if (node instanceof HTMLStringNode) {
// Yes, now we can downcast it to HTMLStringNode
HTMLStringNode stringNode = (HTMLStringNode)node;
// Print the contents of the string node
System.out.println(stringNode.getText());
}
}
HTH. Pls feel free to ask any questions that you have.
Regards,
Somik
----- Original Message -----
From: "Sodergren, M.G." <mg...@le...>
To: <htm...@li...>
Sent: Tuesday, April 16, 2002 6:26 PM
Subject: [Htmlparser-user] the parser
I have a problem with the following:
node = (HTMLNode)e.nextElement();
Please tell me what is the content and return type of
HTMLParser.elements(), and how the HTMLNode is defined.
_______________________________________________
Htmlparser-user mailing list
Htm...@li...
https://lists.sourceforge.net/lists/listinfo/htmlparser-user
|