You might consider having constructor variants that use the
DefaultHTMLParserFeedback (by default ;-) so that users don't get
confused.
-----Original Message-----
From: Somik Raha [mailto:so...@ya...]=20
Sent: Monday, September 30, 2002 3:17 AM
To: htm...@li...
Subject: Re: [Htmlparser-user] Hi All
Hi Drew,
> I have a doubt...I am trying to extract only the text from the
> html pages..But i just could nto get it..I have seen the
> HTMLStringFilter.java..Bu t I could not add it to the existing
> ones and run..bcoz in that the HTML parser has only one argument
> passed whereas other even have the feedback...and also if it
> shoudl work what feedback do we give...I mean a (T or i or s or l)
> And i guess the jar file does not have code fro extracting the
> text..
Sorry bout that - the web page hasnt been updated for a while. You will
need
to create a feedback object. If you dont need feedback from the parser,
use
the default one that we've provided in the com.kizna.html.util package.
Try this :
* Below is some sample code to parse Yahoo.com and print only the text
information. This scanning
* will run faster, as there are no scanners registered here.
HTMLParser parser =3D new HTMLParser("http://www.yahoo.com",new
DefaultHTMLParserFeedback());
// In this example, none of the scanners need to be registered
// as a string node is not a tag to be scanned for.
for (Enumeration e =3D parser.elements();e.hasMoreElements();) {
HTMLNode node =3D (HTMLNode)e.nextElement();
if (node instanceof HTMLStringNode) {
HTMLStringNode stringNode =3D (HTMLStringNode)node;
System.out.println(stringNode.getText());
}
}
Let us know if you still face problems.
Regards,
Somik
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Htmlparser-user mailing list
Htm...@li...
https://lists.sourceforge.net/lists/listinfo/htmlparser-user
|