RE: [Htmlparser-user] Hi All
Brought to you by:
derrickoswald
From: Claude D. <CD...@ar...> - 2002-09-30 15:43:59
|
You might consider having constructor variants that use the DefaultHTMLParserFeedback (by default ;-) so that users don't get confused. -----Original Message----- From: Somik Raha [mailto:so...@ya...]=20 Sent: Monday, September 30, 2002 3:17 AM To: htm...@li... Subject: Re: [Htmlparser-user] Hi All Hi Drew, > I have a doubt...I am trying to extract only the text from the > html pages..But i just could nto get it..I have seen the > HTMLStringFilter.java..Bu t I could not add it to the existing > ones and run..bcoz in that the HTML parser has only one argument > passed whereas other even have the feedback...and also if it > shoudl work what feedback do we give...I mean a (T or i or s or l) > And i guess the jar file does not have code fro extracting the > text.. Sorry bout that - the web page hasnt been updated for a while. You will need to create a feedback object. If you dont need feedback from the parser, use the default one that we've provided in the com.kizna.html.util package. Try this : * Below is some sample code to parse Yahoo.com and print only the text information. This scanning * will run faster, as there are no scanners registered here. HTMLParser parser =3D new HTMLParser("http://www.yahoo.com",new DefaultHTMLParserFeedback()); // In this example, none of the scanners need to be registered // as a string node is not a tag to be scanned for. for (Enumeration e =3D parser.elements();e.hasMoreElements();) { HTMLNode node =3D (HTMLNode)e.nextElement(); if (node instanceof HTMLStringNode) { HTMLStringNode stringNode =3D (HTMLStringNode)node; System.out.println(stringNode.getText()); } } Let us know if you still face problems. Regards, Somik ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |