[Htmlparser-user] Doubts about HTML Parsers.
Brought to you by:
derrickoswald
|
From: Gaurav P. <gau...@gm...> - 2007-04-19 04:53:58
|
Hello Sir,
Thanks for your previous replies as they were of immense help to me.I have
few more doubts regarding the use of Html Parsers & for that i need your
help.
1) I have a doubt regarding the org.htmlparser.util.EncodingChangeException.
Actually this exception is getting thrown by the program whenever some sites
carrying a different charcter set probably charset=UTF-8 .
Can I use some tool to get rid of these exception ocuring in the program &
can i get the details about the Exceptions & where they can occur depending
on the use.
2) If I want to clear the advertisement by the Html parser & the
advertisement in plain text at the base of the page like:-
(c) 2007 Rediff.com India Limited. All Rights Reserved. *
Disclaimer* <http://www.rediff.com/disclaim.htm> |
*Feedback*<http://support.rediff.com/>
Can i implement the Parser in such Fashion to get rid of these tags OR
should i use some sort of Htmlcleaner in this case in parallel with the
HtmlParsers?.
Awaiting for your reply.
Thanks in advance.
Gaurav Pranay.
|