Re: [Htmlparser-user] EncodingChangeException: character mismatch

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Marcin <bigger <at> op.pl> writes:

> 
> Dear Derrick,
> 
> > >I get the following error:
> > >
> > >org.htmlparser.util.EncodingChangeException: character mismatch (new: ?
> !=
> > >old:
> > >¬) for encoding change from ISO-8859-2 to ISO-8859-1 at character offset
> > >4162
> > >Output from LinkExtractor example.
> > >
> > >If I'll try-catch it I won't get any resoult. What can I do with it?
> 
> > The exception is thrown because some of the nodes already given out are
> > in error.  You can try a second time after discarding the information
> > you've gained so far, like StringBean does:
> 
> Thank you for answer but I it's no good solution :( Please try LinkBean
> example with that code:
> 
> import java.net.URL;
> import org.htmlparser.beans.LinkBean;
> 
> public class LinkDemo
> {
>     public static void main (String[] args)
>     {
>         LinkBean lb = new LinkBean ();
>         lb.setURL ("http://www.puszta.pl");
>         URL[] urls = lb.getLinks ();
>         for (int i = 0; i < urls.length; i++)
>             System.out.println (urls[i]);
>     }
> }
> 
> Exception in thread "main" java.lang.NullPointerException
>         at LinkDemo.main(LinkDemo.java:11)
> 
> I can deal with that page with low level lexer but there must by a way to
> extract links from pages with mixed up encodings with NodeVisitor. Is it?
> 
> Greets,
> B
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id70&alloc_id638&op=click
> 

hi,

this is regarding java.lang.nullpointerException

i am extracting urls using LinkBean

LinkBean lb = new LinkBean ();
lb.setURL ("http://www.puszta.pl");
URL[] urls = lb.getLinks ();

Instead of "http://www.puszta.pl" i am giving input from DB. Here am repeatedly
executing the above code to extract urls of given website name from DB. In this
case, its get executing well for around 1500 inputs when it goes more than that
it throws java.lang.nullpointerException error.

I am trying to fix this problem since last one week but i didn't get. I shall be
grateful to you if you provide me solution for this.

Thank indeed,,,