Re: Fw: [Htmlparser-user] Bad formed web page

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Cedric,
Thanks for this fix. But when I download the CVS version of HTMLParser =
and=20
try to parse the page again I get this error:
"java.lang.OutOfMemoryError
         <<no stack trace available>>
Exception in thread "main" "

Is-it normal ? Should I catch this error and write my own code around ?

Its highly abnormal... It should not happen - are you trying it with the =
same piece of html ? Send me the data you are trying on. If its the same =
page, it works perfectly on my end. I am running HTMLParser (main) with =
no params except the file name.

Other question, I can't run the software with two options. Is-it normal =
?=20
Why don't you set the options before the name of the file to parse ?

Yes, this is normal (a feature, not a bug). This is bcos the options are =
intended only as a demo, and I didnt think it'd really be of use to =
people. Are you actually using it this way ? Also, another thing is I am =
not full time on this, so I'd be grateful if you can join up as a =
developer and make this fix.

All code recieved from developers is acknowledged both in the code, and =
the Contributors page that goes out with each release. You can send me =
your sourceforge id and I can add you as a developer.

It can be used like this:
public HTMLStringNode(String text,int textBegin,int textEnd)
{
   NormalizeHtmlCode normalizer =3D new NormalizeHtmlCode();
   this.text =3D normalizer.html2text(text);
   this.textBegin =3D textBegin;
   this.textEnd =3D textEnd;
}
You can implement it with the meta-tags, ...

This is cool. I think it will be useful in the toPlainString() method, =
where we can get the actual meaningful text out. I'd be glad to include =
this as soon as I find some time. Or Tariq can also join as a developer =
and I can give him CVS access to do it.

Thanks a lot for your participation.

Cheers,
Somik