Hi Cedric,
This was a very good bug report. This turned out to be a deep bug - =
but easy to fix. HTMLParser does auto correction of tags when inverted =
commas are not provided. However, this can conflict with certain tags =
where they are provided. So to provide some intelligence into the =
parser- there is this feature of "strictness".=20
This allows you to tell the parser when to be strict and when not to =
be. This makes sense in situations when you know, that the html coder =
would not make a mistake, and if he does, browsers like IE would crash. =
Examples of such tags would be INPUT - for applets, if you are providing =
complex params, they must be within inverted commas or it confuses the =
browser. I have added the META tag also to this strictness list.
Also, there was an issue with HTMLTag.java itself related to this =
report.
Thank you very much for this bug report - you can try the =
StringExtractor on the url you gave, the entire text comes out cleanly. =
(Check out from CVS and build, or wait for the next release)
Cheers,
Somik
----- Original Message -----=20
From: C=E9dric Rosa=20
To: htm...@li...=20
Sent: Tuesday, July 16, 2002 7:38 PM
Subject: [Htmlparser-user] Another bug
Hi,
When I parse this url: www.cybergeo.presse.fr\culture\weili\weili.htm =
no=20
text is found.
With my daily bugs reports, you might think that I want to break your=20
software lol ... excuse me for testing with "space" url :)
Cedric.
-------------------------------------------------------
This sf.net email is sponsored by: Jabber - The world's fastest =
growing=20
real-time communications platform! Don't just IM. Build it in!=20
http://www.jabber.com/osdn/xim
_______________________________________________
Htmlparser-user mailing list
Htm...@li...
https://lists.sourceforge.net/lists/listinfo/htmlparser-user
|