Hi Developers,
Some big changes.
[1] Performance fix in HTMLStringNode. The next release of the =
parser will be twice as fast as ver 1.1. Actually till the previous =
release, 1.2 was 20% slower than 1.1 - thanks to Claude Duguay for =
pointing this out. I was able to fix this after some profiling with =
JProbe - it seems toString() is very bad - it gives a big hit.=20
[2] Bug in HTMLScriptScanner - if the html in the script code is =
bad, it would crash. The script scanner is not supposed to care about =
the java script code in it. This has been done by removing all other =
scanners during the scan, and putting them back in after the parsing is =
done.
[3] Bug in HTMLFormScanner - if the form code is broken (no =
</form>), there is no way to tell when to put it in. I thought I could =
look for </table> but if you have nested tables that wont work. So, for =
the moment, HTMLFormScanner is no longer registered in the standard set =
of scanners - till I can find some elegant fix for this..
I'd be grateful if anyone has suggestions for [3]. Watch out for the =
release this week.
Regards,
Somik
=20
|