Somik Raha wrote:
> Hi Derrick, Everyone,
>
> I checked out the latest code, and I found the following problems :
> [1] testHTTPCharset and testHTMLCharset still fail. This time the
> failures are : Could not open http://www.ibm.co.jp (I tried changing
> it to www.ibm.com/jp <http://www.ibm.com/jp> but that didnt help.
I'm baffled, it runs through for me (all 294 tests, thank you). Is
anyone else having trouble running these tests?
>
> [2] HTMLLinkBeanInfo and HTMLTextBeanInfo dont compile. Are you
> relying on something in JDK 1.4 ? Bytway, htmlparser is JDK 1.1
> compliant. I am not sure if that should change, but then again, it
> really depends on the users of the parser.
1.1 compatibility is news to me. This has probably been broken for a
while (see ArrayList in ChainedException and Iterator in Translate).
Version 1.1 is usually mandated by old browser JVM support, or legacy
(unsupported) operating systems. I don't think it's an issue here since
it's not running as an applet. The use of the Vector class (required
under JDK 1.1) is a bit of a performance hit since the class is
synchronized.
For now just delete those 'BeanInfo' files and it should compile OK, but
I'll see if I can fix it for you, and then a decision can be made about
continuing 1.1 support.
>
> [3] I am wondering if org.htmlparser.beans should exist outside the
> main htmlparser module - in a module of its own within the htmlparser
> project. That way the parser workspace could be focussed on the
> parsing functionality. What do you think ?
Hmm. There's a lot of overhead in adding another configuration item.
Any other developers have opinions on this?
>
> [4] I've fixed the bug that was being caught by testExtractLinkBug2 -
> based on Sam's suggestion, but I think more work needs to be done.
> Currently, HTMLStringNode has turned thread-unsafe due to its
> customizability. Since it is static, thread-safety goes out the
> window. The obvious refactoring would be to have non-static class(es)
> for the basic automata - which would have a 1:1 mapping with the
> parser instance.
>
> Regards,
> Somik
>
>
|