RE: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed Example
Brought to you by:
derrickoswald
|
From: Claude D. <CD...@ar...> - 2002-08-07 15:48:15
|
You are not only talented but very kind! Thanks.
=20
BTW: I was giving some thought to the calls that take place in
HTMLEnumeration. As far as I could tell, many internal calls were made
twice, by virtue of the hasMoreNodes/nextHTMLNode pattern. An alternate
pattern is repeated calls to nextHTMLNode which should stop when a null
response is returned. This pattern is used by the
BufferedReader.readLine method, by the JDBC ResultSet.next method, etc.
Based on the simple observation that calls to hasMoreNodes AND
nextHTMLNode run some of the same underlying code, it seems that the
speed of the parser could be positively influenced by reducing the
interface to a single call. Any thoughts?
=20
-----Original Message-----
From: Somik Raha [mailto:so...@ya...]=20
Sent: Tuesday, August 06, 2002 9:56 PM
To: htm...@li...
Cc: htm...@li...
Subject: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed
Example
Hi Claude,
This has been handled, related to the earlier fix. All potential
infinite loops have been removed, and there will be no more hangings -
only HTMLParserExceptions from now on.
There will be a release having all these fixes this weekend.
=20
Regards,
Somik
----- Original Message -----=20
From: Claude <mailto:CD...@ar...> Duguay=20
To: htm...@li...=20
Sent: Wednesday, August 07, 2002 3:35 AM
Subject: [Htmlparser-user] Another Ill-Formed Example
Here's some markup we found in another document that causes the
HTMLParser to hang.
"<TITLE>KRP VALIDATION<PROCESS/TITLE>"
So far, we've had 4 documents cause our process to come to a grinding
halt. I would much prefer a policy of exception throwing to hangs asap,
followed by consideration of whether unusual markup can be handled more
elegantly in a subsequent phase. Thanks to everyone, as always.
=20
|