RE: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed Example
Brought to you by:
derrickoswald
From: Claude D. <CD...@ar...> - 2002-08-07 15:48:15
|
You are not only talented but very kind! Thanks. =20 BTW: I was giving some thought to the calls that take place in HTMLEnumeration. As far as I could tell, many internal calls were made twice, by virtue of the hasMoreNodes/nextHTMLNode pattern. An alternate pattern is repeated calls to nextHTMLNode which should stop when a null response is returned. This pattern is used by the BufferedReader.readLine method, by the JDBC ResultSet.next method, etc. Based on the simple observation that calls to hasMoreNodes AND nextHTMLNode run some of the same underlying code, it seems that the speed of the parser could be positively influenced by reducing the interface to a single call. Any thoughts? =20 -----Original Message----- From: Somik Raha [mailto:so...@ya...]=20 Sent: Tuesday, August 06, 2002 9:56 PM To: htm...@li... Cc: htm...@li... Subject: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed Example Hi Claude, This has been handled, related to the earlier fix. All potential infinite loops have been removed, and there will be no more hangings - only HTMLParserExceptions from now on. There will be a release having all these fixes this weekend. =20 Regards, Somik ----- Original Message -----=20 From: Claude <mailto:CD...@ar...> Duguay=20 To: htm...@li...=20 Sent: Wednesday, August 07, 2002 3:35 AM Subject: [Htmlparser-user] Another Ill-Formed Example Here's some markup we found in another document that causes the HTMLParser to hang. "<TITLE>KRP VALIDATION<PROCESS/TITLE>" So far, we've had 4 documents cause our process to come to a grinding halt. I would much prefer a policy of exception throwing to hangs asap, followed by consideration of whether unusual markup can be handled more elegantly in a subsequent phase. Thanks to everyone, as always. =20 |