Re: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed Example
Brought to you by:
derrickoswald
|
From: Somik R. <so...@ya...> - 2002-08-10 08:17:21
|
Hi Claude, =20
You've again raised a good point. I will look into this for next =
week's release.
Regards
Somik
----- Original Message -----=20
From: Claude Duguay=20
To: htm...@li...=20
Sent: Friday, August 09, 2002 12:58 AM
Subject: RE: [Htmlparser-developer] Re: [Htmlparser-user] Another =
Ill-Formed Example
Based on your description there is a risk that calling hasMoreNodes =
without calling nextHTMLNode a few times in a row will not have the =
desired API semantics. If the parsing takes place in the call to =
hasMoreNodes, then the parser moves forward, regardless of whether the =
nextHTMLNode method was called. This suggests that the method should be =
called something else, more indicative of this behavior, or the behavior =
should be changed.
=20
-----Original Message-----=20
From: Somik Raha [mailto:so...@ya...]=20
Sent: Thu 8/8/2002 12:07 AM=20
To: htm...@li...=20
Cc:=20
Subject: Re: [Htmlparser-developer] Re: [Htmlparser-user] Another =
Ill-Formed Example
Hi Claude,
Thanks for the kind words.
BTW: I was giving some thought to the calls that take place in =
HTMLEnumeration. As far as I could tell, many internal calls were made =
twice, by virtue of the hasMoreNodes/nextHTMLNode pattern. An alternate =
pattern is repeated calls to nextHTMLNode which should stop when a null =
response is returned. This pattern is used by the =
BufferedReader.readLine method, by the JDBC ResultSet.next method, etc. =
Based on the simple observation that calls to hasMoreNodes AND =
nextHTMLNode run some of the same underlying code, it seems that the =
speed of the parser could be positively influenced by reducing the =
interface to a single call. Any thoughts?
I am not so sure this would be a good idea, because then, we'd have to =
compromise on the API. Then users would have to be checking for null =
values- the iterator interface is also one that is popular and we have =
a familiarity factor here.
As far as optimization goes, the nextHTMLNode doesent do parsing, it =
simply returns the node that was parsed internally when hasMoreNodes() =
was called. So, the only speed up would be in the reduction of a call - =
I am not so sure that this would be the best place for such a speedup.
Bytway, talking about speedups, the last release and the next one =
should see some tweaks - and the performance ought to have gotten =
better. Are you still doing the performance testing ? Any results to =
share ?
Cheers,
Somik
|