Re: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed Example

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Claude,   =20
    You've again raised a good point. I will look into this for next =
week's release.

Regards
Somik
  ----- Original Message -----=20
  From: Claude Duguay=20
  To: htm...@li...=20
  Sent: Friday, August 09, 2002 12:58 AM
  Subject: RE: [Htmlparser-developer] Re: [Htmlparser-user] Another =
Ill-Formed Example

  Based on your description there is a risk that calling hasMoreNodes =
without calling nextHTMLNode a few times in a row will not have the =
desired API semantics. If the parsing takes place in the call to =
hasMoreNodes, then the parser moves forward, regardless of whether the =
nextHTMLNode method was called. This suggests that the method should be =
called something else, more indicative of this behavior, or the behavior =
should be changed.
  =20
  -----Original Message-----=20
  From: Somik Raha [mailto:so...@ya...]=20
  Sent: Thu 8/8/2002 12:07 AM=20
  To: htm...@li...=20
  Cc:=20
  Subject: Re: [Htmlparser-developer] Re: [Htmlparser-user] Another =
Ill-Formed Example

  Hi Claude,
      Thanks for the kind words.

  BTW: I was giving some thought to the calls that take place in =
HTMLEnumeration. As far as I could tell, many internal calls were made =
twice, by virtue of the hasMoreNodes/nextHTMLNode pattern. An alternate =
pattern is repeated calls to nextHTMLNode which should stop when a null =
response is returned. This pattern is used by the =
BufferedReader.readLine method, by the JDBC ResultSet.next method, etc. =
Based on the simple observation that calls to hasMoreNodes AND =
nextHTMLNode run some of the same underlying code, it seems that the =
speed of the parser could be positively influenced by reducing the =
interface to a single call. Any thoughts?

  I am not so sure this would be a good idea, because then, we'd have to =
compromise on the API. Then users would have to be checking for null =
values-  the iterator interface is also one that is popular and we have =
a familiarity factor here.

  As far as optimization goes, the nextHTMLNode doesent do parsing, it =
simply returns the node that was parsed internally when hasMoreNodes() =
was called. So, the only speed up would be in the reduction of a call - =
I am not so sure that this would be the best place for such a speedup.

  Bytway, talking about speedups, the last release and the next one =
should see some tweaks - and the performance ought to have gotten =
better. Are you still doing the performance testing ? Any results to =
share ?

  Cheers,
  Somik