Re: [Htmlparser-developer] Re: [Htmlparser-user] Another Ill-Formed Example
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-08-10 08:17:21
|
Hi Claude, =20 You've again raised a good point. I will look into this for next = week's release. Regards Somik ----- Original Message -----=20 From: Claude Duguay=20 To: htm...@li...=20 Sent: Friday, August 09, 2002 12:58 AM Subject: RE: [Htmlparser-developer] Re: [Htmlparser-user] Another = Ill-Formed Example Based on your description there is a risk that calling hasMoreNodes = without calling nextHTMLNode a few times in a row will not have the = desired API semantics. If the parsing takes place in the call to = hasMoreNodes, then the parser moves forward, regardless of whether the = nextHTMLNode method was called. This suggests that the method should be = called something else, more indicative of this behavior, or the behavior = should be changed. =20 -----Original Message-----=20 From: Somik Raha [mailto:so...@ya...]=20 Sent: Thu 8/8/2002 12:07 AM=20 To: htm...@li...=20 Cc:=20 Subject: Re: [Htmlparser-developer] Re: [Htmlparser-user] Another = Ill-Formed Example Hi Claude, Thanks for the kind words. BTW: I was giving some thought to the calls that take place in = HTMLEnumeration. As far as I could tell, many internal calls were made = twice, by virtue of the hasMoreNodes/nextHTMLNode pattern. An alternate = pattern is repeated calls to nextHTMLNode which should stop when a null = response is returned. This pattern is used by the = BufferedReader.readLine method, by the JDBC ResultSet.next method, etc. = Based on the simple observation that calls to hasMoreNodes AND = nextHTMLNode run some of the same underlying code, it seems that the = speed of the parser could be positively influenced by reducing the = interface to a single call. Any thoughts? I am not so sure this would be a good idea, because then, we'd have to = compromise on the API. Then users would have to be checking for null = values- the iterator interface is also one that is popular and we have = a familiarity factor here. As far as optimization goes, the nextHTMLNode doesent do parsing, it = simply returns the node that was parsed internally when hasMoreNodes() = was called. So, the only speed up would be in the reduction of a call - = I am not so sure that this would be the best place for such a speedup. Bytway, talking about speedups, the last release and the next one = should see some tweaks - and the performance ought to have gotten = better. Are you still doing the performance testing ? Any results to = share ? Cheers, Somik |