Re: [Htmlparser-developer] Writing OPTION tag
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-08-14 08:49:53
|
Hi Dhaval, > Thats exactly what happens. Everythign inside <OPTION ..> will be tag > and outside it will be HTMLStringNode however when I ahve to read > another <OPTIOn ....tag> wherein the previous OPTION tag did not have a > closing </OPTION> the later <OPTION....> tag gets read and since it is > once read it is unavailable for scanning again as a new Option tag. > Anyway I seem to have made my testcases work by storing the previous > node value and in case </OPTION> is not present I take care of it > accordingly. I have just added some more test cases to validate its > robustness. For the time being I think its done. Good question. I faced the same thing with several other tags. To counter this issue - you will find a variable in the evaluate() method - previousOpenScanner. Suppose you are trying to search for </OPTION> and encounter a <OPTION> instead, then evaluate actually allows you to do something about it. At that point, you must fool the open scanner into believing that the previous tag got closed. This is exactly whats done in HTMLLinkScanner. On seeing there was a previousOpenScanner, we accept it as true. And in scan(), the end tag (which wasnt there) is returned, putting in a correction, so that the next tag still gets parsed (in elementEnd() positioning). Let me know if you need more help. (You simply cant do this without testcases..) Cheers Somik ----- Original Message ----- From: <dha...@or...> To: <htm...@li...> Sent: Wednesday, August 14, 2002 5:15 PM Subject: RE: [Htmlparser-developer] Writing OPTION tag > Hi Somik, > > Thats exactly what happens. Everythign inside <OPTION ..> will be tag > and outside it will be HTMLStringNode however when I ahve to read > another <OPTIOn ....tag> wherein the previous OPTION tag did not have a > closing </OPTION> the later <OPTION....> tag gets read and since it is > once read it is unavailable for scanning again as a new Option tag. > Anyway I seem to have made my testcases work by storing the previous > node value and in case </OPTION> is not present I take care of it > accordingly. I have just added some more test cases to validate its > robustness. For the time being I think its done. > > Thanx for the response nevertheless. > > Regards, > > Dhaval Udani > Senior Analyst > M-Line, QPEG > OrbiTech Solutions Ltd. > +91-22-8290019 Extn. 1457 > > > > -----Original Message----- > From: somik [mailto:so...@ya...] > Sent: Wednesday, August 14, 2002 1:14 PM > To: htmlparser-developer > Cc: somik > Subject: Re: [Htmlparser-developer] Writing OPTION tag > > > Hi Dhaval, > Sorry, Ive been really swamped.. > > The problem with my input is that <OPTION value="AltaVista Search"> > > would be read as an OptionTag, AltaVista would be read as the > StringNode > > and then <OPTION value="Lycos Search"> would be read and since it is > > neither a StringNode nor an EndTag an OptionTag would be created for > the > > above 2 values. .. > > This idea is incorrect. <OPTION. .... > is a tag. Nothing inside the > Option > tag is a string node. > <OPTION ... > (this is HTMLTag) > some text here sdjklsdjk (this is HTMLStringNode) > </OPTION> (this is HTMLEndTag) > > HTH. > > Cheers, > Somik > > > > ------------------------------------------------------- > This sf.net email is sponsored by: Dice - The leading online job board > for high-tech professionals. Search and apply for tech jobs today! > http://seeker.dice.com/seeker.epl?rel_code=31 > _______________________________________________ > Htmlparser-developer mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-developer > > |