Thread: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-04-08 04:06:58
|
Hi Raghav > when would be this HTMLparser 1.1 out? As soon as I can wrap it up. Technically, the code is ready and already checked into CVS. I need to do the process of creating a release - make some documentation, check everything is ok, .. If I had some help I could wrap it up sooner. > I am not sure, but to me the way htmlparser parses is it gives me the tag > parameter of the first line in the above snippet of html code, when I do > Hashtable table = tag.parseParameters(); > it is looking for parameters inside <FORM ..... >, but not <FORM > .....</FORM> Yes - parseParameters() will give you the stuff inside the FORM tag. That is what I call "microscopic" parsing. But to get the remaining tags - till you encounter </FORM> you need to do "macroscopic" parsing. This is not hard- check HTMLAppletScanner as an example. In a nutshell - concept is very simple. The scan method provides you with a reader. So you are to use that reader to read ahead and get the next tags. This is simple bcos the reader will automatically identify the correct tags, and the mechanism is very similar to using the parser to get the tags you want. The HTMLLinkScanner among others, also works on the same principle. Bytway - I think we should take this discussion to the Developer list. Regards, Somik ----- Original Message ----- From: "Raghavender Srimantula" <kin...@ho...> To: <htm...@li...> Sent: Monday, April 08, 2002 6:39 AM Subject: [Htmlparser-user] HTML parser 1.1 > Hi Somik, > when would be this HTMLparser 1.1 out? > one more question. to parse the FORM tags, I have a small question. > let us say this is a form tag > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > <P>User name: > <INPUT TYPE="text" NAME="userName" SIZE="10"> > <P>Password: > <INPUT TYPE="password" NAME="password" SIZE="12"> > <P><INPUT TYPE="submit" VALUE="Log in"> > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > </FORM> > > I am not sure, but to me the way htmlparser parses is it gives me the tag > parameter of the first line in the above snippet of html code, when I do > Hashtable table = tag.parseParameters(); > it is looking for parameters inside <FORM ..... >, but not <FORM > .....</FORM> > > could you suggest me how to go ahead with this. > Raghav > > > to extract the INPUT tag parameters > > > > > > _________________________________________________________________ > MSN Photos is the easiest way to share and print your photos: > http://photos.msn.com/support/worldwide.aspx > > > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: Raghavender S. <kin...@ho...> - 2002-04-09 10:01:43
|
hi Somik, question regarding the form parsing. let us say I have this tag <SELECT name="pulldown" class="smaller-text"> so now when I do a node = reader.readElement(); if I do a node.print(), I get Begin Tag : SELECT name="pulldown" class="smaller-text"; begins at : 0; ends at : 44 this node which I get is of neither HTMLRemarkNode, HTMLStringNode, HTMLEndTag. I am not sure how to classify this. because if I want to take some action here I need to classify this node. could you help me out. Raghav >From: "Somik Raha" <so...@ya...> >To: "Raghavender Srimantula" <kin...@ho...> >CC: <htm...@li...> >Subject: Re: [Htmlparser-user] HTML parser 1.1 >Date: Mon, 8 Apr 2002 13:04:07 +0900 > >Hi Raghav > > when would be this HTMLparser 1.1 out? >As soon as I can wrap it up. Technically, the code is ready and already >checked into CVS. I need to do the process of creating a release - make >some >documentation, check everything is ok, .. >If I had some help I could wrap it up sooner. > > > I am not sure, but to me the way htmlparser parses is it gives me the >tag > > parameter of the first line in the above snippet of html code, when I do > > Hashtable table = tag.parseParameters(); > > it is looking for parameters inside <FORM ..... >, but not <FORM > > .....</FORM> > >Yes - parseParameters() will give you the stuff inside the FORM tag. That >is >what I call "microscopic" parsing. But to get the remaining tags - till you >encounter </FORM> you need to do "macroscopic" parsing. This is not hard- >check HTMLAppletScanner as an example. > >In a nutshell - concept is very simple. The scan method provides you with a >reader. So you are to use that reader to read ahead and get the next tags. >This is simple bcos the reader will automatically identify the correct >tags, >and the mechanism is very similar to using the parser to get the tags you >want. The HTMLLinkScanner among others, also works on the same principle. > >Bytway - I think we should take this discussion to the Developer list. > >Regards, >Somik >----- Original Message ----- >From: "Raghavender Srimantula" <kin...@ho...> >To: <htm...@li...> >Sent: Monday, April 08, 2002 6:39 AM >Subject: [Htmlparser-user] HTML parser 1.1 > > > > Hi Somik, > > when would be this HTMLparser 1.1 out? > > one more question. to parse the FORM tags, I have a small question. > > let us say this is a form tag > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > > <P>User name: > > <INPUT TYPE="text" NAME="userName" SIZE="10"> > > <P>Password: > > <INPUT TYPE="password" NAME="password" SIZE="12"> > > <P><INPUT TYPE="submit" VALUE="Log in"> > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > > </FORM> > > > > I am not sure, but to me the way htmlparser parses is it gives me the >tag > > parameter of the first line in the above snippet of html code, when I do > > Hashtable table = tag.parseParameters(); > > it is looking for parameters inside <FORM ..... >, but not <FORM > > .....</FORM> > > > > could you suggest me how to go ahead with this. > > Raghav > > > > > > to extract the INPUT tag parameters > > > > > > > > > > > > _________________________________________________________________ > > MSN Photos is the easiest way to share and print your photos: > > http://photos.msn.com/support/worldwide.aspx > > > > > > _______________________________________________ > > Htmlparser-user mailing list > > Htm...@li... > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > >_________________________________________________________ >Do You Yahoo!? >Get your free @yahoo.com address at http://mail.yahoo.com > _________________________________________________________________ Join the worlds largest e-mail service with MSN Hotmail. http://www.hotmail.com |
From: Somik R. <so...@ya...> - 2002-04-09 14:42:52
|
Hi Raghav > Begin Tag : SELECT name="pulldown" class="smaller-text"; begins at : 0; ends > at : 44 > > this node which I get is of neither HTMLRemarkNode, HTMLStringNode, > HTMLEndTag. Thats right- this is expected behaviour. The type of this node is HTMLTag. If you downcast to HTMLTag, you can get all the info. Regards, Somik ----- Original Message ----- From: "Raghavender Srimantula" <kin...@ho...> To: <so...@ya...> Cc: <htm...@li...> Sent: Tuesday, April 09, 2002 7:01 PM Subject: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1 > hi Somik, > question regarding the form parsing. let us say I have this tag > <SELECT name="pulldown" class="smaller-text"> > > so now when I do a > node = reader.readElement(); > > if I do a node.print(), I get > > Begin Tag : SELECT name="pulldown" class="smaller-text"; begins at : 0; ends > at : 44 > > this node which I get is of neither HTMLRemarkNode, HTMLStringNode, > HTMLEndTag. > I am not sure how to classify this. because if I want to take some action > here I need to classify this node. > could you help me out. > Raghav > > > >From: "Somik Raha" <so...@ya...> > >To: "Raghavender Srimantula" <kin...@ho...> > >CC: <htm...@li...> > >Subject: Re: [Htmlparser-user] HTML parser 1.1 > >Date: Mon, 8 Apr 2002 13:04:07 +0900 > > > >Hi Raghav > > > when would be this HTMLparser 1.1 out? > >As soon as I can wrap it up. Technically, the code is ready and already > >checked into CVS. I need to do the process of creating a release - make > >some > >documentation, check everything is ok, .. > >If I had some help I could wrap it up sooner. > > > > > I am not sure, but to me the way htmlparser parses is it gives me the > >tag > > > parameter of the first line in the above snippet of html code, when I do > > > Hashtable table = tag.parseParameters(); > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > .....</FORM> > > > >Yes - parseParameters() will give you the stuff inside the FORM tag. That > >is > >what I call "microscopic" parsing. But to get the remaining tags - till you > >encounter </FORM> you need to do "macroscopic" parsing. This is not hard- > >check HTMLAppletScanner as an example. > > > >In a nutshell - concept is very simple. The scan method provides you with a > >reader. So you are to use that reader to read ahead and get the next tags. > >This is simple bcos the reader will automatically identify the correct > >tags, > >and the mechanism is very similar to using the parser to get the tags you > >want. The HTMLLinkScanner among others, also works on the same principle. > > > >Bytway - I think we should take this discussion to the Developer list. > > > >Regards, > >Somik > >----- Original Message ----- > >From: "Raghavender Srimantula" <kin...@ho...> > >To: <htm...@li...> > >Sent: Monday, April 08, 2002 6:39 AM > >Subject: [Htmlparser-user] HTML parser 1.1 > > > > > > > Hi Somik, > > > when would be this HTMLparser 1.1 out? > > > one more question. to parse the FORM tags, I have a small question. > > > let us say this is a form tag > > > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > > > <P>User name: > > > <INPUT TYPE="text" NAME="userName" SIZE="10"> > > > <P>Password: > > > <INPUT TYPE="password" NAME="password" SIZE="12"> > > > <P><INPUT TYPE="submit" VALUE="Log in"> > > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > > > </FORM> > > > > > > I am not sure, but to me the way htmlparser parses is it gives me the > >tag > > > parameter of the first line in the above snippet of html code, when I do > > > Hashtable table = tag.parseParameters(); > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > .....</FORM> > > > > > > could you suggest me how to go ahead with this. > > > Raghav > > > > > > > > > to extract the INPUT tag parameters > > > > > > > > > > > > > > > > > > _________________________________________________________________ > > > MSN Photos is the easiest way to share and print your photos: > > > http://photos.msn.com/support/worldwide.aspx > > > > > > > > > _______________________________________________ > > > Htmlparser-user mailing list > > > Htm...@li... > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > >_________________________________________________________ > >Do You Yahoo!? > >Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > > _________________________________________________________________ > Join the world's largest e-mail service with MSN Hotmail. > http://www.hotmail.com > > > _______________________________________________ > Htmlparser-developer mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-developer _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: Raghavender S. <kin...@ho...> - 2002-04-11 00:23:02
|
hi Somik, any ideas about my previous mail. let us say if we have <OPTION value="#">Select a destination</OPTION> when I do a node = reader.readElement(); where "reader" is HTMLReader the node I get is of type neither HTMLStringNode, HTMLEndTag, HTMLRemarkNode. how do I classify this if I want to do some thing with them. Raghav >From: "Somik Raha" <so...@ya...> >To: "Raghavender Srimantula" <kin...@ho...> >CC: <htm...@li...> >Subject: Re: [Htmlparser-user] HTML parser 1.1 >Date: Mon, 8 Apr 2002 13:04:07 +0900 > >Hi Raghav > > when would be this HTMLparser 1.1 out? >As soon as I can wrap it up. Technically, the code is ready and already >checked into CVS. I need to do the process of creating a release - make >some >documentation, check everything is ok, .. >If I had some help I could wrap it up sooner. > > > I am not sure, but to me the way htmlparser parses is it gives me the >tag > > parameter of the first line in the above snippet of html code, when I do > > Hashtable table = tag.parseParameters(); > > it is looking for parameters inside <FORM ..... >, but not <FORM > > .....</FORM> > >Yes - parseParameters() will give you the stuff inside the FORM tag. That >is >what I call "microscopic" parsing. But to get the remaining tags - till you >encounter </FORM> you need to do "macroscopic" parsing. This is not hard- >check HTMLAppletScanner as an example. > >In a nutshell - concept is very simple. The scan method provides you with a >reader. So you are to use that reader to read ahead and get the next tags. >This is simple bcos the reader will automatically identify the correct >tags, >and the mechanism is very similar to using the parser to get the tags you >want. The HTMLLinkScanner among others, also works on the same principle. > >Bytway - I think we should take this discussion to the Developer list. > >Regards, >Somik >----- Original Message ----- >From: "Raghavender Srimantula" <kin...@ho...> >To: <htm...@li...> >Sent: Monday, April 08, 2002 6:39 AM >Subject: [Htmlparser-user] HTML parser 1.1 > > > > Hi Somik, > > when would be this HTMLparser 1.1 out? > > one more question. to parse the FORM tags, I have a small question. > > let us say this is a form tag > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > > <P>User name: > > <INPUT TYPE="text" NAME="userName" SIZE="10"> > > <P>Password: > > <INPUT TYPE="password" NAME="password" SIZE="12"> > > <P><INPUT TYPE="submit" VALUE="Log in"> > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > > </FORM> > > > > I am not sure, but to me the way htmlparser parses is it gives me the >tag > > parameter of the first line in the above snippet of html code, when I do > > Hashtable table = tag.parseParameters(); > > it is looking for parameters inside <FORM ..... >, but not <FORM > > .....</FORM> > > > > could you suggest me how to go ahead with this. > > Raghav > > > > > > to extract the INPUT tag parameters > > > > > > > > > > > > _________________________________________________________________ > > MSN Photos is the easiest way to share and print your photos: > > http://photos.msn.com/support/worldwide.aspx > > > > > > _______________________________________________ > > Htmlparser-user mailing list > > Htm...@li... > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > >_________________________________________________________ >Do You Yahoo!? >Get your free @yahoo.com address at http://mail.yahoo.com > _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. |
From: Somik R. <so...@ya...> - 2002-04-11 02:17:41
|
Hi Raghav I replied to your earlier query. Did you recieve the mail (I forwarded it again) ? Regarding your current query, there are two ways to handle option tags. [1] Like in the previous question, you will have to recognize a HTMLTag (begin tag), followed by HTMLStringNode, and finally HTMLEndTag. [2] To make life easier, since this tag is basic xml, you can use a special XML parsing method provided in the superclass HTMLTagScanner. The methods are : (i) isXMLTagFound (ii) extractXMLData both of them are static mehods. You would use it like this : HTMLNode node = reader.readElement(); if (isXMLTag(node,"OPTION")) { String option = extractXMLData(node,"OPTION",reader); // The string now contains the data within the option xml tag // So given an input : <OPTION value="#">Select a destination</OPTION> // option will hold "Select a destination" } But getting the value from the option tag itself would need to be handled seperately. Regards, Somik ----- Original Message ----- From: "Raghavender Srimantula" <kin...@ho...> To: <so...@ya...>; <htm...@li...> Sent: Thursday, April 11, 2002 9:22 AM Subject: Re: [Htmlparser-user] HTML parser 1.1 > hi Somik, > any ideas about my previous mail. let us say if we have > <OPTION value="#">Select a destination</OPTION> > when I do a > node = reader.readElement(); > where "reader" is HTMLReader > the node I get is of type neither HTMLStringNode, HTMLEndTag, > HTMLRemarkNode. > how do I classify this if I want to do some thing with them. > Raghav > > >From: "Somik Raha" <so...@ya...> > >To: "Raghavender Srimantula" <kin...@ho...> > >CC: <htm...@li...> > >Subject: Re: [Htmlparser-user] HTML parser 1.1 > >Date: Mon, 8 Apr 2002 13:04:07 +0900 > > > >Hi Raghav > > > when would be this HTMLparser 1.1 out? > >As soon as I can wrap it up. Technically, the code is ready and already > >checked into CVS. I need to do the process of creating a release - make > >some > >documentation, check everything is ok, .. > >If I had some help I could wrap it up sooner. > > > > > I am not sure, but to me the way htmlparser parses is it gives me the > >tag > > > parameter of the first line in the above snippet of html code, when I do > > > Hashtable table = tag.parseParameters(); > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > .....</FORM> > > > >Yes - parseParameters() will give you the stuff inside the FORM tag. That > >is > >what I call "microscopic" parsing. But to get the remaining tags - till you > >encounter </FORM> you need to do "macroscopic" parsing. This is not hard- > >check HTMLAppletScanner as an example. > > > >In a nutshell - concept is very simple. The scan method provides you with a > >reader. So you are to use that reader to read ahead and get the next tags. > >This is simple bcos the reader will automatically identify the correct > >tags, > >and the mechanism is very similar to using the parser to get the tags you > >want. The HTMLLinkScanner among others, also works on the same principle. > > > >Bytway - I think we should take this discussion to the Developer list. > > > >Regards, > >Somik > >----- Original Message ----- > >From: "Raghavender Srimantula" <kin...@ho...> > >To: <htm...@li...> > >Sent: Monday, April 08, 2002 6:39 AM > >Subject: [Htmlparser-user] HTML parser 1.1 > > > > > > > Hi Somik, > > > when would be this HTMLparser 1.1 out? > > > one more question. to parse the FORM tags, I have a small question. > > > let us say this is a form tag > > > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > > > <P>User name: > > > <INPUT TYPE="text" NAME="userName" SIZE="10"> > > > <P>Password: > > > <INPUT TYPE="password" NAME="password" SIZE="12"> > > > <P><INPUT TYPE="submit" VALUE="Log in"> > > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > > > </FORM> > > > > > > I am not sure, but to me the way htmlparser parses is it gives me the > >tag > > > parameter of the first line in the above snippet of html code, when I do > > > Hashtable table = tag.parseParameters(); > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > .....</FORM> > > > > > > could you suggest me how to go ahead with this. > > > Raghav > > > > > > > > > to extract the INPUT tag parameters > > > > > > > > > > > > > > > > > > _________________________________________________________________ > > > MSN Photos is the easiest way to share and print your photos: > > > http://photos.msn.com/support/worldwide.aspx > > > > > > > > > _______________________________________________ > > > Htmlparser-user mailing list > > > Htm...@li... > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > >_________________________________________________________ > >Do You Yahoo!? > >Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > > _________________________________________________________________ > Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: Raghavender S. <kin...@ho...> - 2002-04-11 21:12:58
|
hi Somik, the code snippet you mailed me seems to have some problems. let me explain you. the method isXMLTagFound(node,"OPTION") would always return false. the reason: in the definition of the above method we have if (node instanceof HTMLTag) { System.out.println("node instanceof HTMLTag in tagscanner "); HTMLTag tag = (HTMLTag)node; if (tag.getText().equals(tagName)) { xmlTagFound=true; } } tag.getText() would always give me OPTION value="#">Select a destination which is not equal to the tagName, in this case the tagName=OPTION. Raghav >From: "Somik Raha" <so...@ya...> >To: "Raghavender Srimantula" <kin...@ho...>, ><htm...@li...> >Subject: Re: [Htmlparser-user] HTML parser 1.1 >Date: Thu, 11 Apr 2002 11:14:51 +0900 > >Hi Raghav > I replied to your earlier query. Did you recieve the mail (I forwarded >it again) ? > Regarding your current query, there are two ways to handle option >tags. > >[1] Like in the previous question, you will have to recognize a HTMLTag >(begin tag), followed by HTMLStringNode, and finally HTMLEndTag. >[2] To make life easier, since this tag is basic xml, you can use a special >XML parsing method provided in the superclass HTMLTagScanner. > >The methods are : >(i) isXMLTagFound >(ii) extractXMLData > >both of them are static mehods. >You would use it like this : > >HTMLNode node = reader.readElement(); >if (isXMLTag(node,"OPTION")) { > String option = extractXMLData(node,"OPTION",reader); > // The string now contains the data within the option xml tag > // So given an input : <OPTION value="#">Select a destination</OPTION> > // option will hold "Select a destination" >} > >But getting the value from the option tag itself would need to be handled >seperately. > >Regards, >Somik >----- Original Message ----- >From: "Raghavender Srimantula" <kin...@ho...> >To: <so...@ya...>; <htm...@li...> >Sent: Thursday, April 11, 2002 9:22 AM >Subject: Re: [Htmlparser-user] HTML parser 1.1 > > > > hi Somik, > > any ideas about my previous mail. let us say if we have > > <OPTION value="#">Select a destination</OPTION> > > when I do a > > node = reader.readElement(); > > where "reader" is HTMLReader > > the node I get is of type neither HTMLStringNode, HTMLEndTag, > > HTMLRemarkNode. > > how do I classify this if I want to do some thing with them. > > Raghav > > > > >From: "Somik Raha" <so...@ya...> > > >To: "Raghavender Srimantula" <kin...@ho...> > > >CC: <htm...@li...> > > >Subject: Re: [Htmlparser-user] HTML parser 1.1 > > >Date: Mon, 8 Apr 2002 13:04:07 +0900 > > > > > >Hi Raghav > > > > when would be this HTMLparser 1.1 out? > > >As soon as I can wrap it up. Technically, the code is ready and already > > >checked into CVS. I need to do the process of creating a release - make > > >some > > >documentation, check everything is ok, .. > > >If I had some help I could wrap it up sooner. > > > > > > > I am not sure, but to me the way htmlparser parses is it gives me >the > > >tag > > > > parameter of the first line in the above snippet of html code, when >I >do > > > > Hashtable table = tag.parseParameters(); > > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > > .....</FORM> > > > > > >Yes - parseParameters() will give you the stuff inside the FORM tag. >That > > >is > > >what I call "microscopic" parsing. But to get the remaining tags - till >you > > >encounter </FORM> you need to do "macroscopic" parsing. This is not >hard- > > >check HTMLAppletScanner as an example. > > > > > >In a nutshell - concept is very simple. The scan method provides you >with >a > > >reader. So you are to use that reader to read ahead and get the next >tags. > > >This is simple bcos the reader will automatically identify the correct > > >tags, > > >and the mechanism is very similar to using the parser to get the tags >you > > >want. The HTMLLinkScanner among others, also works on the same >principle. > > > > > >Bytway - I think we should take this discussion to the Developer list. > > > > > >Regards, > > >Somik > > >----- Original Message ----- > > >From: "Raghavender Srimantula" <kin...@ho...> > > >To: <htm...@li...> > > >Sent: Monday, April 08, 2002 6:39 AM > > >Subject: [Htmlparser-user] HTML parser 1.1 > > > > > > > > > > Hi Somik, > > > > when would be this HTMLparser 1.1 out? > > > > one more question. to parse the FORM tags, I have a small question. > > > > let us say this is a form tag > > > > > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > > > > <P>User name: > > > > <INPUT TYPE="text" NAME="userName" SIZE="10"> > > > > <P>Password: > > > > <INPUT TYPE="password" NAME="password" SIZE="12"> > > > > <P><INPUT TYPE="submit" VALUE="Log in"> > > > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > > > > </FORM> > > > > > > > > I am not sure, but to me the way htmlparser parses is it gives me >the > > >tag > > > > parameter of the first line in the above snippet of html code, when >I >do > > > > Hashtable table = tag.parseParameters(); > > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > > .....</FORM> > > > > > > > > could you suggest me how to go ahead with this. > > > > Raghav > > > > > > > > > > > > to extract the INPUT tag parameters > > > > > > > > > > > > > > > > > > > > > > > > _________________________________________________________________ > > > > MSN Photos is the easiest way to share and print your photos: > > > > http://photos.msn.com/support/worldwide.aspx > > > > > > > > > > > > _______________________________________________ > > > > Htmlparser-user mailing list > > > > Htm...@li... > > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > >_________________________________________________________ > > >Do You Yahoo!? > > >Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > > > > > > > > _________________________________________________________________ > > Get your FREE download of MSN Explorer at >http://explorer.msn.com/intl.asp. > > >_________________________________________________________ >Do You Yahoo!? >Get your free @yahoo.com address at http://mail.yahoo.com > _________________________________________________________________ Chat with friends online, try MSN Messenger: http://messenger.msn.com |
From: Somik R. <so...@ya...> - 2002-04-12 03:00:50
|
Hi Raghav You are right. That is indeed a bug. I have written a test case for it, captured it, and fixed it. Code is checked into CVS - it should work for you now. Regards, Somik ----- Original Message ----- From: "Raghavender Srimantula" <kin...@ho...> To: <so...@ya...>; <htm...@li...> Sent: Friday, April 12, 2002 6:12 AM Subject: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1 > hi Somik, > the code snippet you mailed me seems to have some problems. > let me explain you. the method > isXMLTagFound(node,"OPTION") > would always return false. the reason: in the definition of the above method > we have > > if (node instanceof HTMLTag) { > System.out.println("node instanceof HTMLTag in tagscanner "); > HTMLTag tag = (HTMLTag)node; > if (tag.getText().equals(tagName)) { > xmlTagFound=true; > } > } > > tag.getText() would always give me > OPTION value="#">Select a destination > > which is not equal to the tagName, in this case the tagName=OPTION. > > Raghav > > > >From: "Somik Raha" <so...@ya...> > >To: "Raghavender Srimantula" <kin...@ho...>, > ><htm...@li...> > >Subject: Re: [Htmlparser-user] HTML parser 1.1 > >Date: Thu, 11 Apr 2002 11:14:51 +0900 > > > >Hi Raghav > > I replied to your earlier query. Did you recieve the mail (I forwarded > >it again) ? > > Regarding your current query, there are two ways to handle option > >tags. > > > >[1] Like in the previous question, you will have to recognize a HTMLTag > >(begin tag), followed by HTMLStringNode, and finally HTMLEndTag. > >[2] To make life easier, since this tag is basic xml, you can use a special > >XML parsing method provided in the superclass HTMLTagScanner. > > > >The methods are : > >(i) isXMLTagFound > >(ii) extractXMLData > > > >both of them are static mehods. > >You would use it like this : > > > >HTMLNode node = reader.readElement(); > >if (isXMLTag(node,"OPTION")) { > > String option = extractXMLData(node,"OPTION",reader); > > // The string now contains the data within the option xml tag > > // So given an input : <OPTION value="#">Select a destination</OPTION> > > // option will hold "Select a destination" > >} > > > >But getting the value from the option tag itself would need to be handled > >seperately. > > > >Regards, > >Somik > >----- Original Message ----- > >From: "Raghavender Srimantula" <kin...@ho...> > >To: <so...@ya...>; <htm...@li...> > >Sent: Thursday, April 11, 2002 9:22 AM > >Subject: Re: [Htmlparser-user] HTML parser 1.1 > > > > > > > hi Somik, > > > any ideas about my previous mail. let us say if we have > > > <OPTION value="#">Select a destination</OPTION> > > > when I do a > > > node = reader.readElement(); > > > where "reader" is HTMLReader > > > the node I get is of type neither HTMLStringNode, HTMLEndTag, > > > HTMLRemarkNode. > > > how do I classify this if I want to do some thing with them. > > > Raghav > > > > > > >From: "Somik Raha" <so...@ya...> > > > >To: "Raghavender Srimantula" <kin...@ho...> > > > >CC: <htm...@li...> > > > >Subject: Re: [Htmlparser-user] HTML parser 1.1 > > > >Date: Mon, 8 Apr 2002 13:04:07 +0900 > > > > > > > >Hi Raghav > > > > > when would be this HTMLparser 1.1 out? > > > >As soon as I can wrap it up. Technically, the code is ready and already > > > >checked into CVS. I need to do the process of creating a release - make > > > >some > > > >documentation, check everything is ok, .. > > > >If I had some help I could wrap it up sooner. > > > > > > > > > I am not sure, but to me the way htmlparser parses is it gives me > >the > > > >tag > > > > > parameter of the first line in the above snippet of html code, when > >I > >do > > > > > Hashtable table = tag.parseParameters(); > > > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > > > .....</FORM> > > > > > > > >Yes - parseParameters() will give you the stuff inside the FORM tag. > >That > > > >is > > > >what I call "microscopic" parsing. But to get the remaining tags - till > >you > > > >encounter </FORM> you need to do "macroscopic" parsing. This is not > >hard- > > > >check HTMLAppletScanner as an example. > > > > > > > >In a nutshell - concept is very simple. The scan method provides you > >with > >a > > > >reader. So you are to use that reader to read ahead and get the next > >tags. > > > >This is simple bcos the reader will automatically identify the correct > > > >tags, > > > >and the mechanism is very similar to using the parser to get the tags > >you > > > >want. The HTMLLinkScanner among others, also works on the same > >principle. > > > > > > > >Bytway - I think we should take this discussion to the Developer list. > > > > > > > >Regards, > > > >Somik > > > >----- Original Message ----- > > > >From: "Raghavender Srimantula" <kin...@ho...> > > > >To: <htm...@li...> > > > >Sent: Monday, April 08, 2002 6:39 AM > > > >Subject: [Htmlparser-user] HTML parser 1.1 > > > > > > > > > > > > > Hi Somik, > > > > > when would be this HTMLparser 1.1 out? > > > > > one more question. to parse the FORM tags, I have a small question. > > > > > let us say this is a form tag > > > > > > > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke"> > > > > > <P>User name: > > > > > <INPUT TYPE="text" NAME="userName" SIZE="10"> > > > > > <P>Password: > > > > > <INPUT TYPE="password" NAME="password" SIZE="12"> > > > > > <P><INPUT TYPE="submit" VALUE="Log in"> > > > > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()"> > > > > > </FORM> > > > > > > > > > > I am not sure, but to me the way htmlparser parses is it gives me > >the > > > >tag > > > > > parameter of the first line in the above snippet of html code, when > >I > >do > > > > > Hashtable table = tag.parseParameters(); > > > > > it is looking for parameters inside <FORM ..... >, but not <FORM > > > > > .....</FORM> > > > > > > > > > > could you suggest me how to go ahead with this. > > > > > Raghav > > > > > > > > > > > > > > > to extract the INPUT tag parameters > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _________________________________________________________________ > > > > > MSN Photos is the easiest way to share and print your photos: > > > > > http://photos.msn.com/support/worldwide.aspx > > > > > > > > > > > > > > > _______________________________________________ > > > > > Htmlparser-user mailing list > > > > > Htm...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > > > > >_________________________________________________________ > > > >Do You Yahoo!? > > > >Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > > > > > > > > > > > > > > _________________________________________________________________ > > > Get your FREE download of MSN Explorer at > >http://explorer.msn.com/intl.asp. > > > > > >_________________________________________________________ > >Do You Yahoo!? > >Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > > _________________________________________________________________ > Chat with friends online, try MSN Messenger: http://messenger.msn.com > > > _______________________________________________ > Htmlparser-developer mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-developer _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |