Thread: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

Brought to you by: derrickoswald

htmlparser-developer

[Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Somik R. <so...@ya...> - 2002-04-08 04:06:58

Hi Raghav
> when would be this HTMLparser 1.1 out?
As soon as I can wrap it up. Technically, the code is ready and already
checked into CVS. I need to do the process of creating a release - make some
documentation, check everything is ok, ..
If I had some help I could wrap it up sooner.

> I am not sure, but to me the way htmlparser parses is it gives me the tag
> parameter of the first line in the above snippet of html code, when I do
> Hashtable table = tag.parseParameters();
> it is looking for parameters inside <FORM ..... >, but not <FORM
> .....</FORM>

Yes - parseParameters() will give you the stuff inside the FORM tag. That is
what I call "microscopic" parsing. But to get the remaining tags - till you
encounter </FORM> you need to do "macroscopic" parsing. This is not hard-
check HTMLAppletScanner as an example.

In a nutshell - concept is very simple. The scan method provides you with a
reader. So you are to use that reader to read ahead and get the next tags.
This is simple bcos the reader will automatically identify the correct tags,
and the mechanism is very similar to using the parser to get the tags you
want. The HTMLLinkScanner among others, also works on the same principle.

Bytway - I think we should take this discussion to the Developer list.

Regards,
Somik
----- Original Message -----
From: "Raghavender Srimantula" <kin...@ho...>
To: <htm...@li...>
Sent: Monday, April 08, 2002 6:39 AM
Subject: [Htmlparser-user] HTML parser 1.1


> Hi Somik,
> when would be this HTMLparser 1.1 out?
> one more question. to parse the FORM tags, I have a small question.
> let us say this is a form tag
>
> <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> <P>User name:
> <INPUT TYPE="text" NAME="userName" SIZE="10">
> <P>Password:
> <INPUT TYPE="password" NAME="password" SIZE="12">
> <P><INPUT TYPE="submit" VALUE="Log in">
> <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> </FORM>
>
> I am not sure, but to me the way htmlparser parses is it gives me the tag
> parameter of the first line in the above snippet of html code, when I do
> Hashtable table = tag.parseParameters();
> it is looking for parameters inside <FORM ..... >, but not <FORM
> .....</FORM>
>
> could you suggest me how to go ahead with this.
> Raghav
>
>
> to extract the INPUT tag parameters
>
>
>
>
>
> _________________________________________________________________
> MSN Photos is the easiest way to share and print your photos:
> http://photos.msn.com/support/worldwide.aspx
>
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

[Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Raghavender S. <kin...@ho...> - 2002-04-09 10:01:43

hi Somik,
question regarding the form parsing. let us say I have this tag
<SELECT name="pulldown" class="smaller-text">

so now when I do a
			node = reader.readElement();

if I do a node.print(), I get

Begin Tag : SELECT name="pulldown" class="smaller-text"; begins at : 0; ends 
at : 44

this node which I get is of neither HTMLRemarkNode, HTMLStringNode, 
HTMLEndTag.
I am not sure how to classify this. because if I want to take some action 
here I need to classify this node.
could you help me out.
Raghav


>From: "Somik Raha" <so...@ya...>
>To: "Raghavender Srimantula" <kin...@ho...>
>CC: <htm...@li...>
>Subject: Re: [Htmlparser-user] HTML parser 1.1
>Date: Mon, 8 Apr 2002 13:04:07 +0900
>
>Hi Raghav
> > when would be this HTMLparser 1.1 out?
>As soon as I can wrap it up. Technically, the code is ready and already
>checked into CVS. I need to do the process of creating a release - make 
>some
>documentation, check everything is ok, ..
>If I had some help I could wrap it up sooner.
>
> > I am not sure, but to me the way htmlparser parses is it gives me the 
>tag
> > parameter of the first line in the above snippet of html code, when I do
> > Hashtable table = tag.parseParameters();
> > it is looking for parameters inside <FORM ..... >, but not <FORM
> > .....</FORM>
>
>Yes - parseParameters() will give you the stuff inside the FORM tag. That 
>is
>what I call "microscopic" parsing. But to get the remaining tags - till you
>encounter </FORM> you need to do "macroscopic" parsing. This is not hard-
>check HTMLAppletScanner as an example.
>
>In a nutshell - concept is very simple. The scan method provides you with a
>reader. So you are to use that reader to read ahead and get the next tags.
>This is simple bcos the reader will automatically identify the correct 
>tags,
>and the mechanism is very similar to using the parser to get the tags you
>want. The HTMLLinkScanner among others, also works on the same principle.
>
>Bytway - I think we should take this discussion to the Developer list.
>
>Regards,
>Somik
>----- Original Message -----
>From: "Raghavender Srimantula" <kin...@ho...>
>To: <htm...@li...>
>Sent: Monday, April 08, 2002 6:39 AM
>Subject: [Htmlparser-user] HTML parser 1.1
>
>
> > Hi Somik,
> > when would be this HTMLparser 1.1 out?
> > one more question. to parse the FORM tags, I have a small question.
> > let us say this is a form tag
> >
> > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> > <P>User name:
> > <INPUT TYPE="text" NAME="userName" SIZE="10">
> > <P>Password:
> > <INPUT TYPE="password" NAME="password" SIZE="12">
> > <P><INPUT TYPE="submit" VALUE="Log in">
> > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> > </FORM>
> >
> > I am not sure, but to me the way htmlparser parses is it gives me the 
>tag
> > parameter of the first line in the above snippet of html code, when I do
> > Hashtable table = tag.parseParameters();
> > it is looking for parameters inside <FORM ..... >, but not <FORM
> > .....</FORM>
> >
> > could you suggest me how to go ahead with this.
> > Raghav
> >
> >
> > to extract the INPUT tag parameters
> >
> >
> >
> >
> >
> > _________________________________________________________________
> > MSN Photos is the easiest way to share and print your photos:
> > http://photos.msn.com/support/worldwide.aspx
> >
> >
> > _______________________________________________
> > Htmlparser-user mailing list
> > Htm...@li...
> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
>_________________________________________________________
>Do You Yahoo!?
>Get your free @yahoo.com address at http://mail.yahoo.com
>




_________________________________________________________________
Join the worlds largest e-mail service with MSN Hotmail. 
http://www.hotmail.com

Re: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Somik R. <so...@ya...> - 2002-04-09 14:42:52

Hi Raghav
> Begin Tag : SELECT name="pulldown" class="smaller-text"; begins at : 0;
ends
> at : 44
>
> this node which I get is of neither HTMLRemarkNode, HTMLStringNode,
> HTMLEndTag.

Thats right- this is expected behaviour. The type of this node is HTMLTag.
If you downcast to HTMLTag, you can get all the info.

Regards,
Somik
----- Original Message -----
From: "Raghavender Srimantula" <kin...@ho...>
To: <so...@ya...>
Cc: <htm...@li...>
Sent: Tuesday, April 09, 2002 7:01 PM
Subject: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1


> hi Somik,
> question regarding the form parsing. let us say I have this tag
> <SELECT name="pulldown" class="smaller-text">
>
> so now when I do a
> node = reader.readElement();
>
> if I do a node.print(), I get
>
> Begin Tag : SELECT name="pulldown" class="smaller-text"; begins at : 0;
ends
> at : 44
>
> this node which I get is of neither HTMLRemarkNode, HTMLStringNode,
> HTMLEndTag.
> I am not sure how to classify this. because if I want to take some action
> here I need to classify this node.
> could you help me out.
> Raghav
>
>
> >From: "Somik Raha" <so...@ya...>
> >To: "Raghavender Srimantula" <kin...@ho...>
> >CC: <htm...@li...>
> >Subject: Re: [Htmlparser-user] HTML parser 1.1
> >Date: Mon, 8 Apr 2002 13:04:07 +0900
> >
> >Hi Raghav
> > > when would be this HTMLparser 1.1 out?
> >As soon as I can wrap it up. Technically, the code is ready and already
> >checked into CVS. I need to do the process of creating a release - make
> >some
> >documentation, check everything is ok, ..
> >If I had some help I could wrap it up sooner.
> >
> > > I am not sure, but to me the way htmlparser parses is it gives me the
> >tag
> > > parameter of the first line in the above snippet of html code, when I
do
> > > Hashtable table = tag.parseParameters();
> > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > .....</FORM>
> >
> >Yes - parseParameters() will give you the stuff inside the FORM tag. That
> >is
> >what I call "microscopic" parsing. But to get the remaining tags - till
you
> >encounter </FORM> you need to do "macroscopic" parsing. This is not hard-
> >check HTMLAppletScanner as an example.
> >
> >In a nutshell - concept is very simple. The scan method provides you with
a
> >reader. So you are to use that reader to read ahead and get the next
tags.
> >This is simple bcos the reader will automatically identify the correct
> >tags,
> >and the mechanism is very similar to using the parser to get the tags you
> >want. The HTMLLinkScanner among others, also works on the same principle.
> >
> >Bytway - I think we should take this discussion to the Developer list.
> >
> >Regards,
> >Somik
> >----- Original Message -----
> >From: "Raghavender Srimantula" <kin...@ho...>
> >To: <htm...@li...>
> >Sent: Monday, April 08, 2002 6:39 AM
> >Subject: [Htmlparser-user] HTML parser 1.1
> >
> >
> > > Hi Somik,
> > > when would be this HTMLparser 1.1 out?
> > > one more question. to parse the FORM tags, I have a small question.
> > > let us say this is a form tag
> > >
> > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> > > <P>User name:
> > > <INPUT TYPE="text" NAME="userName" SIZE="10">
> > > <P>Password:
> > > <INPUT TYPE="password" NAME="password" SIZE="12">
> > > <P><INPUT TYPE="submit" VALUE="Log in">
> > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> > > </FORM>
> > >
> > > I am not sure, but to me the way htmlparser parses is it gives me the
> >tag
> > > parameter of the first line in the above snippet of html code, when I
do
> > > Hashtable table = tag.parseParameters();
> > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > .....</FORM>
> > >
> > > could you suggest me how to go ahead with this.
> > > Raghav
> > >
> > >
> > > to extract the INPUT tag parameters
> > >
> > >
> > >
> > >
> > >
> > > _________________________________________________________________
> > > MSN Photos is the easiest way to share and print your photos:
> > > http://photos.msn.com/support/worldwide.aspx
> > >
> > >
> > > _______________________________________________
> > > Htmlparser-user mailing list
> > > Htm...@li...
> > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >
> >
> >_________________________________________________________
> >Do You Yahoo!?
> >Get your free @yahoo.com address at http://mail.yahoo.com
> >
>
>
>
>
> _________________________________________________________________
> Join the world's largest e-mail service with MSN Hotmail.
> http://www.hotmail.com
>
>
> _______________________________________________
> Htmlparser-developer mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-developer


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

[Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Raghavender S. <kin...@ho...> - 2002-04-11 00:23:02

hi Somik,
any ideas about my previous mail. let us say if we have
<OPTION value="#">Select a destination</OPTION>
when I do a
node = reader.readElement();
where "reader" is HTMLReader
the node I get is of type neither HTMLStringNode, HTMLEndTag, 
HTMLRemarkNode.
how do I classify this if I want to do some thing with them.
Raghav

>From: "Somik Raha" <so...@ya...>
>To: "Raghavender Srimantula" <kin...@ho...>
>CC: <htm...@li...>
>Subject: Re: [Htmlparser-user] HTML parser 1.1
>Date: Mon, 8 Apr 2002 13:04:07 +0900
>
>Hi Raghav
> > when would be this HTMLparser 1.1 out?
>As soon as I can wrap it up. Technically, the code is ready and already
>checked into CVS. I need to do the process of creating a release - make 
>some
>documentation, check everything is ok, ..
>If I had some help I could wrap it up sooner.
>
> > I am not sure, but to me the way htmlparser parses is it gives me the 
>tag
> > parameter of the first line in the above snippet of html code, when I do
> > Hashtable table = tag.parseParameters();
> > it is looking for parameters inside <FORM ..... >, but not <FORM
> > .....</FORM>
>
>Yes - parseParameters() will give you the stuff inside the FORM tag. That 
>is
>what I call "microscopic" parsing. But to get the remaining tags - till you
>encounter </FORM> you need to do "macroscopic" parsing. This is not hard-
>check HTMLAppletScanner as an example.
>
>In a nutshell - concept is very simple. The scan method provides you with a
>reader. So you are to use that reader to read ahead and get the next tags.
>This is simple bcos the reader will automatically identify the correct 
>tags,
>and the mechanism is very similar to using the parser to get the tags you
>want. The HTMLLinkScanner among others, also works on the same principle.
>
>Bytway - I think we should take this discussion to the Developer list.
>
>Regards,
>Somik
>----- Original Message -----
>From: "Raghavender Srimantula" <kin...@ho...>
>To: <htm...@li...>
>Sent: Monday, April 08, 2002 6:39 AM
>Subject: [Htmlparser-user] HTML parser 1.1
>
>
> > Hi Somik,
> > when would be this HTMLparser 1.1 out?
> > one more question. to parse the FORM tags, I have a small question.
> > let us say this is a form tag
> >
> > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> > <P>User name:
> > <INPUT TYPE="text" NAME="userName" SIZE="10">
> > <P>Password:
> > <INPUT TYPE="password" NAME="password" SIZE="12">
> > <P><INPUT TYPE="submit" VALUE="Log in">
> > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> > </FORM>
> >
> > I am not sure, but to me the way htmlparser parses is it gives me the 
>tag
> > parameter of the first line in the above snippet of html code, when I do
> > Hashtable table = tag.parseParameters();
> > it is looking for parameters inside <FORM ..... >, but not <FORM
> > .....</FORM>
> >
> > could you suggest me how to go ahead with this.
> > Raghav
> >
> >
> > to extract the INPUT tag parameters
> >
> >
> >
> >
> >
> > _________________________________________________________________
> > MSN Photos is the easiest way to share and print your photos:
> > http://photos.msn.com/support/worldwide.aspx
> >
> >
> > _______________________________________________
> > Htmlparser-user mailing list
> > Htm...@li...
> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
>_________________________________________________________
>Do You Yahoo!?
>Get your free @yahoo.com address at http://mail.yahoo.com
>




_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.

[Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Somik R. <so...@ya...> - 2002-04-11 02:17:41

Hi Raghav
    I replied to your earlier query. Did you recieve the mail (I forwarded
it again) ?
    Regarding your current query, there are two ways to handle option tags.

[1] Like in the previous question, you will have to recognize a HTMLTag
(begin tag), followed by HTMLStringNode, and finally HTMLEndTag.
[2] To make life easier, since this tag is basic xml, you can use a special
XML parsing method provided in the superclass HTMLTagScanner.

The methods are :
(i) isXMLTagFound
(ii) extractXMLData

both of them are static mehods.
You would use it like this :

HTMLNode node = reader.readElement();
if (isXMLTag(node,"OPTION")) {
    String option = extractXMLData(node,"OPTION",reader);
    // The string now contains the data within the option xml tag
    // So given an input : <OPTION value="#">Select a destination</OPTION>
    // option will hold "Select a destination"
}

But getting the value from the option tag itself would need to be handled
seperately.

Regards,
Somik
----- Original Message -----
From: "Raghavender Srimantula" <kin...@ho...>
To: <so...@ya...>; <htm...@li...>
Sent: Thursday, April 11, 2002 9:22 AM
Subject: Re: [Htmlparser-user] HTML parser 1.1


> hi Somik,
> any ideas about my previous mail. let us say if we have
> <OPTION value="#">Select a destination</OPTION>
> when I do a
> node = reader.readElement();
> where "reader" is HTMLReader
> the node I get is of type neither HTMLStringNode, HTMLEndTag,
> HTMLRemarkNode.
> how do I classify this if I want to do some thing with them.
> Raghav
>
> >From: "Somik Raha" <so...@ya...>
> >To: "Raghavender Srimantula" <kin...@ho...>
> >CC: <htm...@li...>
> >Subject: Re: [Htmlparser-user] HTML parser 1.1
> >Date: Mon, 8 Apr 2002 13:04:07 +0900
> >
> >Hi Raghav
> > > when would be this HTMLparser 1.1 out?
> >As soon as I can wrap it up. Technically, the code is ready and already
> >checked into CVS. I need to do the process of creating a release - make
> >some
> >documentation, check everything is ok, ..
> >If I had some help I could wrap it up sooner.
> >
> > > I am not sure, but to me the way htmlparser parses is it gives me the
> >tag
> > > parameter of the first line in the above snippet of html code, when I
do
> > > Hashtable table = tag.parseParameters();
> > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > .....</FORM>
> >
> >Yes - parseParameters() will give you the stuff inside the FORM tag. That
> >is
> >what I call "microscopic" parsing. But to get the remaining tags - till
you
> >encounter </FORM> you need to do "macroscopic" parsing. This is not hard-
> >check HTMLAppletScanner as an example.
> >
> >In a nutshell - concept is very simple. The scan method provides you with
a
> >reader. So you are to use that reader to read ahead and get the next
tags.
> >This is simple bcos the reader will automatically identify the correct
> >tags,
> >and the mechanism is very similar to using the parser to get the tags you
> >want. The HTMLLinkScanner among others, also works on the same principle.
> >
> >Bytway - I think we should take this discussion to the Developer list.
> >
> >Regards,
> >Somik
> >----- Original Message -----
> >From: "Raghavender Srimantula" <kin...@ho...>
> >To: <htm...@li...>
> >Sent: Monday, April 08, 2002 6:39 AM
> >Subject: [Htmlparser-user] HTML parser 1.1
> >
> >
> > > Hi Somik,
> > > when would be this HTMLparser 1.1 out?
> > > one more question. to parse the FORM tags, I have a small question.
> > > let us say this is a form tag
> > >
> > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> > > <P>User name:
> > > <INPUT TYPE="text" NAME="userName" SIZE="10">
> > > <P>Password:
> > > <INPUT TYPE="password" NAME="password" SIZE="12">
> > > <P><INPUT TYPE="submit" VALUE="Log in">
> > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> > > </FORM>
> > >
> > > I am not sure, but to me the way htmlparser parses is it gives me the
> >tag
> > > parameter of the first line in the above snippet of html code, when I
do
> > > Hashtable table = tag.parseParameters();
> > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > .....</FORM>
> > >
> > > could you suggest me how to go ahead with this.
> > > Raghav
> > >
> > >
> > > to extract the INPUT tag parameters
> > >
> > >
> > >
> > >
> > >
> > > _________________________________________________________________
> > > MSN Photos is the easiest way to share and print your photos:
> > > http://photos.msn.com/support/worldwide.aspx
> > >
> > >
> > > _______________________________________________
> > > Htmlparser-user mailing list
> > > Htm...@li...
> > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >
> >
> >_________________________________________________________
> >Do You Yahoo!?
> >Get your free @yahoo.com address at http://mail.yahoo.com
> >
>
>
>
>
> _________________________________________________________________
> Get your FREE download of MSN Explorer at
http://explorer.msn.com/intl.asp.


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

[Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Raghavender S. <kin...@ho...> - 2002-04-11 21:12:58

hi Somik,
the code snippet you mailed me seems to have some problems.
let me explain you. the method
isXMLTagFound(node,"OPTION")
would always return false. the reason: in the definition of the above method 
we have

if (node instanceof HTMLTag) {
      System.out.println("node instanceof HTMLTag in tagscanner  ");
		HTMLTag tag = (HTMLTag)node;
		if (tag.getText().equals(tagName)) {
			xmlTagFound=true;
		}
	}

tag.getText() would always give me
OPTION value="#">Select a destination

which is not equal to the tagName, in this case the tagName=OPTION.

Raghav


>From: "Somik Raha" <so...@ya...>
>To: "Raghavender Srimantula" <kin...@ho...>, 
><htm...@li...>
>Subject: Re: [Htmlparser-user] HTML parser 1.1
>Date: Thu, 11 Apr 2002 11:14:51 +0900
>
>Hi Raghav
>     I replied to your earlier query. Did you recieve the mail (I forwarded
>it again) ?
>     Regarding your current query, there are two ways to handle option 
>tags.
>
>[1] Like in the previous question, you will have to recognize a HTMLTag
>(begin tag), followed by HTMLStringNode, and finally HTMLEndTag.
>[2] To make life easier, since this tag is basic xml, you can use a special
>XML parsing method provided in the superclass HTMLTagScanner.
>
>The methods are :
>(i) isXMLTagFound
>(ii) extractXMLData
>
>both of them are static mehods.
>You would use it like this :
>
>HTMLNode node = reader.readElement();
>if (isXMLTag(node,"OPTION")) {
>     String option = extractXMLData(node,"OPTION",reader);
>     // The string now contains the data within the option xml tag
>     // So given an input : <OPTION value="#">Select a destination</OPTION>
>     // option will hold "Select a destination"
>}
>
>But getting the value from the option tag itself would need to be handled
>seperately.
>
>Regards,
>Somik
>----- Original Message -----
>From: "Raghavender Srimantula" <kin...@ho...>
>To: <so...@ya...>; <htm...@li...>
>Sent: Thursday, April 11, 2002 9:22 AM
>Subject: Re: [Htmlparser-user] HTML parser 1.1
>
>
> > hi Somik,
> > any ideas about my previous mail. let us say if we have
> > <OPTION value="#">Select a destination</OPTION>
> > when I do a
> > node = reader.readElement();
> > where "reader" is HTMLReader
> > the node I get is of type neither HTMLStringNode, HTMLEndTag,
> > HTMLRemarkNode.
> > how do I classify this if I want to do some thing with them.
> > Raghav
> >
> > >From: "Somik Raha" <so...@ya...>
> > >To: "Raghavender Srimantula" <kin...@ho...>
> > >CC: <htm...@li...>
> > >Subject: Re: [Htmlparser-user] HTML parser 1.1
> > >Date: Mon, 8 Apr 2002 13:04:07 +0900
> > >
> > >Hi Raghav
> > > > when would be this HTMLparser 1.1 out?
> > >As soon as I can wrap it up. Technically, the code is ready and already
> > >checked into CVS. I need to do the process of creating a release - make
> > >some
> > >documentation, check everything is ok, ..
> > >If I had some help I could wrap it up sooner.
> > >
> > > > I am not sure, but to me the way htmlparser parses is it gives me 
>the
> > >tag
> > > > parameter of the first line in the above snippet of html code, when 
>I
>do
> > > > Hashtable table = tag.parseParameters();
> > > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > > .....</FORM>
> > >
> > >Yes - parseParameters() will give you the stuff inside the FORM tag. 
>That
> > >is
> > >what I call "microscopic" parsing. But to get the remaining tags - till
>you
> > >encounter </FORM> you need to do "macroscopic" parsing. This is not 
>hard-
> > >check HTMLAppletScanner as an example.
> > >
> > >In a nutshell - concept is very simple. The scan method provides you 
>with
>a
> > >reader. So you are to use that reader to read ahead and get the next
>tags.
> > >This is simple bcos the reader will automatically identify the correct
> > >tags,
> > >and the mechanism is very similar to using the parser to get the tags 
>you
> > >want. The HTMLLinkScanner among others, also works on the same 
>principle.
> > >
> > >Bytway - I think we should take this discussion to the Developer list.
> > >
> > >Regards,
> > >Somik
> > >----- Original Message -----
> > >From: "Raghavender Srimantula" <kin...@ho...>
> > >To: <htm...@li...>
> > >Sent: Monday, April 08, 2002 6:39 AM
> > >Subject: [Htmlparser-user] HTML parser 1.1
> > >
> > >
> > > > Hi Somik,
> > > > when would be this HTMLparser 1.1 out?
> > > > one more question. to parse the FORM tags, I have a small question.
> > > > let us say this is a form tag
> > > >
> > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> > > > <P>User name:
> > > > <INPUT TYPE="text" NAME="userName" SIZE="10">
> > > > <P>Password:
> > > > <INPUT TYPE="password" NAME="password" SIZE="12">
> > > > <P><INPUT TYPE="submit" VALUE="Log in">
> > > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> > > > </FORM>
> > > >
> > > > I am not sure, but to me the way htmlparser parses is it gives me 
>the
> > >tag
> > > > parameter of the first line in the above snippet of html code, when 
>I
>do
> > > > Hashtable table = tag.parseParameters();
> > > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > > .....</FORM>
> > > >
> > > > could you suggest me how to go ahead with this.
> > > > Raghav
> > > >
> > > >
> > > > to extract the INPUT tag parameters
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > _________________________________________________________________
> > > > MSN Photos is the easiest way to share and print your photos:
> > > > http://photos.msn.com/support/worldwide.aspx
> > > >
> > > >
> > > > _______________________________________________
> > > > Htmlparser-user mailing list
> > > > Htm...@li...
> > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> > >
> > >
> > >_________________________________________________________
> > >Do You Yahoo!?
> > >Get your free @yahoo.com address at http://mail.yahoo.com
> > >
> >
> >
> >
> >
> > _________________________________________________________________
> > Get your FREE download of MSN Explorer at
>http://explorer.msn.com/intl.asp.
>
>
>_________________________________________________________
>Do You Yahoo!?
>Get your free @yahoo.com address at http://mail.yahoo.com
>




_________________________________________________________________
Chat with friends online, try MSN Messenger: http://messenger.msn.com

Re: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1

From: Somik R. <so...@ya...> - 2002-04-12 03:00:50

Hi Raghav
    You are right. That is indeed a bug. I have written a test case for it,
captured it, and fixed it.
    Code is checked into CVS - it should work for you now.

Regards,
Somik
----- Original Message -----
From: "Raghavender Srimantula" <kin...@ho...>
To: <so...@ya...>; <htm...@li...>
Sent: Friday, April 12, 2002 6:12 AM
Subject: [Htmlparser-developer] Re: [Htmlparser-user] HTML parser 1.1


> hi Somik,
> the code snippet you mailed me seems to have some problems.
> let me explain you. the method
> isXMLTagFound(node,"OPTION")
> would always return false. the reason: in the definition of the above
method
> we have
>
> if (node instanceof HTMLTag) {
>       System.out.println("node instanceof HTMLTag in tagscanner  ");
> HTMLTag tag = (HTMLTag)node;
> if (tag.getText().equals(tagName)) {
> xmlTagFound=true;
> }
> }
>
> tag.getText() would always give me
> OPTION value="#">Select a destination
>
> which is not equal to the tagName, in this case the tagName=OPTION.
>
> Raghav
>
>
> >From: "Somik Raha" <so...@ya...>
> >To: "Raghavender Srimantula" <kin...@ho...>,
> ><htm...@li...>
> >Subject: Re: [Htmlparser-user] HTML parser 1.1
> >Date: Thu, 11 Apr 2002 11:14:51 +0900
> >
> >Hi Raghav
> >     I replied to your earlier query. Did you recieve the mail (I
forwarded
> >it again) ?
> >     Regarding your current query, there are two ways to handle option
> >tags.
> >
> >[1] Like in the previous question, you will have to recognize a HTMLTag
> >(begin tag), followed by HTMLStringNode, and finally HTMLEndTag.
> >[2] To make life easier, since this tag is basic xml, you can use a
special
> >XML parsing method provided in the superclass HTMLTagScanner.
> >
> >The methods are :
> >(i) isXMLTagFound
> >(ii) extractXMLData
> >
> >both of them are static mehods.
> >You would use it like this :
> >
> >HTMLNode node = reader.readElement();
> >if (isXMLTag(node,"OPTION")) {
> >     String option = extractXMLData(node,"OPTION",reader);
> >     // The string now contains the data within the option xml tag
> >     // So given an input : <OPTION value="#">Select a
destination</OPTION>
> >     // option will hold "Select a destination"
> >}
> >
> >But getting the value from the option tag itself would need to be handled
> >seperately.
> >
> >Regards,
> >Somik
> >----- Original Message -----
> >From: "Raghavender Srimantula" <kin...@ho...>
> >To: <so...@ya...>; <htm...@li...>
> >Sent: Thursday, April 11, 2002 9:22 AM
> >Subject: Re: [Htmlparser-user] HTML parser 1.1
> >
> >
> > > hi Somik,
> > > any ideas about my previous mail. let us say if we have
> > > <OPTION value="#">Select a destination</OPTION>
> > > when I do a
> > > node = reader.readElement();
> > > where "reader" is HTMLReader
> > > the node I get is of type neither HTMLStringNode, HTMLEndTag,
> > > HTMLRemarkNode.
> > > how do I classify this if I want to do some thing with them.
> > > Raghav
> > >
> > > >From: "Somik Raha" <so...@ya...>
> > > >To: "Raghavender Srimantula" <kin...@ho...>
> > > >CC: <htm...@li...>
> > > >Subject: Re: [Htmlparser-user] HTML parser 1.1
> > > >Date: Mon, 8 Apr 2002 13:04:07 +0900
> > > >
> > > >Hi Raghav
> > > > > when would be this HTMLparser 1.1 out?
> > > >As soon as I can wrap it up. Technically, the code is ready and
already
> > > >checked into CVS. I need to do the process of creating a release -
make
> > > >some
> > > >documentation, check everything is ok, ..
> > > >If I had some help I could wrap it up sooner.
> > > >
> > > > > I am not sure, but to me the way htmlparser parses is it gives me
> >the
> > > >tag
> > > > > parameter of the first line in the above snippet of html code,
when
> >I
> >do
> > > > > Hashtable table = tag.parseParameters();
> > > > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > > > .....</FORM>
> > > >
> > > >Yes - parseParameters() will give you the stuff inside the FORM tag.
> >That
> > > >is
> > > >what I call "microscopic" parsing. But to get the remaining tags -
till
> >you
> > > >encounter </FORM> you need to do "macroscopic" parsing. This is not
> >hard-
> > > >check HTMLAppletScanner as an example.
> > > >
> > > >In a nutshell - concept is very simple. The scan method provides you
> >with
> >a
> > > >reader. So you are to use that reader to read ahead and get the next
> >tags.
> > > >This is simple bcos the reader will automatically identify the
correct
> > > >tags,
> > > >and the mechanism is very similar to using the parser to get the tags
> >you
> > > >want. The HTMLLinkScanner among others, also works on the same
> >principle.
> > > >
> > > >Bytway - I think we should take this discussion to the Developer
list.
> > > >
> > > >Regards,
> > > >Somik
> > > >----- Original Message -----
> > > >From: "Raghavender Srimantula" <kin...@ho...>
> > > >To: <htm...@li...>
> > > >Sent: Monday, April 08, 2002 6:39 AM
> > > >Subject: [Htmlparser-user] HTML parser 1.1
> > > >
> > > >
> > > > > Hi Somik,
> > > > > when would be this HTMLparser 1.1 out?
> > > > > one more question. to parse the FORM tags, I have a small
question.
> > > > > let us say this is a form tag
> > > > >
> > > > > <FORM NAME="LoginForm" METHOD=POST ACTION="urltoInvoke">
> > > > > <P>User name:
> > > > > <INPUT TYPE="text" NAME="userName" SIZE="10">
> > > > > <P>Password:
> > > > > <INPUT TYPE="password" NAME="password" SIZE="12">
> > > > > <P><INPUT TYPE="submit" VALUE="Log in">
> > > > > <INPUT TYPE="button" VALUE="Cancel" onClick="window.close()">
> > > > > </FORM>
> > > > >
> > > > > I am not sure, but to me the way htmlparser parses is it gives me
> >the
> > > >tag
> > > > > parameter of the first line in the above snippet of html code,
when
> >I
> >do
> > > > > Hashtable table = tag.parseParameters();
> > > > > it is looking for parameters inside <FORM ..... >, but not <FORM
> > > > > .....</FORM>
> > > > >
> > > > > could you suggest me how to go ahead with this.
> > > > > Raghav
> > > > >
> > > > >
> > > > > to extract the INPUT tag parameters
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > _________________________________________________________________
> > > > > MSN Photos is the easiest way to share and print your photos:
> > > > > http://photos.msn.com/support/worldwide.aspx
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Htmlparser-user mailing list
> > > > > Htm...@li...
> > > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> > > >
> > > >
> > > >_________________________________________________________
> > > >Do You Yahoo!?
> > > >Get your free @yahoo.com address at http://mail.yahoo.com
> > > >
> > >
> > >
> > >
> > >
> > > _________________________________________________________________
> > > Get your FREE download of MSN Explorer at
> >http://explorer.msn.com/intl.asp.
> >
> >
> >_________________________________________________________
> >Do You Yahoo!?
> >Get your free @yahoo.com address at http://mail.yahoo.com
> >
>
>
>
>
> _________________________________________________________________
> Chat with friends online, try MSN Messenger: http://messenger.msn.com
>
>
> _______________________________________________
> Htmlparser-developer mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-developer


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com