htmlparser-user Mailing List for HTML Parser (Page 99)
Brought to you by:
derrickoswald
You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(7) |
Feb
|
Mar
(9) |
Apr
(50) |
May
(20) |
Jun
(47) |
Jul
(37) |
Aug
(32) |
Sep
(30) |
Oct
(11) |
Nov
(37) |
Dec
(47) |
| 2003 |
Jan
(31) |
Feb
(70) |
Mar
(67) |
Apr
(34) |
May
(66) |
Jun
(25) |
Jul
(48) |
Aug
(43) |
Sep
(58) |
Oct
(25) |
Nov
(10) |
Dec
(25) |
| 2004 |
Jan
(38) |
Feb
(17) |
Mar
(24) |
Apr
(25) |
May
(11) |
Jun
(6) |
Jul
(24) |
Aug
(42) |
Sep
(13) |
Oct
(17) |
Nov
(13) |
Dec
(44) |
| 2005 |
Jan
(10) |
Feb
(16) |
Mar
(16) |
Apr
(23) |
May
(6) |
Jun
(19) |
Jul
(39) |
Aug
(15) |
Sep
(40) |
Oct
(49) |
Nov
(29) |
Dec
(41) |
| 2006 |
Jan
(28) |
Feb
(24) |
Mar
(52) |
Apr
(41) |
May
(31) |
Jun
(34) |
Jul
(22) |
Aug
(12) |
Sep
(11) |
Oct
(11) |
Nov
(11) |
Dec
(4) |
| 2007 |
Jan
(39) |
Feb
(13) |
Mar
(16) |
Apr
(24) |
May
(13) |
Jun
(12) |
Jul
(21) |
Aug
(61) |
Sep
(31) |
Oct
(13) |
Nov
(32) |
Dec
(15) |
| 2008 |
Jan
(7) |
Feb
(8) |
Mar
(14) |
Apr
(12) |
May
(23) |
Jun
(20) |
Jul
(9) |
Aug
(6) |
Sep
(2) |
Oct
(7) |
Nov
(3) |
Dec
(2) |
| 2009 |
Jan
(5) |
Feb
(8) |
Mar
(10) |
Apr
(22) |
May
(85) |
Jun
(82) |
Jul
(45) |
Aug
(28) |
Sep
(26) |
Oct
(50) |
Nov
(8) |
Dec
(16) |
| 2010 |
Jan
(3) |
Feb
(11) |
Mar
(39) |
Apr
(56) |
May
(80) |
Jun
(64) |
Jul
(49) |
Aug
(48) |
Sep
(16) |
Oct
(3) |
Nov
(5) |
Dec
(5) |
| 2011 |
Jan
(13) |
Feb
|
Mar
(1) |
Apr
(7) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(8) |
Sep
|
Oct
(6) |
Nov
(2) |
Dec
|
| 2012 |
Jan
(5) |
Feb
|
Mar
(3) |
Apr
(3) |
May
(4) |
Jun
(8) |
Jul
(1) |
Aug
(5) |
Sep
(10) |
Oct
(3) |
Nov
(2) |
Dec
(4) |
| 2013 |
Jan
(4) |
Feb
(2) |
Mar
(7) |
Apr
(7) |
May
(6) |
Jun
(7) |
Jul
(3) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2014 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(3) |
Jun
(1) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(2) |
Dec
(4) |
| 2015 |
Jan
(4) |
Feb
(2) |
Mar
(8) |
Apr
(7) |
May
(6) |
Jun
(7) |
Jul
(3) |
Aug
(1) |
Sep
(1) |
Oct
(4) |
Nov
(3) |
Dec
(4) |
| 2016 |
Jan
(4) |
Feb
(6) |
Mar
(9) |
Apr
(9) |
May
(6) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
| 2017 |
Jan
|
Feb
(1) |
Mar
(3) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(2) |
Aug
(3) |
Sep
(6) |
Oct
(3) |
Nov
(2) |
Dec
(5) |
| 2018 |
Jan
(3) |
Feb
(13) |
Mar
(28) |
Apr
(5) |
May
(4) |
Jun
(2) |
Jul
(2) |
Aug
(8) |
Sep
(2) |
Oct
(1) |
Nov
(5) |
Dec
(1) |
| 2019 |
Jan
(8) |
Feb
(1) |
Mar
|
Apr
(1) |
May
(4) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(2) |
| 2020 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
(2) |
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
(1) |
| 2021 |
Jan
(3) |
Feb
(2) |
Mar
(1) |
Apr
(1) |
May
(2) |
Jun
(1) |
Jul
(2) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2022 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2023 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2024 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
|
From: Somik R. <so...@ya...> - 2002-04-01 15:22:00
|
Hi Craig
Wow! Thats a great question.
Actually, I doubt if I could replace Sun Microsystems' code with mine. I
dont think Java is that open (or is it ?)
However, we could think of writing our own adapter for the html parser that
might plugin in some way...
I have never used Sun's html parser (If I had, I might not have started
this project).
I will need to study Sun's parser before I can answer your question..
But there does seem to be some interesting possibilities.
Regards
Somik
----- Original Message -----
From: "Craig Raw" <cr...@qu...>
To: <htm...@li...>
Sent: Monday, April 01, 2002 10:20 PM
Subject: [Htmlparser-user] Swing integration
> Has the HTML Parser been integrated into Swing's HTMLEditorKit to
> provide a better implementation of JEditorPane's HTML viewing
> capabilities? HTML Parser would need to replace
> javax.swing.text.html.parser.Parser, which is currently somewhat buggy.
> Anyone tried this?
>
> -craig
>
>
>
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
|
|
From: Craig R. <cr...@qu...> - 2002-04-01 13:20:43
|
Has the HTML Parser been integrated into Swing's HTMLEditorKit to provide a better implementation of JEditorPane's HTML viewing capabilities? HTML Parser would need to replace javax.swing.text.html.parser.Parser, which is currently somewhat buggy. Anyone tried this? -craig |
|
From: Somik R. <so...@ya...> - 2002-03-24 05:51:01
|
Dear Users,
Thanks for using HTMLParser. HTMLParser is getting some new =
features, namely,=20
[1] HTMLMetaTag scanner
[2] Support for not ".html" pages - I am planning to bring in dynamic =
pages under the purview of the parser as well. Though I might need a bit =
of help for this.
I wanted to have some feedback from the user community -what are the =
features that you would really like to see added to the parser (or r u =
quite happy with the parser as is?)
Regards,
Somik
|
|
From: Somik R. <so...@ya...> - 2002-03-22 16:41:00
|
Hi Folks,
Release 1.04 is out. Has the following bug fixes :
[1] Parsing JSP tags which had tags within inverted commas, was causing =
problems.
[2] A link with no link url would cause the parser to crash with a null =
pointer exception.
The above bugs were reported by Gordon Deudney and Robert Kausch.
More test cases added.=20
Regards,
Somik
|
|
From: Somik R. <so...@ya...> - 2002-03-16 10:52:44
|
Hi Gordon,
This is in reply to your request for help on the sourceforge site. I =
couldnt find out how to format code and put it up there.
Here's the sample code for you :
HTMLNode node,node2;=20
HTMLLinkTag linkTag;=20
HTMLImageTag imageTag;=20
for (Enumeration e=3D parser.elements();e.hasMoreElements();) {=20
node =3D (HTMLNode)e.nextElement();=20
if (node instanceof HTMLLinkTag) { // If its a link tag, only =
then shall we look for image tags within them
linkTag =3D (HTMLLinkTag)node;=20
for (Enumeration e2=3DlinkTag.linkData();e2.hasMoreElements();) { =
// Go through the list of elements in this
node2 =3D (HTMLNode)e2.nextElement();=20
if (node2 instanceof HTMLImageTag) { // Only if the element is =
an image tag shall we downcase
imageTag =3D (HTMLImageTag)node2;=20
System.out.println("Image loc =
=3D"+imageTag.getImageLocation());=20
}=20
}=20
}=20
}=20
Regards,
Somik
|
|
From: Gordon D. <gde...@on...> - 2002-03-16 10:40:06
|
I was wondering how to extract an image tag and img properties from within an anchor tag? I know HTMLlinkTag has a linkData() which returns an enumeration, I have tried to convert that to a tag but I have had no luck. example of what I am trying to parse (a href="asdf") (img src="asdf")(/a) <a href="test"><img src="test"> </a> I want to get the image tag info. I appreciate any help. -- Gordon Deudney gde...@on... - email (212) 894-3750 x7884 - voicemail/fax __________________________________________________ Voicemail, email, and fax...all in one place. Sign Up Now! http://www.onebox.com |
|
From: Somik R. <so...@ya...> - 2002-03-14 12:43:41
|
Here's your program attached. Run it on the attached file. I think this is what you wanted to do. Regards, Somik ----- Original Message ----- From: "Somik Raha" <so...@ya...> To: "HTMLParser User List" <htm...@li...> Sent: Wednesday, March 13, 2002 11:38 PM Subject: Re: [Htmlparser-user] HTML Parsing > Yes - this should be possible. You have to process a HTMLTag, and check if > it is a DIV tag. If yes, you can use the parseParameters() to get the CLASS > value. > > Regards, > Somik > ----- Original Message ----- > From: "Kalyan Kumar Mudumbai" <mk...@wi...> > To: "'Somik Raha'" <so...@ya...> > Sent: Wednesday, March 13, 2002 2:56 PM > Subject: RE: [Htmlparser-user] HTML Parsing > > > > Hi Somik, > > thanks alot for the quick reply. I had had a feel of the parser. I > > wanted to obtain the attribute value of CLASS in DIV to do the further > > processing in my application. But what I found from the initial running > was, > > DIV tag hasn't been handled (if I'm not wrong. Please excuse me for my > > ignorance). I have also tried using the default java parser > > HTMLEditorKit.ParserCallback. Even this guy is also not handling this one. > > Can I handle this and if so, how can I? > > Thanks for your input. > > > > Regards, > > Kalyan > > > > > -----Original Message----- > > > From: Somik Raha [SMTP:so...@ya...] > > > Sent: Tuesday, March 12, 2002 1:10 PM > > > To: Kalyan Kumar Mudumbai > > > Subject: Re: [Htmlparser-user] HTML Parsing > > > > > > Hi Kalyan > > > It seems like you are using something else other than HtmlParser. > > > Please download htmlparser from http://htmlparser.sourceforge.net > > > You will find the documentation as well with all info. To try it > > > immediately after downloading it, you can try : > > > run.bat http://www.yahoo.com > > > > > > Type run.bat to get options for your switches. In your question - you > want > > > to only extract links. So you can do : > > > > > > run.bat http://www.yahoo.com -l > > > > > > This will only show you the links. > > > > > > Regards, > > > Somik > > > > > > > > > This message is confidential and may also be legally privileged. If you > are not the intended recipient, please notify us immediately. You should not > copy it or use it for any purpose, nor disclose it's contents to any other > person. The views and opinions expressed in this e-mail message are the > author's own and may not reflect the views and opinions of Wilco > International. > > > _________________________________________________________ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com > > > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user |
|
From: Somik R. <so...@ya...> - 2002-03-13 14:33:37
|
Yes - this should be possible. You have to process a HTMLTag, and check if it is a DIV tag. If yes, you can use the parseParameters() to get the CLASS value. Regards, Somik ----- Original Message ----- From: "Kalyan Kumar Mudumbai" <mk...@wi...> To: "'Somik Raha'" <so...@ya...> Sent: Wednesday, March 13, 2002 2:56 PM Subject: RE: [Htmlparser-user] HTML Parsing > Hi Somik, > thanks alot for the quick reply. I had had a feel of the parser. I > wanted to obtain the attribute value of CLASS in DIV to do the further > processing in my application. But what I found from the initial running was, > DIV tag hasn't been handled (if I'm not wrong. Please excuse me for my > ignorance). I have also tried using the default java parser > HTMLEditorKit.ParserCallback. Even this guy is also not handling this one. > Can I handle this and if so, how can I? > Thanks for your input. > > Regards, > Kalyan > > > -----Original Message----- > > From: Somik Raha [SMTP:so...@ya...] > > Sent: Tuesday, March 12, 2002 1:10 PM > > To: Kalyan Kumar Mudumbai > > Subject: Re: [Htmlparser-user] HTML Parsing > > > > Hi Kalyan > > It seems like you are using something else other than HtmlParser. > > Please download htmlparser from http://htmlparser.sourceforge.net > > You will find the documentation as well with all info. To try it > > immediately after downloading it, you can try : > > run.bat http://www.yahoo.com > > > > Type run.bat to get options for your switches. In your question - you want > > to only extract links. So you can do : > > > > run.bat http://www.yahoo.com -l > > > > This will only show you the links. > > > > Regards, > > Somik > > > > > This message is confidential and may also be legally privileged. If you are not the intended recipient, please notify us immediately. You should not copy it or use it for any purpose, nor disclose it's contents to any other person. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of Wilco International. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
|
From: Somik R. <so...@ya...> - 2002-03-12 07:47:43
|
Hi Kalyan
It seems like you are using something else other than HtmlParser.
Please download htmlparser from http://htmlparser.sourceforge.net
You will find the documentation as well with all info. To try it
immediately after downloading it, you can try :
run.bat http://www.yahoo.com
Type run.bat to get options for your switches. In your question - you want
to only extract links. So you can do :
run.bat http://www.yahoo.com -l
This will only show you the links.
Regards,
Somik
----- Original Message -----
From: "Kalyan Kumar Mudumbai" <mk...@wi...>
To: <htm...@li...>
Sent: Monday, March 11, 2002 7:49 PM
Subject: [Htmlparser-user] HTML Parsing
> Hi All,
> how do I parse an HTML document and obtain the value of a tag in that
> document. Suppose if I have an html document named Table.html which
contains
> a table will cells having HREF to another document which also contains a
> table, I should be first able to obtain the HREF and then the table
content.
> I am not able to find out a way to obtain a parser object from the
> HTMLEditorKit. Can some one please post a code snippet of parsing the HTML
> and obtaining the attribute value of any HREF which can specified from the
> command line. Something like
>
> java TestParser Table.html HREF
>
> should read the Table.html file and the output on the console has to be
the
> value of HREF
>
> Thanks,
> Kalyan
>
>
>
>
>
>
>
>
> This message is confidential and may also be legally privileged. If you
are not the intended recipient, please notify us immediately. You should not
copy it or use it for any purpose, nor disclose it's contents to any other
person. The views and opinions expressed in this e-mail message are the
author's own and may not reflect the views and opinions of Wilco
International.
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
|
|
From: Kalyan K. M. <mk...@wi...> - 2002-03-11 10:51:55
|
Hi All, how do I parse an HTML document and obtain the value of a tag in that document. Suppose if I have an html document named Table.html which contains a table will cells having HREF to another document which also contains a table, I should be first able to obtain the HREF and then the table content. I am not able to find out a way to obtain a parser object from the HTMLEditorKit. Can some one please post a code snippet of parsing the HTML and obtaining the attribute value of any HREF which can specified from the command line. Something like java TestParser Table.html HREF should read the Table.html file and the output on the console has to be the value of HREF Thanks, Kalyan This message is confidential and may also be legally privileged. If you are not the intended recipient, please notify us immediately. You should not copy it or use it for any purpose, nor disclose it's contents to any other person. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of Wilco International. |
|
From: Somik R. <so...@ya...> - 2002-03-04 14:28:40
|
HTMLParser 1.03 has been released. It contains a bug fix in = HTMLRemarkNode which was causing the parser to crash on pages with = remarks going over one line. A test case for the bug has been added in = HTMLRemarkNodeTest.=20 The release also contains the design documentation in the zip. Thanks to = Serge Kruppa for pointing out the bug. Check http://htmlparser.sourceforge.net Regards Somik |
|
From: Somik R. <so...@ya...> - 2002-01-21 01:17:25
|
Hi Rohit,
For including your own scanner type, you would need to do something like
this :
[1] HTMLTableTag - the tag that stores the data of the table tags
[2] HTMLTableScanner - the class which does the scanning -
implement the two template methods :
(i) evaluate() - returns true if the tag name is "TABLE". false otherwise
(ii) scan() - returns the HTMLTableTag object from the available text data.
Here, you will be having the tag contents, and you will need to extract the
relevant data out, construct the table object appropriately and return it.
Finally, you need to register this scanner. Thats it - after this, table
object will be identified. All the scanners in the library were written with
this architecture in mind. Check out the entire scanners package, in
particular, HTMLLinkScanner. Check out the corresponding test cases (in
scannersTests package), and you should get a clear idea of the usage.
Also - could you subscribe to the HTMLParser User's list, and mail your
queries to that single mail id.
Cheers
Somik
----- Original Message -----
From: "Rohit Kelapure" <rke...@vt...>
To: <fal...@mt...>; <kaa...@ik...>;
<na...@us...>; <so...@ki...>
Sent: Monday, January 21, 2002 10:07 AM
Subject: HTML TABLE PARSER
> My name is Rohit Kelapure.
>
> I am a graduate student in Computer Science at Virginia Tech.
>
> I have been going through the source code of the HTML parser.
>
> I need to customize this so as to extract the items of a table on a HTML
page
> and insert in a database.
>
> >From the code and documentation it is clear that I need to create my own
> scanner-tag pair.
>
> Could you give some more pointers to this.Which are the java source files
> which I should be working with? Have any of you worked on this
modification
> before?
>
> Your help and suggestions are greatly welcome.
>
> Thanks,
> Rohit Kelapure.
> Graduate Student Computer Science Virginia Tech USA.
>
>
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
|
|
From: Somik R. <so...@ya...> - 2002-01-16 14:09:44
|
Hi Folks,
Check http://htmlparser.sourceforge.net for a totally new look. =
Design documentation with sample programs has been added.
Feedback is welcome.
Regards,
Somik
|
|
From: Somik R. <so...@ki...> - 2002-01-16 14:08:53
|
Hi Folks,
Check http://htmlparser.sourceforge.net for a totally new look. =
Design documentation with sample programs has been added.
Feedback is welcome.
Regards,
Somik
|
|
From: Somik R. <so...@ya...> - 2002-01-09 16:36:59
|
Hi Folks,
Another bug was detected in HTMLStyleScanner, and has been =
immediately fixed. v1.02 has been released with this fix, and another =
one - which allows scanning of Finnish pages to proceed properly.
Regards,
Somik
|
|
From: Somik R. <so...@ya...> - 2002-01-08 17:35:06
|
Hi Folks,
An important bug fix has been done. The parser was crashing on style =
tags - this has been fixed.
Regards,
Somik
|
|
From: Somik R. <so...@ya...> - 2002-01-05 17:11:41
|
Hi Folks,
Sorry bout that, the zip file that was uploaded seemed to be =
corrupted. Its fixed, and you should be able to download it now.
Regards,
Somik
|
|
From: Somik R. <so...@ya...> - 2002-01-03 20:05:24
|
Hi Folks,
A new year present - HTMLParser 1.0 is released. We've finally made =
the transition from alpha to a beta stage. Modifications henceforth =
would only be of a maintenance nature and API should remain constant.
There are huge changes in the architecture, and lots of bug fixes. =
Thanks a lot to Kaarle Kaaila for some great support and ideas. Thanks =
also to Rodney Foley, for some nice ideas for improvement. And thanks to =
everyone else who's been supporting this project.=20
Looking forward to your continuing support, and wishing you a very =
happy new year.
=20
Cheers,
Somik
|
|
From: Somik R. <so...@ya...> - 2001-11-13 16:56:18
|
Hi folks,
I have modified the architecture, to include the change I spoke of =
last. Now, the parser throws an exception if no scanners have been =
registered. This feature can be turned off by setting a boolean flag, =
but by default it is set to true.
Also, a static method called registerScanners is now available in =
HTMLParser, which will register some of the common scanners.
Hopefully, this will alleviate much of the confusion being caused by =
the scanner registration process.
Regards,
Somik
|