htmlparser-user Mailing List for HTML Parser (Page 17)

Brought to you by: derrickoswald

htmlparser-user — The user mailing list for users of the htmlparser library

You can subscribe to this list here.

2001	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (1)	Dec
2002	Jan (7)	Feb	Mar (9)	Apr (50)	May (20)	Jun (47)	Jul (37)	Aug (32)	Sep (30)	Oct (11)	Nov (37)	Dec (47)
2003	Jan (31)	Feb (70)	Mar (67)	Apr (34)	May (66)	Jun (25)	Jul (48)	Aug (43)	Sep (58)	Oct (25)	Nov (10)	Dec (25)
2004	Jan (38)	Feb (17)	Mar (24)	Apr (25)	May (11)	Jun (6)	Jul (24)	Aug (42)	Sep (13)	Oct (17)	Nov (13)	Dec (44)
2005	Jan (10)	Feb (16)	Mar (16)	Apr (23)	May (6)	Jun (19)	Jul (39)	Aug (15)	Sep (40)	Oct (49)	Nov (29)	Dec (41)
2006	Jan (28)	Feb (24)	Mar (52)	Apr (41)	May (31)	Jun (34)	Jul (22)	Aug (12)	Sep (11)	Oct (11)	Nov (11)	Dec (4)
2007	Jan (39)	Feb (13)	Mar (16)	Apr (24)	May (13)	Jun (12)	Jul (21)	Aug (61)	Sep (31)	Oct (13)	Nov (32)	Dec (15)
2008	Jan (7)	Feb (8)	Mar (14)	Apr (12)	May (23)	Jun (20)	Jul (9)	Aug (6)	Sep (2)	Oct (7)	Nov (3)	Dec (2)
2009	Jan (5)	Feb (8)	Mar (10)	Apr (22)	May (85)	Jun (82)	Jul (45)	Aug (28)	Sep (26)	Oct (50)	Nov (8)	Dec (16)
2010	Jan (3)	Feb (11)	Mar (39)	Apr (56)	May (80)	Jun (64)	Jul (49)	Aug (48)	Sep (16)	Oct (3)	Nov (5)	Dec (5)
2011	Jan (13)	Feb	Mar (1)	Apr (7)	May (7)	Jun (7)	Jul (7)	Aug (8)	Sep	Oct (6)	Nov (2)	Dec
2012	Jan (5)	Feb	Mar (3)	Apr (3)	May (4)	Jun (8)	Jul (1)	Aug (5)	Sep (10)	Oct (3)	Nov (2)	Dec (4)
2013	Jan (4)	Feb (2)	Mar (7)	Apr (7)	May (6)	Jun (7)	Jul (3)	Aug	Sep (1)	Oct	Nov	Dec
2014	Jan	Feb (2)	Mar (1)	Apr	May (3)	Jun (1)	Jul	Aug	Sep (1)	Oct (4)	Nov (2)	Dec (4)
2015	Jan (4)	Feb (2)	Mar (8)	Apr (7)	May (6)	Jun (7)	Jul (3)	Aug (1)	Sep (1)	Oct (4)	Nov (3)	Dec (4)
2016	Jan (4)	Feb (6)	Mar (9)	Apr (9)	May (6)	Jun (1)	Jul (1)	Aug	Sep	Oct (1)	Nov (1)	Dec (1)
2017	Jan	Feb (1)	Mar (3)	Apr (1)	May	Jun (1)	Jul (2)	Aug (3)	Sep (6)	Oct (3)	Nov (2)	Dec (5)
2018	Jan (3)	Feb (13)	Mar (28)	Apr (5)	May (4)	Jun (2)	Jul (2)	Aug (8)	Sep (2)	Oct (1)	Nov (5)	Dec (1)
2019	Jan (8)	Feb (1)	Mar	Apr (1)	May (4)	Jun	Jul (1)	Aug	Sep	Oct	Nov (2)	Dec (2)
2020	Jan	Feb	Mar (1)	Apr (1)	May (1)	Jun (2)	Jul (1)	Aug (1)	Sep (1)	Oct	Nov (1)	Dec (1)
2021	Jan (3)	Feb (2)	Mar (1)	Apr (1)	May (2)	Jun (1)	Jul (2)	Aug (1)	Sep	Oct	Nov	Dec
2022	Jan	Feb	Mar	Apr (1)	May (1)	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec
2023	Jan (2)	Feb	Mar	Apr	May	Jun	Jul	Aug (1)	Sep	Oct	Nov	Dec
2024	Jan (2)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2025	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug	Sep	Oct (1)	Nov	Dec

Flat | Threaded

<< < 1 .. 15 16 17 18 19 .. 99 > >> (Page 17 of 99)

Re: [Htmlparser-user] Personal discount 89% for you

From: Adela L. <htm...@li...> - 2009-05-05 23:24:25

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Health &amp; Beauty</title>
<script language="XML" xmlns:annuncio='http://www.annuncio.com'><annuncio:head/></script>
</head>

<body style="margin: 0px; background-color: #F46C94;" link="#7A3B96">

<script language="XML" xmlns:annuncio='http://www.annuncio.com'> <annuncio:body/></script>


<div align="center" style="margin-top:10px; margin-bottom:10px; font-family:Verdana, Arial, Helvetica, sans-serif; font-size:10px; color: #333333;">If you have trouble viewing this e-mail, please <a href="http://www.jupkevet.cn/">click here</a>.</div>


<table width="554" border="0" cellspacing="0" cellpadding="0" align="center">
  <tr>
    <td colspan="3"><img src="http://phobos.doctorspreferred.com/images/whan/lark2_topimage.jpg" width="554" height="370" /></td>
    </tr>
  <tr>
    <td width="36" background="http://phobos.doctorspreferred.com/images/whan/email2_leftspacer.gif" bgcolor="#F7E6EB"><img src="http://phobos.doctorspreferred.com/images/whan/email2_leftspacer.gif" width="36" height="1" /></td>
    <td width="472" bgcolor="#F7E6EB"><p align="center"><font color="#EC0E8C" face="Georgia, Times New Roman, Times, serif" size="8"><b><a href="http://www.jupkevet.cn/">Everyone</a><br />
      <a href="http://www.jupkevet.cn/">Will Want</a> <br />

      <font size="6"><a href="http://www.jupkevet.cn/">Your New Secret</a></font></a></b></font></p>
		<p align="center"><a href="http://www.jupkevet.cn/">
		<img alt="" src="http://www.jupkevet.cn/10.gif" style="border-width: 0px" width="470" height="320"></a></p>
      <p align="center"><font face="Georgia, Times New Roman, Times, serif" size="5">Discover the secret today!<br />
        <a href="http://www.jupkevet.cn/">Click here for details</a></font></p></td>
    <td width="46" background="http://phobos.doctorspreferred.com/images/whan/email2_rightspacer.gif" bgcolor="#F7E6EB"><img src="http://phobos.doctorspreferred.com/images/whan/email2_rightspacer.gif" width="46" height="1" /></td>
  </tr>
  <tr>
    <td colspan="3"><img src="http://phobos.doctorspreferred.com/images/whan/lark2_bottom.gif" width="554" height="17" /></td>

    </tr>
</table>
<p align="center"><font color="#333333" size="2" face="Verdana, Arial, Helvetica, sans-serif">To
review our Privacy Policy, please <strong><a href="http://www.jupkevet.cn/">click here</a></strong>.</font></p>

<p align="center" style="font-family:Verdana, Arial, Helvetica, sans-serif; font-size:10px; color:#000000; line-height:14px;">
                        To ensure the delivery of your informative updates from Dr. Lark and the Daily Balance<br /> Team, please add
                        <strong><a href="mailto:htm...@li...">htm...@li...</a>                                  </strong>
                        to your email address book.
                </p>

        <p align="center"><font size="1" face="Verdana, Arial, Helvetica, sans-serif">************TO UNSUBSCRIBE************<br />
        You are receiving this e-mail at htm...@li... because you <br />
        indicated an interest in receiving special updates and offers
        from Dr. Lark.<br />
        We hope that you find these updates helpful, but if you would
        rather
        not<br />
        receive them, you can unsubscribe by <a href="http://www.jupkevet.cn/">clicking here</a>. You will be<br />
        immediately unsubscribed from our database. Remember, your personal information <br />
        will only be used by Healthy Directions, LLC, for editorial and marketing purposes. <br />

        Thank you. </font></p>
        <p align="center"><font size="1" face="Verdana, Arial, Helvetica, sans-serif"><em>Daily Balance<br />
        700 Indian Springs Drive<br />
        Lancaster, PA 17601</em></font></p>


</body>
</html>

[Htmlparser-user] Only sourcefourge Mailing List with Spam?

From: Athar S. S. <ath...@gm...> - 2009-04-19 02:45:06

This is perhaps the only sourceforge mailing list with spam. You guys
must be doing something right to get the clandestine world of spammers
interested in you. (the clandestine world of spammers and the search
engine companies like google)

-- 
Shiraz

500 Riverside Drive
#425
New York, NY 10027
(703) 879-8342 (skype prefer)
(571) 276 2404 (cell)
(212) 316 8630 (landline) (extn 8630)

Re: [Htmlparser-user] SiteCapturer Not Downloading /Saving Images

From: Athar S. S. <ath...@gm...> - 2009-04-19 02:22:33

I just found a problem with sitecapturer. So line 686 says :
            if (isToBeCaptured (image))
when you go inside istobecaptured the images are not recognized as
being part of the website. The code here is the culprit :

	        return (
	            link.toLowerCase ().contains (getSource ().toLowerCase ())
	            && (-1 == link.indexOf ("?"))
	            && (-1 == link.indexOf ("#")));


Here the source is http://www1.cs.columbia.edu/~shiraz/psetv001.htm
and the link maybe :
http://www1.cs.columbia.edu/~shiraz/psetv001_files/image001.jpg
so the above keeps returning false!...


On Sat, Apr 18, 2009 at 9:22 PM, Athar Shiraz Siddiqui
<ath...@gm...> wrote:
> Good evening everyone,
>
> I am trying to simply screen scrape or download html and images of a
> webpage. I cannot do it however with sitecapturer. Could someone
> indicate why I cannot find the images when I run the following code :
>
>        worker = new SiteCapturer ();
>        worker.setSource ("http://www1.cs.columbia.edu/~shiraz/psetv001.htm");
>        worker.setTarget ("C:\\Temp\\set\\download1");
> //C:\\Temp\\set\\download1
>        worker.setCaptureResources (true);
>        worker.capture ();
>        System.out.println("Done!");
>
> I stepped through the code and there dont seem to be any images in the
> mImages array.
>
> I would merely like to get the text of a website and a handle on the
> accompanying images. Thanks!
>
>
> --
> Shiraz
>
> 500 Riverside Drive
> #425
> New York, NY 10027
> (703) 879-8342 (skype prefer)
> (571) 276 2404 (cell)
> (212) 316 8630 (landline) (extn 8630)
>



-- 
Shiraz

500 Riverside Drive
#425
New York, NY 10027
(703) 879-8342 (skype prefer)
(571) 276 2404 (cell)
(212) 316 8630 (landline) (extn 8630)

Re: [Htmlparser-user] SiteCapturer Not Downloading /Saving Images

From: Athar S. S. <ath...@gm...> - 2009-04-19 01:23:10

Good evening everyone,

I am trying to simply screen scrape or download html and images of a
webpage. I cannot do it however with sitecapturer. Could someone
indicate why I cannot find the images when I run the following code :

        worker = new SiteCapturer ();
        worker.setSource ("http://www1.cs.columbia.edu/~shiraz/psetv001.htm");
        worker.setTarget ("C:\\Temp\\set\\download1");
//C:\\Temp\\set\\download1
        worker.setCaptureResources (true);
        worker.capture ();
        System.out.println("Done!");

I stepped through the code and there dont seem to be any images in the
mImages array.

I would merely like to get the text of a website and a handle on the
accompanying images. Thanks!


-- 
Shiraz

500 Riverside Drive
#425
New York, NY 10027
(703) 879-8342 (skype prefer)
(571) 276 2404 (cell)
(212) 316 8630 (landline) (extn 8630)

[Htmlparser-user] SiteCapturer Not Downloading /Saving Images

From: Athar S. S. <ath...@gm...> - 2009-04-19 00:59:27

I am using the following snippet in site capturer to save images but
it wont save any images. What is going on?

        worker = new SiteCapturer ();
        worker.setSource ("http://www1.cs.columbia.edu/~shiraz/psetv001.htm");
        // http://www1.cs.columbia.edu/~shiraz/psetv001.htm
        worker.setTarget ("C:\\Temp\\set\\download1");
//C:\\Temp\\set\\download1
        worker.setCaptureResources (true);
        worker.capture ();
        System.out.println("Done!");
        System.exit (0);


-- 
Shiraz

500 Riverside Drive
#425
New York, NY 10027
(703) 879-8342 (skype prefer)
(571) 276 2404 (cell)
(212) 316 8630 (landline) (extn 8630)

[Htmlparser-user] Using HTMLParser to extract layout info (through DOM)

From: Snir K. <sk...@gm...> - 2009-03-31 07:48:00

Hi all,

I'm trying to leverage HTMLParser to extract proximity/layout properties (as
one would be able to do through using the DOM and offsetWidth/offsetHeight
recursively on parents to a given element).

Is this something I can accomplish with the API, and if so, how?

Thanks for all the help.

Cheers,

Nick

Re: [Htmlparser-user] Regarding the Head Tag (yangqike)

From: Pony N. <nth...@gm...> - 2009-03-31 07:21:44

-
Pony Onthusitse Nthatsi
+267 71467530

Re: [Htmlparser-user] extract table

From: Aravind R P. <Ara...@in...> - 2009-03-25 03:10:33

Dint understand :)

-----Original Message-----
From: alaeddine [mailto:ala...@sa...] 
Sent: Tuesday, March 24, 2009 8:22 PM
To: htm...@li...
Subject: [Htmlparser-user] extract table

Hi
when i test the next code
/////////////////////////
Parser parser = new Parser(url);
           NodeList nl = parser.parse(null);

 for (NodeIterator iterator = n1.elements(); iterator.hasMoreNodes();) {
                Node node = iterator.nextNode();
                if (node instanceof Tag) {
                    Tag tag = (Tag) node;
//////////////////

i  usually have a result outside test  '  if (node instanceof Tag) {'

so how i can progress in the next node and test if the name of the tag is 
body or not?

Thank you for your help






> Message: 2
> Date: Tue, 24 Mar 2009 12:31:49 +0100
> From: "alaeddine" <ala...@sa...>
> Subject: [Htmlparser-user] Help me
> To: <htm...@li...>
> Message-ID: <E716CD9DBF704D9A9E982AB95DE9BDE7@aladin>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi
>
> I would to extract a table from a html url and i cant make a filter
>
> please help me to do this
>
> Thank you for your help
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 3
> Date: Tue, 24 Mar 2009 17:24:44 +0530
> From: Aravind R Pillai <Ara...@in...>
> Subject: Re: [Htmlparser-user] Help me
> To: htmlparser user list <htm...@li...>
> Message-ID:
> <E92...@BL...>
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi
>
> Parser parser = new Parser(url);
>            NodeList nl = parser.parse(null);
>
> This will give u firsrt set of all nodes. Like every node that's is inside 
> the <html> tag.
>
> for (NodeIterator iterator = n1.elements(); iterator.hasMoreNodes();) {
>                Node node = iterator.nextNode();
>                if (node instanceof Tag) {
>                    Tag tag = (Tag) node;
> This way u will get every node and cast it to tag from that u can get the 
> tag name.compare it to "BODY".
> Once tag body is obtained take the children and repeat the same process 
> using for loop until u get tag name "TABLE".
>
> U have to iterate through every tag.no other way.. try using a recursion.
>
> From: alaeddine [mailto:ala...@sa...]
> Sent: Tuesday, March 24, 2009 5:02 PM
> To: htm...@li...
> Subject: [Htmlparser-user] Help me
>
> Hi
>
> I would to extract a table from a html url and i cant make a filter
>
> please help me to do this
>
> Thank you for your help
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended 
> solely
> for the use of the addressee(s). If you are not the intended recipient, 
> please
> notify the sender by e-mail and delete the original message. Further, you 
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other 
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys 
> has taken
> every reasonable precaution to minimize this risk, but is not liable for 
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry 
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves 
> the
> right to monitor and review the content of all messages sent to or from 
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> ------------------------------------------------------------------------------
> Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
> powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
> easily build your RIAs with Flex Builder, the Eclipse(TM)based development
> software that enables intelligent coding and step-through debugging.
> Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
>
> ------------------------------
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
> End of Htmlparser-user Digest, Vol 30, Issue 2
> ********************************************** 


------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Htmlparser-user mailing list
Htm...@li...
https://lists.sourceforge.net/lists/listinfo/htmlparser-user

Re: [Htmlparser-user] Help me

From: y. <qik...@16...> - 2009-03-25 02:26:28

why not use org.htmlparser.filters.NodeClassFilter? 


2009-03-25 



yangqike 



发件人： alaeddine 
发送时间： 2009-03-24  19:51:20 
收件人： htm...@li... 
抄送： 
主题： [Htmlparser-user] Help me 
 
Hi

I would to extract a table from a html url and i cant make a filter 

please help me to do this

Thank you for your help

[Htmlparser-user] extract table

From: alaeddine <ala...@sa...> - 2009-03-24 14:51:56

Hi
when i test the next code
/////////////////////////
Parser parser = new Parser(url);
           NodeList nl = parser.parse(null);

 for (NodeIterator iterator = n1.elements(); iterator.hasMoreNodes();) {
                Node node = iterator.nextNode();
                if (node instanceof Tag) {
                    Tag tag = (Tag) node;
//////////////////

i  usually have a result outside test  '  if (node instanceof Tag) {'

so how i can progress in the next node and test if the name of the tag is 
body or not?

Thank you for your help






> Message: 2
> Date: Tue, 24 Mar 2009 12:31:49 +0100
> From: "alaeddine" <ala...@sa...>
> Subject: [Htmlparser-user] Help me
> To: <htm...@li...>
> Message-ID: <E716CD9DBF704D9A9E982AB95DE9BDE7@aladin>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi
>
> I would to extract a table from a html url and i cant make a filter
>
> please help me to do this
>
> Thank you for your help
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 3
> Date: Tue, 24 Mar 2009 17:24:44 +0530
> From: Aravind R Pillai <Ara...@in...>
> Subject: Re: [Htmlparser-user] Help me
> To: htmlparser user list <htm...@li...>
> Message-ID:
> <E92...@BL...>
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi
>
> Parser parser = new Parser(url);
>            NodeList nl = parser.parse(null);
>
> This will give u firsrt set of all nodes. Like every node that's is inside 
> the <html> tag.
>
> for (NodeIterator iterator = n1.elements(); iterator.hasMoreNodes();) {
>                Node node = iterator.nextNode();
>                if (node instanceof Tag) {
>                    Tag tag = (Tag) node;
> This way u will get every node and cast it to tag from that u can get the 
> tag name.compare it to "BODY".
> Once tag body is obtained take the children and repeat the same process 
> using for loop until u get tag name "TABLE".
>
> U have to iterate through every tag.no other way.. try using a recursion.
>
> From: alaeddine [mailto:ala...@sa...]
> Sent: Tuesday, March 24, 2009 5:02 PM
> To: htm...@li...
> Subject: [Htmlparser-user] Help me
>
> Hi
>
> I would to extract a table from a html url and i cant make a filter
>
> please help me to do this
>
> Thank you for your help
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended 
> solely
> for the use of the addressee(s). If you are not the intended recipient, 
> please
> notify the sender by e-mail and delete the original message. Further, you 
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other 
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys 
> has taken
> every reasonable precaution to minimize this risk, but is not liable for 
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry 
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves 
> the
> right to monitor and review the content of all messages sent to or from 
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> ------------------------------------------------------------------------------
> Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
> powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
> easily build your RIAs with Flex Builder, the Eclipse(TM)based development
> software that enables intelligent coding and step-through debugging.
> Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
>
> ------------------------------
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
> End of Htmlparser-user Digest, Vol 30, Issue 2
> **********************************************

Re: [Htmlparser-user] Help me

From: Aravind R P. <Ara...@in...> - 2009-03-24 11:55:16

Hi

Parser parser = new Parser(url);
            NodeList nl = parser.parse(null);

This will give u firsrt set of all nodes. Like every node that's is inside the <html> tag.

for (NodeIterator iterator = n1.elements(); iterator.hasMoreNodes();) {
                Node node = iterator.nextNode();
                if (node instanceof Tag) {
                    Tag tag = (Tag) node;
This way u will get every node and cast it to tag from that u can get the tag name.compare it to "BODY".
Once tag body is obtained take the children and repeat the same process using for loop until u get tag name "TABLE".

U have to iterate through every tag.no other way.. try using a recursion.

From: alaeddine [mailto:ala...@sa...]
Sent: Tuesday, March 24, 2009 5:02 PM
To: htm...@li...
Subject: [Htmlparser-user] Help me

Hi

I would to extract a table from a html url and i cant make a filter

please help me to do this

Thank you for your help

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

[Htmlparser-user] Help me

From: alaeddine <ala...@sa...> - 2009-03-24 11:49:41

Hi

I would to extract a table from a html url and i cant make a filter 

please help me to do this

Thank you for your help

Re: [Htmlparser-user] Regarding the Head Tag

From: yangqike <qik...@16...> - 2009-03-17 08:14:16

org.htmlparser.tags.HeadTag 


2009-03-17 



yangqike 



发件人： Aravind R Pillai 
发送时间： 2009-03-17  14:35:24 
收件人： htm...@li... 
抄送： 
主题： [Htmlparser-user] Regarding the Head Tag 
 
Hi
 
Am pretty new to Html Parser and needs help in extracting and editing a particular set of tags in the html. I was going through the tutorial and I found this bit of code.
 
Head head = heads.elementAt (0);
 
I can’t find the “Head” class. Can anyone please help me.
 
The e.g.: is listed in http://htmlparser.sourceforge.net/javadoc/index.html in parse method. 
 
Any help is greatly appreciated.
 
Regards,
Aravind R Pillai

[Htmlparser-user] Regarding the Head Tag

From: Aravind R P. <Ara...@in...> - 2009-03-17 06:31:04

Hi

Am pretty new to Html Parser and needs help in extracting and editing a particular set of tags in the html. I was going through the tutorial and I found this bit of code.

Head head = heads.elementAt (0);

I can't find the "Head" class. Can anyone please help me.

The e.g.: is listed in http://htmlparser.sourceforge.net/javadoc/index.html in parse method.

Any help is greatly appreciated.

Regards,
Aravind R Pillai

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Re: [Htmlparser-user] Avoid tag balancing

From: <qik...@16...> - 2009-02-22 12:06:58

CompositeTagScanner.java
Line 125
pls comment 
else if (isTagToBeEndedFor (ret, next)) // check DTD 
=>
//else if (isTagToBeEndedFor (ret, next)) // check DTD

for the other tag ,you can also comment the balance code

for the style tag 
 StyleScanner.java

comment the following code
/*
// build new end tag if required
if (null == node)
        {
            attribute = new Attribute ("/style", null);
            vector = new Vector ();
            vector.addElement (attribute);
            node = lexer.getNodeFactory ().createTagNode (
                lexer.getPage (), position, position, vector);
        }

*/



======= 2009-02-22 17:09 11:09:25 Roy Michael 您在来信中写到: [Htmlparser-user] Avoid tag balancing======= 


I am using version 1.6, and I am wondering if there is a way to avoid / disable the tag balancing operation when using the parser. 
I still need the parser functionality over the laxer, but I want to avoid the tag balancing operations, even if the document is malformed.
Roy.


= = = = = = = = = = = = = = = = = = = =

[Htmlparser-user] Avoid tag balancing

From: Roy M. <roy...@gm...> - 2009-02-22 09:09:47

I am using version 1.6, and I am wondering if there is a way to avoid /
disable the tag balancing operation when using the parser. 
I still need the parser functionality over the laxer, but I want to avoid
the tag balancing operations, even if the document is malformed.

 

Roy.

Re: [Htmlparser-user] HTTP header user-agent property (Ian Macfarlane)

From: Pony N. <nth...@gm...> - 2009-02-16 06:48:09

-- 
Pony Onthusitse Nthatsi
+267 71467530

Re: [Htmlparser-user] Contents of Htmlparser-user digest...

From: Pony N. <nth...@gm...> - 2009-02-16 06:46:45

-- 
Pony Onthusitse Nthatsi
+267 71467530

Re: [Htmlparser-user] replace links

From: yangqike <qik...@16...> - 2009-02-12 03:17:45

case1:do you mean you want to repalce all the LinkTag to some Other tag(MyTag) ?
case2:or you just want to find all the LinkTag and then modify the href property?

solution of case 1:just use the NodeClassFilter  and then copy all the property to you MyTag(rember to remove the LinkTag and add Your tag)
solution of case 2:just use the NodeClassFilter  and then process each the returned nodeList

is my understanding right?
if a am wrong,please ingore above. 


2009-02-12 



yangqike 



发件人： Randy Paries 
发送时间： 2009-02-11  23:36:24 
收件人： htmlparser-user 
抄送： 
主题： [Htmlparser-user] replace links 
 
Not sure if anyone even responds to this list anymore
i am trying to figure out a way of replacing all PDF links in a html doc
i was thinking if using htmlparser
i can use the parser to find all the LinkTag objects, but not sure how
to or if it is possible to replace nodes as well as find them
Thanks
Randy
------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Htmlparser-user mailing list
Htm...@li...
https://lists.sourceforge.net/lists/listinfo/htmlparser-user
.

[Htmlparser-user] replace links

From: Randy P. <rtp...@gm...> - 2009-02-11 15:35:25

Not sure if anyone even responds to this list anymore

i am trying to figure out a way of replacing all PDF links in a html doc

i was thinking if using htmlparser

i can use the parser to find all the LinkTag objects, but not sure how
to or if it is possible to replace nodes as well as find them

Thanks

Randy

[Htmlparser-user] text inside the font tag

From: Kadir V. <abd...@ba...> - 2009-02-05 15:13:22

How can I get the text inside the font tag.
For example: <font id="bla">requested text</font>

thanks

[Htmlparser-user] text inside the font tag

From: Kadir V. <abd...@ba...> - 2009-02-05 14:14:07

How can I get the text inside the font tag.
For example: <font id="bla">requested text</font>

thanks

[Htmlparser-user] How to extract only the "viewable" text? (Not scripts and comments, etc.)

From: Peter A. D. <pet...@gm...> - 2009-01-22 02:41:15

I'm using htmlparser very successfully for specific tag extraction,
but am having trouble trying to implementing a plain text export for a
"word count" function.

I have spent half of today in JavaDoc and experimenting trying to get
only the "printable" words on a page.  I cannot get the javascript to
not be included, although I'm able to exclude the script tags
themselves (script body still prints) using the NotFilter class
combined with a ScriptTag filter.

Am I not going about this correctly?  Maybe a better question is how I
should be going about trying to do this?  I can think of complicated
ways I could use brute force to make this work, but it seems as if
there is a simple and elegant solution I am missing.

Thank you for any help,

-Pete

[Htmlparser-user] About the Texts Generated by Javascript

From: Guo Y. <bli...@gm...> - 2009-01-18 08:20:01

Dear,
     I hope someone could give me a help. When I using HTML Parser to parse
webpages and grab certain texts, I noticed that some texts shown in IE
cannot be found in the source of HTML. I think they are generated by
Javascript dynamically. So, is there a way to get the whole page with all
the texts which have been generated by Javascript?

      Thank your for your patience.

-- 
Yang Guo

[Htmlparser-user] HTMLParser and meta http-equiv redirects

From: Thushara W. <th...@gm...> - 2009-01-15 01:23:41

Can HTMLParser follow redirects set by this type of meta tag:

meta http-equiv="refresh" content="0;url=http://www.myblog.net/thatpage/" />

Seems like HTMLParser and HttpURLConnection follow the standard HTTP
redirect, but not this meta refresh form of redirect.

thanks,
thushara

790 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 15 16 17 18 19 .. 99 > >> (Page 17 of 99)