Thread: [Htmlparser-user] Reading HTML doc
Brought to you by:
derrickoswald
From: <abh...@hs...> - 2006-03-03 11:03:27
|
Hi , I am going through the HtmlParser classes to develop a utility which reads HTML from a java program. My HTML doc has the info like this <H2>My Name</H2><H3>Address</H3> <P>It is not useful</P><H3>Age</H3> <P>It is important</P> I have to read the content between <H1><H2> and the corresponding <P> tags . How to do this or how to get started. Thanks in advance Abhijeet ************************************************************ HSBC Software Development (India) Pvt Ltd HSBC Center Riverside,West Avenue , 25 B Kalyani Nagar Pune 411 006 INDIA Telephone: +91 20 26683000 Fax: +91 20 26681030 ************************************************************ ----------------------------------------- *********************************************************************** This e-mail is confidential. It may also be legally privileged. If you are not the addressee you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return e-mail. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions. *********************************************************************** |
From: Derrick O. <Der...@Ro...> - 2006-03-03 12:35:20
|
I would suggest trying the FilterBuilder utility. You'll want things like TagNameFilter to get the <H3> and HasParent/HasChild/HasSibling filters to navigate around the node tree. abh...@hs... wrote: > > >Hi , >I am going through the HtmlParser classes to develop a utility which reads >HTML from a java program. > >My HTML doc has the info like this > ><H2>My Name</H2><H3>Address</H3> ><P>It is not useful</P><H3>Age</H3> ><P>It is important</P> > >I have to read the content between <H1><H2> and the corresponding <P> tags >. > How to do this or how to get started. > >Thanks in advance >Abhijeet > >************************************************************ >HSBC Software Development (India) Pvt Ltd >HSBC Center Riverside,West Avenue , >25 B Kalyani Nagar Pune 411 006 INDIA > >Telephone: +91 20 26683000 >Fax: +91 20 26681030 >************************************************************ > > >----------------------------------------- >*********************************************************************** >This e-mail is confidential. It may also be legally privileged. >If you are not the addressee you may not copy, forward, disclose >or use any part of it. If you have received this message in error, >please delete it and all copies from your system and notify the >sender immediately by return e-mail. > >Internet communications cannot be guaranteed to be timely, >secure, error or virus-free. The sender does not accept liability >for any errors or omissions. >*********************************************************************** > > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > |
From: Konstantine <lis...@gm...> - 2006-03-05 08:52:24
|
T24gMy8zLzA2LCBEZXJyaWNrIE9zd2FsZCA8RGVycmlja09zd2FsZEByb2dlcnMuY29tPiB3cm90 ZToKPiBJIHdvdWxkIHN1Z2dlc3QgdHJ5aW5nIHRoZSBGaWx0ZXJCdWlsZGVyIHV0aWxpdHkuCj4g WW91J2xsIHdhbnQgdGhpbmdzIGxpa2UgVGFnTmFtZUZpbHRlciB0byBnZXQgdGhlIDxIMz4gYW5k Cj4gSGFzUGFyZW50L0hhc0NoaWxkL0hhc1NpYmxpbmcgZmlsdGVycyB0byBuYXZpZ2F0ZSBhcm91 bmQgdGhlIG5vZGUgdHJlZS4KPHNuaXA+Cgp0aGFua3MgZm9yIHRoZSByZXBseSBhbmQgeW91ciB0 aW1lLiBJIHdhcyBnb2luZyB0aHJvdWdoIEFQSSwgaXQncwpwcmV0dHkgY29vbCwgYWx0aG91Z2gg ZG9jdW1lbnRhdGlvbiBpcyBzb21ld2hhdCBsYWNraW5nLgo= |
From: Ian M. <ian...@gm...> - 2006-03-06 12:47:20
|
May I also suggest you have a look at the NodeTreeWalker class in CVS? Lets you navigate a Node tree iteratively in breadth-first or depth-first fashion. Ian On 05/03/06, Konstantine <lis...@gm...> wrote: > On 3/3/06, Derrick Oswald <Der...@ro...> wrote: > > I would suggest trying the FilterBuilder utility. > > You'll want things like TagNameFilter to get the <H3> and > > HasParent/HasChild/HasSibling filters to navigate around the node tree. > <snip> > > thanks for the reply and your time. I was going through API, it's > pretty cool, although documentation is somewhat lacking. > |