Thread: [Htmlparser-user] Extract Data from Table Row Question.
Brought to you by:
derrickoswald
From: andrew d. <and...@ho...> - 2006-09-06 11:01:25
|
Hello All and Thanks for looking at my Question. I am still new to Java and HtmlParser I have se series of Web pages stored offline that i need to process, that are made up of tables, i can find the tables tag, and then all Table Rows, but the next bit is stumping me, I.e how do i read the TD values or how to check invidual tags to see if there is more processing to do (see Source Example below) Many Thanks for Any help. public static void process(NodeList listx) { // Scan for "tr" tags and Extract info NodeList TableList = listx.extractAllNodesThatMatch(new TagNameFilter("tr")); for (int x = 0; x < xx.size(); x++) { // Process Nodes or Tags this is what is stamping me 1. How do i read all TD from nodes with say format <TD class="listi"> etc and get their value 2. Or How do i get invidural Tags for futher processing } } public static void main(String[] args) { try { parser = new Parser("c:\\HtmlTest0002.htm"); // Look for Table Tag list = parser.parse (new TagNameFilter("table")); for (int x = 0; x < list.size(); x++) { // Is it the right Table if (list.elementAt(x).toString().contains("listme")) { // Get all Children and process process(list.elementAt(x).getChildren()); } } } catch (ParserException ex) { ex.printStackTrace(); } } } |
From: Derrick O. <Der...@Ro...> - 2006-09-07 11:50:47
|
Andrew, You could use a filter on the row NodeList, something like: NodeList td_tags = TableList.extractAllNodesThatMatch ( new AndFilter (new TagNameFilter ("TD"), new HasAttributeFilter ("class", "listi"))); Once you have the tags you can fetch their text contents with a StringBean: StringBean sb = new StringBean (); td_tags.visitAllNodesWith (sb); System.out.println (sb.getStrings () ); Derrick andrew davis wrote: >Hello All and Thanks for looking at my Question. > >I am still new to Java and HtmlParser I have se series of Web pages stored >offline that i need to process, that are made up of tables, i can find the >tables tag, and then all Table Rows, but the next bit is stumping me, I.e >how do i read the TD values or how to check invidual tags to see if there is >more processing to do (see Source Example below) > >Many Thanks for Any help. > > >public static void process(NodeList listx) > { > // Scan for "tr" tags and Extract info > NodeList TableList = listx.extractAllNodesThatMatch(new >TagNameFilter("tr")); > for (int x = 0; x < xx.size(); x++) > { > > // Process Nodes or Tags this is what is stamping me > > 1. How do i read all TD from nodes with say format <TD class="listi"> etc >and get their value > 2. Or How do i get invidural Tags for futher processing > > } > } > > > public static void main(String[] args) { > > try { > parser = new Parser("c:\\HtmlTest0002.htm"); > >// Look for Table Tag > > list = parser.parse (new TagNameFilter("table")); > for (int x = 0; x < list.size(); x++) > { > >// Is it the right Table > > if (list.elementAt(x).toString().contains("listme")) > { > // Get all Children and process > process(list.elementAt(x).getChildren()); > } > } > } catch (ParserException ex) { > ex.printStackTrace(); > } > > } > >} > > > >------------------------------------------------------------------------- >Using Tomcat but need to do more? Need to support web services, security? >Get stuff done quickly with pre-integrated technology to make your job easier >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > |
From: andrew d. <and...@ho...> - 2006-09-07 18:06:16
|
Thank you for this it was just what was needed.. >From: Derrick Oswald <Der...@Ro...> >Reply-To: This is the user list of htmlparser ><htm...@li...> >To: This is the user list of htmlparser ><htm...@li...> >Subject: Re: [Htmlparser-user] Extract Data from Table Row Question. >Date: Thu, 07 Sep 2006 07:50:37 -0400 > >Andrew, > >You could use a filter on the row NodeList, something like: > > NodeList td_tags = TableList.extractAllNodesThatMatch ( > new AndFilter (new TagNameFilter ("TD"), new HasAttributeFilter >("class", "listi"))); > >Once you have the tags you can fetch their text contents with a StringBean: > StringBean sb = new StringBean (); > td_tags.visitAllNodesWith (sb); > System.out.println (sb.getStrings () ); > >Derrick > >andrew davis wrote: > > >Hello All and Thanks for looking at my Question. > > > >I am still new to Java and HtmlParser I have se series of Web pages >stored > >offline that i need to process, that are made up of tables, i can find >the > >tables tag, and then all Table Rows, but the next bit is stumping me, I.e > >how do i read the TD values or how to check invidual tags to see if there >is > >more processing to do (see Source Example below) > > > >Many Thanks for Any help. > > > > > >public static void process(NodeList listx) > > { > > // Scan for "tr" tags and Extract info > > NodeList TableList = listx.extractAllNodesThatMatch(new > >TagNameFilter("tr")); > > for (int x = 0; x < xx.size(); x++) > > { > > > > // Process Nodes or Tags this is what is stamping me > > > > 1. How do i read all TD from nodes with say format <TD class="listi"> >etc > >and get their value > > 2. Or How do i get invidural Tags for futher processing > > > > } > > } > > > > > > public static void main(String[] args) { > > > > try { > > parser = new Parser("c:\\HtmlTest0002.htm"); > > > >// Look for Table Tag > > > > list = parser.parse (new TagNameFilter("table")); > > for (int x = 0; x < list.size(); x++) > > { > > > >// Is it the right Table > > > > if (list.elementAt(x).toString().contains("listme")) > > { > > // Get all Children and process > > process(list.elementAt(x).getChildren()); > > } > > } > > } catch (ParserException ex) { > > ex.printStackTrace(); > > } > > > > } > > > >} > > > > > > > >------------------------------------------------------------------------- > >Using Tomcat but need to do more? Need to support web services, security? > >Get stuff done quickly with pre-integrated technology to make your job >easier > >Download IBM WebSphere Application Server v.1.0.1 based on Apache >Geronimo > >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >_______________________________________________ > >Htmlparser-user mailing list > >Htm...@li... > >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > >------------------------------------------------------------------------- >Using Tomcat but need to do more? Need to support web services, security? >Get stuff done quickly with pre-integrated technology to make your job >easier >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user |