[Htmlparser-user] How to save <TD> value to unique variables from html tables
Brought to you by:
derrickoswald
From: Henry T. <htr...@ya...> - 2008-06-04 12:43:16
|
Hi All, I have been successful in extracting almost all the table data using the following htmlparser statements in Java: Parser parser = new Parser ("http://www.abc.com/..."); NodeList nl = parser.parse(null); NodeFilter currenttabledatafilter = new AndFilter ( new TagNameFilter ("td"), new OrFilter ( new HasAttributeFilter("class","even"), new OrFilter ( new HasAttributeFilter("class", "odd"), new AndFilter ( new HasAttributeFilter("colspan","6"), new HasChildFilter(new TagNameFilter ("Strong")))))); NodeList a1 = nl.extractAllNodesThatMatch(currenttabledatafilter,true); int len = a1.size(); for (int i=0; i<len; i+=1) { TagNode tag = (TagNode)a1.elementAt(i); System.out.println(tag.toPlainTextString()); // System.out.println(tag.toHtml()); } } catch(Exception pe) { pe.printStackTrace(); } This is great for retrieving all these table data. However, I would like to save the value of each <td> to a unique variable so that they could be used in the program and ultimately save them to database. As a result, I am looking to structure a program to assign each value to a unique variable (or insert it into the database, which I can do once they are available) from as many html tables on a web page. Each table has some distinct attributes but varies on the number of <td> in them. In other, I am looking for some thing similar to the loop through a text a file as follows: While not end of line (i) identify a new table based on its unique attributes. (ii) assign the value/content of each <td> in the current table to a unique variable for instance. (iii) repeat step (i) and (ii) for remaining tables. Thanks a lot, Henry Send instant messages to your online friends http://au.messenger.yahoo.com |