[Htmlparser-user] How to save <TD> value to unique variables from html tables
Brought to you by:
derrickoswald
|
From: Henry T. <htr...@ya...> - 2008-06-04 12:43:16
|
Hi All,
I have been successful in extracting almost all the table data using the following htmlparser statements in Java:
Parser parser = new Parser ("http://www.abc.com/...");
NodeList nl = parser.parse(null);
NodeFilter currenttabledatafilter =
new AndFilter (
new TagNameFilter ("td"),
new OrFilter (
new HasAttributeFilter("class","even"),
new OrFilter (
new HasAttributeFilter("class", "odd"),
new AndFilter (
new HasAttributeFilter("colspan","6"),
new HasChildFilter(new TagNameFilter ("Strong"))))));
NodeList a1 = nl.extractAllNodesThatMatch(currenttabledatafilter,true);
int len = a1.size();
for (int i=0; i<len; i+=1)
{
TagNode tag = (TagNode)a1.elementAt(i);
System.out.println(tag.toPlainTextString());
// System.out.println(tag.toHtml());
}
} catch(Exception pe) {
pe.printStackTrace();
}
This is great for retrieving all these table data. However, I would like to save the value of each <td> to a unique variable so that they could be used in the program and ultimately save them to database. As a result, I am looking to structure a program to assign each value to a unique variable (or insert it into the database, which I can do once they are available) from as many html tables on a web page. Each table has some distinct attributes but varies on the number of <td> in them. In other, I am looking for some thing similar to the loop through a text a file as follows:
While not end of line
(i) identify a new table based on its unique attributes.
(ii) assign the value/content of each <td> in the current table to a unique variable for instance.
(iii) repeat step (i) and (ii) for remaining tables.
Thanks a lot,
Henry
Send instant messages to your online friends http://au.messenger.yahoo.com |