Re: [Htmlparser-user] How to get the content of the column
Brought to you by:
derrickoswald
|
From: Derrick O. <Der...@Ro...> - 2005-12-13 03:00:08
|
Dink,
The text of the children can be retrieved two ways:
System.out.println (getChildren ().asString ());
or
StringBean sb = new StringBean ();
getChildren ().visitAllNodesWith (sb);
System.out.println (sb.getStrings ());
The second way has better handling of line breaks and other whitespace.
As for attributes of tags, there are a lot of ways depending on whether
you want all the attributes or a particular one.
Look at Tag.getAttribute (String), or Tag.getAttributesEx () along with
the Attribute class.
Derrick
dink wrote:
> Hello,
> I am a beginner to use the html parser and would like to thank the
> contributors to this tool.
> When I want to get the content of the table, I encounter some problems.
> The table I want to parse is like below:
> <table>
> <TR>
> <TD><b>HTML</b></TD>
> </TR>
> </table>
> The code used is:
> NodeList tables = parser.parse (new TagNameFilter ("TABLE"));
> TableTag table = (TableTag) tables.elementAt(0); //There is only one Table
> TableRow row = table.getRows (0); //There is only one TR
> TableColumn column = row.getColumn(); //get the TD
> System.out.println(column.getChildren (); //print the content of TD
> The output is:
> 0 tag:b
> 1 txt:HTML
> 2 end:/b
> Can somebody tell me how to only get the content,"HTML"? And if there
> are some attributes in the tag <b>, e.g. <b attribute=xxx>,how can I
> get the attribute value "xxx"?
> Thanks in advance.
> Dink Lo
|