Re: [Htmlparser-user] How to get the content of the column

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dink,

The text of the children can be retrieved two ways:

System.out.println (getChildren ().asString ());

or

StringBean sb = new StringBean ();
getChildren ().visitAllNodesWith (sb);
System.out.println (sb.getStrings ());

The second way has better handling of line breaks and other whitespace.

As for attributes of tags, there are a lot of ways depending on whether
you want all the attributes or a particular one.
Look at Tag.getAttribute (String), or Tag.getAttributesEx () along with
the Attribute class.

Derrick

dink wrote:

> Hello,
> I am a beginner to use the html parser and would like to thank the
> contributors to this tool.
> When I want to get the content of the table, I encounter some problems.
> The table I want to parse is like below:
> <table>
> <TR>
> <TD><b>HTML</b></TD>
> </TR>
> </table>
> The code used is:
> NodeList tables = parser.parse (new TagNameFilter ("TABLE"));
> TableTag table = (TableTag) tables.elementAt(0); //There is only one Table
> TableRow row = table.getRows (0); //There is only one TR
> TableColumn column = row.getColumn(); //get the TD
> System.out.println(column.getChildren (); //print the content of TD
> The output is:
> 0 tag:b
> 1 txt:HTML
> 2 end:/b
> Can somebody tell me how to only get the content,"HTML"? And if there
> are some attributes in the tag <b>, e.g. <b attribute=xxx>,how can I
> get the attribute value "xxx"?
> Thanks in advance.
> Dink Lo