Re: [Htmlparser-user] Only extract text from div tag with specific attribute
Brought to you by:
derrickoswald
|
From: Jumbo P. <jum...@gm...> - 2008-04-01 22:06:27
|
Thanks for the reply, Joshua. I think that's what I'm trying to do. The
part I'm stuck on is where to distinguish that I only want the div tag that
has the attribute class="body". Here is my code:
String contents = null;
Parser parser = new Parser(url);
ObjectFindingVisitor visitor = new ObjectFindingVisitor(Div.class);
parser.visitAllNodesWith(visitor);
Node[] nodes = visitor.getTags(); // do I really want to use getTags() here?
for (int i = 0; i < nodes.length; i++)
{
// if nodes[i] has attribute class="body", then get the page text enclosed
in the div tags
// what to do here?
}
return contents;
Obviously I am new to htmlparser, so much thanks in advance.
|