[Htmlparser-user] Can't extract any div's, redux.
Brought to you by:
derrickoswald
From: Jan S. <net...@gm...> - 2011-07-30 20:21:07
|
Thanks for answering! However, I'm afraid it didn't help me much :( So, all I've changed in the code is the nodeFilter object ( now constructed as new AndFilter(new TagNameFilter("div"),new HasAttributeFilter("storytext")); ) Then, I do the for(NodeIterator e = parser.elements(); e.hasMoreNodes();){ e.nextNode().collectInto(nodeList, nodeFilter); } And according to nodeLIst.toNodeArray().lenght, there are no matching nodes. Therefore, I don't have anything to pass to anything you've said, not to mention I don't know, for example, what a StringBean is (that means, I've read the javadoc on your page, but I don't have the foggiest idea how to use it there) (And why couldn't I use the toPlainTextString() method? I'd like to get the inner HTML of div without removing any tags there, which StringBean removes, as I've noticed, unless I've misunderstood it) :( I'd be very thankful if you could elaborate more on what should I do there to make it work, please. By the way, how do I respond to the posts on that mailing list? I can't find the response option anywhere? |