[Htmlparser-user] Can't extract any div's, redux.
                
                Brought to you by:
                
                    derrickoswald
                    
                
            
            
        
        
        
    | 
      
      
      From: Jan S. <net...@gm...> - 2011-07-30 20:21:07
      
     | 
| Thanks for answering! However, I'm afraid it didn't help me much :(
So, all I've changed in the code is the nodeFilter object ( now
constructed as new AndFilter(new TagNameFilter("div"),new
HasAttributeFilter("storytext")); )
Then, I do the
for(NodeIterator e = parser.elements(); e.hasMoreNodes();){
                e.nextNode().collectInto(nodeList, nodeFilter);
            }
And according to nodeLIst.toNodeArray().lenght, there are no matching nodes.
Therefore, I don't have anything to pass to anything you've said, not
to mention I don't know, for example, what a StringBean is (that
means, I've read the javadoc on your page, but I don't have the
foggiest idea how to use it there) (And why couldn't I use the
toPlainTextString() method? I'd like to get the inner HTML of div
without removing any tags there, which StringBean removes, as I've
noticed, unless I've misunderstood it) :(
I'd be very thankful if you could elaborate more on what should I do
there to make it work, please.
By the way, how do I respond to the posts on that mailing list? I
can't find the response option anywhere?
 |