Was able to fix the problem using this code, pulled from the
extractAllNodesThatMatch method itself:
=========================
NodeIterator e;
for (e = parser.elements (); e.hasMoreNodes (); ) {
Node currentNode = e.nextNode();
currentNode.collectInto(titleList, titleFilter);
currentNode.collectInto(summaryTableList, summaryTableFilter);
}
===================
-Daniel
On Tue, Mar 11, 2008 at 9:03 AM, Daniel Dixon <me...@cr...> wrote:
> Hello,
>
> Anyone know why I can't use two extractAllNodesThatMatch(filter)
> methods back-to-back on the same Parser instance?
>
> More specifically I have this code:
>
> ========================================
> Parser parser = new Parser(google);
>
> NodeList titleList = parser.extractAllNodesThatMatch(titleFilter);
> NodeList summaryTableList = parser.extractAllNodesThatMatch(summaryTableFilter);
> ========================================
>
> The Google search results page I'm parsing has a series of these:
>
> <a href="blah">Title</a>
> <table><tr><td>.....Summary info....</td></tr></table>
>
> The two filters above, when independent, work fine. Run them
> back-to-back and the second will come up empty. I don't see where the
> extractAllNodesThatMatch method literally pulls the nodes out of the
> captured source, thus affecting the second filter. Here are my
> filters:
>
> ========================================
> // filter to pull out titles (all links that are next to a table)
> NodeFilter titleFilter = new AndFilter (
> new NodeClassFilter (LinkTag.class),
> new HasSiblingFilter (new NodeClassFilter(TableTag.class))
> );
> // filter to pull out summaries (all tables that are next to a title link)
> NodeFilter summaryTableFilter = new AndFilter (
> new NodeClassFilter (TableTag.class),
> new NodeClassFilterOnPreviousSibling (LinkTag.class)
> // custom filter
> );
> ========================================
>
> Thanks for the help. I've already tried subclassing the Parser so
> that I could implement the clone() method, but got the same result.
>
> -Daniel
>
--
-------------------------------
Daniel
me...@da...
www.OneDanShow.com
-------------------------------
|