Re: [Htmlparser-user] Help on extracting clean body content from web page
Brought to you by:
derrickoswald
From: James M. <jam...@a-...> - 2007-11-16 00:27:19
|
I found the solution to my problem! lNodes = lDocumentNodeList.extractAllNodesThatMatch(new TagNameFilter ("BODY"),true); The BODY tag was buried underneath another element, and by default the boolean recursive flag is set to false, meaning that nested elements will not be returned. After setting this flag -- the second parameter -- to true, the problem was resolved and I was able to retrieve my content. Hope this helps someone! -- James Mortensen A-CTI Development Team |