Re: [Htmlparser-user] Help on extracting clean body content from web page

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I found the solution to my problem!

lNodes = lDocumentNodeList.extractAllNodesThatMatch(new TagNameFilter
("BODY"),true);

The BODY tag was buried underneath another element, and by default the
boolean recursive flag is set to false, meaning that nested elements will
not be returned.

After setting this flag -- the second parameter -- to true, the problem was
resolved and I was able to retrieve my content.

Hope this helps someone!

-- 

James Mortensen
A-CTI Development Team