[Htmlparser-user] parser help
Brought to you by:
derrickoswald
From: ernest c. <ern...@gm...> - 2011-08-17 20:25:40
|
Hi, I have been trying to use the parser for some time and I have been unable to get it to do exactly what I want, which is to gather only the plaintext without javascript or style stuff. Here is the code I've been running: public class Test { public static void main (String[] args) { try { Parser parser = new Parser (args[0]); TextExtractingVisitor visitor = new TextExtractingVisitor(); parser.visitAllNodesWith(visitor); String textInPage = visitor.getExtractedText(); System.out.println(textInPage); } catch (ParserException pe) { pe.printStackTrace (); } } } I could really use some help with this! Thanks, Ernest |