[Htmlparser-user] parser help
Brought to you by:
derrickoswald
|
From: ernest c. <ern...@gm...> - 2011-08-17 20:25:40
|
Hi,
I have been trying to use the parser for some time and I have been unable to
get it to do exactly what I want, which is to gather only the plaintext
without javascript or style stuff. Here is the code I've been running:
public class Test
{
public static void main (String[] args)
{
try
{
Parser parser = new Parser (args[0]);
TextExtractingVisitor visitor = new TextExtractingVisitor();
parser.visitAllNodesWith(visitor);
String textInPage = visitor.getExtractedText();
System.out.println(textInPage);
}
catch (ParserException pe)
{
pe.printStackTrace ();
}
}
}
I could really use some help with this!
Thanks,
Ernest
|