[Htmlparser-user] How can I improve the speed of extracting the information of a html page through
Brought to you by:
derrickoswald
|
From: sajid k. <ass...@gm...> - 2007-02-26 11:41:13
|
Hi,
I am using HTMLParser for extracting the content of the Html page. I
have noticed that bulk of the time is spent in extracting the information
than processing the data.
The code looks like this,
// inputStream is of type InputStream. It carries the page Source of a Html
page.
Page page = new Page(inputStream, null);
Lexer lexer = new Lexer(page);
Parser parser = new Parser(lexer);
StringBean sb=new StringBean();
parser.visitAllNodesWith (sb);
String text = sb.getStrings();
//Doing something with text.
Here I want to inform you that i have crawled few pages with the help of a
crawler. So html pages are in my Hard Disk.
Can anybody please help me to improve the speed of my program.
regards
Sajid Khan.
|