[Htmlparser-user] How can I improve the speed of extracting the information of a html page through
Brought to you by:
derrickoswald
From: sajid k. <ass...@gm...> - 2007-02-26 11:41:13
|
Hi, I am using HTMLParser for extracting the content of the Html page. I have noticed that bulk of the time is spent in extracting the information than processing the data. The code looks like this, // inputStream is of type InputStream. It carries the page Source of a Html page. Page page = new Page(inputStream, null); Lexer lexer = new Lexer(page); Parser parser = new Parser(lexer); StringBean sb=new StringBean(); parser.visitAllNodesWith (sb); String text = sb.getStrings(); //Doing something with text. Here I want to inform you that i have crawled few pages with the help of a crawler. So html pages are in my Hard Disk. Can anybody please help me to improve the speed of my program. regards Sajid Khan. |