[Htmlparser-developer] htmlparser 1.0
Brought to you by:
derrickoswald
From: Kaarle K. <kaa...@ik...> - 2002-01-07 22:06:18
|
I tried the example applications using the bat-files with htmlparser 1.0 with not very good success. 1) runCrawler http://www.google.com 1 This gives a list of links on the abovementioned page I assume 2) (finnish broadcastin company) runCrawler http://www.yle.fi 1 This throws Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String ind ex out of range: 27 3) (finnish commercial tvstation ) runCrawler http://www.mtv3.fi 1 this throws Exception in thread "main" java.lang.OutOfMemoryError <<no stack trace available>> 4) my own simple homepage After a rather long time throws: Crawling to http://www.microsoft.com/ContentRedirect.asp?prd=iis&sbp=&pver=5.0&p id=&ID=404&cat=web&os=&over=&hrd=&Opt1=&Opt2=&Opt3= crawlDepth = 0 Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String ind ex out of range: 23 at java.lang.String.substring(Unknown Source) ........ I don't think I have such microsoft links on my page. Probably something to to with the activeisp.com that provides me with diskspace?? Similar result from my software page at www.kk-software.fi -------------------- As a result of these experiments i did not understand what the Robot tries to do?? Any explanations to this? regards Kaarle --------------------------------------------- Kaarle Kaila http://www.iki.fi/kaila mailto:kaa...@ik... tel: +358 50 3725844 |