[Htmlparser-user] Link Location resolving
Brought to you by:
derrickoswald
From: Jurgen V. <j.e...@st...> - 2007-11-23 16:13:00
|
List, I've recently started using htmlparser as part of a webspidering tool that I have written and I've run into a small problem. My spider downloads files from webservers using HttpClient from the Apache Commons project. These files are then stored locally in a temporary location. If a file contains HTML it is then parsed by htmlparser. During parsing the parser resolves relative links to other files by adding the location of the file to the relative link. Which of course completely screws up the links. Is there any way to turn this feature off or some way of telling the parser that the location of the data is not where it gets the data from. thanks Jurgen Voorneveld |