Re: [Htmlparser-user] Link Location resolving
Brought to you by:
derrickoswald
From: Jurgen V. <ri...@pl...> - 2007-11-23 18:07:46
|
Thanks that worked. Jurgen Derrick Oswald wrote: > > You should be able to use the Page.setBaseUrl (string base) method to > set the URL used as a prefix for relative links, i.e. > parser.getLexer ().getPage ().setBaseUrl ("http://yadda.yadda"); > > > ----- Original Message ---- > From: Jurgen Voorneveld <j.e...@st...> > To: htm...@li... > Sent: Friday, November 23, 2007 11:13:33 AM > Subject: [Htmlparser-user] Link Location resolving > > List, > > I've recently started using htmlparser as part of a webspidering tool > that I have written and I've run into a small problem. > My spider downloads files from webservers using HttpClient from the > Apache Commons project. These files are then stored locally in a > temporary location. If a file contains HTML it is then parsed by > htmlparser. > During parsing the parser resolves relative links to other files by > adding the location of the file to the relative link. Which of course > completely screws up the links. Is there any way to turn this feature > off or some way of telling the parser that the location of the data is > not where it gets the data from. > > thanks > Jurgen Voorneveld > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > <mailto:Htm...@li...> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > ------------------------------------------------------------------------ > > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > |