Re: [Htmlparser-user] Link Location resolving
Brought to you by:
derrickoswald
|
From: Jurgen V. <ri...@pl...> - 2007-11-23 18:07:46
|
Thanks that worked.
Jurgen
Derrick Oswald wrote:
>
> You should be able to use the Page.setBaseUrl (string base) method to
> set the URL used as a prefix for relative links, i.e.
> parser.getLexer ().getPage ().setBaseUrl ("http://yadda.yadda");
>
>
> ----- Original Message ----
> From: Jurgen Voorneveld <j.e...@st...>
> To: htm...@li...
> Sent: Friday, November 23, 2007 11:13:33 AM
> Subject: [Htmlparser-user] Link Location resolving
>
> List,
>
> I've recently started using htmlparser as part of a webspidering tool
> that I have written and I've run into a small problem.
> My spider downloads files from webservers using HttpClient from the
> Apache Commons project. These files are then stored locally in a
> temporary location. If a file contains HTML it is then parsed by
> htmlparser.
> During parsing the parser resolves relative links to other files by
> adding the location of the file to the relative link. Which of course
> completely screws up the links. Is there any way to turn this feature
> off or some way of telling the parser that the location of the data is
> not where it gets the data from.
>
> thanks
> Jurgen Voorneveld
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> <mailto:Htm...@li...>
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> ------------------------------------------------------------------------
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
|