Re: [Htmlparser-user] set URL for relative links
Brought to you by:
derrickoswald
|
From: Jeffrey B. <jb...@cs...> - 2006-09-11 22:28:08
|
Thanks Derrick,
Your suggestion worked perfectly!
-Jeff
On 9/11/06, Derrick Oswald <Der...@ro...> wrote:
>
> I believe you need to use setBaseUrl on the Page object.
> parser.getLexer ().getPage ().setBaseUrl ("http://www.bar.com");
>
> Jeffrey Bigham wrote:
>
> >On 9/11/06, Garry Huang <ga...@gm...> wrote:
> >
> >
> >>Did you try my_parser.setURL("http://www.bar.com/"); ?
> >>
> >>
> >
> >Yeah, I tried that.
> >
> >If it's inserted before I call extractAllNodesThatMatch(img_filter);
> >then http://www.bar.com is downloaded. If it's called after then the
> >relative links aren't fixed.
> >
> >It's possible that there's something subtle with the ordering that I
> >could change, but I couldn't get it to work and it seems like it would
> >be a hack...
> >
> >Thanks for the suggestion though.
> >
> >-Jeff
> >
> >
> >
> >>Just a thought.
> >>
> >>Cheers,
> >>Garry
> >>
> >>On Sep 12, 2006, at 12:58 AM, jpdogg wrote:
> >>
> >>
> >>
> >>>Hello,
> >>>
> >>>I've cached some HTML pages in local files and would like to tell the
> >>>Parser object what the original URLs were so that it can correctly
> >>>interpret relative links.
> >>>
> >>>As a simple example, say I do this:
> >>>
> >>>Parser my_parser = new Parser("<html><img src='foo.jpg'></html>");
> >>>
> >>>If I construct a filter to give me all of the ImageTags in this simple
> >>>document, I get one. Unfortunately, it has the URL foo.jpg. If I
> >>>know that this file was originally located at
> >>>http://www.bar.com/foo.html, how do I inform the parser module? I
> >>>want it to be able to report that the above image is located at
> >>>http://www.bar.com/foo.jpg.
> >>>
> >>>Thanks!
> >>>Jeff
> >>>
> >>>----------------------------------------------------------------------
> >>>---
> >>>Using Tomcat but need to do more? Need to support web services,
> >>>security?
> >>>Get stuff done quickly with pre-integrated technology to make your
> >>>job easier
> >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache
> >>>Geronimo
> >>>http://sel.as-us.falkag.net/sel?
> >>>cmd=lnk&kid=120709&bid=263057&dat=121642
> >>>_______________________________________________
> >>>Htmlparser-user mailing list
> >>>Htm...@li...
> >>>https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >>>
> >>>
> >>-------------------------------------------------------------------------
> >>Using Tomcat but need to do more? Need to support web services, security?
> >>Get stuff done quickly with pre-integrated technology to make your job easier
> >>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> >>_______________________________________________
> >>Htmlparser-user mailing list
> >>Htm...@li...
> >>https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >>
> >>
> >>
> >
> >-------------------------------------------------------------------------
> >Using Tomcat but need to do more? Need to support web services, security?
> >Get stuff done quickly with pre-integrated technology to make your job easier
> >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> >_______________________________________________
> >Htmlparser-user mailing list
> >Htm...@li...
> >https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >
> >
> >
>
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
|