I can see there are tones on codes to retrieve attributes / tags from the HTML
Is there any sample code which teach how to:
1. Retrieve HTML from the Internet, e.g. http://www.apache.org
2. Rewrite all the image src attribute to make all the image src path as absoulte path
3. Re-generate the code for the whole apache page.
It should be grateful if any attachment / URL reference suggested.
I am looking forward to your reply.
Many thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I can see there are tones on codes to retrieve attributes / tags from the HTML
Is there any sample code which teach how to:
1. Retrieve HTML from the Internet, e.g. http://www.apache.org
2. Rewrite all the image src attribute to make all the image src path as absoulte path
3. Re-generate the code for the whole apache page.
It should be grateful if any attachment / URL reference suggested.
I am looking forward to your reply.
Many thanks
URL rewriting is covered in the SiteCapturer (and WikiCapturer) sample applications. http://htmlparser.sourceforge.net/javadoc/org/htmlparser/parserapplications/SiteCapturer.html
In this case it's used to copy a site to local storage, but the same idea can be used to rewrite the URLs however you want.
Derrick