I'm new to jericho and don't really see any examples that show how to modify a document. I am wanting to modify all the url links within html documents downloaded using apache HttpRequest. The document will then be fed into a google WebVew with the new proxy url links. How would I go about modifying the document links after loading the body of the html document into the Source object? How do I output the document with only the html link changes for use in the WebView? Also, I will have no knowledge of the document contents beforehand.
See the documentation of the OutputDocument class for sample code to modify a document. A working version of the code is supplied in the file samples/console/src/ConvertStyleSheets.java.
If you're using the Apache HttpClient to load the document, use the Source class' InputStream constructor so that Jericho HTML Parser can determine the correct character encoding automatically.
Log in to post a comment.