I was wondering if there has been any work done to support the rendering of a HTML document to a BufferedImage much like what is now support for PDF. I know that there are many other ways of doing this without Multivalent. But, if it where possible, then I can just use one API to render thumbnails of difference documents without having to support an API for each document type. I am already planning to use Multivalent for rendering PDF thunbnails, it would be nice if it could handle HTML as well.
Most of the architecture is general across document type. The main difference is the document parser and the type of nodes put into the document tree. Depending on how your talking to the system, you just give it the URL of an HTML page as opposed to a PDF document.
Can you post a complete example of rendering the HTML from a given site? I haven't been able to find any docs/code that does this. I can't get pass how to parse and generate the Nodes for the document.
Rendering of any document type is almost the same. You set the "genre" (like "AdobePDF") or let the system compute it from file suffix. For paginated documents, you set the page number. For flowed documents you set a page width. Beyond that code for PDF that you have, you can look at the classes Browser and MediaLoader.