DocSedoter

DocSedoter is to download/collect the documents from the internet. For web documents (HTML/CSS), it also will download all resources (image, script, HTML, CSS etc) linked with them. The process will be done recursively. This process is also called as web crawling process. All links URI will be changed to the local path, so that the collected web documents can be navigated by offline.

This project consists of two sub projects. The first is the library that can be used by different applications whose different UI design. The second one is the example of application whose a simple UI which uses the library mentioned before.

This library provides the interface to set where the collected documents/resources will be saved in. Currently, this library only provides a class to saves those documents in the files. However, you may implement a class to save the documents/resources in the database. So, it will be like something used by a searching engine.