To see the sample, please refer to http://dev.stela.org.br/retriever/samples.html
Retriever is an open source crawler under the Apache License V2.0 that collects information reachable through a variety of protocolos (e.g. http, smb, file), allowing users to manipulate such information as needed.
From now on, you can track Retriever's news using its website --> http://retriever.sourceforge.net/.
With Retriever 0.5.2 you can collect data from a local hard disk and the web, parse this data and persist it as you need. It also comes with a schedule mechanism and a built-in persistence feature concerning Lucene. In the future, more sources of data will be available, like feeds, rdbms, folders shared with Samba, and many more!