Retriever: a light, extensible crawler / News: Recent posts

Added sample that shows how to crawl documents in a LAN

To see the sample, please refer to

Posted by Lucas Nazario 2009-02-20

Retriever: a lite, extensible crawler

Retriever is an open source crawler under the Apache License V2.0 that collects information reachable through a variety of protocolos (e.g. http, smb, file), allowing users to manipulate such information as needed.

Posted by Lucas Nazario 2009-02-05

News will be posted on Retriever's website

From now on, you can track Retriever's news using its website -->

Posted by Lucas Nazario 2007-12-29

First version of Retriever available!

With Retriever 0.5.2 you can collect data from a local hard disk and the web, parse this data and persist it as you need. It also comes with a schedule mechanism and a built-in persistence feature concerning Lucene. In the future, more sources of data will be available, like feeds, rdbms, folders shared with Samba, and many more!

Posted by Lucas Nazario 2007-12-14