Free Extracts Emails, Phones and custom text from Web using JAVA Regex
An object relational-mapping (ORM) library for Java
Digital Library Software
Self-hosted search engine with web service to share discoveries with
Easy Spider is a distributed Perl Web Crawler Project from 2006
Free Extracts Emails, Phones and custom text from Web using JAVA Regex
An Open Source "product catalogue" that is customizable and versatile.
Decentralized Web Search Engine
Very configurable web downloader
An open source search engine with RESTFul API and crawlers
Very simple implementaition of list poisoning idea.
Framework for search and display of heterogenous document collections.
WebCollector is an open source web crawler framework based on Java.
Simple Semantic Web Architecture and Protocol