It's an selfeducation project. Aim of project is teach ourselves to work together and improve our programming's skills.
A C Implementation of an OAI-PMH Static Repository Gateway.
Distributed search engine for the Internet and intranets. Parallel search in heterogeneous indexes, topic-oriented harvester, CORBA interface to legacy document databases. Document clustering with neural networks.
Okapi is an IR system, built to do experiments in interactive search and investigate theories in IR. The core of Okapi is a Basic Search System (BSS) which allows programmers to build search applications. See http://www.soi.city.ac.uk/~andym/OKAPI-PACK/
open-search is a framework to build a p2p web search engine, whereby people mutually form a search engine without the intervention of central servers or a central actor.
OpenAnonymity consists of a module for apache 2.0 Webserver and a framework that enables you to control search engine spider indexing on a word level, contrary to on file level as in Robots exclusion. OA could force Spiders to follow this rules.
OpenFTS (Open Source Full Text Search engine) is an advanced PostgreSQL-based search engine that provides online indexing of data and relevance ranking for database searching. Close integration with database allows use of metadata to restrict search re
Virtuoso is a scalable cross-platform server that combines Relational, Graph, and Document Data Management with Web Application Server and Web Services Platform functionality.
The goal of OpenParentalControls is to provide a user-contributed database of website age ratings, as well as a series of extensions for popular web browsers to honor, update and vote on these ratings.
PHP Wrapper Class For ht://Dig is a class I developed while desperately searching for something with similar capabilites. This class is intended to be much more thorough allowing for easily changing headers, footers, and templates. htdig + PHP = htPHP
A PHP extension to Swish-e
This will be a implementation of Google PageRank Algorithm. For calculate the PR, could be use various PCs for speed up.
Pansophica is an intelligent web search agent that presents results in a dynamic and interactive virtual reality. Twist, fly and play the net.
A function-testing, performance-measuring, site-mirroring, web spider that is widely portable and capable of using scenarios to process a wide range of web transactions, including ssl and forms.
PySMBSearch is a crawler and search engine for SMB shares. It consists of a crawler script, which creates an index and stores it in an SQL database, and a CGI script that can be used to extract queries from the database.
Creates really cool and useful hypermaps from SQL database schema, consists of a small PL/SQL metadata extractor and a Python (or C) postprocessor file.
This will be a generic indexing system for the python language with pluggable engines to store the resulting indexes in either ZODB, MySQL or (at a later date) its own proprietary format.
Remora provides local document search capabilities to the iPhone and iTouch. The project uses the open source search engine Hyper Estraier together with a live search powered by Yahoo!
SEARCHTHELAN is a SEARCHER/INDEXER of files/directories in a local area network....
The project provides an incubator for intelligent agent-assisted, AR gaming-oriented BI applications generated through the STALEMATE Knowledge-based System Design Environment (KBSDE), integrating Web-enabled knowledge bases, data mining and warehousing and directed at asset management and investment banking.
SWISH-Enhanced is a fast, powerful, *flexible*, free, and easy to use system for indexing collections of Web pages or other files. Key features include the ability to limit searches to certain HTML tags (META, TITLE, comments, etc.).
SYRAH si propone di far emergere e rappresentare i concetti espressi per mezzo di un linguaggio naturale. SYRAH aims to discover and represent concepts expressed in natural languages. NLP, lemma, lemmario, italiano, rete, semantica, clustering, semantic
A next-generation spinoff from All-Seeing Eye serverbrowser that is now down. Aims to be a multiplatform serverbrowser for as many games as possible.
A robust website scraping framework that uses XML, XPath, RegEx and scripting to consume, parse, normalize and traverse HTML based on a set of seed URLs. Scrape.NET is built using C#, TidyForNet (the p-invoke only version) and HTML Tidy.
The Python Search Engine