Digital Library Software
Greenstone is a complete digital library creation, management and distribution package created and distributed by the New Zealand Digital Library Project. There are two major versions of the software. Greenstone 3 is under active development, and is recommended for download. We also provide maintenance releases for its forerunner, Greenstone 2. Featured download not what you're looking for? Click "Browse all files" to access binaries and source releases of both versions.
Virtuoso is a scalable cross-platform server that combines Relational, Graph, and Document Data Management with Web Application Server and Web Services Platform functionality.
The stuff here has no documentation and some of it may never be completed. This is my playground, use at your own risk.
The ht://Dig system is a complete indexing and searching system for a domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Google and AltaVista.
A function-testing, performance-measuring, site-mirroring, web spider that is widely portable and capable of using scenarios to process a wide range of web transactions, including ssl and forms.
3store is an RDF "triple store", written in C and backed by MySQL and Berkeley DB. It is an optimisation and port of an older triple store (WebKBC). It provides access to the RDF data via RDQL or SPARQL over HTTP, on the command line or via a C API.
Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.
CNSearch - Web-sites Search Engine. CNSearch is a full-text search system, easy to install and to set up.
Harvest is a distributed search engine framework. It collects data using various methods like HTTP, FTP, News, local files etc., extracts relevant information, creates indexes and make them searchable using a Web interface. All of the collecting, extracti
Hyper Estraier is a full-text search system. It works as with Google, but based on peer-to-peer architecture. Using Hyper Estraier, we can construct a large-scaled search engine with cheap computers.
OpenFTS (Open Source Full Text Search engine) is an advanced PostgreSQL-based search engine that provides online indexing of data and relevance ranking for database searching. Close integration with database allows use of metadata to restrict search re
A threaded C application that searches torrent trackers/indexers for .torrent files and sorts the results according to user defined criteria. Uses glib2.0 and libcurl4
Software, information, data sets and documentation for the Web as Corpus community.
Amberfish is general purpose text retrieval software. It supports nested queries of semi-structured text in XML format and traditional unstructured searching.
ht://Check is more than a link checker. It's particularly suitable for checking broken links, anchors and web accessibility barriers, but retrieved data can also be used for Web structure mining. Uses a MySQL backend. Derived from ht://Dig.
Sgrep (sorted grep) is a much faster alternative to traditional Unix grep when searching large files, because sgrep searches sorted input files using a fast binary search to find matching lines.
The BeeGram library is a portable open source search engine toolkit written in C. BeeGram provides a number of building blocks for the construction of powerful general-purpose text-based search tools.
BeeSeek is a project to build a free, open-source search engine based on a peer to peer technology. Code and bug reports are available on https://launchpad.net/beeseek-project
DocTaur is a Web-based searchable directory of reference manuals. You can freely download, install, and administrate it on your local Linux intranet server. It is powered by the ht://Dig search engine and contains reference manuals for developers.
FileStructureToHTML literally creates an .html file for mp3´s, videos and more of your selected drive or a specific directory. The files can be organized in a tree, listing or in a table. Gain a whole new way of viewing your file list.
When released, FilmSearch will let you gain a huge amount of time: you'll no more have to scan every day the program of some dozens of TV-channels, just to find once per month something interesting enough to turn on the TV. Each user will be able t
Fleming File Sharing System is a networked file sharing system. It should be much more reliable and user-friendly than FTP or netbios's stuff.
PAD stands for Portable Application Description. PAD is an XML-based open format to describe downloadable applications. By using the PAD system, developers save time by having to create a description of their software packages only once.
Harvest is a web indexing package, originally disigned for distributed indexing, it can form a powerful system for indexing both large and small web sites. Also now includes Harvest-NG a highly efficient, modular, perl-based web crawler.
A collection of software to implement search engine technology. The overall search technology is built on the individual components of this project, each component is released under the BSD License, and is written in the language most suited to its task.