Digital Library Software
Greenstone is a complete digital library creation, management and distribution package created and distributed by the New Zealand Digital Library Project. There are two major versions of the software. Greenstone 3 is under active development, and is recommended for download. We also provide maintenance releases for its forerunner, Greenstone 2. Featured download not what you're looking for? Click "Browse all files" to access binaries and source releases of both versions.
Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
Google Hacks is a compilation of carefully crafted Google searches that expose novel functionality from Google's search and map services.
Imgur Gallery Downloader
Users can now search Imgur for any phrase and ImgurDL/Loadur will automatically search for matching images. ImgurDL/Loadur will download the images while displaying the progress to the user.
Amberfish is general purpose text retrieval software. It supports nested queries of semi-structured text in XML format and traditional unstructured searching.
IRToolkit is an attempt to build and develop a generic search engine that integrates state-of-the-art Information Retrieval (IR) models. Furthermore, it offers a capability to compare the performance (in terms of precision, recall, index size, search response time and so on) between several open source IR applications. If you use the IRToolkit please cite the following work: https://sites.google.com/site/dinhbaduy/bibtex#Dinh-Phdthesis-2012
Hyper Estraier is a full-text search system. It works as with Google, but based on peer-to-peer architecture. Using Hyper Estraier, we can construct a large-scaled search engine with cheap computers.
The Netjuke is a Web-Based Audio Streaming Jukebox powered by PHP 4, a database and all the MP3, Ogg Vorbis and other format files that constitute your digital music collection. Supports images, language packs, multi-level security, random playlists, etc
CaC is a application to easily download and convert Videos from Videosites like YouTube, Google Video etc. It´s written in Lazarus / FreePascal and availible for Linux, Windows and Mac OS X Systems.
ARADO RSS Feed Reader is a URL Database for Websearch and RSS Feed Reading, which saves your added Bookmarks & RSS-Feeds and syncs newest URLs with your connected devices. Store and Search your all your URLs in ARADO. As framework c++ / Qt is used.
Google() meets the Matrix. Red Piranha combines Lucene (Searching Ability), XML-RDF (ability to learn), Tomcat (for P2P Power) and Spring (Ease of use) to not only let you find anything, anywhere, but to actually understand what you are looking for.
This is an ***old archive*** of tools developed for facilitating the use of Creative Commons licenses and metadata. --- For the most up to date representation of any of the projects listed here, please see: http://creativecommons.org/project/Developer.
HTTP Directory Index consiste en un script PHP que actúa como interfaz gráfica amigable para indexar directorios Web.
The Medlane project is an attempt to create a set of tools that will enable librarians to move from the standard MARC (MAchine Readable Cataloging) format to a new library/museum XML format. This move will ensure traditional library/museum data remains
Online news and newspaper harvester - Like RSS Newsreader w/ database. National & International News. Very detailed catches hard to find news articles. Allows resposting of summaries w/ comments to Usenet Newsgroups, complex searches & more.
Zope is an open source application server specializing in content management, intranets, and custom web applications. Zope is written in Python and has a large, global community of developers and companies.
Simple application for downloading pictures from Zerochan.net
Simple java application for downloading high-quality pictures from Zerochan.net. You can find images by size or a tag. It's simple. And flat. All you need to do: download .jar file and run it with Oracle JVM (or any another JVM supporting image decoding)
webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.
Scalable CLI to bit torrent trackers aiming to be completely functional and have the same usability as browsing in a browser.
Caissfind is expected to be an independent web searching application based on Google API in Java.
Cheshire3 is a fast Z39.50, SRW, XML search engine, written in Python for extensability and using C libraries for speed. Next generation of the Cheshire system (http://cheshire.berkeley.edu) and designed around a distributable, object oriented model.
Coherence is an advanced Content Management System build on top of Zope. Coherence has site-, user- and filemanagement. Some of the special features are a WYSIWYG page-editor with a drag and drop interface, versioncontrol, workflow and linkmanagement.
DCTViewer is a robust web based solution, sponsored by Document Conversions Technology, http://docconversions.com , for digital document searching and viewing in an intranet enviroment. Features include document storage, indexing, searching and viewing,
=DOES NOT WORK ANYMORE AS DSA HAS PUT CAPTCHA= DSA Practical Driving Test Monitor helps you find any available practical driving test slot within specified date range. Runs on Linux/Mac/Windows and automates your manual task of finding the test slot.
Fire.now is a Firefox plugin that automatically adds your documents to the WhereIsNow latest version discovery service. Everytime you upload a document somewhere, Fire.now integrates the WhereIsNow keys into the file and add it's url to WhereIsNow.