Digital Library Software
Greenstone is a complete digital library creation, management and distribution package created and distributed by the New Zealand Digital Library Project. There are two major versions of the software. Greenstone 3 is under active development, and is recommended for download. We also provide maintenance releases for its forerunner, Greenstone 2. Featured download not what you're looking for? Click "Browse all files" to access binaries and source releases of both versions.
Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
Google Hacks is a compilation of carefully crafted Google searches that expose novel functionality from Google's search and map services.
Torrent search utility.
Imgur Gallery Downloader
Users can now search Imgur for any phrase and ImgurDL/Loadur will automatically search for matching images. ImgurDL/Loadur will download the images while displaying the progress to the user.
Amberfish is general purpose text retrieval software. It supports nested queries of semi-structured text in XML format and traditional unstructured searching.
Hyper Estraier is a full-text search system. It works as with Google, but based on peer-to-peer architecture. Using Hyper Estraier, we can construct a large-scaled search engine with cheap computers.
ARADO RSS Feed Reader is a URL Database for Websearch and RSS Feed Reading, which saves your added Bookmarks & RSS-Feeds and syncs newest URLs with your connected devices. Store and Search your all your URLs in ARADO. As framework c++ / Qt is used.
This is an ***old archive*** of tools developed for facilitating the use of Creative Commons licenses and metadata. --- For the most up to date representation of any of the projects listed here, please see: http://creativecommons.org/project/Developer.
The Netjuke is a Web-Based Audio Streaming Jukebox powered by PHP 4, a database and all the MP3, Ogg Vorbis and other format files that constitute your digital music collection. Supports images, language packs, multi-level security, random playlists, etc
webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.
IRToolkit is an attempt to build and develop a generic search engine that integrates state-of-the-art Information Retrieval (IR) models. Furthermore, it offers a capability to compare the performance (in terms of precision, recall, index size, search response time and so on) between several open source IR applications. If you use the IRToolkit please cite the following work: https://sites.google.com/site/dinhbaduy/bibtex#Dinh-Phdthesis-2012
DuckDuckGo from the terminal
ddgr is a cmdline utility to search DuckDuckGo from the terminal. While googler is highly popular among cmdline users, in many forums the need of a similar utility for privacy-aware DuckDuckGo came up. DuckDuckGo Bangs are super-cool too! So here's ddgr for you! Unlike the web interface, you can specify the number of search results you would like to see per page. It's more convenient than skimming through 30-odd search results per page. The default interface is carefully designed to use minimum space without sacrificing readability. ddgr isn't affiliated to DuckDuckGo in any way. Demo: https://asciinema.org/a/151849
CaC is a application to easily download and convert Videos from Videosites like YouTube, Google Video etc. It´s written in Lazarus / FreePascal and availible for Linux, Windows and Mac OS X Systems.
=DOES NOT WORK ANYMORE AS DSA HAS PUT CAPTCHA= DSA Practical Driving Test Monitor helps you find any available practical driving test slot within specified date range. Runs on Linux/Mac/Windows and automates your manual task of finding the test slot.
Google() meets the Matrix. Red Piranha combines Lucene (Searching Ability), XML-RDF (ability to learn), Tomcat (for P2P Power) and Spring (Ease of use) to not only let you find anything, anywhere, but to actually understand what you are looking for.
Analyze and visualization of the social structuring from "lastfm.de" which contained user data, friendslist, groups, group-members and musical neighbours.
Scalable CLI to bit torrent trackers aiming to be completely functional and have the same usability as browsing in a browser.
System-wide utility to recover info on given data from several sources, either on- or offline. Typical use would be the translation of selected text. This is the implementation of the idea with the same name presented on My Dream App in the summer 2006.
Caissfind is expected to be an independent web searching application based on Google API in Java.
Cheshire3 is a fast Z39.50, SRW, XML search engine, written in Python for extensability and using C libraries for speed. Next generation of the Cheshire system (http://cheshire.berkeley.edu) and designed around a distributable, object oriented model.
CockyContacts is a Mac OS X Dashboard widget that searches and displays contact information from the University of South Carolina's online directory of students and faculty members.
Coherence is an advanced Content Management System build on top of Zope. Coherence has site-, user- and filemanagement. Some of the special features are a WYSIWYG page-editor with a drag and drop interface, versioncontrol, workflow and linkmanagement.
DCTViewer is a robust web based solution, sponsored by Document Conversions Technology, http://docconversions.com , for digital document searching and viewing in an intranet enviroment. Features include document storage, indexing, searching and viewing,
A combination web and desktop application for cataloging and orgranizing your books, cd's and dvd's.