Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex compounding or character encoding. Hunspell interfaces: Curses, Ispell compatible pipe interface, OpenOffice.org UNO module
Digital Library Software
Greenstone is a complete digital library creation, management and distribution package created and distributed by the New Zealand Digital Library Project. There are two major versions of the software. Greenstone 3 is under active development, and is recommended for download. We also provide maintenance releases for its forerunner, Greenstone 2. Featured download not what you're looking for? Click "Browse all files" to access binaries and source releases of both versions.
Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
CLucene is a C++ port of Lucene: the high-performance, full-featured text search engine written in Java. CLucene is faster than lucene as it is written in C++.
An open source search engine with RESTFul API and crawlers
OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
The stuff here has no documentation and some of it may never be completed. This is my playground, use at your own risk.
The ht://Dig system is a complete indexing and searching system for a domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Google and AltaVista.
SWISH++ is a Unix-based file indexing and searching engine (typically used to index and search files on web sites). It's very fast, robust, and can index several file formats including text, HTML, mail, news, LaTeX, and MP3, and apply filters.
Lurker is a mailing list archiver designed for capacity, speed, simplicity, and configurability in that order. Noteworthy features include: google-style searching on all fields, chronology preserving threads, multilingual, and attachment support.
Amberfish is general purpose text retrieval software. It supports nested queries of semi-structured text in XML format and traditional unstructured searching.
Dave's Quick Search Deskbar is an add-on for the Windows Desktop Taskbar that lets you launch searches quickly. With almost 400 searches, a calculator, clock, calendar, and much more in one little textbox, it's monster functionality in a flea-sized
ARADO RSS Feed Reader is a URL Database for Websearch and RSS Feed Reading, which saves your added Bookmarks & RSS-Feeds and syncs newest URLs with your connected devices. Store and Search your all your URLs in ARADO. As framework c++ / Qt is used.
Desktop Search - Speed up searching your Windows PC and Outlook emails
Copernic Desktop Search helps you search within your computer for documents, files & emails. Download the best Desktop Search today for free. Copernic Desktop Searchallows you to centralize your document, file & email searches in one unique interface. You can search your files & documents on your computer, external and network drives. Increase your search speed and your organization's productivity while reducing the time lost trying to find those documents. Join the largest Desktop Search users with 4,000,000+ users in more than 125 countries.
WallPaper (alias crawlpaper) is a desktop changer (NOT a screensaver) which includes a web crawler for picture download, an audio stream ripper, an audio player, a mini mp3 tag editor,etc. Also included support for .zip and .rar files.
That project aims at providing a clean API, and the corresponding C++ implementation, for parsing travel-focused requests (e.g., "washington dc beijing monday r/t +aa -ua 1 week 2 adults 1 dog").
A full featured Internet search engine, specifically designed to power vertical search, enterprise search, or a knowledge area search. Can index 2.5 million documents per 24 hours on a single Dell server. Clean C++/STL code written from scratch.
High performance distributed in-memory key/value store
Infinispan is an open source, Java based data grid platform. ***IMPORTANT*** Starting with Infinispan 5.0.0.FINAL, Infinispan releases are no longer hosted in Sourceforge. They can now be located in www.jboss.org/infinispan/downloads
SNT is a search engine for SMB and FTP shares with crawler running on Win32. Web interface is provided for searching files and browsing shares contents. Also provided shared films list with users rates and comments.
CNSearch - Web-sites Search Engine. CNSearch is a full-text search system, easy to install and to set up.
CoverYourASP.com - complete Active Server Pages source (JScript) for this popular web site. Includes full membership system, diary, online db admin, banner ad system and loads more.
The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl
Grub is a distributed internet crawler/indexer designed to run on multi-platform systems, interfacing with a central server/database.
A new Web Crawler including sophisticated searching process especialized by language !
It is basicly a program that can make you a search engine. It is a web crawler, has all the web site source code (in ASP, soon to be PHP as well), and a mysql database.
ALJ is a program for exporting complete user's journal including all comments into plain files. Once journal is downloaded, program can check if new entries are available, download them and add to plain file. http://s93143383.onlinehome.us/