An open source search engine with RESTFul API and crawlers
OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
ARADO RSS Feed Reader is a URL Database for Websearch and RSS Feed Reading, which saves your added Bookmarks & RSS-Feeds and syncs newest URLs with your connected devices. Store and Search your all your URLs in ARADO. As framework c++ / Qt is used.
Digital Library Software
Greenstone is a complete digital library creation, management and distribution package created and distributed by the New Zealand Digital Library Project. There are two major versions of the software. Greenstone 3 is under active development, and is recommended for download. We also provide maintenance releases for its forerunner, Greenstone 2. Featured download not what you're looking for? Click "Browse all files" to access binaries and source releases of both versions.
The "Netiquette abolishment project" ! Replace content RATING by content POSITIONNING. This project's main goal is to create an on line 'real place': It will work like a 3D visualisation software: you select your interests by geting close to them,
Seeks is a free and open technical design and application for enabling social websearch. Its specific purpose is to regroup users whose queries are similar so they can share both the query results and their experience on these results.
In the current phase, the author is concentrating on the following work: 1. To provide an OO framework over linux posix interfaces 2. To procide a basic encapsulation for the http proto. 3.To provide a high performance url robert.
SF Ftp Search Engine ----High speed, open source based and no database required. Demo: http://sf.hit.edu.cn/
crowdspider is a multi-thread web crawler. crowdspider is (just) a web crawler, NOT an indexer. You have to write some code yourself in order to save pages or index them in a database.
The Jobcrawler search engine is a research project in order to index the available applications on the internet. Our mission is to really help people who seek a job or employee on a one to one basis and rule mediators (job agencies) out.
Uni-wordsplit aimed to provide a unicode(lexical analysis/word splitter) system.Especially designed for CJK(China/Japan/Korea) users. The Code based on Mozilla-XPCOM code.
filofant is an archiving and indexing server for e-mails, attachments and other documents stored on various locations in your company. The indexed documents are accessible by a customizable web frontend like an internet search engine.
POPsearch is a desktop search engine that's designed to help you find information on your computer. This information can then be accessed remotely with RSS feeds, email feeds, or from any computer that has a web browser.
Xyzse has implemented the essential functions of general web search engines. It is developed for students or anyone who are interested in search engine. More features will be added in the following releases.
PMS (Perl Managed Streaming). A set of tools and a background daemon written in perl to help manage and control online radios, icecast stream. Includes, Searches, Uploads, Song Permissions, Playlist, Shoutouts, Requests, IRC and Web based controls, ect.
HooDoo is designed to provide most of the same functionality of Google, but available to all for their websites
You can 'wear' any clothes on the internet 'virtually' on your body (image). And this application stores the URL where you can buy the clothes, you can share your information like your good looking shot or your ratingson the clothe over internet.
A fast way to rate the reading challenging level of book or text. Unlike well known reading metrics such as Fog, Kincaid, SMOG, ARI, Flesch, and Coleman-Liau readability this metric takes into account far more factors and is standarized against a corpus
A system to retrieve and display in 3D the structure of the Internet (or as much as can be analysed). It should allow for an interesting perspective of the way pages are linked and clustered. It will hopefully also provide a more intuitive way of browsing
KPhotools is a QT and ImLib2 based Web Album (image gallery) generation tool for KDE. It can: resize images,rotate images, blend logos on images, create web albums with slideshows, take screenshots from you desktop(s). No PHP needed on your server.
Amberfish is general purpose text retrieval software. It supports nested queries of semi-structured text in XML format and traditional unstructured searching.
SYRAH si propone di far emergere e rappresentare i concetti espressi per mezzo di un linguaggio naturale. SYRAH aims to discover and represent concepts expressed in natural languages. NLP, lemma, lemmario, italiano, rete, semantica, clustering, semantic
GUI frontend for the a full-text search engine namazu (www.namazu.org)
Neko is a GUI for namazu: a full-text search engine (www.namazu.org).
Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex compounding or character encoding. Hunspell interfaces: Curses, Ispell compatible pipe interface, OpenOffice.org UNO module