Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex compounding or character encoding. Hunspell interfaces: Curses, Ispell compatible pipe interface, OpenOffice.org UNO module
CLucene is a C++ port of Lucene: the high-performance, full-featured text search engine written in Java. CLucene is faster than lucene as it is written in C++.
Annotea Ubimarks is an application of Annotea shared bookmarks in Mozilla. It helps users to easily organize Web information by using familiar concepts, share findings with trusted peer groups and benefit from the underlying Semantic Web technologies.
TagHybrida is a French hybrid syntactic parser. TagHybrida is a four stage parser combining hand-writen and corpus based information.
Job publish and search engine based on Java2EE, Hibernate, PostgreSQL and Jersey with Web interface based on JQuery
MAOS (Meta-Attribute Object Store) is a light-weight Java library / framework implementing simple Object persistence using search-engine technology
Framework (scripts, configuration, code) to build free and public services around travel and leisure data. That project makes an extensive use of already existing data sources such as Geonames and dbPedia, and adds some glue around those (eg, links).
That project aims at providing a clean API, and the corresponding C++ implementation, for parsing travel-focused requests (e.g., "washington dc beijing monday r/t +aa -ua 1 week 2 adults 1 dog").
The goal of OpenParentalControls is to provide a user-contributed database of website age ratings, as well as a series of extensions for popular web browsers to honor, update and vote on these ratings.
The Overmind is a "Sentient Component Framework Glue" for rapidly developing web applications. Overmind adds intelligent merging for various existing frameworks and predictively retrieves application data and rules using JIT dependency injection.
RSS spider for getting multiple RSS feeds into single place with search capabilities.
Relational storage for tagged documents
Restad is an indexing-querying tool for tagged documents. It uses a relational database for storage and querying. See the last news on the blog : https://sourceforge.net/p/restad/blog/ The Ruby first prototype can be found there : https://github.com/ymoreau/Restad
A fast way to rate the reading challenging level of book or text. Unlike well known reading metrics such as Fog, Kincaid, SMOG, ARI, Flesch, and Coleman-Liau readability this metric takes into account far more factors and is standarized against a corpus
SphinxSearch is an extension for MediaWiki to replace the built-in search engine with a Sphinx search engine backend.
Spider is web crawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.
XML documents To Generated dynamic web application supporting CRUD actions. Credits to Ministry of Culture and Communication, France; UNESCO; Ecole Nationale des Chartes, France; PASS-TECH, France.
Clucened is a project to build a daemon around CLucene, which is a C++ implementation of the Lucene search engine. This is *not* the CLucene project, but is a separate project to write a generic daemon based on CLucene.
jBingAPI is a java library to query the microsoft search engine bing (http://www.bing.com/) using their public api. jBingAPI just makes it a lot easier to communicate with this api.