Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.
Build generative AI apps with Vertex AI. Switch between models without switching platforms.
Project moved to GitHub!
https://github.com/carrot2/carrot2
Carrot2 is an Open Source Search Results Clustering Engine. It can automatically organize small collections of documents, e.g. search results, into thematic categories. Carrot2 integrates very well with both Open Source and proprietary search engines.
This package contains different tools to add NLP capabilities for Lucene 4.x (it has been tested using Lucene version from 4.6.x to 4.8.1). Although it was originally developed for German, it is, mostly, language independent.
It allows the user to lemmatize words to be indexed, to weight termy ba their parts of speech (e.g. weighting nouns mor hevaily than pronouns), and to add synonyms taken from GermaNet or a list you provide to the search index and thereby increase recall of lucene.
SSWAP (Simple Semantic Web Architecture and Protocol; pronounced "swap") is an architecture, protocol, and platform that uses reasoning to semantically integrate disparate data and services on the web. Running live at http://sswap.info.
Please click on "BROWSE ALL FILES" to open a README!
What is this program?
This program is meant to search the texts and suggest sentences that might follow any inputted sentence. It was meant to help composing the writings.
How was it made?
First, I made a chatterbot to understand the concept of "predicting" the next sentence (worked as good as the Cleverbot), then I made this program that is aided by an user to compose the new writings.
SAMPLE:
Input sentence was "Learning...
This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
This project provides cross-forge semantic search for the Qualipso Forge. It integrates A4 AdvDoc prototype (semantic search GUI and engine) with A3 homogeneous and heterogeneous cross-forge semantic search capabilities. See Qualipso.org for details
TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.
The Wikipedia Miner toolkit provides simplified access to Wikipedia. This open encyclopedia represents a vast, constantly evolving multilingual database of concepts and semantic relations; a promising resource for nlp and related research.
OpenEphyra is an open framework for question answering (QA). It retrieves answers to natural language questions from the Web and other sources. Visit http://www.ephyra.info/ for more details and information on joining this open research initiative.
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.
Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
True-Hybrid Web Search Engine, which is designed to organize a web-based information by making heavy use of a mutually beneficial collaboration between Human and ArtificialIntelligence.
Web-as-corpus tools in Java.
* Simple Crawler (and also integration with Nutch and Heritrix)
* HTML cleaner to remove boiler plate code
* Language recognition
* Corpus builder
The implementation of Bee Hive @ Work algorithm that simulates the foraging behavior of honey bees in nature. The aim is to provide an extensible framework that can be used by researchers to simply create new applications of this algorithm.
This project tries to find geographical locations (formulated by GPS positions) for specific queries (eg 'art') in specific environments (eg 'Maastricht') by analyzing webpages.
S3B - Social Semantic Search and Browsing - is a middleware that delivers a set of search and browsing components that can be used in J2EE web applications to deliver user-oriented features based on semantic descriptions and social networking
SYRAH si propone di far emergere e rappresentare i concetti espressi per mezzo di un linguaggio naturale. SYRAH aims to discover and represent concepts expressed in natural languages. NLP, lemma, lemmario, italiano, rete, semantica, clustering, semantic
DOSE: a distributed platform for semantic elaboration that provides semantic services such as automatic annotation of web resources at the document substructure level, semantic search facilities, semantic annotation storage and retrieval.
Open Source Semantic Web Search Engine Software: If two machines anywhere on the web can agree on the same definition of a digital service or digital good, then machine to machine transactions can use this lingua franca to transact on the users behalf.
This project intends to create an indexing search engine, for knowledge management. The primary object is to apply an information retrieval core. And implement a knowledge data discovery theory such as data mining algorithm, text mining.
SENTENSA Knowledge Miner is a platform independent tool for searching any text. SENTENSA uses robust methods of indexing and searching text, leveraging on experience from more than 20 years of information retrieval.
Catalogo is a system for cataloguing resources on a web site. It allows semantic search of information on an intranet using metadata, RDF and ontology concepts. It provides a Catalog server (Java web applications) and a Catalog client (Firefox plug-in).
The Jorne project develops software and open standards for linking Lojban text with WWW and Semantic Web metadata (e.g. RDF/N3, RSS, XML). Lojban is an artificial spoken and written language based on predicate logic.
The OpenBorges project intends to provide an humble place to experiment, and debate, about what can be an open, distributed, adaptive and collaborative, semantic virtual library. Inspirations are: As we May Think, Library of Babel, and Weaving the web