Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
The NITE XML Toolkit supports the creation, analysis, and browsing of annotated multimodal, text, or spoken language corpora, and represents both timing and rich linguistic structure. It contains libraries for developers and some end user tools.
Ontea - Pattern based Semantic Annotation Platform. Ontea search or create semantic meta data from text or documents using pattern based approaches. Implementation currently includes regular expressions (regex) patterns
A lyrical analysis and classification tool focused specifically on rhyming style in rap lyrics. Functions include phonetic transcription, rhyme visualization, and rapper classification.
Contextor is a light-weight simple-to-use Java based library to help developers and researchers working with the general concept of a resource; as examples, resources can be text resources, web resources, images and videos.
Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.
* Java classes for parsing text, conversion to XML or to evaluate in Java. The parser is textual-script-controlled with a syntax near Backus Naur Format, named ZBNF. * Some routines for conversion: C-Header or Java to XMI, XML-Documentation generation,
DawNLITE is a Natural-Language-based Image Transmoding Engine. The software transforms an image to a video as recorded by a virtual camera panning and zooming over the image, following a natural language text description of the image.
Provides a GUI interface to grammatical structure and relations (as parsed by the Stanford Parser) of any text.
Contains grammatical relation editor to modify, import, export grammatical relation definitions (tregex patterns and features).
The Fiber project seeks to create a modular open source text mining tool that provides a contextual foundation for analysis in the dissemination of large quantities of text data.
T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
The main purpose of AMATOOL is to create an application for semiautomatic mark of text, using XML tags. The texts is typical can be archaeological reports or midleagetextscripts.
It is a semiautomtaic editor.
The Java Text Categorizing Library (JTCL) is a pure java implementation of libTextCat which in turn is "a library that was primarily developed for language guessing, a task on which it is known to perform with near-perfect accuracy."
LACE means "Lucene Analyzer for CJK (Chinese/Japanese/Korean) & English". It's a simple tokenizer that can handle English-CJK mixed text. Chinese words are handled using a dictionary based method.
hypKNOWsys aims at developing a Java-based workbench for knowledge discovery and knowledge management. Currently, hypKNOWsys has released two intermediate tools: DIAsDEM Workbench (text mining for semantic tagging) and WUMprep (Web mining pre-processing)
The UIMA Annotator (called BRUTUS - Business Rules from Unstructured Text and Unstructured Sources) is a component for the UIMA Framework that allows for capturing business knowledge formalized in Structured English syntax (based on OMG's SBVR) with MOF
An approximate gazetteer for GATE (General Architecture for Text Engineering), based on Levenshtein's Distance. Strings can be matched and found even in texts with noise and errors. More Info: http://bruno-wp.blogspot.com/search/label/Software
Java Expert Rule Based Inference Language. Jerbil is an open source rule processing engine written in Java. Currently Jerbil supports a full set of processing functions with text-based and XML interfaces; a Java interface is planned.
The Word Vector Tool is a simple but flexible Java library to create word vector representations of text documents. Word vectors can be used for various text processing tasks, as text classification, text clustering or information retrieval.