Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.
Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Ontea - Pattern based Semantic Annotation Platform. Ontea search or create semantic meta data from text or documents using pattern based approaches. Implementation currently includes regular expressions (regex) patterns
A lyrical analysis and classification tool focused specifically on rhyming style in rap lyrics. Functions include phonetic transcription, rhyme visualization, and rapper classification.
Contextor is a light-weight simple-to-use Java based library to help developers and researchers working with the general concept of a resource; as examples, resources can be text resources, web resources, images and videos.
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.
Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
sigMan is a utility for the analysis of time-dependent signals, especially electropherogram/chromatogram data. It is no longer being actively updated or maintained by me (Nathan Cermak) as of January 2011, for lack of any users.
Feed State is used to view (and store) log files of different formats from many different processes over a network. A massive variety of logs are supported: XML, database, all ASCII log files, all parsed into a common format for viewing and analysis.
Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.
Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.
* Java classes for parsing text, conversion to XML or to evaluate in Java. The parser is textual-script-controlled with a syntax near Backus Naur Format, named ZBNF. * Some routines for conversion: C-Header or Java to XMI, XML-Documentation generation,
DawNLITE is a Natural-Language-based Image Transmoding Engine. The software transforms an image to a video as recorded by a virtual camera panning and zooming over the image, following a natural language text description of the image.
Provides a GUI interface to grammatical structure and relations (as parsed by the Stanford Parser) of any text.
Contains grammatical relation editor to modify, import, export grammatical relation definitions (tregex patterns and features).
The Vodoo/Stream project let users to define transducers dedicated to document analysis. Such transducers describe how fragments are matched and transformed. Finally a document can be an XML fragment, a free text or something else depending on extensions
The Fiber project seeks to create a modular open source text mining tool that provides a contextual foundation for analysis in the dissemination of large quantities of text data.
T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
Like Unix-Tail BUT:
- Runs with or without GUI
- Suspend and resume tailing at runtime
- Can monitor a set of Files
- Print output to a textfield, stdout or file
- Runs in "Grep" mode, too (Read files once)
- (Almost) the same options as Unix-Tail
The main purpose of AMATOOL is to create an application for semiautomatic mark of text, using XML tags. The texts is typical can be archaeological reports or midleagetextscripts.
It is a semiautomtaic editor.
DuMP3 is a duplicate and similar file finder. It finds exact duplicate binaries by hash, similar text files by substring content, images (JPG, BMP, GIF, PNG, etc) by color and audio files (MP3, WAV, OGG, etc) by wave data. Future: fonts, video.
The Java Text Categorizing Library (JTCL) is a pure java implementation of libTextCat which in turn is "a library that was primarily developed for language guessing, a task on which it is known to perform with near-perfect accuracy."
LACE means "Lucene Analyzer for CJK (Chinese/Japanese/Korean) & English". It's a simple tokenizer that can handle English-CJK mixed text. Chinese words are handled using a dictionary based method.
hypKNOWsys aims at developing a Java-based workbench for knowledge discovery and knowledge management. Currently, hypKNOWsys has released two intermediate tools: DIAsDEM Workbench (text mining for semantic tagging) and WUMprep (Web mining pre-processing)
The UIMA Annotator (called BRUTUS - Business Rules from Unstructured Text and Unstructured Sources) is a component for the UIMA Framework that allows for capturing business knowledge formalized in Structured English syntax (based on OMG's SBVR) with MOF