EyeMap is a visualization and analysis tool for text reading eye movement data. It can process Unicode, proportion/non-proportion and spaced/unspaced reading materials, which supports various languages and experiment methods.
The Deep Email Miner Application is a software solution for the multistaged analysis of an Email Corpus. Social network analysis and text mining techniques are connected to enable an in depth view into the underlying information.
The self-executable Version 1.1 jar file will now run on Java 1.5 or higher.
A Windows executable file of Version 1.1 is also provided in the Files section.
Documentation can be found on the project homepage.
SemaRule Navigator is an Integrated Suite of Open-Source and Free-License Software, placing Semantic and TextAnalysis Technologies in the toolbox of Researchers, Students, and Enterprises.
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
Apolda is a plugin for the Gate framework (see http://sourceforge.net/projects/gate/) that annotates texts with labels of concepts from an arbitrary OWL-ontology.
A multi-agent architecture for building interactive dramas. It uses the Jason's BDI engine, being the Jason's agent-oriented programming language utilized for performing the drama management and for authoring behaviors for the characters.
TextMarker is now developed and hosted at Apache UIMA (http://uima.apache.org/textmarker.html). TextMarker is a UIMA-based tool for information extraction and more. The full featured editor of the rule language and the build process of UIMA descriptors are complemented with components for visualization, explanation, testing and rule learning.
SEMANTIXS is a semantic information extraction system that can extract, represent and visualize domain-specific information from free-text in the form of complex (and simple) relationships. Refer - http://www.cs.iastate.edu/~semantix/ for more info.
OSCAR (Open Source Chemistry Analysis Routines) is software for the semantic annotation of chemistry papers. The modules OPSIN (a name to structure converter) and ChemTok (a tokeniser for chemical text) are also available as standalone libraries.
The NITE XML Toolkit supports the creation, analysis, and browsing of annotated multimodal, text, or spoken language corpora, and represents both timing and rich linguistic structure. It contains libraries for developers and some end user tools.
LDIFF is an enhanced language-independent line differencing tool built upon the Unix diff and overcomes its limitations in determining whether an artifact line has been changed or is the result of additions and removals
Ontea - Pattern based Semantic Annotation Platform. Ontea search or create semantic meta data from text or documents using pattern based approaches. Implementation currently includes regular expressions (regex) patterns
A lyrical analysis and classification tool focused specifically on rhyming style in rap lyrics. Functions include phonetic transcription, rhyme visualization, and rapper classification.
Contextor is a light-weight simple-to-use Java based library to help developers and researchers working with the general concept of a resource; as examples, resources can be text resources, web resources, images and videos.
sigMan is a utility for the analysis of time-dependent signals, especially electropherogram/chromatogram data. It is no longer being actively updated or maintained by me (Nathan Cermak) as of January 2011, for lack of any users.
Feed State is used to view (and store) log files of different formats from many different processes over a network. A massive variety of logs are supported: XML, database, all ASCII log files, all parsed into a common format for viewing and analysis.
Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.