This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.
The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the...
Charface is GUI for OCR engines. Currently under developing.
It suports automatica detection of next engines to be installed
- cuneiform with its languages
- tesseract with language database files
- gocr
Supports
- adding custom engines
- bach processing of images
- text postprocessing
EXTE recognizes temporal expressions in a text. Temporal expressions are any references to the TIME e.g.: "today", "12th march 2006", "as soon as he woke up". This analyzer will recognize three languages: Spanish, English and French.
Wikipedia Concept Association Map (WCAM) is new approach for textual knowledge representation and understanding. All concepts and associations are stored in a graph database for better performance and easy distribution.
Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform.
Geographic restrictions, eligibility, and terms apply.
SemNotes is a semantic note taking tool for KDE4, built on top of Nepomuk-KDE. The tool is still under development, but it is already usable, provided that KDE4 is installed and the Nepomuk running.
OCR c++ library. Include: contour recognition; vectorisation; matrix letter feature recognition; auto page segmentation and detect rotation; SS3 ASM core; XML base; web-based GUI; 99,6% printed Unicode text recognition; letter base up to 1200 letters.
TextMine is for the Perl hacker who is grappling with the problems of managing unstructured text from various sources. You can use these text mining tools to search the Web, index text, extract entities, categorize your e-mail, and summarize documents.
The purpose of this program is to take metadata and full text OCR from ContentDM and export into a database for use in other applications. The application is setup to generate a JPG derivative from either a TIF or JP2 associated with an object.
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
Tank Arena X is a 2-D arcade game playable by a single or multiple players. It loads all the game data from simple text files, allowing for easy modifications and creation of new levels and mods. Also, can support AI tanks using its AI engine.
This project aims to distribute a facial animation system with speech, developed to brazilian portuguese case. This system is composed by many modules: movement extraction, facial animation and speech, through a text-to-speech system.
Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers
SENTENSA Knowledge Miner is a platform independent tool for searching any text. SENTENSA uses robust methods of indexing and searching text, leveraging on experience from more than 20 years of information retrieval.
It is an universal language translator and written in Java. All languages are translated to an unique language (interlingua) and generate any native language from the interlingua. The wordbooks are XML. It use the context of a text, rules and a grammar.
FramerD is a distributed semi-structured object database originally developed at MIT. It provides an internationalized Scheme-based scripting language, built-in text analysis tools, and special support for web scripting.
MUD engine, enables interaction with text-only, XML clients. Has on-line/in-game world creation capability. WotC's OGL SRD and Java based, capable of scalable worlds across distributed servers. In short a next generation MMORPG engine.
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Bio_ClinicalBERT is a domain-specific language model tailored for clinical natural language processing (NLP), extending BioBERT with additional training on clinical notes. It was initialized from BioBERT-Base v1.0 and further pre-trained on all clinical notes from the MIMIC-III database (~880M words), which includes ICU patient records. The training focused on improving performance in tasks like named entity recognition and natural language inference within the healthcare domain. Notes were...