Semantic Assistants support users in content retrieval, analysis, and development, by offering context-sensitive NLP services directly integrated in standard desktop clients, like a word processor, and web information systems, like a wiki.
...The WNLT project delivers four core NLP modules;
a) Word Segmentation for separating text into words
b) Sentence Boundary Disambiguation for finding sentence boundaries
c) Part of Speech Tagger for determining the part of speech of each word
d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework.
The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions.
Version 2.x
The CYMRIE pipeline is accessible via a API, standalone GUI and CLI. The CymrIE pipeline has also been adapted for Twitter.
WNLT is a suite of open source natural language modules for the Welsh
...The WNLT project delivers four core NLP modules;
a) Word Segmentation for separating text into words
b) Sentence Boundary Disambiguation for finding sentence boundaries
c) Part of Speech Tagger for determining the part of speech of each word
d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework.
The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions.
...Using CONLL-Evaluation:
processed 32065 tokens with 3656 phrases; found: 3251 phrases; correct: 2786.
accuracy: 95.25%; precision: 85.70%; recall: 76.20%; FB1: 80.67
Using GATE Corpus Benchmark:
Strict: P: 0.65 R: 0.73 F1: 0.69
Lenient: P: 0.74 R: 0.84 F1: 0.78
The details of how to reproduce evaluation, see README.
To use standalone version for tagging download DrugExtractionStandalone.tar.gz from Files.
Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
NLPTools-ES is a Spanish plugin for GATE (General Architecture for Text Engineering). It includes a tokenizer, sentence splitter, gazetteer, pos tagger.
TBLTools is a set of GATE processing resources that implements the Fast Transformation Based Learning Algorithm. You can train it to learn rules for NLP tasks such as Named Entity Recognition and Shallow parsing.
The program provides Java interface (to C++ Lemmatizer via XML-RPC) in order to perform lemmatizing in Russian, English, and German (lemma is the canonical form of a lexeme in Natural Language Processing). RussianPOSTagger could work as a module of GATE.