The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules;
a) Word Segmentation for separating text into words
b) Sentence Boundary Disambiguation for finding sentence boundaries
c) Part of Speech Tagger for determining the part of speech of each word
d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework.
The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions.
Welsh Natural Language Toolkit
WNLT is a suite of open source natural language modules for the Welsh
Brought to you by:
avlachid
Downloads:
0 This Week