Unicode XML TEI text analysis platform
A toolkit for managing and manipulating text annotations
A 50 million tokens corpus of Classical Arabic.
A Rule-based Part-of-Speech and Morphological Tagging Toolkit
Statistical phrase-based machine translation system
Nigerian component of the International Corpus of English
Drug name extraction
Python, NLTK-based package for shallow parsing of Brazilian Portuguese
This project includes basic NLP and DSP techniques for Text-to-Speech