Indexing and query tools for very large text corpora
Aligns tokens in two versions of a text with differing tokenization.
Unicode XML TEI text analysis platform
Linking Language to Knowledge with Distributional Semantics
The Linguistic Analyzer is a tool for corpus analysis and comparison
Phrase-Based & Neural Unsupervised Machine Translation
Text categorization, arabic language processing, language modeling
Powerful search library, best suited for computer-aided translation
An open source system for Arabic corpora processing
natural language corpora search engine
We describe a simple XML format to share text documents and annotation
Dialogue Similarity
cross-languages resources
THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/
Python, NLTK-based package for shallow parsing of Brazilian Portuguese
An Arabic Corpora Processing Tool
A repository of software, documentation and data for NLP