Search Results for "corpora"
Sort By:
Unsupervised text tokenizer focused on computational efficiency
Text categorization, arabic language processing, language modeling
We describe a simple XML format to share text documents and annotation