Unsupervised text tokenizer focused on computational efficiency
A Chinese information extraction tool
Basic Utilities for PyTorch Natural Language Processing (NLP)
The open-source virtual assistant for Ubuntu based Linux distributions
A 50 million tokens corpus of Classical Arabic.
Toolkit for Machine Learning, Natural Language Processing
Probabilistic Noising of Natural Language
Resources for speech processing in Brazilian Portuguese
A smart search engine for medical documents
Safe Harbor Deidentification for medical documents
Text Analytics Platform
Library to scrape and clean web pages to create massive datasets
Named-entity recognition using neural networks
TextRank implementation for Python 3
Text categorization, arabic language processing, language modeling
Analyze text. Diagonal read subject, predicate, obj. Search other pdf.
Implementation of research papers on Deep Learning+ NLP+ CV in Python
text file quick lemmater
Turku Event Extraction System
We describe a simple XML format to share text documents and annotation
Ansj word segmentation
a text parsing library that matches text with concepts.