Java based XSLT Processor extension for syntax highlighting
A 50 million tokens corpus of Classical Arabic.
Toolkit for Machine Learning, Natural Language Processing
Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, etc.
Budou is an auto organizer tool for beautiful line breaking in CJK
Facilitating the design, comparison and sharing of deep text models
Text-to-Speech TTS for Basque, Spanish, Catalan, Galician and English
Probabilistic Noising of Natural Language
Resources for speech processing in Brazilian Portuguese
From finding text to search and replace
Deal with bad samples in your dataset dynamically
Go-based automation utility that downloads YouTube videos
Simple SQL-like syntax on top of Perl text processing
Clojure library that parses text into structured data
Text Analytics Platform
Download websites as e-book: pdf, txt, epub.
Simple java script library for auto literation, input tool.
Library to scrape and clean web pages to create massive datasets
Named-entity recognition using neural networks
TextRank implementation for Python 3