A Repo For Document AI
Readest is a modern, feature-rich ebook reader
A free, open source, and extensible speech-to-text application
Open source semantic search and text analytics for large document sets
Easy-to-use and powerful NLP library with Awesome model zoo
The most accurate natural language detection library for Rust
A high-quality PDF to Markdown tool based on large language model
Easy-to-use and high-performance NLP and LLM framework
Go efficient multilingual NLP and text segmentation
AI-powered tool for generating, optimizing, and translating subtitles
Generate audiobooks from EPUBs, PDFs and text with captions
Enhances Tesseract OCR output using LLMs (local or API)
Open source libraries and APIs to build custom preprocessing pipelines
OCR software, free and offline
Easily compute clip embeddings and build a clip retrieval system
Automatic Speech Recognition with Word-level Timestamps
A fast, helpful, and open-source document parser
Python binding to the Apache Tika™ REST services
Apache OpenNLP
Agent harness to make your slop code well-engineered and beautiful
Advanced NLP with spaCy: A free online course
Audiocraft is a library for audio processing and generation
A multimodal model for brain response prediction
A very simple framework for state-of-the-art NLP
Lightning-fast, on-device TTS, running natively via ONNX