Audiocraft is a library for audio processing and generation
A robust, efficient, low-latency speech-to-text library
Ready-to-use OCR with 80+ supported languages
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
EPUB to audiobook converter, optimized for Audiobookshelf
Official inference repo for FLUX.1 models
Offline inference engine for art, real-time voice conversations
Offline Text To Speech synthesis for python
Converts text to speech in realtime
Library for OCR-related tasks powered by Deep Learning
Tools to ease the creation of snippets, syntax definitions, etc.
Use Microsoft Edge's online text-to-speech service from Python
Implementation of Phenaki Video, which uses Mask GIT
Generate audiobooks from e-books, voice cloning & 1107+ languages
A simple native web interface that uses ChatTTS to synthesize text
A TTS that fits in your CPU (and pocket)
Python library and CLI tool to interface with Google Translate
A simple, high-quality voice conversion tool focused on ease of use
ASCII art library for Python
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Claude Code skill implementing Manus-style persistent planning
CLIP, Predict the most relevant text snippet given an image
Comprehensive Markdown plugin built for Django
Easy-to-use and powerful NLP library with Awesome model zoo