A robust, efficient, low-latency speech-to-text library
Rich is a Python library for rich text and beautiful formatting
Windows GUI Automation with Python (based on text properties)
CLI tool and python library
Code for running inference and finetuning with SAM 3 model
High-Quality Voice Cloning TTS for 600+ Languages
A fast, helpful, and open-source document parser
Extensions for Python Markdown
A generative speech model for daily dialogue
File Parser optimised for LLM Ingestion with no loss
Offline inference engine for art, real-time voice conversations
Open source healthcare AI
A simple native web interface that uses ChatTTS to synthesize text
Cut videos with a text editor
Speech recognition module for Python
Generate audiobooks from EPUBs, PDFs and text with captions
Qwen3-TTS is an open-source series of TTS models
Robust Speech Recognition via Large-Scale Weak Supervision
A simple tool for reading in poorly redacted documents
OCR software, free and offline
Edit PDF files with Nano Banana
A high-quality rapid TTS voice cloning model
State-of-the-art TTS model under 25MB
A TTS that fits in your CPU (and pocket)
Official MiniMax Model Context Protocol (MCP) server