A text-to-speech, speech-to-text and speech-to-speech library
Contexts Optical Compression
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Code for running inference and finetuning with SAM 3 model
Video-based AI memory library. Store millions of text chunks in MP4
Generate audiobooks from EPUBs, PDFs and text with captions
Offline inference engine for art, real-time voice conversations
Open source no-code system for text annotation and building of text
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
A TTS that fits in your CPU (and pocket)
Qwen3-TTS is an open-source series of TTS models
The python library for real-time communication
Robust Speech Recognition via Large-Scale Weak Supervision
Persian NLP Toolkit
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
High-Quality Voice Cloning TTS for 600+ Languages
Automatic Speech Recognition with Word-level Timestamps
An open-source toolkit for monitoring Language Learning Models (LLMs)
OCR software, free and offline
Official inference repo for FLUX.2 models
Official MiniMax Model Context Protocol (MCP) server
A Family of Open Sourced Music Foundation Models
Converts text to speech in realtime
State-of-the-art TTS model under 25MB
A lightweight text-to-speech model with zero-shot voice cloning