Qwen3-ASR is an open-source series of ASR models
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
EPUB to audiobook converter, optimized for Audiobookshelf
Persian NLP Toolkit
Offline inference engine for art, real-time voice conversations
Controllable and fast Text-to-Speech for over 7000 languages
SOTA discrete acoustic codec models with 40/75 tokens per second
Audio foundation model excelling in audio understanding
Real-time voice interactive digital human
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Repo of Qwen2-Audio chat & pretrained large audio language model
Management of Yandex Station and other smart home devices
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
A 0.1B Omni model trained from scratch
Framework for building real-time voice and multimodal AI agents
NLP Cloud serves high performance pre-trained or custom models for NER
Framework for building neural networks
Open-source industrial-grade ASR models
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Underthesea - Vietnamese NLP Toolkit
FAIR Sequence Modeling Toolkit 2
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Automatically translates the text of a video based on a subtitle file
Bailing is a voice dialogue robot similar to GPT-4o
Official PyTorch Implementation