Toolkit for conversational AI
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Fast multimodal LLM for real-time voice interaction and AI apps
Underthesea - Vietnamese NLP Toolkit
Multi-modal large language model designed for audio understanding
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Translate the video from one language to another and embed dubbing
Voice Recognition to Text Tool
Replace OpenAI GPT with another LLM in your app
Open source AI VTuber platform with voice chat and Live2D avatars
End-to-end speech processing toolkit
Training data (data labeling, annotation, workflow) for all data types
SoTA open-source TTS
Framework for building real-time voice and multimodal AI agents
Omnilingual ASR Open-Source Multilingual SpeechRecognition
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Capable of understanding text, audio, vision, video
The behavior guidance framework for customer-facing LLM agents
Real-time voice interactive digital human
Han Language Processing
NLP Cloud serves high performance pre-trained or custom models for NER
LLM Large Model of Selling Anchor
Large Audio Language Model built for natural interactions
Persian NLP Toolkit
Qwen3-ASR is an open-source series of ASR models