Qwen3-ASR is an open-source series of ASR models
Open source AI VTuber platform with voice chat and Live2D avatars
Real-time voice interactive digital human
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Han Language Processing
NLP Cloud serves high performance pre-trained or custom models for NER
Qwen3-omni is a natively end-to-end, omni-modal LLM
LLM Large Model of Selling Anchor
Large Audio Language Model built for natural interactions
Framework for building neural networks
Open source AI wearable platform for recording and summarizing speech
AI-powered tool for generating, optimizing, and translating subtitles
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Conversational voice AI agents
Bailing is a voice dialogue robot similar to GPT-4o
A Web UI for easy subtitle using whisper model
Powerful Android AI agent with tools, automation, and Linux shell
Python Audio Analysis Library: Feature Extraction, Classification
Multi-modal large language model designed for audio understanding
Towards Studio-Grade Character Animation via In-Context Learning of 3D
Build voice-based LLM agents. Modular + open source
Framework for building AI-powered interactive digital humans and agent
Pre-trained Deep Learning models and demos
Models for the spaCy Natural Language Processing (NLP) library
Data manipulation and transformation for audio signal processing