Qwen3-omni is a natively end-to-end, omni-modal LLM
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Toolkit for conversational AI
Capable of understanding text, audio, vision, video
Repo of Qwen2-Audio chat & pretrained large audio language model
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
LLM Large Model of Selling Anchor
A high-quality PDF to Markdown tool based on large language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Multilingual sentence & image embeddings with BERT
Using AI models to automatically provide commentary and edit videos
Generative AI reference workflows
Open-weight, large-scale hybrid-attention reasoning model
AIlice is a fully autonomous, general-purpose AI agent
Framework that is dedicated to making neural data processing
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)