Repo of Qwen2-Audio chat & pretrained large audio language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Large Audio Language Model built for natural interactions
Qwen3-omni is a natively end-to-end, omni-modal LLM
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Capable of understanding text, audio, vision, video
AudioMuse-AI is an Open Source Dockerized environment
LLM based data scientist, AI native data application
Code and models for ICML 2024 paper, NExT-GPT
Specify a github or local repo, github pull request
Data Infrastructure providing an approach to multimodal AI workloads
Build multimodal language agents for fast prototype and production
Framework and no-code GUI for fine-tuning LLMs
Collect, organize, use, and share, all in OmniBox
GLM-4-Voice | End-to-End Chinese-English Conversational Model
LLM Large Model of Selling Anchor
Data Lake for Deep Learning. Build, manage, and query datasets
Serving multiple LoRA finetuned LLM as one
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Community for applying LLMs to robotics and a robot simulator