The most powerful local music generation model
Long-form streaming TTS system for multi-speaker dialogue generation
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Wan2.1: Open and Advanced Large-Scale Video Generative Model
A Systematic Framework for Interactive World Modeling
From Images to High-Fidelity 3D Assets
Industrial-level controllable zero-shot text-to-speech system
Controllable & emotion-expressive zero-shot TTS
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
State-of-the-art TTS model under 25MB
Qwen3-TTS is an open-source series of TTS models
Capable of understanding text, audio, vision, video
High-Fidelity and Controllable Generation of Textured 3D Assets
Open-source framework for intelligent speech interaction
HY-Motion model for 3D character animation generation
Qwen-Image is a powerful image generation foundation model
RGBD video generation model conditioned on camera input
High-Resolution Image Synthesis with Latent Diffusion Models
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
CogView4, CogView3-Plus and CogView3(ECCV 2024)
GLM-4-Voice | End-to-End Chinese-English Conversational Model
State-of-the-art (SoTA) text-to-video pre-trained model
A Unified Framework for Text-to-3D and Image-to-3D Generation
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Tongyi Deep Research, the Leading Open-source Deep Research Agent