Autoregressive Model Beats Diffusion
Diffusion Transformer with Fine-Grained Chinese Understanding
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Open-source multi-speaker long-form text-to-speech model
Official SeedVR2 Video Upscaler for ComfyUI
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Collection of CVPR 2026 Papers and Open Source Projects
Multimodal Diffusion with Representation Alignment
HY-Motion model for 3D character animation generation
A unified library of SOTA model optimization techniques
PyTorch implementation of JiT
Repo for SeedVR2 & SeedVR
All-in-one WebUI for AI generative image and video creation
Image inpainting tool powered by SOTA AI Model
Cosmos-RL is a flexible and scalable Reinforcement Learning framework
Personalize Any Characters with a Scalable Diffusion Transformer
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Official Python inference and LoRA trainer package
Towards Human-Level Text-to-Speech through Style Diffusion
Code and models for ICML 2024 paper, NExT-GPT
Inference script for Oasis 500M
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official PyTorch Implementation
High-Fidelity and Controllable Generation of Textured 3D Assets
State-of-the-art (SoTA) text-to-video pre-trained model