Qwen-Image is a powerful image generation foundation model
HY-Motion model for 3D character animation generation
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Long-form streaming TTS system for multi-speaker dialogue generation
Qwen3-ASR is an open-source series of ASR models
Release for Improved Denoising Diffusion Probabilistic Models
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Easy Docker setup for Stable Diffusion with user-friendly UI
Foundation Models for Time Series
Hackable and optimized Transformers building blocks
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Unified Multimodal Understanding and Generation Models
DeepSeek Coder: Let the Code Write Itself
Inference framework for 1-bit LLMs
Accurate × Fast × Comprehensive
LTX-Video Support for ComfyUI
Official implementation of Watermark Anything with Localized Messages
Tool for exploring and debugging transformer model behaviors
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Repo for SeedVR2 & SeedVR
State-of-the-art (SoTA) text-to-video pre-trained model
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Pretrained time-series foundation model developed by Google Research
Recovering the Visual Space from Any Views