Open image model at the forefront of design
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Official Python inference and LoRA trainer package
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Controllable & emotion-expressive zero-shot TTS
Python inference and LoRA trainer package for the LTX-2 audio–video
Open-source framework for intelligent speech interaction
Official implementation of DreamCraft3D
Diffusion Transformer with Fine-Grained Chinese Understanding
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Advancing Open-source World Models
MOSS‑TTS Family open‑source speech and sound generation model
Foundation model for image generation
Achieving 3+ generation speedup on reasoning tasks
LLM-based Reinforcement Learning audio edit model
Qwen-Image is a powerful image generation foundation model
Renderer for the harmony response format to be used with gpt-oss
A Pragmatic VLA Foundation Model
Hunyuan Translation Model Version 1.5
Bidirectional token-classification model for identifiable info
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Uncommon Objects in 3D dataset
High-Fidelity and Controllable Generation of Textured 3D Assets