Open image model at the forefront of design
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Official Python inference and LoRA trainer package
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Continuous Autonomy for the AI SDK
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Controllable & emotion-expressive zero-shot TTS
super expressive prompting model based on ltx2.3
Python inference and LoRA trainer package for the LTX-2 audio–video
Claude Code action for GitHub PRs
Flux 2 image generation model pure C inference
Advancing Open-source World Models
Official implementation of DreamCraft3D
Diffusion Transformer with Fine-Grained Chinese Understanding
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
MOSS‑TTS Family open‑source speech and sound generation model
Achieving 3+ generation speedup on reasoning tasks
Foundation model for image generation
Renderer for the harmony response format to be used with gpt-oss
Qwen-Image is a powerful image generation foundation model
Hunyuan Translation Model Version 1.5
Bidirectional token-classification model for identifiable info
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Safety reasoning models built-upon gpt-oss