Qwen-Image is a powerful image generation foundation model
Foundation model for image generation
General-purpose image editing model that delivers high-fidelity
Implementation of Make-A-Video, new SOTA text to video generator
Focus on prompting and generating
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Comprehensive Markdown plugin built for Django
A Powerful Native Multimodal Model for Image Generation
OCRmyPDF adds an OCR text layer to scanned PDF files
Official inference repo for FLUX.2 models
Implementation of Imagen, Google's Text-to-Image Neural Network
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
This repo contains the code for 1D tokenizer and generator
Official MiniMax Model Context Protocol (MCP) server
Label Studio is a multi-type data labeling and annotation tool
Official inference repo for FLUX.1 models
CLIP, Predict the most relevant text snippet given an image
Stable Diffusion WebUI optimized for AMD GPUs with editing tools
Stable Diffusion web UI
A python tool that uses GPT-4, FFmpeg, and OpenCV
Wan2.1: Open and Advanced Large-Scale Video Generative Model
InvokeAI is a leading creative engine for Stable Diffusion models
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Easily compute clip embeddings and build a clip retrieval system
Collection of Gemma 3 variants that are trained for performance