Qwen-Image is a powerful image generation foundation model
Foundation model for image generation
General-purpose image editing model that delivers high-fidelity
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Focus on prompting and generating
OCRmyPDF adds an OCR text layer to scanned PDF files
A Powerful Native Multimodal Model for Image Generation
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Official inference repo for FLUX.2 models
CLIP, Predict the most relevant text snippet given an image
Official inference repo for FLUX.1 models
ComfyUI wrapper nodes for HunyuanVideo
Label Studio is a multi-type data labeling and annotation tool
Official MiniMax Model Context Protocol (MCP) server
Comprehensive Markdown plugin built for Django
Stable Diffusion web UI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Collection of Gemma 3 variants that are trained for performance
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Stable Diffusion WebUI optimized for AMD GPUs with editing tools
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Text and image to video generation: CogVideoX and CogVideo
Implementation of Imagen, Google's Text-to-Image Neural Network
Ready-to-use OCR with 80+ supported languages
Image inpainting tool powered by SOTA AI Model