A playground to generate images from any text prompt using SD
Qwen-Image is a powerful image generation foundation model
Image generation model with single-stream diffusion transformer
Foundation model for image generation
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
OCRmyPDF adds an OCR text layer to scanned PDF files
General-purpose image editing model that delivers high-fidelity
Official inference repo for FLUX.2 models
Readest is a modern, feature-rich ebook reader
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Open Source OCR Engine
Multimodal-Driven Architecture for Customized Video Generation
Qwen3-omni is a natively end-to-end, omni-modal LLM
Chat & pretrained large vision language model
Focus on prompting and generating
A Powerful Native Multimodal Model for Image Generation
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
A pure Javascript Multilingual OCR
CLIP, Predict the most relevant text snippet given an image
Capable of understanding text, audio, vision, video
Diffusion Bee is the easiest way to run Stable Diffusion locally
An easy 1-click way to create beautiful artwork on your PC using AI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Label Studio is a multi-type data labeling and annotation tool