Browse free open source Python AI Image Generators and projects below. Use the toggles on the left to filter open source Python AI Image Generators by OS, license, language, programming language, and project status.

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and Apple Silicon, plus support for GPUs and CPUs, it caters to a wide range of users—from hobbyists to professionals. The interface also supports prompt editing, batch processing, custom scripts, and many community extensions, making it a highly customizable and continually evolving platform for creative AI art generation.
    Downloads: 293 This Week
    Last Update:
    See Project
  • 2
    Fooocus

    Fooocus

    Focus on prompting and generating

    Fooocus is an open-source image generation software that simplifies the process of creating images from text prompts. Built on Gradio and leveraging Stable Diffusion XL, Fooocus eliminates the need for manual parameter tweaking, allowing users to focus solely on crafting prompts. It offers a user-friendly interface with minimal setup, making advanced image synthesis accessible to a broader audience.
    Downloads: 287 This Week
    Last Update:
    See Project
  • 3
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission is to advance and democratize AI tooling. We believe that the future of AI tooling is open-source and community-driven.
    Downloads: 222 This Week
    Last Update:
    See Project
  • 4
    FLUX.2

    FLUX.2

    Official inference repo for FLUX.2 models

    FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved. It supports high-resolution output (up to ~4 megapixels), which allows for photography-quality images, detailed product shots, infographics or UI mockups rather than just low-resolution drafts. FLUX.2 is built with a modern architecture (a flow-matching transformer + a revamped VAE + a strong vision-language encoder), enabling strong prompt adherence, correct rendering of text/typography in images, reliable lighting, layout, and physical realism, and consistent style/character/product identity across multiple generations or edits.
    Downloads: 58 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 5
    FastSD CPU

    FastSD CPU

    Fast stable diffusion on CPU and AI PC

    FastSD CPU is an optimized fork of Stable Diffusion designed to run efficiently on CPUs and devices without dedicated GPUs by leveraging Latent Consistency Models and Adversarial Diffusion Distillation techniques that accelerate inference. It focuses on bringing fast text-to-image generation to mainstream hardware like desktop CPUs, lower-end laptops, or edge devices without requiring high-end graphics processors. The repository contains multiple interfaces including a desktop GUI for simple generation, an advanced web-based UI with support for extensions like LoRA and ControlNet, and a command-line interface for scripted usage or server deployments. With support for performance-oriented libraries such as OpenVINO and hardware acceleration on platforms like Intel AI PCs, FastSD CPU aims to shrink generation times dramatically compared with naive CPU implementations.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 6
    FLUX.1

    FLUX.1

    Official inference repo for FLUX.1 models

    FLUX.1 repository contains inference code and tooling for the FLUX.1 text-to-image diffusion models, enabling developers and researchers to generate and edit images from natural-language prompts using open-weight versions of the model on their own hardware or within custom applications. The project is part of a larger family of FLUX models developed by Black Forest Labs, designed to produce high-quality, detailed visuals from text descriptions with competitive prompt adherence and artistic fidelity. This repo focuses on running the open-source model variants efficiently, providing scripts, model loading logic, and examples for local installations, and supports integration with Python toolchains like PyTorch and popular generative pipelines. Users can launch CLI tools to generate images, experiment with different FLUX variants, and extend the base code for research-oriented applications.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 7
    Stable Diffusion

    Stable Diffusion

    High-Resolution Image Synthesis with Latent Diffusion Models

    Stable Diffusion Version 2. The Stable Diffusion project, developed by Stability AI, is a cutting-edge image synthesis model that utilizes latent diffusion techniques for high-resolution image generation. It offers an advanced method of generating images based on text input, making it highly flexible for various creative applications. The repository contains pretrained models, various checkpoints, and tools to facilitate image generation tasks, such as fine-tuning and modifying the models. Stability AI's approach to image synthesis has contributed to creating detailed, scalable images while maintaining efficiency.
    Downloads: 288 This Week
    Last Update:
    See Project
  • 8
    Qwen-Image

    Qwen-Image

    Qwen-Image is a powerful image generation foundation model

    Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence. The model excels not only in text rendering but also in a wide range of artistic styles, including photorealistic, impressionist, anime, and minimalist aesthetics. Qwen-Image supports sophisticated editing tasks such as style transfer, object insertion and removal, detail enhancement, and even human pose manipulation, making it suitable for both professional and casual users. It also includes advanced image understanding capabilities like object detection, semantic segmentation, depth and edge estimation, and novel view synthesis.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 9
    KoboldCpp

    KoboldCpp

    Run GGUF models easily with a UI or API. One File. Zero Install.

    KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.
    Leader badge
    Downloads: 420 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter counts without linear inference cost explosion. The model is intended to be competitive with closed-source image generation systems, aiming for high fidelity, prompt adherence, fine detail, and even “world knowledge” reasoning (i.e. leveraging context, semantics, or common sense in generation). The GitHub repo includes code, scripts, model loading instructions, inference utilities, prompt handling, and integration with standard ML tooling (e.g. Hugging Face / Transformers).
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    Dream Textures

    Dream Textures

    Stable Diffusion built-in to Blender

    Create textures, concept art, background assets, and more with a simple text prompt. Use the 'Seamless' option to create textures that tile perfectly with no visible seam. Texture entire scenes with 'Project Dream Texture' and depth to image. Re-style animations with the Cycles render pass. Run the models on your machine to iterate without slowdowns from a service. Create textures, concept art, and more with text prompts. Learn how to use the various configuration options to get exactly what you're looking for. Texture entire models and scenes with depth to image. Inpaint to fix up images and convert existing textures into seamless ones automatically. Outpaint to increase the size of an image by extending it in any direction. Perform style transfer and create novel animations with Stable Diffusion as a post processing step. Dream Textures has been tested with CUDA and Apple Silicon GPUs. Over 4GB of VRAM is recommended.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    Diffusers

    Diffusers

    State-of-the-art diffusion models for image and audio generation

    Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or training your own diffusion models, Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions. State-of-the-art diffusion pipelines that can be run in inference with just a few lines of code. Interchangeable noise schedulers for different diffusion speeds and output quality. Pretrained models that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems. We recommend installing Diffusers in a virtual environment from PyPi or Conda. For more details about installing PyTorch and Flax, please refer to their official documentation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Hunyuan3D-1

    Hunyuan3D-1

    A Unified Framework for Text-to-3D and Image-to-3D Generation

    Hunyuan3D-1 is an earlier version in the same 3D generation line (the unified framework for text-to-3D and image-to-3D tasks) by Tencent Hunyuan. It provides a framework combining shape generation and texture synthesis, enabling users to create 3D assets from images or text conditions. While less advanced than version 2.1, it laid the foundations for the later PBR, higher resolution, and open-source enhancements. (Note: less detailed public documentation was found for Hunyuan3D-1 compared to 2.1.). Community and ecosystem support (e.g. usage via Blender addon for geometry/texture). Integration into user-friendly tools/platforms.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    ImageReward

    ImageReward

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

    ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    GLIDE (Text2Im)

    GLIDE (Text2Im)

    GLIDE: a diffusion-based text-conditional image synthesis model

    glide-text2im is an open source implementation of OpenAI’s GLIDE model, which generates photorealistic images from natural language text prompts. It demonstrates how diffusion-based generative models can be conditioned on text to produce highly detailed and coherent visual outputs. The repository provides both model code and pretrained checkpoints, making it possible for researchers and developers to experiment with text-to-image synthesis. GLIDE includes advanced techniques such as classifier-free guidance, which improves the quality and alignment of generated images with the input text. The project also offers sampling scripts and utilities for exploring how diffusion models can be applied to multimodal tasks. As one of the early diffusion-based text-to-image systems, glide-text2im laid important groundwork for later advances in generative AI research.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    VQGAN-CLIP web app

    VQGAN-CLIP web app

    Local image generation using VQGAN-CLIP or CLIP guided diffusion

    VQGAN-CLIP has been in vogue for generating art using deep learning. Searching the r/deepdream subreddit for VQGAN-CLIP yields quite a number of results. Basically, VQGAN can generate pretty high-fidelity images, while CLIP can produce relevant captions for images. Combined, VQGAN-CLIP can take prompts from human input, and iterate to generate images that fit the prompts. Thanks to the generosity of creators sharing notebooks on Google Colab, the VQGAN-CLIP technique has seen widespread circulation. However, for regular usage across multiple sessions, I prefer a local setup that can be started up rapidly. Thus, this simple Streamlit app for generating VQGAN-CLIP images on a local environment. Be advised that you need a beefy GPU with lots of VRAM to generate images large enough to be interesting. (Hello Quadro owners!).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Free AI Watermark Remover - FreeRepair

    Free AI Watermark Remover - FreeRepair

    AI-powered tool to quickly remove watermarks from images flawlessly

    AI Watermark Remover (Free And Open-Source) & Make Blurry Images Clearer Or Larger Tool - FreeRepair, Simulation IOPaint Based On The Django Of Python With No Sign-Up. As a free, open-source, AI-powered tool, FreeRepair makes it easy to remove watermarks, logos, text or clutter from images, and blurry images can be made clearer or larger. No installation, no internet connection, it works out of the box, safe and secure, unlimited.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 19
    Core ML Stable Diffusion

    Core ML Stable Diffusion

    Stable Diffusion with Core ML on Apple Silicon

    Run Stable Diffusion on Apple Silicon with Core ML. python_coreml_stable_diffusion, a Python package for converting PyTorch models to Core ML format and performing image generation with Hugging Face diffusers in Python. StableDiffusion, a Swift package that developers can add to their Xcode projects as a dependency to deploy image generation capabilities in their apps. The Swift package relies on the Core ML model files generated by python_coreml_stable_diffusion. Hugging Face ran the conversion procedure on the following models and made the Core ML weights publicly available on the Hub. If you would like to convert a version of Stable Diffusion that is not already available on the Hub, please refer to the Converting Models to Core ML. Log in to or register for your Hugging Face account, generate a User Access Token and use this token to set up Hugging Face API access by running huggingface-cli login in a Terminal window.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    DALL-E 2 - Pytorch

    DALL-E 2 - Pytorch

    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis

    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch. The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best performing variant (but which incidentally involves a causal transformer as the denoising network) To train DALLE-2 is a 3 step process, with the training of CLIP being the most important. To train CLIP, you can either use x-clip package, or join the LAION discord, where a lot of replication efforts are already underway. Then, you will need to train the decoder, which learns to generate images based on the image embedding coming from the trained CLIP.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Deep Daze

    Deep Daze

    Simple command line tool for text to image generation

    Simple command-line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). In true deep learning fashion, more layers will yield better results. Default is at 16, but can be increased to 32 depending on your resources. Technique first devised and shared by Mario Klingemann, it allows you to prime the generator network with a starting image, before being steered towards the text. Simply specify the path to the image you wish to use, and optionally the number of initial training steps. We can also feed in an image as an optimization goal, instead of only priming the generator network. Deepdaze will then render its own interpretation of that image. The regular mode for texts only allows 77 tokens. If you want to visualize a full story/paragraph/song/poem, set create_story to True.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Disco Diffusion

    Disco Diffusion

    Notebooks, models and techniques for the generation of AI Art

    A frankensteinian amalgamation of notebooks, models, and techniques for the generation of AI art and animations. This project uses a special conversion tool to convert the Python files into notebooks for easier development. What this means is you do not have to touch the notebook directly to make changes to it. The tool being used is called Colab-Convert. Initial QoL improvements added, including user-friendly UI, settings+prompt saving, and improved google drive folder organization. Now includes sizing options, intermediate saves and fixed image prompts and Perlin inits. the unexposed batch option since it doesn't work.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    PaddleGAN

    PaddleGAN

    PaddlePaddle GAN library, including lots of interesting applications

    PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on. PaddleGAN provides developers with high-performance implementation of classic and SOTA Generative Adversarial Networks, and supports developers to quickly build, train and deploy GANs for academic, entertainment, and industrial usage. GAN-Generative Adversarial Network, was praised by "the Father of Convolutional Networks" Yann LeCun (Yang Likun) as [One of the most interesting ideas in the field of computer science in the past decade]. It's the one research area in deep learning that AI researchers are most concerned about.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Deface GUI -  Face Anonymization Tool

    Deface GUI - Face Anonymization Tool

    Graphical User Interface Face Anonymization Tool

    This application is a professional tool with a graphical user interface that enables anonymization of faces using the Deface Engine. Cross-Platform Compatible (Linux-Windows) NOTE: To use on Windows, first install Python. Then, if necessary, install “pip install deface” (only if necessary).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    macara

    macara

    A converter for seamless transformation of files, data, and media ...

    This application consolidates various scripts, including an AI feature (rembg), into a singular platform. The design of this software is evolutionary, allowing for the seamless integration of additional scripts, menus, or windows as needed. Serving as a versatile tool, it facilitates efficient file management, especially when handling a substantial volume of images, whether sorting by name or other attributes. These scripts are crafted to complement generative art AI technologies like Dall-e or stable diffusion.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB