Showing 333 open source projects for "z-image"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 1
    Qwen-Image

    Qwen-Image

    Qwen-Image is a powerful image generation foundation model

    Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 2
    scikit-image

    scikit-image

    Image processing in Python

    scikit-image is a collection of algorithms for image processing. It is available free of charge and free of restriction. We pride ourselves on high-quality, peer-reviewed code, written by an active community of volunteers. scikit-image builds on scipy.ndimage to provide a versatile set of image processing routines in Python. This library is developed by its community, and contributions are most welcome!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Qwen-Image-Layered

    Qwen-Image-Layered

    Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

    ...By combining text and structured image representations, it aims to facilitate tasks where both descriptive and structural understanding are important, such as detailed image QA, interactive image editing via prompt layers, and image-conditioned generation with structural control. The layered approach supports training signals that help the model learn how visual elements relate to each other and to textual context, rather than simply learning global image embeddings.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    PyTorch Image Models

    PyTorch Image Models

    The largest collection of PyTorch image encoders / backbones

    timm (PyTorch Image Models) is a premier library hosting a vast collection of state-of-the-art image classification models and backbones such as ResNet, EfficientNet, NFNet, Vision Transformer, ConvNeXt, and more. Created by Ross Wightman and now maintained by Hugging Face, it includes pretrained weights, data loaders, augmentations, optimizers, schedulers, and reference scripts for training, evaluation, inference, and model export.
    Downloads: 1 This Week
    Last Update:
    See Project
  • QA Wolf | We Write, Run and Maintain Tests Icon
    QA Wolf | We Write, Run and Maintain Tests

    For developer teams searching for a testing software

    QA Wolf is an AI-native service that delivers 80% automated E2E test coverage for web & mobile apps in weeks not years.
    Learn More
  • 5
    labelme Image Polygonal Annotation

    labelme Image Polygonal Annotation

    Image polygonal annotation with Python

    Labelme is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Image annotation for polygon, rectangle, circle, line and point. Image flag annotation for classification and cleaning. Video annotation. (video annotation). GUI customization (predefined labels / flags, auto-saving, label validation, etc). Exporting VOC-format dataset for semantic/instance segmentation.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 6
    Deep-Live-Cam

    Deep-Live-Cam

    Real time face swap and one-click video deepfake

    Real time face swap and one-click video deepfake with only a single image. Choose a face (image with the desired face) and the target image/video (image/video in which you want to replace the face) and click on Start. Open File Explorer and navigate to the directory you select your output to be in. You will find a directory named <video_title> where you can see the frames being swapped in real time. Once the processing is done, it will create the output file.
    Downloads: 727 This Week
    Last Update:
    See Project
  • 7
    Fooocus

    Fooocus

    Focus on prompting and generating

    Fooocus is an open-source image generation software that simplifies the process of creating images from text prompts. Built on Gradio and leveraging Stable Diffusion XL, Fooocus eliminates the need for manual parameter tweaking, allowing users to focus solely on crafting prompts. It offers a user-friendly interface with minimal setup, making advanced image synthesis accessible to a broader audience.
    Downloads: 180 This Week
    Last Update:
    See Project
  • 8
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and Apple Silicon, plus support for GPUs and CPUs, it caters to a wide range of users—from hobbyists to professionals. ...
    Downloads: 241 This Week
    Last Update:
    See Project
  • 9
    Stable Diffusion WebUI

    Stable Diffusion WebUI

    Web interface for generating images using Stable Diffusion models

    This project provides a powerful web-based interface for running Stable Diffusion, a text-to-image generation model. Developed by AUTOMATIC1111, it supports numerous features like model customization, prompt history, image upscaling, inpainting, and batch processing. The WebUI is beginner-friendly yet powerful enough for advanced users, becoming one of the most popular community-run UIs for AI image generation.
    Downloads: 40 This Week
    Last Update:
    See Project
  • Start building your dream online with an easy-to-use and affordable website builder | one.com Icon
    Start building your dream online with an easy-to-use and affordable website builder | one.com

    For companies and brands seeking a provider of website tools, hosting, and personalized email solutions

    Website tools, hosting, and personalized email all in one plan. We’ll help you every step of the way. Find or transfer your domain name, build your site, and make it a success. Kick-start your success today by registering the perfect domain name. If you already own a domain name, we’ll help you transfer it. Build your website with the simple Website Builder or more advanced WordPress. Create a beautiful, responsive site in just a few steps. Grow your customer base. You’ve put in the effort of creating something you are proud of, and now you want the world to see it. To get you started, all our plans include one free domain for a whole year. Start building your dream online with our easy-to-use website builder. Grow your website traffic with Google Ads. Get 1 month free when you sign up. Our friendly support team is available 24/7, every day of the year. All our plans include a free SSL certificate. Your website is secure from day 1.
    Learn More
  • 10
    FLUX.2

    FLUX.2

    Official inference repo for FLUX.2 models

    FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved.
    Downloads: 61 This Week
    Last Update:
    See Project
  • 11
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    ...Its inference path is fully feed-forward with optional mixed-precision and memory-efficient modes, making it practical to scale to long image sequences while keeping latency predictable.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 113 This Week
    Last Update:
    See Project
  • 13
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter counts without linear inference cost explosion. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 14
    Lama Cleaner

    Lama Cleaner

    Image inpainting tool powered by SOTA AI Model

    ...Many AICG creators are using Lama Cleaner to clean-up their work. Completely free and open-source, fully self-hosted, supports CPU & GPU. Windows 1-Click Installer, classical image inpainting algorithm powered by cv2. Multiple SOTA AI models, and various inpainting strategies. Run as a desktop application. Interactive Segmentation on any object.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 15
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image systems struggle. ...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 16
    DeiT (Data-efficient Image Transformers)
    DeiT (Data-efficient Image Transformers) shows that Vision Transformers can be trained competitively on ImageNet-1k without external data by using strong training recipes and knowledge distillation. Its key idea is a specialized distillation strategy—including a learnable “distillation token”—that lets a transformer learn effectively from a CNN or transformer teacher on modest-scale datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    ...The model is trained on significantly larger datasets than its predecessor, greatly enhancing motion complexity, semantic understanding, and aesthetic diversity. Wan2.2 also open-sources a 5-billion parameter high-compression VAE-based hybrid text-image-to-video (TI2V) model that supports 720P video generation at 24fps on consumer-grade GPUs like the RTX 4090. It supports multiple video generation tasks including text-to-video.
    Downloads: 228 This Week
    Last Update:
    See Project
  • 18
    Dream Textures

    Dream Textures

    Stable Diffusion built-in to Blender

    Create textures, concept art, background assets, and more with a simple text prompt. Use the 'Seamless' option to create textures that tile perfectly with no visible seam. Texture entire scenes with 'Project Dream Texture' and depth to image. Re-style animations with the Cycles render pass. Run the models on your machine to iterate without slowdowns from a service. Create textures, concept art, and more with text prompts. Learn how to use the various configuration options to get exactly what you're looking for. Texture entire models and scenes with depth to image. Inpaint to fix up images and convert existing textures into seamless ones automatically. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...
    Downloads: 415 This Week
    Last Update:
    See Project
  • 20
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates. Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 21
    HunyuanCustom

    HunyuanCustom

    Multimodal-Driven Architecture for Customized Video Generation

    HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images. The architecture builds on HunyuanVideo, with added modules for identity reinforcement and modality-specific condition injection. Text-image fusion module based on LLaVA for improved multimodal understanding. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 23
    Qwen-VL

    Qwen-VL

    Chat & pretrained large vision language model

    ...Chinese, English), and is aimed at tasks like image captioning, question answering on images (VQA, DocVQA), grounding (detecting objects or regions from textual queries), etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    CogVideo

    CogVideo

    text and image to video generation: CogVideoX (2024) and CogVideo

    CogVideo is an open source text-/image-/video-to-video generation project that hosts the CogVideoX family of diffusion-transformer models and end-to-end tooling. The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100).
    Downloads: 24 This Week
    Last Update:
    See Project
  • 25
    EasyOCR

    EasyOCR

    Ready-to-use OCR with 80+ supported languages

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first generation models. EasyOCR will choose the latest model by default but you can also specify which model to use. ...
    Downloads: 39 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next