Search Results for "stable-diffusion-webui"

Sort By:

13972 projects for "stable-diffusion-webui" with 1 filter applied:

ChromeOS Clear Filters & Widen Search

Cloud tools for web scraping and data extraction
Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.

Explore 10,000+ tools
The Original Buy Center Software.
Never Go To The Auction Again.

VAN sources private-party vehicles from over 20 platforms and provides all necessary tools to communicate with sellers and manage opportunities. Franchise and Independent dealers can boost their buy center strategies with our advanced tools and an experienced Acquisition Coaching™ team dedicated to your success.

Learn More
1

Stable Diffusion WebUI Forge

Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion

Stable Diffusion WebUI Forge is a performance- and feature-oriented fork of the popular AUTOMATIC1111 interface that experiments with new backends, memory optimizations, and UX improvements. It targets heavy users and researchers who push large models, control nets, and high-resolution pipelines where default settings can become bottlenecks.

Downloads: 0 This Week

Last Update: 2025-10-21
See Project
2

Stable Diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Stable Diffusion Version 2. The Stable Diffusion project, developed by Stability AI, is a cutting-edge image synthesis model that utilizes latent diffusion techniques for high-resolution image generation. It offers an advanced method of generating images based on text input, making it highly flexible for various creative applications. The repository contains pretrained models, various checkpoints, and tools to facilitate image generation tasks, such as fine-tuning and modifying the models. ...

2 Reviews

Downloads: 229 This Week

Last Update: 2025-02-28
See Project
3

Stable Diffusion Rembg

Removes backgrounds from pictures. Extension for webui

This project is an extension for the Stable Diffusion Web UI that removes backgrounds from images directly inside the interface. It wraps popular background-removal models so creators can take a generated or uploaded image and isolate the subject with a single click. The workflow is designed to be non-destructive: you can preview, tweak thresholds, and export either a transparent PNG or a masked layer for further editing.

Downloads: 3 This Week

Last Update: 2025-10-23
See Project
4

Open WebUI

User-friendly AI Interface

...Additionally, Open WebUI offers a Progressive Web App (PWA) for mobile devices, providing offline access and a native app-like experience. The platform also includes a Model Builder, allowing users to create custom models from base Ollama models directly within the interface. With over 156,000 users, Open WebUI is a versatile solution for deploying and managing AI models in a secure, offline environment.

Downloads: 81 This Week

Last Update: 2026-01-10
See Project
HOA Software
Smarter Community Management Starts Here

Simplify HOA management with software that handles everything from financials to communication.

Learn More
5

TTS WebUI

A single Gradio + React WebUI with extensions for ACE-Step

TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis.

Downloads: 12 This Week

Last Update: 2025-11-28
See Project
6

Pixelization

Stable-diffusion-webui-pixelization

This is a specialized extension for the popular Stable Diffusion Web UI (AUTOMATIC1111) that focuses on converting or “pixelizing” images into a pixel-art aesthetic. It's designed as a plugin you install into the Web UI so that in the “Extras” or “Pixelization” tab you can drag in an input image and produce a stylized, block-based version with control over cell size, color depth, and segmentation.

Downloads: 0 This Week

Last Update: 2025-10-21
See Project
7

Z-Image

Image generation model with single-stream diffusion transformer

Z-Image is an efficient, open-source image generation foundation model built to make high-quality image synthesis more accessible. With just 6 billion parameters — far fewer than many large-scale models — it uses a novel “single-stream diffusion Transformer” architecture to deliver photorealistic image generation, demonstrating that excellence does not always require extremely large model sizes. The project includes several variants: Z-Image-Turbo, a distilled version optimized for speed and low resource consumption; Z-Image-Base, the full-capacity foundation model; and Z-Image-Edit, fine-tuned for image editing tasks. ...

Downloads: 148 This Week

Last Update: 6 days ago
See Project
8

StyleTTS 2

Towards Human-Level Text-to-Speech through Style Diffusion

StyleTTS2 is a state-of-the-art text-to-speech system that aims for human-level naturalness by combining style diffusion, adversarial training, and large speech language models. It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide generation toward more natural and coherent utterances. ...

Downloads: 6 This Week

Last Update: 2025-11-28
See Project
9

VoxCPM

TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic information while preserving fine-grained prosody, leading to more stable and expressive generation than many discrete-token systems. ...

Downloads: 10 This Week

Last Update: 2025-12-05
See Project
eProcurement Software
Enterprises and companies seeking a solution to manage all their procurement operations and processes

eBuyerAssist by Eyvo is a cloud-based procurement solution designed for businesses of all sizes and industries. Fully modular and scalable, it streamlines the entire procurement lifecycle—from requisition to fulfillment. The platform includes powerful tools for strategic sourcing, supplier management, warehouse operations, and contract oversight. Additional modules cover purchase orders, approval workflows, inventory and asset management, customer orders, budget control, cost accounting, invoice matching, vendor credit checks, and risk analysis. eBuyerAssist centralizes all procurement functions into a single, easy-to-use system—improving visibility, control, and efficiency across your organization. Whether you're aiming to reduce costs, enhance compliance, or align procurement with broader business goals, eBuyerAssist helps you get there faster, smarter, and with measurable results.

Learn More
10

HunyuanWorld-Voyager

RGBD video generation model conditioned on camera input

HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks.

Downloads: 49 This Week

Last Update: 2025-12-17
See Project
11

GLM-Image

GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image

GLM-Image is an open-source generative AI model designed to create high-fidelity images from text prompts using a hybrid architecture that combines autoregressive semantic understanding with diffusion-based detail refinement. It excels at generating images that include complex layouts and detailed text content, making it especially useful for posters, diagrams, info-graphics, social media graphics, and visual content that requires precise text placement and semantic alignment. Because it blends linguistic reasoning with image synthesis, GLM-Image produces visual outputs where semantic relationships and textual accuracy are prioritized alongside artistic style and realism, and its model structure enables it to handle dense visual knowledge tasks that challenge many pure diffusion models. ...

Downloads: 11 This Week

Last Update: 2026-01-16
See Project
12

HunyuanDiT

Diffusion Transformer with Fine-Grained Chinese Understanding

HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions.

Downloads: 9 This Week

Last Update: 2025-11-27
See Project
13

tinygrad

Deep learning framework

This may not be the best deep learning framework, but it is a deep learning framework. Due to its extreme simplicity, it aims to be the easiest framework to add new accelerators to, with support for both inference and training. If XLA is CISC, tinygrad is RISC.

Downloads: 1 This Week

Last Update: 2026-01-12
See Project
14

mcpo

A simple, secure MCP-to-OpenAPI proxy server

...This design lets you reuse a growing library of MCP servers with platforms that only understand HTTP+OpenAPI, unifying tool access across ecosystems. The project emphasizes “dead-simple” setup and pairs with Open WebUI documentation that shows end-to-end integration. It supports running multiple tools and makes them discoverable to clients that expect Swagger/JSON schemas. In practice, mcpo shortens the path from a local MCP tool to a shareable, network-accessible microservice.

Downloads: 0 This Week

Last Update: 2025-10-14
See Project
15

VibeVoice

Open-source multi-speaker long-form text-to-speech model

...A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz, enabling high audio fidelity with efficient processing of long sequences. The model integrates a Qwen2.5-based large language model with a diffusion head to produce realistic acoustic details and capture conversational context. Training involved curriculum learning with increasing sequence lengths up to 65K tokens, allowing VibeVoice to handle very long dialogues effectively. Safety mechanisms include an audible disclaimer and imperceptible watermarking in all generated audio to mitigate misuse risks.

Downloads: 12 This Week

Last Update: 2025-12-17
See Project
16

HunyuanVideo-Avatar

Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces, enabling multiple characters to be animated in a scene. ...

Downloads: 5 This Week

Last Update: 2025-12-16
See Project
17

StabilityMatrix

Multi-Platform Package Manager for Stable Diffusion

StabilityMatrix is a project that helps organize, evaluate, and compare generative AI models and their behavior across prompts, datasets, or configuration settings. It provides a framework to run experiments systematically—capturing inputs, model configurations, outputs, and metrics—so researchers and practitioners can reason about differences in quality, robustness, and failure modes. The repository often bundles tooling for automated prompt sweeping, scoring heuristics (such as diversity,...

Downloads: 121 This Week

Last Update: 2025-12-29
See Project
18

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks.

Downloads: 2 This Week

Last Update: 2025-09-28
See Project
19

Oasis

Inference script for Oasis 500M

Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. Because it’s an inference-focused repository, it’s especially useful as a practical reference for running the model, wiring inputs, and producing the autoregressive sequence of gameplay frames. ...

Downloads: 0 This Week

Last Update: 2026-01-06
See Project
20

Chat Nio

Next Generation AI One-Stop Internationalization Solution

Chat Nio is described as a next-generation, all-in-one AI platform that serves as an end-to-end solution for both B2B and B2C use cases. It supports dozens of underlying AI providers (OpenAI, Claude, Stable Diffusion, DALL·E, Midjourney, and many Chinese models, etc.), giving users flexibility in backend selection and switching. It offers a full stack: model management, channel/provider integration, a model marketplace, caching, subscription and billing support, dashboard analytics, and a web/admin UI. The platform supports model caching so repeated queries or similar inputs may be accelerated, and has mechanisms for elastic billing/subscription models to monetize usage. ...

Downloads: 1 This Week

Last Update: 2025-10-23
See Project
21

Roadmap To Learn Generative AI In 2025

Basic Machine Learning Natural Language Processing Roadmap

Roadmap To Learn Generative AI In 2025 is a curated learning path focused on contemporary generative AI — covering large language models (LLMs), diffusion-based image generation, prompt engineering, multi-modal AI, fine-tuning techniques, and the practical considerations for deploying generative models. It’s aimed at learners and developers who already have some programming or ML basics and wish to specialize in generative AI, offering a modern, structured plan that reflects the state of the art as of 2025. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
22

DreamO

A Unified Framework for Image Customization

DreamO is a unified, open-source framework from ByteDance for advanced image customization and generation that consolidates multiple “image manipulation” tasks into a single system, rather than requiring separate specialized models. Built on a diffusion-transformer (DiT) backbone, it supports a diverse set of tasks — including identity preservation, virtual “try-on” (e.g. clothing, accessories), style transfer, IP adaptation (objects/characters), and layout/condition-aware customizations — all handled within the same unified architecture. DreamO’s design introduces a feature routing constraint that helps disentangle different control conditions (like identity, style, clothing) when more than one is specified, which significantly reduces conflicts and artifacts when combining controls. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
23

Flow Matching

A PyTorch library for implementing flow matching algorithms

flow_matching is a PyTorch library implementing flow matching algorithms in both continuous and discrete settings, enabling generative modeling via matching vector fields rather than diffusion. The underlying idea is to parameterize a flow (a time-dependent vector field) that transports samples from a simple base distribution to a target distribution, and train via matching of flows without requiring score estimation or noisy corruption—this can lead to more efficient or stable generative training. The library supports both continuous-time flows (via differential equations) and discrete-time analogues, giving flexibility in design and tradeoffs. ...

Downloads: 1 This Week

Last Update: 2026-01-05
See Project
24

FLUX.1

Official inference repo for FLUX.1 models

FLUX.1 repository contains inference code and tooling for the FLUX.1 text-to-image diffusion models, enabling developers and researchers to generate and edit images from natural-language prompts using open-weight versions of the model on their own hardware or within custom applications. The project is part of a larger family of FLUX models developed by Black Forest Labs, designed to produce high-quality, detailed visuals from text descriptions with competitive prompt adherence and artistic fidelity. ...

Downloads: 12 This Week

Last Update: 6 days ago
See Project
25

HunyuanImage-3.0

A Powerful Native Multimodal Model for Image Generation

...It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter counts without linear inference cost explosion. The model is intended to be competitive with closed-source image generation systems, aiming for high fidelity, prompt adherence, fine detail, and even “world knowledge” reasoning (i.e. leveraging context, semantics, or common sense in generation). ...

1 Review

Downloads: 17 This Week

Last Update: 2025-10-31
See Project