Search Results for "automatic1111-stable-diffusion" - Page 4

Sort By:

Showing 23843 open source projects for "automatic1111-stable-diffusion"

View related business solutions

Mac Clear Filters & Widen Search

Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
1

HunyuanVideo-Avatar

Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces, enabling multiple characters to be animated in a scene. ...

Downloads: 0 This Week

Last Update: 2025-12-16
See Project
2

MegaTTS 3

Official PyTorch Implementation

MegaTTS3 is an open-source text-to-speech (TTS) and voice-cloning system from ByteDance that aims to deliver high-quality, expressive speech synthesis, including zero-shot voice cloning of previously unseen speakers. Its backbone is a lightweight diffusion-transformer (on the order of ~0.45 B parameters), which enables efficient inference while still producing high-fidelity audio. Given a reference audio sample (and corresponding latent representation), MegaTTS3 can generate speech in the style and voice timbre of that speaker — useful for personalized TTS, voice-overs, dubbing, or multi-speaker applications. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
3

FLUX.1

Official inference repo for FLUX.1 models

FLUX.1 repository contains inference code and tooling for the FLUX.1 text-to-image diffusion models, enabling developers and researchers to generate and edit images from natural-language prompts using open-weight versions of the model on their own hardware or within custom applications. The project is part of a larger family of FLUX models developed by Black Forest Labs, designed to produce high-quality, detailed visuals from text descriptions with competitive prompt adherence and artistic fidelity. ...

Downloads: 32 This Week

Last Update: 2026-01-19
See Project
4

Roadmap To Learn Generative AI In 2025

Basic Machine Learning Natural Language Processing Roadmap

Roadmap To Learn Generative AI In 2025 is a curated learning path focused on contemporary generative AI — covering large language models (LLMs), diffusion-based image generation, prompt engineering, multi-modal AI, fine-tuning techniques, and the practical considerations for deploying generative models. It’s aimed at learners and developers who already have some programming or ML basics and wish to specialize in generative AI, offering a modern, structured plan that reflects the state of the art as of 2025. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
5

DreamO

A Unified Framework for Image Customization

DreamO is a unified, open-source framework from ByteDance for advanced image customization and generation that consolidates multiple “image manipulation” tasks into a single system, rather than requiring separate specialized models. Built on a diffusion-transformer (DiT) backbone, it supports a diverse set of tasks — including identity preservation, virtual “try-on” (e.g. clothing, accessories), style transfer, IP adaptation (objects/characters), and layout/condition-aware customizations — all handled within the same unified architecture. DreamO’s design introduces a feature routing constraint that helps disentangle different control conditions (like identity, style, clothing) when more than one is specified, which significantly reduces conflicts and artifacts when combining controls. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
6

Flow Matching

A PyTorch library for implementing flow matching algorithms

flow_matching is a PyTorch library implementing flow matching algorithms in both continuous and discrete settings, enabling generative modeling via matching vector fields rather than diffusion. The underlying idea is to parameterize a flow (a time-dependent vector field) that transports samples from a simple base distribution to a target distribution, and train via matching of flows without requiring score estimation or noisy corruption—this can lead to more efficient or stable generative training. The library supports both continuous-time flows (via differential equations) and discrete-time analogues, giving flexibility in design and tradeoffs. ...

Downloads: 2 This Week

Last Update: 2026-01-05
See Project
7

HunyuanWorld 1.0

Generating Immersive, Explorable, and Interactive 3D Worlds

HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations. This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines, and...

Downloads: 3 This Week

Last Update: 2026-04-15
See Project
8

MethodOfLines.jl

Automatic Finite Difference PDE solving with Julia SciML

MethodOfLines.jl is a Julia package for automated finite difference discretization of symbolically defined PDEs in N dimensions. It uses symbolic expressions for systems of partial differential equations as defined with ModelingToolkit.jl, and Interval from DomainSets.jl to define the space(time) over which the simulation runs. This project is under active development, therefore the interface is subject to change. The docs will be updated to reflect any changes, please check back for current...

Downloads: 1 This Week

Last Update: 2024-10-12
See Project
9

ComfyUI SUPIR

SUPIR upscaling wrapper for ComfyUI

The ComfyUI-SUPIR project is a ComfyUI integration of the SUPIR model, which is designed for high-quality image restoration and super-resolution. It enables users to enhance low-resolution or degraded images using advanced diffusion-based techniques. The integration provides nodes that allow users to control parameters such as noise levels, guidance strength, and output quality. It is particularly useful for workflows that require upscaling or restoring images before further processing. The project leverages modern generative models to produce sharp, detailed outputs while preserving the original structure of the image. ...

Downloads: 5 This Week

Last Update: 5 days ago
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Grounded-Segment-Anything

Marrying Grounding DINO with Segment Anything & Stable Diffusion

Grounded-Segment-Anything is a research-oriented project that combines powerful open-set object detection with pixel-level segmentation and subsequent creative workflows, effectively enabling detection, segmentation, and high-level vision tasks guided by free-form text prompts. The core idea behind the project is to pair Grounding DINO — a zero-shot object detector that can locate objects described by natural language — with Segment Anything Model (SAM), which can produce detailed masks for...

Downloads: 0 This Week

Last Update: 2026-02-03
See Project
11

NetherSX2 Classic

Continuation of NetherSX2 based on AetherSX2 3668

NetherSX2-classic is a companion and variant of NetherSX2 that targets a specific older base version of the AetherSX2 emulator (based on the 3668 branch), applying similar custom patches to provide a stable and performant PS2 emulation environment on Android devices. The project stitches in anti-tampering modifications, RetroAchievements notification fixes, and controller and GameDB updates while maintaining the legacy behavior of the classic build for compatibility with titles that might perform better on the older codebase. Because this classic branch starts from a slightly different upstream version than NetherSX2-patch, users often choose it for performance reasons on lower-power devices or for games with known regressions in newer builds. ...

Downloads: 3,208 This Week

Last Update: 2026-01-05
See Project
12

CogVideo

Text and image to video generation: CogVideoX and CogVideo

CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference, fine-tuning, and optimization, making it suitable for both research and production use. ...

Downloads: 22 This Week

Last Update: 2025-10-04
See Project
13

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper

WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. ...

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
14

Linfa

A Rust machine learning framework

linfa aims to provide a comprehensive toolkit to build Machine Learning applications with Rust. Kin in spirit to Python's scikit-learn, it focuses on common preprocessing tasks and classical ML algorithms for your everyday ML tasks.

Downloads: 0 This Week

Last Update: 2025-12-23
See Project
15

101-0250-00

ETH course - Solving PDEs in parallel on GPUs

This course aims to cover state-of-the-art methods in modern parallel Graphical Processing Unit (GPU) computing, supercomputing and code development with applications to natural sciences and engineering.

Downloads: 0 This Week

Last Update: 2026-01-05
See Project
16

PEFT

State-of-the-art Parameter-Efficient Fine-Tuning

Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters. Fine-tuning large-scale PLMs is often prohibitively costly. In this regard, PEFT methods only fine-tune a small number of (extra) model parameters, thereby greatly decreasing the computational and storage costs. Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full...

Downloads: 4 This Week

Last Update: 2026-04-16
See Project
17

BepInEx

Unity / XNA game patcher and plugin framework

Unity / XNA game patcher and plugin framework. BepInEx is a plugin / modding framework for Unity Mono, IL2CPP, and .NET framework games (XNA, FNA, MonoGame, etc.) Stable builds are released once a new iteration of BepInEx is considered feature-complete. They have the least bugs, but some newest features might not be available. Bleeding edge builds are available on BepisBuilds. Bleeding edge builds are always the latest builds of the source code. Thus they are the opposite to stable builds: they have the newest features and bugfixes available, but usually tend to be the most buggy. ...

Downloads: 370 This Week

Last Update: 2026-02-09
See Project
18

OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

The OmniVoice project is a cutting-edge multilingual text-to-speech system designed to generate high-quality speech across more than 600 languages. Built on a diffusion language model-style architecture, it combines scalability with strong performance, enabling both natural-sounding voice synthesis and efficient inference speeds. One of its most notable capabilities is zero-shot voice cloning, allowing users to replicate a speaker’s voice using only a short reference audio clip. In addition, it supports voice design through configurable attributes such as gender, accent, pitch, and speaking style, giving users fine-grained control over generated speech. ...

Downloads: 8 This Week

Last Update: 6 days ago
See Project
19

HunyuanImage-3.0

A Powerful Native Multimodal Model for Image Generation

...It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter counts without linear inference cost explosion. The model is intended to be competitive with closed-source image generation systems, aiming for high fidelity, prompt adherence, fine detail, and even “world knowledge” reasoning (i.e. leveraging context, semantics, or common sense in generation). ...

1 Review

Downloads: 10 This Week

Last Update: 2026-02-03
See Project
20

LiveAvatar

Streaming Real-time Audio-Driven Avatar Generation

LiveAvatar is an open-source research and implementation project that provides a unified framework for real-time, streaming, interactive avatar video generation driven by audio and other control signals. It implements techniques from state-of-the-art diffusion-based avatar modeling to support infinite-length continuous video generation with low latency, enabling interactive AI avatars that maintain continuity and realism over extended sessions. The project co-designs algorithms and system optimizations, such as block-wise autoregressive processing and fast sampling strategies, to deliver real-time frame rates (e.g., ~45 FPS on appropriate GPU clusters) while handling non-stop generation without quality degradation. ...

Downloads: 3 This Week

Last Update: 2026-04-08
See Project
21

TensorRT Node for ComfyUI

Enables the best performance on NVIDIA RTX Graphics Cards

ComfyUI_TensorRT is an extension that lets ComfyUI run AI inference through NVIDIA’s TensorRT, aiming to get faster, more efficient execution on supported GPUs. It bridges the gap between ComfyUI’s flexible, node-based workflows and TensorRT’s highly optimized engine format. The result is that complex diffusion or image-processing graphs can be accelerated without the user having to rewrite the pipeline. The repo typically includes instructions for converting models to TensorRT engines and for wiring those engines into ComfyUI nodes. This is particularly attractive for power users who run many generations or who host ComfyUI on dedicated hardware and want to squeeze out every bit of GPU performance. ...

Downloads: 3 This Week

Last Update: 2025-10-30
See Project
22

cc-switch

A cross-platform desktop All-in-One assistant tool for Claude Code

...Built as a modern desktop app using Tauri and web technologies, it enables users to manage credentials, sessions, and tool settings without manually editing configuration files. The project also includes advanced reliability features such as automatic failover, local proxy routing, and usage monitoring to help maintain stable AI tool operations. With ongoing updates adding session management, backup controls, and expanded provider support, cc-switch is positioned as a power-user control center for AI-assisted development environments.

Downloads: 1,503 This Week

Last Update: 2026-04-23
See Project
23

Lax 6.0.6 Stable

Official Lax 6.0.6 on Sourceforge, all Lax versions are here

Downloads: 0 This Week

Last Update: 2026-01-14
See Project
24

rembg-stable-projectorz

For integration of REMBG with the StableProjectorz https://stableprojectorz.com Helps to detect and remove backgrounds from 2D images. A second mirror, in addition to the GitHub.

Downloads: 0 This Week

Last Update: 2025-02-10
See Project
25

trellis-stable-projectorz

A One-click installer for Windows: (Python 3.11, Cuda 11.8, Torch 2.1.2) Repository for integration with the StableProjectorz, a free AI-texturing tool. https://stableprojectorz.com Our Discord server: https://discord.gg/aWbnX2qan2 supports float16 and int32 optimizations

1 Review

Downloads: 0 This Week

Last Update: 2025-08-08
See Project