Showing 281 open source projects for "automatic1111-stable-diffusion"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Applio

    Applio

    A simple, high-quality voice conversion tool focused on ease of use

    ...The project is structured to be flexible through plugins and configurations so users can extend functionality without touching the core code. Applio is considered stable and mature; ongoing development is now centered on security patches, dependency maintenance, and occasional improvements, which makes it attractive for production or repeatable workflows. It also includes TensorBoard helper scripts so people training custom models can monitor metrics and experiment more systematically.
    Downloads: 83 This Week
    Last Update:
    See Project
  • 2
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe...
    Downloads: 79 This Week
    Last Update:
    See Project
  • 3
    AnimateDiff

    AnimateDiff

    Plug-n-play module turning text-to-image models into animation

    AnimateDiff is an open-source project designed to enhance text-to-image diffusion models by adding animation capabilities. It allows users to turn static images generated by popular text-to-image models into animated sequences without requiring additional model training. This plug-and-play tool is compatible with a wide range of community models and facilitates the generation of animation directly from pre-existing text-to-image models.
    Leader badge
    Downloads: 44 This Week
    Last Update:
    See Project
  • 4
    TensorFlow

    TensorFlow

    TensorFlow is an open source library for machine learning

    Originally developed by Google for internal use, TensorFlow is an open source platform for machine learning. Available across all common operating systems (desktop, server and mobile), TensorFlow provides stable APIs for Python and C as well as APIs that are not guaranteed to be backwards compatible or are 3rd party for a variety of other languages. The platform can be easily deployed on multiple CPUs, GPUs and Google's proprietary chip, the tensor processing unit (TPU). TensorFlow expresses its computations as dataflow graphs, with each node in the graph representing an operation. ...
    Downloads: 32 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    DINOv3

    DINOv3

    Reference PyTorch implementation and models for DINOv3

    DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 6
    Chatterbox

    Chatterbox

    SoTA open-source TTS

    Chatterbox is Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 7
    LeWorldModel

    LeWorldModel

    Official code base for LeWorldModel: Stable End-to-End Joint-Embedding

    LeWorldModel is a minimalist tiling window manager designed for the X11 windowing system, focusing on simplicity, performance, and efficient use of screen space. It provides automatic window tiling behavior, organizing application windows into structured layouts without requiring manual resizing or positioning. The project emphasizes a lightweight design, minimizing resource usage while maintaining responsiveness and stability. It is highly configurable through source code or configuration...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Materials Discovery: GNoME

    Materials Discovery: GNoME

    AI discovers 520000 stable inorganic crystal structures for research

    Materials Discovery (GNoME) is a large-scale research initiative by Google DeepMind focused on applying graph neural networks to accelerate the discovery of stable inorganic crystal materials. The project centers on Graph Networks for Materials Exploration (GNoME), a message-passing neural network architecture trained on density functional theory (DFT) data to predict material stability and energy formation. Using GNoME, DeepMind identified 381,000 new stable materials, later expanding the dataset to include over 520,000 materials within 1 meV/atom of the convex hull as of August 2024. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    GPT Computer Assistant

    GPT Computer Assistant

    gpt-4o for windows, macos and linux

    This is an alternative work for providing ChatGPT MacOS app to Windows and Linux. In this way, this is a fresh and stable work. You can easily install as a Python library for this time but we will prepare a pipeline for providing native install scripts (.exe).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 10
    RL Baselines3 Zoo

    RL Baselines3 Zoo

    Training framework for Stable Baselines3 reinforcement learning agents

    rl-baselines3-zoo is a collection of pre-trained models, benchmarks, and hyperparameter tuning tools built on top of Stable Baselines3, a reinforcement learning library. It provides an easy way to test, evaluate, and train RL agents across a wide variety of environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    LongCat-Image

    LongCat-Image

    Foundation model for image generation

    LongCat-Image is an open-source foundation model for image generation and editing created by the LongCat team at Meituan, designed to deliver high-quality visual outputs while remaining efficient and accessible for developers and researchers. Rather than relying on massive parameter counts typical of many cutting-edge models, LongCat-Image achieves strong photorealism, stable structure, and accurate bilingual (Chinese and English) text rendering with a more compact ~6-billion parameter architecture, making it competitive with much larger alternatives despite its relatively lean design. The model excels at both text-to-image generation and instruction-guided image editing, offering users versatile capabilities for creative and practical tasks—whether generating art, mockups, or adjusting existing visuals with fine control.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Chitu

    Chitu

    High-performance inference framework for large language models

    ...The system also includes performance optimizations for large models, including support for quantized formats and efficient computation operators that reduce memory usage and latency. Its architecture aims to support enterprise adoption by ensuring stable long-term operation under production workloads.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    vJEPA-2

    vJEPA-2

    PyTorch code and models for VJEPA2 self-supervised learning from video

    ...The architecture is designed to scale: spatiotemporal ViT backbones, flexible masking schedules, and efficient sampling let it train on long clips while remaining stable. Trained representations transfer well to downstream tasks such as action recognition, temporal localization, and video retrieval, often with simple linear probes or light fine-tuning. The repository typically includes end-to-end recipes—data pipelines, augmentation policies, training scripts, and evaluation harnesses.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Godot RL Agents

    Godot RL Agents

    An Open Source package that allows video game creators

    godot_rl_agents is a reinforcement learning integration for the Godot game engine. It allows AI agents to learn how to interact with and play Godot-based games using RL algorithms. The toolkit bridges Godot with Python-based RL libraries like Stable-Baselines3, making it possible to create complex and visually rich RL environments natively in Godot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Klavis AI

    Klavis AI

    MCP integration platforms for AI agents to use tools at any scale

    Klavis AI is a Y Combinator X25-backed open-source infrastructure platform that enables AI agents to reliably connect with external tools and services at scale through Model Context Protocol (MCP). Founded by ex-Google DeepMind and ex-Lyft engineers, Klavis provides 50+ production-ready MCP servers with enterprise OAuth support for GitHub, Slack, Gmail, Salesforce, Linear, Notion, and more. The flagship product Strata solves tool overload through progressive discovery, achieving +13% higher...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    JEPA

    JEPA

    PyTorch code and models for V-JEPA self-supervised learning from video

    ...The repository provides training recipes, data pipelines, and evaluation utilities for image JEPA variants and often includes ablations that illuminate which masking and architectural choices matter. Because the objective is non-autoregressive and operates in embedding space, JEPA tends to be compute-efficient and stable at scale. The approach has become a strong alternative to contrastive or pixel-reconstruction methods for representation learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ManiSkill

    ManiSkill

    SAPIEN Manipulation Skill Framework

    ManiSkill is a benchmark platform for training and evaluating reinforcement learning agents on dexterous manipulation tasks using physics-based simulations. Developed by Hao Su Lab, it focuses on robotic manipulation with diverse, high-quality 3D tasks designed to challenge perception, control, and planning in robotics. ManiSkill provides both low-level control and visual observation spaces for realistic learning scenarios.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    pix2pixHD

    pix2pixHD

    Synthesizing and manipulating 2048x1024 images with conditional GANs

    ...It is widely used to convert structured inputs such as semantic label maps into realistic images, making it particularly valuable in applications like autonomous driving simulation, face synthesis, and scene generation. The model improves upon earlier GAN approaches by introducing multi-scale generators and discriminators that enable stable training and fine detail generation at large resolutions. It also supports interactive editing, allowing users to modify semantic regions and regenerate images with realistic adjustments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    OpenHome Abilities

    OpenHome Abilities

    Open-source abilities for OpenHome agents

    ...The system is meant to support a wide range of voice-driven actions, from API calls and media playback to quiz flows, device control, and multi-turn conversations, so it functions as a practical extension framework rather than a narrow template library. The repository includes official abilities maintained by the OpenHome team as well as community-contributed ones, creating both a stable baseline and a path for outside developers to publish their own work.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    DeepChem

    DeepChem

    Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, etc

    DeepChem aims to provide a high-quality open-source toolchain that democratizes the use of deep learning in drug discovery, materials science, quantum chemistry, and biology. DeepChem currently supports Python 3.7 through 3.9 and requires these packages on any condition. DeepChem has a number of "soft" requirements. If you face some errors like ImportError: This class requires XXXX, you may need to install some packages. Deepchem provides support for TensorFlow, PyTorch, JAX and each...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    FATE

    FATE

    An industrial grade federated learning framework

    FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy. It implements secure computation protocols based on homomorphic encryption and multi-party computation (MPC). Supporting various federated learning scenarios, FATE now provides a host of federated learning algorithms, including logistic regression, tree-based algorithms,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    FireRedTTS-2

    FireRedTTS-2

    Long-form streaming TTS system for multi-speaker dialogue generation

    FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like chatbots, podcasts, and applications where dynamic turn-taking between speakers is essential. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    VibeTensor

    VibeTensor

    Our first fully AI generated deep learning system

    VibeTensor is a groundbreaking open-source research system software stack for deep learning that was uniquely generated almost entirely by AI coding agents under guided human supervision, demonstrating a new frontier in AI-assisted software engineering. It implements a PyTorch-style eager tensor library with a modern C++20 core that supports both CPU and CUDA backends, giving it the ability to manage tensors, automatic differentiation (autograd), and complex computation flows similar to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Dolphin

    Dolphin

    Document Image Parsing via Heterogeneous Anchor Prompting”

    ...Because multimedia delivery requirements vary widely (adaptive streaming, live feeds, cross-platform compatibility, custom UI, performance constraints), Dolphin aims to offer a foundation that developers can build upon or adapt to their needs. It is designed to integrate with other tools and libraries and provide stable playback or media-processing pipelines, while remaining open-source so that users can inspect, extend, and adapt it.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB