Showing 323 open source projects for "images"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    Improved Diffusion

    Improved Diffusion

    Release for Improved Denoising Diffusion Probabilistic Models

    ...These models, also known as score-based generative models, are a class of generative models that have shown strong performance in producing high-quality synthetic data such as images. The repository provides code for training and sampling diffusion models with improved techniques that enhance stability, efficiency, and output fidelity. It includes scripts for setting up training runs, generating samples, and reproducing results from OpenAI’s research on diffusion-based generation. The implementation is intended for researchers and practitioners who want to explore the theoretical and practical aspects of diffusion models in deep learning. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Recurrent Interface Network (RIN)

    Recurrent Interface Network (RIN)

    Implementation of Recurrent Interface Network (RIN)

    Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch. The author unawaredly reinvented the induced set-attention block from the set transformers paper. They also combine this with the self-conditioning technique from the Bit Diffusion paper, specifically for the latents. The last ingredient seems to be a new noise function based around the sigmoid, which the author claims is better than cosine scheduler for larger images. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Autodistill

    Autodistill

    Images to inference with no labeling

    Autodistill uses big, slower foundation models to train small, faster supervised models. Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between. You can use Autodistill on your own hardware, or use the Roboflow hosted version of Autodistill to label images in the cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    SSD in PyTorch 1.0

    SSD in PyTorch 1.0

    High quality, fast, modular reference implementation of SSD in PyTorch

    This repository implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for research based on SSD. Multi-GPU training and inference: We use DistributedDataParallel, you can train or test with arbitrary GPU(s), the training schema will change accordingly. Add your own modules without pain. We abstract backbone, Detector, BoxHead, BoxPredictor, etc. You can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL Databases Icon
    99.99% Uptime for MySQL and PostgreSQL Databases

    Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

    Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.
    Try Free
  • 5
    Extended Dreambooth How-To Guides

    Extended Dreambooth How-To Guides

    Implementation of Dreambooth

    Extended Dreambooth How-To Guides is an implementation and extended toolkit for fine-tuning Stable Diffusion models using the DreamBooth technique, enabling users to train AI image generators to reproduce specific subjects, styles, or identities from a small set of reference images. The project adapts and expands upon earlier DreamBooth research by providing practical scripts, notebooks, and workflows that allow users to train personalized models on local machines, cloud environments, or platforms such as Google Colab. It focuses heavily on usability, offering detailed guides for different setups while still exposing advanced configuration options for experienced users. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CLIP-as-service

    CLIP-as-service

    Embed images and sentences into fixed-length vectors

    CLIP-as-service is a low-latency high-scalability service for embedding images and text. It can be easily integrated as a microservice into neural search solutions. Serve CLIP models with TensorRT, ONNX runtime and PyTorch w/o JIT with 800QPS[*]. Non-blocking duplex streaming on requests and responses, designed for large data and long-running tasks. Horizontally scale up and down multiple CLIP models on single GPU, with automatic load balancing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DPM-Solver

    DPM-Solver

    Fast ODE Solver for Diffusion Probabilistic Model Sampling

    ...This approach significantly reduces the computational cost required to generate images while maintaining strong generation quality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    OpenAssistant

    OpenAssistant

    Chat-based assistant that understands tasks

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. In the same way that Stable Diffusion helped the world make art and images in new ways, we want to improve the world by providing amazing conversational AI. We are in the early stages of development, working from established research in applying RLHF to large language models. Open Assistant is a project organized by LAION and individuals around the world interested in bringing this technology to everyone. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs by 54% Icon
    Cut Data Warehouse Costs by 54%

    Easily migrate from Snowflake, Redshift, or Databricks with free tools.

    BigQuery delivers 54% lower TCO with exabyte scale and flexible pricing. Free migration tools handle the SQL translation automatically.
    Try Free
  • 10
    Shap-E

    Shap-E

    Generate 3D objects conditioned on text or images

    The shap-e repository provides the official code and model release for Shap-E, a conditional generative model designed to produce 3D assets (implicit functions, meshes, neural radiance fields) from text or image prompts. The model is built with a two-stage architecture: first an encoder that maps existing 3D assets into parameterizations of implicit functions, and then a conditional diffusion model trained on those parameterizations to generate new assets. Because it works at the level of...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    DALL-E 2 - Pytorch

    DALL-E 2 - Pytorch

    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis

    ...To train CLIP, you can either use x-clip package, or join the LAION discord, where a lot of replication efforts are already underway. Then, you will need to train the decoder, which learns to generate images based on the image embedding coming from the trained CLIP.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    ConsistencyDecoder

    ConsistencyDecoder

    Consistency Distilled Diff VAE

    ConsistencyDecoder is a Python package from OpenAI that introduces an improved decoding method for variational autoencoders (VAEs) used in Stable Diffusion pipelines. Instead of relying solely on the standard GAN or VAE decoder, this approach leverages a Consistency Distilled Diff VAE, designed to produce higher-quality and more stable outputs from encoded latents. The project provides a simple API for encoding with a Stable Diffusion VAE and decoding using the new consistency model,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    CogView

    CogView

    Text-to-Image generation. The repo for NeurIPS 2021 paper

    CogView is a large-scale pretrained text-to-image transformer model, introduced in the NeurIPS 2021 paper CogView: Mastering Text-to-Image Generation via Transformers. With 4 billion parameters, it was one of the earliest transformer-based models to successfully generate high-quality images from natural language descriptions in Chinese, with partial support for English via translation. The model incorporates innovations such as PB-relax and Sandwich-LN to enable stable training of very deep transformers without NaN loss issues. CogView supports multiple tasks beyond text-to-image, including image captioning, post-selection (ranking candidate images by relevance to a prompt), and super-resolution (upscaling model-generated images). ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    DragGAN

    DragGAN

    Official Code for DragGAN (SIGGRAPH 2023)

    DragGAN is a research-driven image editing system that enables precise manipulation of GAN-generated images through interactive point dragging. The project introduces a novel workflow where users move specific points in an image and the model intelligently deforms the content while preserving realism. Built on top of StyleGAN architectures, the tool operates directly on the learned generative manifold to maintain photorealistic consistency.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    ChatFred

    ChatFred

    Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting

    Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting, image generation and more. Access ChatGPT, DALL·E 2, and other OpenAI models. Language models often give wrong information. Verify answers if they are important. Talk with ChatGPT via the cf keyword. Answers will show as Large Type. Alternatively, use the Universal Action, Fallback Search, or Hotkey. To generate text with InstructGPT models and see results in-line, use the cft keyword. ⤓ Install on the Alfred Gallery or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    YoloV3 Implemented in TensorFlow 2.0

    YoloV3 Implemented in TensorFlow 2.0

    YoloV3 Implemented in Tensorflow 2.0

    YoloV3 Implemented in TensorFlow 2.0 is built using TensorFlow 2.0. The project provides a modern deep learning implementation of the popular YOLOv3 algorithm, which is widely used for real-time object detection in images and video streams. YOLOv3 works by dividing an image into grid regions and predicting bounding boxes and class probabilities simultaneously, allowing objects to be detected quickly and efficiently. The repository includes training scripts, inference tools, and configuration files that make it possible to train custom object detection models on user-defined datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ControlNet

    ControlNet

    Let us control diffusion models

    ControlNet is a neural network architecture designed to add conditional control to text-to-image diffusion models. Rather than training from scratch, ControlNet “locks” the weights of a pre-trained diffusion model and introduces a parallel trainable branch that learns additional conditions—like edges, depth maps, segmentation, human pose, scribbles, or other guidance signals. This allows the system to control where and how the model should focus during generation, enabling users to steer...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    StoryTeller

    StoryTeller

    Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.

    ...To develop locally, install dev dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each commit. The final video will be saved as /out/out.mp4, alongside other intermediate images, audio files, and subtitles. For more advanced use cases, you can also directly interface with Story Teller in Python code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Basaran

    Basaran

    Basaran, an open-source alternative to the OpenAI text completion API

    ...Multi-GPU support with optional 8-bit quantization. Real-time partial progress using server-sent events. Compatible with OpenAI API and client libraries. Comes with a fancy web-based playground. Docker images are available on Docker Hub and GitHub Packages.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    find-similar

    find-similar

    User-friendly library to find similar objects

    The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers developers to efficiently find similar objects and perform comparisons across a variety of data types. Whether dealing with texts, images, audio, or more, our project aims to simplify the process of identifying similarities and enhancing decision-making. https://github.com/findsimilar/find-similar - GitHub repo http://demo.findsimilar.org/ - Demo project and tutorial https://docs.findsimilar.org/ - Documentation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Nougat

    Nougat

    Implementation of Nougat Neural Optical Understanding

    Nougat is a multi-modal generative modeling framework that bridges vision and text modalities with structured generation control (e.g. layout, scene composition) rather than treating images as flat contexts. It combines object-centric modules with transformer-based reasoning to propose, refine, and render scenes in a generative pipeline. The architecture allows you to specify or prompt a layout (which objects should be where) and then the model fills in appearance, context, lighting, and relations coherently. The design supports interactive editing: you could adjust object positions or types and have the model adapt generation accordingly. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OpenFlamingo

    OpenFlamingo

    An open-source framework for training large multimodal models

    Welcome to our open source version of DeepMind's Flamingo model! In this repository, we provide a PyTorch implementation for training and evaluating OpenFlamingo models. We also provide an initial OpenFlamingo 9B model trained on a new Multimodal C4 dataset (coming soon). Please refer to our blog post for more details. This repo is still under development, and we hope to release better-performing and larger OpenFlamingo models soon. If you have any questions, please feel free to open an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    macara

    macara

    A converter for seamless transformation of files, data, and media ...

    ...The design of this software is evolutionary, allowing for the seamless integration of additional scripts, menus, or windows as needed. Serving as a versatile tool, it facilitates efficient file management, especially when handling a substantial volume of images, whether sorting by name or other attributes. These scripts are crafted to complement generative art AI technologies like Dall-e or stable diffusion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    hloc

    hloc

    Visual localization made easy with hloc

    ...The notebook pipeline_InLoc.ipynb shows the steps for localizing with InLoc. It's much simpler since a 3D SfM model is not needed. We show in pipeline_SfM.ipynb how to run 3D reconstruction for an unordered set of images. This generates reference poses, and a nice sparse 3D model suitable for localization with the same pipeline as Aachen.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    TaskMatrix

    TaskMatrix

    Enable sending and receiving images during chatting

    TaskMatrix is an experimental AI ecosystem designed to connect large language models with visual foundation models, APIs, and external systems in order to complete multimodal tasks collaboratively. The project expands beyond traditional chatbot behavior by enabling AI systems to process, generate, edit, and reason about images while coordinating multiple specialized models simultaneously. Originally introduced alongside the Visual ChatGPT concept, TaskMatrix acts as an orchestration framework where a central language model delegates subtasks to domain-specific AI systems such as image generators, segmentation tools, or recognition models. The architecture focuses on modularity, allowing new APIs and foundation models to be integrated as interchangeable task-solving components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo