Showing 158 open source projects for "gpu image"

View related business solutions
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • Save Up to 91% on Cloud Compute With Spot VMs Icon
    Save Up to 91% on Cloud Compute With Spot VMs

    Automatic sustained-use discounts. One free VM per month. No negotiation needed.

    Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.
    Try Free
  • 1
    HunyuanDiT

    HunyuanDiT

    Diffusion Transformer with Fine-Grained Chinese Understanding

    HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions. LoRA, ControlNet (pose, depth,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Faiss

    Faiss

    Library for efficient similarity search and clustering dense vectors

    Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    DreamO

    DreamO

    A Unified Framework for Image Customization

    DreamO is a unified, open-source framework from ByteDance for advanced image customization and generation that consolidates multiple “image manipulation” tasks into a single system, rather than requiring separate specialized models. Built on a diffusion-transformer (DiT) backbone, it supports a diverse set of tasks — including identity preservation, virtual “try-on” (e.g. clothing, accessories), style transfer, IP adaptation (objects/characters), and layout/condition-aware customizations —...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    Halide

    A language for fast, portable data-parallel computation

    Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Anime4KCPP

    Anime4KCPP

    A high performance anime upscaler

    Anime4KCPP provides an optimized bloc97's Anime4K algorithm version 0.9, and it also provides its own CNN algorithm ACNet, it provides a variety of way to use, including preprocessing and real-time playback, it aims to be a high-performance tool to process both image and video. This project is for learning and the exploration task of the algorithm course in SWJTU. Anime4K is a simple high-quality anime upscale algorithm. Version 0.9 does not use any machine learning approaches and can be...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 9
    Lightweight' GAN

    Lightweight' GAN

    Implementation of 'lightweight' GAN, proposed in ICLR 2021

    ...The main contribution of the paper is a skip-layer excitation in the generator, paired with autoencoding self-supervised learning in the discriminator. Quoting the one-line summary "converge on single gpu with few hours' training, on 1024 resolution sub-hundred images". Augmentation is essential for Lightweight GAN to work effectively in a low data setting. You can test and see how your images will be augmented before they pass into a neural network (if you use augmentation). The general recommendation is to use suitable augs for your data and as many as possible, then after some time of training disable the most destructive (for image) augs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    AWS Deep Learning Containers

    AWS Deep Learning Containers

    A set of Docker images for training and serving models in TensorFlow

    AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). The AWS DLCs are used in Amazon SageMaker as the default vehicles for your SageMaker jobs such as training, inference,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    MiniMind-V

    MiniMind-V

    "Big Model" trains a visual multimodal VLM with 26M parameters

    MiniMind-V is an experimental open-source project that aims to train a very small multimodal vision–language model (VLM) from scratch with extremely low compute and cost, making research and experimentation accessible to more people. The repository showcases training workflows and code designed to produce a 26-million parameter model—including both image and text capabilities—using minimal resources in very little time, reflecting a trend toward democratizing AI research. MiniMind-V combines techniques from modern vision-language modeling but focuses on efficiency and simplicity so that individuals or small teams can explore multimodal learning without massive GPU clusters. It includes training scripts, model definitions, and associated tooling that illustrate how to build and evaluate such lightweight models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    RamaLama

    RamaLama

    Simplifies the local serving of AI models from any source

    RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Denoising Diffusion Probabilistic Model

    Denoising Diffusion Probabilistic Model

    Implementation of Denoising Diffusion Probabilistic Model in Pytorch

    Implementation of Denoising Diffusion Probabilistic Model in Pytorch. It is a new approach to generative modeling that may have the potential to rival GANs. It uses denoising score matching to estimate the gradient of the data distribution, followed by Langevin sampling to sample from the true distribution. If you simply want to pass in a folder name and the desired image dimensions, you can use the Trainer class to easily train a model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    FLUX.2-klein-4B

    FLUX.2-klein-4B

    Flux 2 image generation model pure C inference

    FLUX.2-klein-4B is a compact, high-performance C library implementation of the Flux optimization algorithm — an iterative approach for solving large-scale optimization problems common in scientific computing, machine learning, and numerical simulation. Written with a strong emphasis on simplicity, correctness, and performance, it abstracts the core logic of flux-based optimization into a minimal C API that can be embedded in broader applications without pulling in heavy dependencies. Because...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    DeepSpeed MII

    DeepSpeed MII

    MII makes low-latency and high-throughput inference possible

    ...The Deep Learning (DL) open-source community has seen tremendous growth in the last few months. Incredibly powerful text generation models such as the Bloom 176B, or image generation model such as Stable Diffusion are now available to anyone with access to a handful or even a single GPU through platforms such as Hugging Face. While open-sourcing has democratized access to AI capabilities, their application is still restricted by two critical factors: inference latency and cost. DeepSpeed-MII is a new open-source python library from DeepSpeed, aimed towards making low-latency, low-cost inference of powerful models not only feasible but also easily accessible. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Qwen2.5-Omni

    Qwen2.5-Omni

    Capable of understanding text, audio, vision, video

    Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Jupyter Docker Stacks

    Jupyter Docker Stacks

    Ready-to-run Docker images containing Jupyter applications

    Jupyter Docker Stacks provides a curated set of ready-to-run Docker container images that bundle Jupyter applications with popular data science and computing tools, enabling users to quickly start working in a reproducible environment. These stacks support a range of use cases, from lightweight base notebook images to full featured environments that include scientific computing libraries, machine learning tools, and IDE-like notebook interfaces, all within Docker containers that run...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    UNO

    UNO

    A Universal Customization Method for Single and Multi Conditioning

    UNO is a project by ByteDance introduced in 2025, titled “A Universal Customization Method for Both Single and Multi-Subject Conditioning.” It suggests a framework for image (or more general generative) modeling where the model can be conditioned either on a single subject or multiple subjects — which may correspond to generating or customizing images featuring specific people, styles, or objects, possibly with fine-grained control over subject identity or composition. Because the project is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DocArray

    DocArray

    The data structure for multimodal data

    DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    LingBot-World

    LingBot-World

    Advancing Open-source World Models

    LingBot-World is an open-source, high-fidelity world simulator designed to advance the state of world models through video generation. Built on top of Wan2.2, it enables realistic, dynamic environment simulation across diverse styles, including real-world, scientific, and stylized domains. LingBot-World supports long-term temporal consistency, maintaining coherent scenes and interactions over minute-level horizons. With real-time interactivity and sub-second latency at 16 FPS, it is...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Stable Diffusion WebUI Docker

    Stable Diffusion WebUI Docker

    Easy Docker setup for Stable Diffusion with user-friendly UI

    Stable Diffusion WebUI Docker is a Docker-based repository that simplifies running Stable Diffusion with rich user interfaces by packaging multiple popular web UIs into an easy-to-deploy containerized solution. It integrates leading community UIs like AUTOMATIC1111 and ComfyUI into a Docker Compose setup that can be started with a single command, abstracting away dependency installation and environment configuration. Users can choose which UI profile they want to run — for example, full...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Contour

    Contour

    Modern C++ Terminal Emulator

    ...Blurred behind transparent background when using Windows 10 or KDE window manager on Linux. Blurrable Background image support. Runtime configuration reload. 256-color and Truecolor support. Key binding customization.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    PyTorch Geometric

    PyTorch Geometric

    Geometric deep learning extension library for PyTorch

    ...We have outsourced a lot of functionality of PyTorch Geometric to other packages, which needs to be additionally installed. These packages come with their own CPU and GPU kernel implementations based on C++/CUDA extensions. We do not recommend installation as root user on your system python. Please setup an Anaconda/Miniconda environment or create a Docker image. We provide pip wheels for all major OS/PyTorch/CUDA combinations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo