Showing 175 open source projects for "gpu image"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    emgucv

    emgucv

    Cross platform .Net wrapper to the OpenCV image processing library

    Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Android Emulator Container Scripts

    Android Emulator Container Scripts

    Minimal scripts to run the emulator in a container for various systems

    ...A built-in WebRTC bridge lets you stream the emulator screen to a browser with interactive input, which is ideal for CI dashboards, remote debugging, or demo environments. The project focuses on reproducibility and scale: you define which system image to boot, how to persist or reset data, and how many instances to run, then schedule them like any other workload. GPU acceleration, audio, and sensors can be enabled depending on your host and cluster capabilities, while fallbacks like SwiftShader keep things usable when no GPU is available.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    ArrayFire

    ArrayFire

    ArrayFire, a general purpose GPU library

    ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Wan2.1

    Wan2.1

    Wan2.1: Open and Advanced Large-Scale Video Generative Model

    ...Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text-to-video and image-to-video generation tasks with flexible resolution options suitable for various GPU hardware configurations. Wan2.1’s architecture balances generation quality and inference cost, paving the way for later improvements seen in Wan2.2 such as Mixture-of-Experts and enhanced aesthetics. It was trained on large-scale video and image datasets, providing generalization across diverse scenes and motion patterns.
    Downloads: 69 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 5
    ML Sharp

    ML Sharp

    Sharp Monocular View Synthesis in Less Than a Second

    ML Sharp is a research code release that turns a single 2D photograph into a photorealistic 3D representation that can be rendered from nearby viewpoints. Instead of requiring multi-view input, it predicts the parameters of a 3D Gaussian scene representation directly from one image using a single forward pass through a neural network. The core idea is speed: the 3D representation is produced in under a second on a standard GPU, and then the resulting scene can be rendered in real time to generate new views interactively. The representation is metric, meaning it supports camera movements with an absolute scale rather than only relative depth cues, which is useful for consistent viewpoint changes and downstream spatial tasks. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    DALI

    DALI

    A GPU-accelerated library containing highly optimized building blocks

    ...DALI addresses the problem of the CPU bottleneck by offloading data preprocessing to the GPU. Additionally, DALI relies on its own execution engine, built to maximize the throughput of the input pipeline.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    COLMAP

    COLMAP

    Structure-from-Motion and Multi-View Stereo

    COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface. It offers a wide range of features for the reconstruction of ordered and unordered image collections. The software is licensed under the new BSD license.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 8
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    Wan2GP is an open source AI video generation toolkit designed to make modern generative models accessible on consumer-grade hardware with limited GPU memory. It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and certain AMD GPUs. ...
    Downloads: 39 This Week
    Last Update:
    See Project
  • 9
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...
    Downloads: 222 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    TensorRT Node for ComfyUI

    TensorRT Node for ComfyUI

    Enables the best performance on NVIDIA RTX Graphics Cards

    ComfyUI_TensorRT is an extension that lets ComfyUI run AI inference through NVIDIA’s TensorRT, aiming to get faster, more efficient execution on supported GPUs. It bridges the gap between ComfyUI’s flexible, node-based workflows and TensorRT’s highly optimized engine format. The result is that complex diffusion or image-processing graphs can be accelerated without the user having to rewrite the pipeline. The repo typically includes instructions for converting models to TensorRT engines and for wiring those engines into ComfyUI nodes. This is particularly attractive for power users who run many generations or who host ComfyUI on dedicated hardware and want to squeeze out every bit of GPU performance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Skiko

    Skiko

    Kotlin Multiplatform bindings to Skia

    Skiko is an open-source graphics library from JetBrains that provides lightweight, cross-platform bindings for the Skia graphics engine tailored specifically for Kotlin Multiplatform and Compose applications. It serves as the low-level rendering backbone for Kotlin UI frameworks like Compose for Desktop and Compose for Web, enabling smooth, GPU-accelerated 2D graphics across Windows, macOS, Linux, and other supported targets without writing native code. Skiko abstracts away platform-specific rendering details while exposing Skia’s powerful features such as high-quality text shaping, image filters, path operations, and hardware accelerated canvases, making it ideal for building rich UI components, animations, games, or custom drawing surfaces. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    JiT

    JiT

    PyTorch implementation of JiT

    JiT is an open-source PyTorch implementation of a state-of-the-art image diffusion model designed around a minimalist yet powerful architecture for pixel-level generative modeling, based on the paper Back to Basics: Let Denoising Generative Models Denoise. Rather than predicting noise, JiT models directly predict clean image data, which the research suggests aligns better with the manifold structure of natural images and leads to stronger generative performance at high resolution. This...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...
    Downloads: 99 This Week
    Last Update:
    See Project
  • 14
    UpscalerJS

    UpscalerJS

    Image Upscaling in Javascript. Increase image resolution up to 4x

    Image Upscaling in Javascript. Increase image resolution up to 4x using Tensorflow.js. Open source, browser/Node compatibility, and completely free to use under the MIT license. Scale images up to 4x their original size, all in Javascript. UpscalerJS ships with pre-trained models in the box covering a wide variety of use cases. Or bring your own!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    DINOv3

    DINOv3

    Reference PyTorch implementation and models for DINOv3

    DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 16
    Face Alignment

    Face Alignment

    2D and 3D Face alignment library build using pytorch

    Detect facial landmarks from Python using the world's most accurate face alignment network, capable of detecting points in both 2D and 3D coordinates. Build using FAN's state-of-the-art deep learning-based face alignment method. For numerical evaluations, it is highly recommended to use the lua version which uses identical models with the ones evaluated in the paper. More models will be added soon. By default, the package will use the SFD face detector. However, the users can alternatively...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MuJoCo Playground

    MuJoCo Playground

    An open source library for GPU-accelerated robot learning

    MuJoCo Playground, developed by Google DeepMind, is a GPU-accelerated suite of simulation environments for robot learning and sim-to-real research, built on top of MuJoCo MJX. It unifies a range of control, locomotion, and manipulation tasks into a consistent and scalable framework optimized for JAX and Warp backends. The project includes classic control benchmarks from dm_control, advanced quadruped and bipedal locomotion systems, and dexterous as well as non-prehensile manipulation setups. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    CogVideo

    CogVideo

    Text and image to video generation: CogVideoX and CogVideo

    CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference,...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    Nexa SDK

    Nexa SDK

    Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML

    Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), and speech-to-text (ASR), and text-to-speech (TTS) capabilities. Additionally, it offers an OpenAI-compatible API server with JSON schema mode for function calling and streaming support, and a user-friendly Streamlit UI. Users can run Nexa SDK in any device with Python environment, and GPU acceleration is supported, including CUDA, Metal, and ROCm. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 21
    Stable Diffusion Version 2

    Stable Diffusion Version 2

    High-Resolution Image Synthesis with Latent Diffusion Models

    Stable Diffusion (the stablediffusion repo by Stability-AI) is an open-source implementation and reference codebase for high-resolution latent diffusion image models that power many text-to-image systems. The repository provides code for training and running Stable Diffusion-style models, instructions for installing dependencies (with notes about performance libraries like xformers), and guidance on hardware/driver requirements for efficient GPU inference and training. It’s organized as a practical, developer-focused toolkit: model code, scripts for inference, and examples for using memory-efficient attention and related optimizations are included so researchers and engineers can run or adapt the model for their own projects. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    DeepDetect

    DeepDetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch

    ...While the Open Source Deep Learning Server is the core element, with REST API, and multi-platform support that allows training & inference everywhere, the Deep Learning Platform allows higher level management for training neural network models and using them as if they were simple code snippets. Ready for applications of image tagging, object detection, segmentation, OCR, Audio, Video, Text classification, CSV for tabular data and time series. Neural network templates for the most effective architectures for GPU, CPU, and Embedded devices. Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Depth Pro

    Depth Pro

    Sharp Monocular Metric Depth in Less Than a Second

    Depth Pro is a foundation model for zero-shot metric monocular depth estimation, producing sharp, high-frequency depth maps with absolute scale from a single image. Unlike many prior approaches, it does not require camera intrinsics or extra metadata, yet still outputs metric depth suitable for downstream 3D tasks. Apple highlights both accuracy and speed: the model can synthesize a ~2.25-megapixel depth map in around 0.3 seconds on a standard GPU, enabling near real-time applications. The repo and research page emphasize boundary fidelity and crisp geometry, addressing a common weakness in monocular depth where edges can blur. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Spartan Engine

    Spartan Engine

    A game engine with an emphasis on real-time cutting-edge solutions

    ...The engine implements a wide range of advanced graphics features, such as atmospheric scattering, physically based shading, screen-space shadows and ambient occlusion, screen-space reflections, sophisticated shadow mapping, volumetric fog, and HDR output. It supports next-gen performance and image quality technologies including variable rate shading, dynamic resolution scaling, temporal anti-aliasing, and upscaling via XeSS 2 and FSR 3. Beyond rendering, SpartanEngine offers PhysX-powered physics, CPU and GPU profiling, a thread pool for parallel workloads.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Servo

    Servo

    Embed web technologies in applications

    Servo is an experimental, highly parallel, and embeddable browser rendering engine written in Rust. It leverages Rust’s memory-safety and concurrency strengths, supports modern GPU-powered rendering (WebGL/WebGPU), and serves as a research-forward alternative to traditional browser engines. Servo is a prototype web browser engine written in the Rust language. It is currently developed on 64-bit macOS, 64-bit Linux, 64-bit Windows, 64-bit OpenHarmony, and Android. Open governance under Linux...
    Downloads: 1 This Week
    Last Update:
    See Project
Auth0 Logo