Showing 94 open source projects for "gpu"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Automate contact and company data extraction Icon
    Automate contact and company data extraction

    Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

    Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.
    Explore Apify Store
  • 1
    CuPy

    CuPy

    A NumPy-compatible array library accelerated by CUDA

    CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. CuPy is very easy to install through pip or through precompiled binary packages called wheels for recommended environments. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 2
    ComputeSharp

    ComputeSharp

    .NET library to run C# code in parallel on the GPU through DX12

    ComputeSharp is a .NET library to run C# code in parallel on the GPU through DX12 and dynamically generated HLSL compute shaders. The available APIs let you access GPU devices, allocate GPU buffers and textures, move data between them and the RAM, write compute shaders entirely in C# and have them run on the GPU. The goal of this project is to make GPU computing easy to use for all .NET developers!
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    cuDF

    cuDF

    GPU DataFrame Library

    ...The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Shumai

    Shumai

    Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

    ...It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Financial reporting cloud-based software. Icon
    Financial reporting cloud-based software.

    For companies looking to automate their consolidation and financial statement function

    The software is cloud based and automates complexities around consolidating and reporting for groups with multiple year ends, currencies and ERP systems with a slice and dice approach to reporting. While retaining the structure, control and validation needed in a financial reporting tool, we’ve managed to keep things flexible.
    Learn More
  • 5
    Khronos KTX

    Khronos KTX

    KTX (Khronos Texture) Library and Tools

    KTX-Software is a suite of tools and libraries for working with Khronos Texture (KTX) files, designed and maintained by the Khronos Group. KTX is a container format for storing textures that are optimized for GPU upload, supporting modern formats like Basis Universal and ASTC. This repository includes tools for creating, validating, inspecting, and converting KTX and KTX2 files, making it essential for developers working in 3D engines, games, and visualization tools where texture streaming and compression are key.
    Downloads: 41 This Week
    Last Update:
    See Project
  • 6
    DeepSeed

    DeepSeed

    Deep learning optimization library making distributed training easy

    ...DeepSpeed delivers extreme-scale model training for everyone, from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. With just a single GPU, ZeRO-Offload of DeepSpeed can train models with over 10B parameters, 10x bigger than the state of arts, democratizing multi-billion-parameter model training such that many deep learning scientists can explore bigger and better models. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    PortableGL

    PortableGL

    An implementation of OpenGL 3.x-ish in clean C

    PortableGL is a single-header, software-only implementation of a subset of OpenGL (specifically the GL 2.1 pipeline), designed to run entirely on the CPU. This lightweight graphics library allows OpenGL-style rendering without GPU acceleration, making it ideal for educational use, debugging, embedded systems, and retro-style software rendering. Because it mirrors OpenGL syntax and design, it can act as a drop-in CPU renderer for testing or deploying 3D graphics on platforms without GPU support.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    libplacebo

    libplacebo

    Official mirror of libplacebo

    libplacebo is a flexible, high-performance graphics library built on top of Vulkan, designed to provide reusable GPU-accelerated components for media applications. It originated as a core part of the rendering pipeline for the mpv media player and has since grown into a standalone library used for tone mapping, dithering, color space conversion, and more. libplacebo is ideal for developers looking to integrate sophisticated video rendering and post-processing into their own applications with full control over shaders and rendering stages.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    MuJoCo Playground

    MuJoCo Playground

    An open source library for GPU-accelerated robot learning

    MuJoCo Playground, developed by Google DeepMind, is a GPU-accelerated suite of simulation environments for robot learning and sim-to-real research, built on top of MuJoCo MJX. It unifies a range of control, locomotion, and manipulation tasks into a consistent and scalable framework optimized for JAX and Warp backends. The project includes classic control benchmarks from dm_control, advanced quadruped and bipedal locomotion systems, and dexterous as well as non-prehensile manipulation setups. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 10
    Faiss

    Faiss

    Library for efficient similarity search and clustering dense vectors

    ...It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It assumes that the instances are represented as vectors and are identified by an integer, and that the vectors can be compared with L2 (Euclidean) distances or dot products. Vectors that are similar to a query vector are those that have the lowest L2 distance or the highest dot product with the query vector. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Images.jl

    Images.jl

    An image library for Julia

    ...Julia is well-suited to image processing because it is a modern and elegant high-level language that is a pleasure to use, while also allowing you to write "inner loops" that compile to efficient machine code (i.e., it is as fast as C). Julia supports multithreading and, through add-on packages, GPU processing. JuliaImages is a collection of packages specifically focused on image processing. It is not yet as complete as some toolkits for other programming languages, but it has many useful algorithms. It is focused on clean architecture and is designed to unify "machine vision" and "biomedical 3d image processing" communities.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    DeepEP

    DeepEP

    DeepEP: an efficient expert-parallel communication library

    DeepEP is a communication library designed specifically to support Mixture-of-Experts (MoE) and expert parallelism (EP) deployments. Its core role is to implement high-throughput, low-latency all-to-all GPU communication kernels, which handle the dispatching of tokens to different experts (or shards) and then combining expert outputs back into the main data flow. Because MoE architectures require routing inputs to different experts, communication overhead can become a bottleneck — DeepEP addresses that by providing optimized GPU kernels and efficient dispatch/combining logic. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    DGL

    DGL

    Python package built to ease deep learning on graph

    ...We also want to make the combination of graph based modules and tensor based modules (PyTorch or MXNet) as smooth as possible. DGL provides a powerful graph object that can reside on either CPU or GPU. It bundles structural data as well as features for a better control. We provide a variety of functions for computing with graph objects including efficient and customizable message passing primitives for Graph Neural Networks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Kornia

    Kornia

    Open Source Differentiable Computer Vision Library

    Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. Inspired by existing packages, this library is composed by a subset of packages containing operators that can be inserted within...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    XFrames

    XFrames

    GPU-accelerated GUI development for Node.js and the browser

    xframes is a high-performance library that empowers developers to build native desktop applications using familiar web technologies, specifically Node.js and React, without the overhead of the DOM. xframes serves as a streamlined alternative to Electron, designed for developers looking to maximize performance and efficiency.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    webgl-plot

    webgl-plot

    A high-Performance real-time 2D plotting library based on native WebGL

    ...Unlike traditional canvas or SVG-based charting libraries, webgl-plot is optimized for streaming and dynamic updates, making it ideal for oscilloscope-style data, biomedical signals, or any application where data updates hundreds of times per second. Its minimal memory footprint and GPU acceleration ensure excellent performance even with tens of thousands of data points, and its simple API allows developers to get started quickly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Tracy Profiler

    Tracy Profiler

    Frame profiler

    ...Tracy supports profiling CPU (Direct support is provided for C, C++, Lua and Python integration. At the same time, third-party bindings to many other languages exist on the internet, such as Rust, Zig, C#, OCaml, Odin, etc.), GPU (All major graphic APIs: OpenGL, Vulkan, Direct3D 11/12, OpenCL.), memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    RLax

    RLax

    Library of JAX-based building blocks for reinforcement learning agents

    ...It supports both on-policy and off-policy learning, as well as value-based, policy-based, and model-based approaches. RLax is fully JIT-compilable with JAX, enabling high-performance execution across CPU, GPU, and TPU backends. The library implements tools for Bellman equations, return distributions, general value functions, and policy optimization in both continuous and discrete action spaces. It integrates seamlessly with DeepMind’s Haiku (for neural network definition) and Optax (for optimization), making it a key component in modular RL pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Theseus

    Theseus

    A library for differentiable nonlinear optimization

    ...Because solves are differentiable, you can backpropagate through optimization to learn cost weights, feature extractors, or initialization networks end-to-end. The implementation supports batched optimization on GPU, robust losses, damping strategies, and custom factors, making it practical for real-time systems. Helper packages provide geometry primitives and utilities for composing priors, relative constraints, and measurement models. Theseus bridges the gap between classical optimization and deep learning, enabling hybrid systems that learn components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    GPUPixel

    GPUPixel

    Real-time image and video processing library similar to GPUImage

    GPUPixel is a real-time image and video processing library written in C++11, based on OpenGL/ES. It offers functionalities similar to GPUImage, including built-in beauty filters, enabling efficient processing and rendering of visual effects on images and videos.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    CUDA API Wrappers

    CUDA API Wrappers

    Thin, unified, C++-flavored wrappers for the CUDA APIs

    CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIA’s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    AlphaZero.jl

    AlphaZero.jl

    A generic, simple and fast implementation of Deepmind's AlphaZero

    Beyond its much publicized success in attaining superhuman level at games such as Chess and Go, DeepMind's AlphaZero algorithm illustrates a more general methodology of combining learning and search to explore large combinatorial spaces effectively. We believe that this methodology can have exciting applications in many different research areas. Because AlphaZero is resource-hungry, successful open-source implementations (such as Leela Zero) are written in low-level languages (such as C++)...
    Downloads: 46 This Week
    Last Update:
    See Project
  • 23
    FlashMLA

    FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

    FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Floem

    Floem

    A native Rust UI library with fine-grained reactivity

    Floem is a cross-platform GUI framework for Rust. It aims to be extremely performant while providing world-class developer ergonomics. Supporting both GPU and CPU rendering, Floem gives you performance that's closest to bare metal. Also primitives are provided to help the developer to write performant UI code without too much effect.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PyOpenCL

    PyOpenCL

    OpenCL integration for Python, plus shiny features

    PyOpenCL is a Python wrapper for the OpenCL framework, providing seamless access to parallel computing on CPUs, GPUs, and other accelerators. It enables developers to harness the full power of heterogeneous computing directly from Python, combining Python’s ease of use with the performance benefits of OpenCL. PyOpenCL also includes convenient features for managing memory, compiling kernels, and interfacing with NumPy, making it a preferred choice in scientific computing, data analysis, and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next