Showing 71 open source projects for "gpu"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Automate contact and company data extraction Icon
    Automate contact and company data extraction

    Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

    Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.
    Explore Apify Store
  • 1
    CuPy

    CuPy

    A NumPy-compatible array library accelerated by CUDA

    CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. CuPy is very easy to install through pip or through precompiled binary packages called wheels for recommended environments. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    Shumai

    Shumai

    Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

    ...It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DeepSeed

    DeepSeed

    Deep learning optimization library making distributed training easy

    ...DeepSpeed delivers extreme-scale model training for everyone, from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. With just a single GPU, ZeRO-Offload of DeepSpeed can train models with over 10B parameters, 10x bigger than the state of arts, democratizing multi-billion-parameter model training such that many deep learning scientists can explore bigger and better models. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    ...Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu and Ali to complete text recognition locally. Support GPU acceleration, after GPU acceleration, you can get higher accuracy and faster extraction speed. (CLI version) No need for users to manually set the subtitle area, the project automatically detects the subtitle area through the text detection model. Filter the text in the non-subtitle area and remove the watermark (station logo) text.
    Downloads: 71 This Week
    Last Update:
    See Project
  • G-P - Global EOR Solution Icon
    G-P - Global EOR Solution

    Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

    With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
    Learn More
  • 5
    MuJoCo Playground

    MuJoCo Playground

    An open source library for GPU-accelerated robot learning

    MuJoCo Playground, developed by Google DeepMind, is a GPU-accelerated suite of simulation environments for robot learning and sim-to-real research, built on top of MuJoCo MJX. It unifies a range of control, locomotion, and manipulation tasks into a consistent and scalable framework optimized for JAX and Warp backends. The project includes classic control benchmarks from dm_control, advanced quadruped and bipedal locomotion systems, and dexterous as well as non-prehensile manipulation setups. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    DeepEP

    DeepEP

    DeepEP: an efficient expert-parallel communication library

    DeepEP is a communication library designed specifically to support Mixture-of-Experts (MoE) and expert parallelism (EP) deployments. Its core role is to implement high-throughput, low-latency all-to-all GPU communication kernels, which handle the dispatching of tokens to different experts (or shards) and then combining expert outputs back into the main data flow. Because MoE architectures require routing inputs to different experts, communication overhead can become a bottleneck — DeepEP addresses that by providing optimized GPU kernels and efficient dispatch/combining logic. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TorchQuantum

    TorchQuantum

    A PyTorch-based framework for Quantum Classical Simulation

    ...Researchers on quantum algorithm design, parameterized quantum circuit training, quantum optimal control, quantum machine learning, and quantum neural networks. Dynamic computation graph, automatic gradient computation, fast GPU support, batch model terrorized processing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Meridian

    Meridian

    Meridian is an MMM framework

    ...The framework provides a robust foundation for constructing in-house MMM pipelines capable of handling both national and geo-level data, with built-in support for calibration using experimental data or prior knowledge. Meridian uses the No-U-Turn Sampler (NUTS) for Markov Chain Monte Carlo (MCMC) sampling to produce statistically rigorous results, and it includes GPU acceleration to significantly reduce computation time.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    DGL

    DGL

    Python package built to ease deep learning on graph

    ...We also want to make the combination of graph based modules and tensor based modules (PyTorch or MXNet) as smooth as possible. DGL provides a powerful graph object that can reside on either CPU or GPU. It bundles structural data as well as features for a better control. We provide a variety of functions for computing with graph objects including efficient and customizable message passing primitives for Graph Neural Networks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-First Supply Chain Management Icon
    AI-First Supply Chain Management

    Supply chain managers, executives, and businesses seeking AI-powered solutions to optimize planning, operations, and decision-making across the supply

    Logility is a market-leading provider of AI-first supply chain management solutions engineered to help organizations build sustainable digital supply chains that improve people’s lives and the world we live in. The company’s approach is designed to reimagine supply chain planning by shifting away from traditional “what happened” processes to an AI-driven strategy that combines the power of humans and machines to predict and be ready for what’s coming. Logility’s fully integrated, end-to-end platform helps clients know faster, turn uncertainty into opportunity, and transform the supply chain from a cost center to an engine for growth.
    Learn More
  • 10
    ComfyUI

    ComfyUI

    The most powerful and modular diffusion model GUI, api and backend

    The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...
    Downloads: 263 This Week
    Last Update:
    See Project
  • 11
    DualPipe

    DualPipe

    A bidirectional pipeline parallelism algorithm

    DualPipe is a bidirectional pipeline parallelism algorithm open-sourced by DeepSeek, introduced in their DeepSeek-V3 technical framework. The main goal of DualPipe is to maximize overlap between computation and communication phases during distributed training, thus reducing idle GPU time (i.e. “pipeline bubbles”) and improving cluster efficiency. Traditional pipeline parallelism methods (e.g. 1F1B or staggered pipelining) leave gaps because forward and backward phases can’t fully overlap with communication. DualPipe addresses that by scheduling micro-batches from both ends of the pipeline in a bidirectional fashion—i.e. some micro-batches flow forward while others flow backward—so that computation on one partition can coincide with communication for another.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    KServe

    KServe

    Standardized Serverless ML Inference Platform on Kubernetes

    ...It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and Canary Rollouts to your ML deployments. It enables a simple, pluggable, and complete story for Production ML Serving including prediction, pre-processing, post-processing and explainability. KServe is being used across various organizations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Kornia

    Kornia

    Open Source Differentiable Computer Vision Library

    Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. Inspired by existing packages, this library is composed by a subset of packages containing operators that can be inserted within...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    AWS Deep Learning Containers

    AWS Deep Learning Containers

    A set of Docker images for training and serving models in TensorFlow

    AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). The AWS DLCs are used in Amazon SageMaker as the default vehicles for your SageMaker jobs such as training, inference, transforms etc. They've been tested for machine learning workloads on Amazon EC2, Amazon ECS and Amazon EKS services as well. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    TensorRT Node for ComfyUI

    TensorRT Node for ComfyUI

    Enables the best performance on NVIDIA RTX Graphics Cards

    ...The repo typically includes instructions for converting models to TensorRT engines and for wiring those engines into ComfyUI nodes. This is particularly attractive for power users who run many generations or who host ComfyUI on dedicated hardware and want to squeeze out every bit of GPU performance. In short, it’s about taking ComfyUI from “it runs” to “it runs fast” on NVIDIA GPUs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    RLax

    RLax

    Library of JAX-based building blocks for reinforcement learning agents

    ...It supports both on-policy and off-policy learning, as well as value-based, policy-based, and model-based approaches. RLax is fully JIT-compilable with JAX, enabling high-performance execution across CPU, GPU, and TPU backends. The library implements tools for Bellman equations, return distributions, general value functions, and policy optimization in both continuous and discrete action spaces. It integrates seamlessly with DeepMind’s Haiku (for neural network definition) and Optax (for optimization), making it a key component in modular RL pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Theseus

    Theseus

    A library for differentiable nonlinear optimization

    ...Because solves are differentiable, you can backpropagate through optimization to learn cost weights, feature extractors, or initialization networks end-to-end. The implementation supports batched optimization on GPU, robust losses, damping strategies, and custom factors, making it practical for real-time systems. Helper packages provide geometry primitives and utilities for composing priors, relative constraints, and measurement models. Theseus bridges the gap between classical optimization and deep learning, enabling hybrid systems that learn components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    FairChem

    FairChem

    FAIR Chemistry's library of machine learning methods for chemistry

    ...Tasks span heterogeneous domains—catalysis (OC20-style), inorganic materials (OMat), molecules (OMol), MOFs (ODAC), and molecular crystals (OMC)—allowing one model family to serve many simulations. The README provides quick paths for pulling models (e.g., via Hugging Face access), then running energy/force predictions on GPU or CPU.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PyOpenCL

    PyOpenCL

    OpenCL integration for Python, plus shiny features

    PyOpenCL is a Python wrapper for the OpenCL framework, providing seamless access to parallel computing on CPUs, GPUs, and other accelerators. It enables developers to harness the full power of heterogeneous computing directly from Python, combining Python’s ease of use with the performance benefits of OpenCL. PyOpenCL also includes convenient features for managing memory, compiling kernels, and interfacing with NumPy, making it a preferred choice in scientific computing, data analysis, and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    Numba

    Numba

    NumPy aware dynamic Python compiler using LLVM

    Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. You don't need to replace the Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Just apply one of the Numba decorators to your...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    BentoML

    BentoML

    Unified Model Serving Framework

    ...Adaptive batching dynamically groups inference requests for optimal performance. Orchestrate distributed inference graph with multiple models via Yatai on Kubernetes. Easily configure CUDA dependencies for running inference with GPU. Automatically generate docker images for production deployment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Best-of Machine Learning with Python

    Best-of Machine Learning with Python

    A ranked list of awesome machine learning Python libraries

    This curated list contains 900 awesome open-source projects with a total of 3.3M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! General-purpose machine learning and deep learning...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    TensorFlow Model Garden

    TensorFlow Model Garden

    Models and examples built with TensorFlow

    ...A flexible and lightweight library that users can easily use or fork when writing customized training loop code in TensorFlow 2.x. It seamlessly integrates with tf.distribute and supports running on different device types (CPU, GPU, and TPU).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    PennyLane

    PennyLane

    A cross-platform Python library for differentiable programming

    ...Support for hybrid quantum and classical models, and compatible with existing machine learning libraries. Quantum circuits can be set up to interface with either NumPy, PyTorch, JAX, or TensorFlow, allowing hybrid CPU-GPU-QPU computations. The same quantum circuit model can be run on different devices. Install plugins to run your computational circuits on more devices, including Strawberry Fields, Amazon Braket, Qiskit and IBM Q, Google Cirq, Rigetti Forest, and the Microsoft QDK.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next