Showing 147 open source projects for "cuda gpu"

View related business solutions
  • Small Business HR Management Software Icon
    Small Business HR Management Software

    Get a unified timekeeping, scheduling, payroll, HR and benefits portal with WorkforceHub.

    WorkforceHub is the instantly useful, delightfully simple to use, small business solution for tracking time, scheduling and hiring. It scales as your business grows while delivering the mission-critical features an organization needs. It is tailored to, built for, and priced for small business employers.
    Learn More
  • Intelligent network automation for businesses and organizations Icon
    Intelligent network automation for businesses and organizations

    Network automation for the hybrid multi-cloud era

    BackBox seamlessly integrates with network monitoring and NetOps platforms and automates configuration backups, restores, and change detection. BackBox also provides before and after config diffs for change management, and automated remediation of discovered network security issues.
    Get a Free Trial
  • 1
    CUDA.jl

    CUDA.jl

    CUDA programming in Julia

    High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    NVIDIA GPU Operator

    NVIDIA GPU Operator

    NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

    ... software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    CV-CUDA

    CV-CUDA

    CV-CUDA™ is an open-source, GPU accelerated library

    CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    ... memory in its default configuration. It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.
    Downloads: 1 This Week
    Last Update:
    See Project
  • All-in-One Payroll and HR Platform Icon
    All-in-One Payroll and HR Platform

    For small and mid-sized businesses that need a comprehensive payroll and HR solution with personalized support

    We design our technology to make workforce management easier. APS offers core HR, payroll, benefits administration, attendance, recruiting, employee onboarding, and more.
    Learn More
  • 5
    CuPy

    CuPy

    A NumPy-compatible array library accelerated by CUDA

    CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    XMRig

    XMRig

    RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

    High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface does...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    emgucv

    emgucv

    Cross platform .Net wrapper to the OpenCV image processing library

    Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 9
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    ..., embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Digital Payments by Deluxe Payment Exchange Icon
    Digital Payments by Deluxe Payment Exchange

    A single integrated payables solution that takes manual payment processes out of the equation, helping reduce risk and cutting costs for your business

    Save time, money and your sanity. Deluxe Payment Exchange+ (DPX+) is our integrated payments solution that streamlines and automates your accounts payable (AP) disbursements. DPX+ ensures secure payments and offers suppliers alternate ways to receive funds, including mailed checks, ACH, virtual credit cards, debit cards, or eCheck payments. By simply integrating with your existing accounting software like QuickBooks®, you’ll implement efficient payment solutions for AP with ease—without costly development fees or untimely delays.
    Learn More
  • 10
    TSNE-CUDA

    TSNE-CUDA

    GPU Accelerated t-SNE for CUDA with Python bindings

    This repo is an optimized CUDA version of FIt-SNE algorithm with associated python modules. We find that our implementation of t-SNE can be up to 1200x faster than Sklearn, or up to 50x faster than Multicore-TSNE when used with the right GPU. You can install binaries with anaconda for CUDA version 10.1 and 10.2 using conda install tsnecuda -c conda-forge. Tsnecuda supports CUDA versions 9.0 and later through source installation, check out the wiki for up to date installation instructions. Time...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Stable Diffusion in Docker

    Stable Diffusion in Docker

    Run the Stable Diffusion releases in a Docker container

    Run the Stable Diffusion releases in a Docker container with txt2img, img2img, depth2img, pix2pix, upscale4x, and inpaint. Run the Stable Diffusion releases on Huggingface in a GPU-accelerated Docker container. By default, the pipeline uses the full model and weights which requires a CUDA capable GPU with 8GB+ of VRAM. It should take a few seconds to create one image. On less powerful GPUs you may need to modify some of the options; see the Examples section for more details. If you lack...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    NVIDIA Container Toolkit

    NVIDIA Container Toolkit

    Build and run Docker containers leveraging NVIDIA GPUs

    The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed. The NVIDIA Container Toolkit supports different container engines...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Real-ESRGAN Video Enhance

    Real-ESRGAN Video Enhance

    Real-ESRGAN video upscaler with resumability

    REVE (Real-ESRGAN Video Enhance) is a small, fast application written in Rust that is used for upscaling animated video content. It utilizes Real-ESRGAN-can-Vulkan, FFmpeg and MediaInfo under the hood. REVE employs a segment-based approach to video upscaling, allowing it to simultaneously upscale and encode videos. This results in a notable enhancement in performance and enables the feature of reusability. You can download Windows executable file for Intel/AMD/Nvidia GPU. This executable file...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Faiss

    Faiss

    Library for efficient similarity search and clustering dense vectors

    Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It assumes...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    ImplicitGlobalGrid.jl

    ImplicitGlobalGrid.jl

    Distributed parallelization of stencil-based GPU and CPU applications

    ... MPI wrapper (MPI.jl) to perform halo updates close to hardware limit and leverages CUDA-aware or ROCm-aware MPI for GPU-applications. The communication can straightforwardly be hidden behind computation [1, 3] (how this can be done automatically when using ParallelStencil.jl is shown in; a general approach particularly suited for CUDA C applications is explained in.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    GFPGAN

    GFPGAN

    GFPGAN aims at developing Practical Algorithms

    GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration. Colab Demo for GFPGAN; (Another Colab Demo for the original paper model) Online demo: Huggingface (return only the cropped face) Online demo: Replicate.ai (may need to sign in, return the whole image). Online demo: Baseten.co (backed by GPU, returns the whole image). We provide a clean version of GFPGAN, which can run without CUDA extensions. So that it can run in Windows or on CPU mode. GFPGAN aims at developing...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 17
    cuDF

    cuDF

    GPU DataFrame Library

    Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data. cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming. For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    PyTorch Geometric

    PyTorch Geometric

    Geometric deep learning extension library for PyTorch

    ... of functionality of PyTorch Geometric to other packages, which needs to be additionally installed. These packages come with their own CPU and GPU kernel implementations based on C++/CUDA extensions. We do not recommend installation as root user on your system python. Please setup an Anaconda/Miniconda environment or create a Docker image. We provide pip wheels for all major OS/PyTorch/CUDA combinations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    NNlib.jl

    NNlib.jl

    Neural Network primitives with multiple backends

    This package provides a library of functions useful for neural networks, such as softmax, sigmoid, batched multiplication, convolutions and pooling. Many of these are used by Flux.jl, which loads this package, but they may be used independently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    cuML

    cuML

    RAPIDS Machine Learning Library

    cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects. cuML enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs without going into the details of CUDA programming. In most cases, cuML's Python API matches the API from scikit-learn. For large datasets, these GPU-based implementations can complete 10-50x faster than their CPU...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    TIGRE

    TIGRE

    TIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox

    TIGRE is an open-source toolbox for fast and accurate 3D tomographic reconstruction for any geometry. Its focus is on iterative algorithms for improved image quality that have all been optimized to run on GPUs (including multi-GPUs) for improved speed. It combines the higher-level abstraction of MATLAB or Python with the performance of CUDA at a lower level in order to make it both fast and easy to use. TIGRE is free to download and distribute: use it, modify it, add to it, and share it. Our...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    ParallelStencil.jl

    ParallelStencil.jl

    Package for writing high-level code for parallel stencil computations

    ParallelStencil empowers domain scientists to write architecture-agnostic high-level code for parallel high-performance stencil computations on GPUs and CPUs. Performance similar to CUDA C / HIP can be achieved, which is typically a large improvement over the performance reached when using only CUDA.jl or AMDGPU.jl GPU Array programming. For example, a 2-D shallow ice solver presented at JuliaCon 2020 [1] achieved a nearly 20 times better performance than a corresponding GPU Array programming...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Knet

    Knet

    Koç University deep learning framework

    Knet.jl is a deep learning package implemented in Julia, so you should be able to run it on any machine that can run Julia. It has been extensively tested on Linux machines with NVIDIA GPUs and CUDA libraries, and it has been reported to work on OSX and Windows. If you would like to try it on your own computer, please follow the instructions on Installation. If you would like to try working with a GPU and do not have access to one, take a look at Using Amazon AWS or Using Microsoft Azure...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Halide

    A language for fast, portable data-parallel computation

    Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building...
    Downloads: 0 This Week
    Last Update:
    See Project