Showing 307 open source projects for "cuda"

View related business solutions
  • Eptura Workplace Software Icon
    Eptura Workplace Software

    From desk booking and visitor management, to space planning and office utilization data, Eptura Workplace helps your entire organization work smarter.

    With the world of work changed forever, it’s essential to manage your workplace and assets together to effectively create a high-performing environment. The Eptura experience combines the power of workplace management software with asset management, enabling you to effectively operate your building and facilitate hybrid work.
  • Web Hosting, Reseller Hosting and Cloud Server Hosting | ScalaHosting Icon
    Web Hosting, Reseller Hosting and Cloud Server Hosting | ScalaHosting

    Help your website thrive by hosting it on the cloud which delivers scalability, high speed and security 24/7/365.

    From freelance web developers, to web design agencies, to bustling e-commerce websites, to savvy domain resellers, clients trust our managed cloud-hosting solutions to reach their full potential on the web.
  • 1
    CUDA.jl

    CUDA.jl

    CUDA programming in Julia

    High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl. This will...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    CV-CUDA

    CV-CUDA

    CV-CUDA™ is an open-source, GPU accelerated library

    CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    ... memory in its default configuration. It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GLM

    GLM

    OpenGL Mathematics (GLM)

    OpenGL Mathematics (GLM) is a header only C++ mathematics library for graphics software based on the OpenGL Shading Language (GLSL) specifications. GLM provides classes and functions designed and implemented with the same naming conventions and functionality than GLSL so that anyone who knows GLSL, can use GLM as well in C++. This project isn't limited to GLSL features. An extension system, based on the GLSL extension conventions, provides extended capabilities: matrix transformations,...
    Downloads: 143 This Week
    Last Update:
    See Project
  • Complete Data Management for Nonprofits Icon
    Complete Data Management for Nonprofits

    Designed to fit with multi-level non-profit organization, across any sector

    NewOrg is a robust platform built with enhanced features to help non-profit organizations that capture and integrate the information from all of their operational areas to better manage volunteers, clients, programs, outcome reporting, activity sign-ups & scheduling, communications, surveys, fundraising activities and Development campaigns. NewOrg can truly deliver an intuitive product that will help manage your Committees, Donors, Events, and Memberships so that the organization runs efficiently.
  • 5
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    TSNE-CUDA

    TSNE-CUDA

    GPU Accelerated t-SNE for CUDA with Python bindings

    This repo is an optimized CUDA version of FIt-SNE algorithm with associated python modules. We find that our implementation of t-SNE can be up to 1200x faster than Sklearn, or up to 50x faster than Multicore-TSNE when used with the right GPU. You can install binaries with anaconda for CUDA version 10.1 and 10.2 using conda install tsnecuda -c conda-forge. Tsnecuda supports CUDA versions 9.0 and later through source installation, check out the wiki for up to date installation instructions. Time...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    GFPGAN

    GFPGAN

    GFPGAN aims at developing Practical Algorithms

    GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration. Colab Demo for GFPGAN; (Another Colab Demo for the original paper model) Online demo: Huggingface (return only the cropped face) Online demo: Replicate.ai (may need to sign in, return the whole image). Online demo: Baseten.co (backed by GPU, returns the whole image). We provide a clean version of GFPGAN, which can run without CUDA extensions. So that it can run in Windows or on CPU mode. GFPGAN aims at developing...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 8
    XMRig

    XMRig

    RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

    High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 9
    Dream Textures

    Dream Textures

    Stable Diffusion built-in to Blender

    ... you're looking for. Texture entire models and scenes with depth to image. Inpaint to fix up images and convert existing textures into seamless ones automatically. Outpaint to increase the size of an image by extending it in any direction. Perform style transfer and create novel animations with Stable Diffusion as a post processing step. Dream Textures has been tested with CUDA and Apple Silicon GPUs. Over 4GB of VRAM is recommended.
    Downloads: 11 This Week
    Last Update:
    See Project
  • KYC Portal CLM is the most advanced CDD and AML data collection software Icon
    KYC Portal CLM is the most advanced CDD and AML data collection software

    For Banks, Law firms, Accounting and Auditing, Financial Service Providers, Logistic Companies, Gaming

    KYC Portal focuses on streamlining and automating the back-office of any due diligence process. It allows you to define and manage all your regulatory and policy requirements within the system and it then provides the operational capacity to automate and manage the entire process from on-boarding relationship management all throughout the automation of ongoing aspects of KYC such as risk-based approach, reporting, document requests, automated risk-based questionnaires etc.
  • 10
    opencvsharp

    opencvsharp

    OpenCV wrapper for .NET

    OpenCV wrapper for .NET. Some Docker images are provided to use OpenCvSharp with AppEngine Flexible. The native binding (libOpenCvSharpExtern) is already built in the docker image and you don't need to worry about it. OpenCvSharp won't work on Unity and Xamarin platform. For Unity, please consider using OpenCV for Unity or some other solutions. OpenCvSharp does not support CUDA. If you want to use the CUDA features, you need to customize the native bindings yourself. Objects of classes...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    ..., embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    CuPy

    CuPy

    A NumPy-compatible array library accelerated by CUDA

    CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    NVIDIA Container Toolkit

    NVIDIA Container Toolkit

    Build and run Docker containers leveraging NVIDIA GPUs

    The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed. The NVIDIA Container Toolkit supports different container engines...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    .... InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Real-ESRGAN Video Enhance

    Real-ESRGAN Video Enhance

    Real-ESRGAN video upscaler with resumability

    ... is portable and includes all the binaries and models required. No CUDA or PyTorch environment is needed.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    VLLM

    VLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Torch-TensorRT

    Torch-TensorRT

    PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

    Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch’s Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into a module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extension and compiles modules that integrate...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Implicit

    Implicit

    Fast Python collaborative filtering for implicit feedback datasets

    This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets. All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU’s. This library also supports using approximate nearest neighbour libraries such as Annoy, NMSLIB and Faiss for speeding up...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    NVIDIA GPU Operator

    NVIDIA GPU Operator

    NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

    ... software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    emgucv

    emgucv

    Cross platform .Net wrapper to the OpenCV image processing library

    Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Bullet Physics SDK

    Bullet Physics SDK

    Real-time collision detection and multi-physics simulation for VR

    ... using Deep Reinforcement Learning, or using Gradient based optimization (for example LFBGS). In addition, the simulator can be entirely run on CUDA for fast rollouts, in combination with Augmented Random Search. This allows for 1 million simulation steps per second. It is highly recommended to use PyBullet Python bindings for improved support for robotics, reinforcement learning and VR. Use pip install pybullet and checkout the PyBullet Quickstart Guide.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Jittor

    Jittor

    Jittor is a high-performance deep learning framework

    ... learning, etc. The front-end language is Python. Module Design and Dynamic Graph Execution is used in the front-end, which is the most popular design for deep learning framework interface. The back-end is implemented by high-performance languages, such as CUDA, C++. Jittor'op is similar to NumPy. Let's try some operations. We create Var a and b via operation jt.float32, and add them. Printing those variables shows they have the same shape and dtype.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    CUTLASS

    CUTLASS

    CUDA Templates for Linear Algebra Subroutines

    CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN. CUTLASS decomposes these "moving parts" into reusable, modular software components abstracted by C++ template classes. These thread-wide, warp-wide, block-wide, and device-wide...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    TensorRT Backend For ONNX

    TensorRT Backend For ONNX

    ONNX-TensorRT: TensorRT backend for ONNX

    ... the docker containers as instructed in the main (TensorRT repository). Note that this project has a dependency on CUDA. By default the build will look in /usr/local/cuda for the CUDA toolkit installation. If your CUDA path is different, overwrite the default path. ONNX models can be converted to serialized TensorRT engines using the onnx2trt executable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Thrust

    Thrust

    The C++ parallel algorithms library

    Thrust is the C++ parallel algorithms library which inspired the introduction of parallel algorithms to the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next