cuda gpu free download

Showing 147 open source projects for "cuda gpu"

View related business solutions

Small Business HR Management Software
Get a unified timekeeping, scheduling, payroll, HR and benefits portal with WorkforceHub.

WorkforceHub is the instantly useful, delightfully simple to use, small business solution for tracking time, scheduling and hiring. It scales as your business grows while delivering the mission-critical features an organization needs. It is tailored to, built for, and priced for small business employers.

Learn More
Intelligent network automation for businesses and organizations
Network automation for the hybrid multi-cloud era

BackBox seamlessly integrates with network monitoring and NetOps platforms and automates configuration backups, restores, and change detection. BackBox also provides before and after config diffs for change management, and automated remediation of discovered network security issues.

Get a Free Trial
1

CUDA.jl

CUDA programming in Julia

High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl...

Downloads: 0 This Week

Last Update: 2024-09-26
See Project
2

NVIDIA GPU Operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

... software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.

Downloads: 2 This Week

Last Update: 2024-09-24
See Project
3

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library

CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.

Downloads: 0 This Week

Last Update: 2024-09-27
See Project
4

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

... memory in its default configuration. It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.

Downloads: 1 This Week

Last Update: 2022-12-01
See Project
All-in-One Payroll and HR Platform
For small and mid-sized businesses that need a comprehensive payroll and HR solution with personalized support

We design our technology to make workforce management easier. APS offers core HR, payroll, benefits administration, attendance, recruiting, employee onboarding, and more.

Learn More
5

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases...

Downloads: 4 This Week

Last Update: 2024-08-22
See Project
6

XMRig

RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface does...

1 Review

Downloads: 10 This Week

Last Update: 2024-08-11
See Project
7

emgucv

Cross platform .Net wrapper to the OpenCV image processing library

Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.

Downloads: 8 This Week

Last Update: 2024-08-13
See Project
8

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies...

2 Reviews

Downloads: 18 This Week

Last Update: 3 days ago
See Project
9

TensorRT

C++ library for high performance inference on NVIDIA GPUs

..., embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 8 This Week

Last Update: 2 days ago
See Project
Digital Payments by Deluxe Payment Exchange
A single integrated payables solution that takes manual payment processes out of the equation, helping reduce risk and cutting costs for your business

Save time, money and your sanity. Deluxe Payment Exchange+ (DPX+) is our integrated payments solution that streamlines and automates your accounts payable (AP) disbursements. DPX+ ensures secure payments and offers suppliers alternate ways to receive funds, including mailed checks, ACH, virtual credit cards, debit cards, or eCheck payments. By simply integrating with your existing accounting software like QuickBooks®, you’ll implement efficient payment solutions for AP with ease—without costly development fees or untimely delays.

Learn More
10

TSNE-CUDA

GPU Accelerated t-SNE for CUDA with Python bindings

This repo is an optimized CUDA version of FIt-SNE algorithm with associated python modules. We find that our implementation of t-SNE can be up to 1200x faster than Sklearn, or up to 50x faster than Multicore-TSNE when used with the right GPU. You can install binaries with anaconda for CUDA version 10.1 and 10.2 using conda install tsnecuda -c conda-forge. Tsnecuda supports CUDA versions 9.0 and later through source installation, check out the wiki for up to date installation instructions. Time...

Downloads: 0 This Week

Last Update: 2022-07-14
See Project
11

Stable Diffusion in Docker

Run the Stable Diffusion releases in a Docker container

Run the Stable Diffusion releases in a Docker container with txt2img, img2img, depth2img, pix2pix, upscale4x, and inpaint. Run the Stable Diffusion releases on Huggingface in a GPU-accelerated Docker container. By default, the pipeline uses the full model and weights which requires a CUDA capable GPU with 8GB+ of VRAM. It should take a few seconds to create one image. On less powerful GPUs you may need to modify some of the options; see the Examples section for more details. If you lack...

Downloads: 5 This Week

Last Update: 2023-09-22
See Project
12

NVIDIA Container Toolkit

Build and run Docker containers leveraging NVIDIA GPUs

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed. The NVIDIA Container Toolkit supports different container engines...

Downloads: 4 This Week

Last Update: 2023-04-26
See Project
13

Real-ESRGAN Video Enhance

Real-ESRGAN video upscaler with resumability

REVE (Real-ESRGAN Video Enhance) is a small, fast application written in Rust that is used for upscaling animated video content. It utilizes Real-ESRGAN-can-Vulkan, FFmpeg and MediaInfo under the hood. REVE employs a segment-based approach to video upscaling, allowing it to simultaneously upscale and encode videos. This results in a notable enhancement in performance and enables the feature of reusability. You can download Windows executable file for Intel/AMD/Nvidia GPU. This executable file...

Downloads: 6 This Week

Last Update: 2023-03-30
See Project
14

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It assumes...

Downloads: 4 This Week

Last Update: 2024-10-04
See Project
15

ImplicitGlobalGrid.jl

Distributed parallelization of stencil-based GPU and CPU applications

... MPI wrapper (MPI.jl) to perform halo updates close to hardware limit and leverages CUDA-aware or ROCm-aware MPI for GPU-applications. The communication can straightforwardly be hidden behind computation [1, 3] (how this can be done automatically when using ParallelStencil.jl is shown in; a general approach particularly suited for CUDA C applications is explained in.

Downloads: 0 This Week

Last Update: 2024-09-27
See Project
16

GFPGAN

GFPGAN aims at developing Practical Algorithms

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration. Colab Demo for GFPGAN; (Another Colab Demo for the original paper model) Online demo: Huggingface (return only the cropped face) Online demo: Replicate.ai (may need to sign in, return the whole image). Online demo: Baseten.co (backed by GPU, returns the whole image). We provide a clean version of GFPGAN, which can run without CUDA extensions. So that it can run in Windows or on CPU mode. GFPGAN aims at developing...

Downloads: 27 This Week

Last Update: 2022-09-16
See Project
17

cuDF

GPU DataFrame Library

Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data. cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming. For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
18

TorchAudio

Data manipulation and transformation for audio signal processing

The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...

Downloads: 1 This Week

Last Update: 2024-08-22
See Project
19

PyTorch Geometric

Geometric deep learning extension library for PyTorch

... of functionality of PyTorch Geometric to other packages, which needs to be additionally installed. These packages come with their own CPU and GPU kernel implementations based on C++/CUDA extensions. We do not recommend installation as root user on your system python. Please setup an Anaconda/Miniconda environment or create a Docker image. We provide pip wheels for all major OS/PyTorch/CUDA combinations.

Downloads: 1 This Week

Last Update: 2024-09-26
See Project
20

NNlib.jl

Neural Network primitives with multiple backends

This package provides a library of functions useful for neural networks, such as softmax, sigmoid, batched multiplication, convolutions and pooling. Many of these are used by Flux.jl, which loads this package, but they may be used independently.

Downloads: 0 This Week

Last Update: 2024-09-18
See Project
21

cuML

RAPIDS Machine Learning Library

cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects. cuML enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs without going into the details of CUDA programming. In most cases, cuML's Python API matches the API from scikit-learn. For large datasets, these GPU-based implementations can complete 10-50x faster than their CPU...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
22

TIGRE

TIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox

TIGRE is an open-source toolbox for fast and accurate 3D tomographic reconstruction for any geometry. Its focus is on iterative algorithms for improved image quality that have all been optimized to run on GPUs (including multi-GPUs) for improved speed. It combines the higher-level abstraction of MATLAB or Python with the performance of CUDA at a lower level in order to make it both fast and easy to use. TIGRE is free to download and distribute: use it, modify it, add to it, and share it. Our...

Downloads: 1 This Week

Last Update: 1 day ago
See Project
23

ParallelStencil.jl

Package for writing high-level code for parallel stencil computations

ParallelStencil empowers domain scientists to write architecture-agnostic high-level code for parallel high-performance stencil computations on GPUs and CPUs. Performance similar to CUDA C / HIP can be achieved, which is typically a large improvement over the performance reached when using only CUDA.jl or AMDGPU.jl GPU Array programming. For example, a 2-D shallow ice solver presented at JuliaCon 2020 [1] achieved a nearly 20 times better performance than a corresponding GPU Array programming...

Downloads: 0 This Week

Last Update: 2024-09-27
See Project
24

Knet

Koç University deep learning framework

Knet.jl is a deep learning package implemented in Julia, so you should be able to run it on any machine that can run Julia. It has been extensively tested on Linux machines with NVIDIA GPUs and CUDA libraries, and it has been reported to work on OSX and Windows. If you would like to try it on your own computer, please follow the instructions on Installation. If you would like to try working with a GPU and do not have access to one, take a look at Using Amazon AWS or Using Microsoft Azure...

Downloads: 0 This Week

Last Update: 2023-02-26
See Project
25

Halide

A language for fast, portable data-parallel computation

Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building...

Downloads: 0 This Week

Last Update: 2024-07-17
See Project