gpu cuda free download

Showing 147 open source projects for "gpu cuda"

View related business solutions

AlertBot: Website Monitoring of Uptime, Performance, and Errors
For IT Professionals and network adminstrators looking for a web application monitoring solution

AlertBot monitors your website's full functionality around the clock so you can focus your time on more important things.

Learn More
JobNimbus Construction Software
For Roofers, Remodelers, Contractors, Home Service Industry

Track leads, jobs, and tasks from one easy to use software. You can access your information wherever you are, get everyone on the same page, and grow your business.

Learn More
1

CUDA.jl

CUDA programming in Julia

High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
2

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library

CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.

Downloads: 1 This Week

Last Update: 2024-09-05
See Project
3

NVIDIA GPU Operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

... software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Runtime, automatic node labeling, DCGM-based monitoring, and others.

Downloads: 0 This Week

Last Update: 2024-08-08
See Project
4

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

... memory in its default configuration. It will likely only work on an RTX 3090, an RTX 2080 Ti, or high-end enterprise GPUs. Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.

Downloads: 1 This Week

Last Update: 2022-12-01
See Project
High-performance Open Source API Gateway
KrakenD is a stateless, distributed, high-performance API Gateway that helps you effortlessly adopt microservices

KrakenD is a high-performance API Gateway optimized for resource efficiency, capable of managing 70,000 requests per second on a single instance. The stateless architecture allows for straightforward, linear scalability, eliminating the need for complex coordination or database maintenance.

Learn More
5

TSNE-CUDA

GPU Accelerated t-SNE for CUDA with Python bindings

This repo is an optimized CUDA version of FIt-SNE algorithm with associated python modules. We find that our implementation of t-SNE can be up to 1200x faster than Sklearn, or up to 50x faster than Multicore-TSNE when used with the right GPU. You can install binaries with anaconda for CUDA version 10.1 and 10.2 using conda install tsnecuda -c conda-forge. Tsnecuda supports CUDA versions 9.0 and later through source installation, check out the wiki for up to date installation instructions. Time...

Downloads: 0 This Week

Last Update: 2022-07-14
See Project
6

GFPGAN

GFPGAN aims at developing Practical Algorithms

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration. Colab Demo for GFPGAN; (Another Colab Demo for the original paper model) Online demo: Huggingface (return only the cropped face) Online demo: Replicate.ai (may need to sign in, return the whole image). Online demo: Baseten.co (backed by GPU, returns the whole image). We provide a clean version of GFPGAN, which can run without CUDA extensions. So that it can run in Windows or on CPU mode. GFPGAN aims at developing...

Downloads: 53 This Week

Last Update: 2022-09-16
See Project
7

XMRig

RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface does...

1 Review

Downloads: 17 This Week

Last Update: 2024-08-11
See Project
8

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases...

Downloads: 2 This Week

Last Update: 2024-08-22
See Project
9

Real-ESRGAN Video Enhance

Real-ESRGAN video upscaler with resumability

REVE (Real-ESRGAN Video Enhance) is a small, fast application written in Rust that is used for upscaling animated video content. It utilizes Real-ESRGAN-can-Vulkan, FFmpeg and MediaInfo under the hood. REVE employs a segment-based approach to video upscaling, allowing it to simultaneously upscale and encode videos. This results in a notable enhancement in performance and enables the feature of reusability. You can download Windows executable file for Intel/AMD/Nvidia GPU. This executable file...

Downloads: 7 This Week

Last Update: 2023-03-30
See Project
Small Business HR Management Software
Get a unified timekeeping, scheduling, payroll, HR and benefits portal with WorkforceHub.

WorkforceHub is the instantly useful, delightfully simple to use, small business solution for tracking time, scheduling and hiring. It scales as your business grows while delivering the mission-critical features an organization needs. It is tailored to, built for, and priced for small business employers.

Learn More
10

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies...

2 Reviews

Downloads: 10 This Week

Last Update: 2024-09-05
See Project
11

TensorRT

C++ library for high performance inference on NVIDIA GPUs

..., embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 4 This Week

Last Update: 2024-09-12
See Project
12

NVIDIA Container Toolkit

Build and run Docker containers leveraging NVIDIA GPUs

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs. Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed. The NVIDIA Container Toolkit supports different container engines...

Downloads: 2 This Week

Last Update: 2023-04-26
See Project
13

ImplicitGlobalGrid.jl

Distributed parallelization of stencil-based GPU and CPU applications

... MPI wrapper (MPI.jl) to perform halo updates close to hardware limit and leverages CUDA-aware or ROCm-aware MPI for GPU-applications. The communication can straightforwardly be hidden behind computation [1, 3] (how this can be done automatically when using ParallelStencil.jl is shown in; a general approach particularly suited for CUDA C applications is explained in.

Downloads: 0 This Week

Last Update: 2024-08-12
See Project
14

emgucv

Cross platform .Net wrapper to the OpenCV image processing library

Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.

Downloads: 1 This Week

Last Update: 2024-08-13
See Project
15

Knet

Koç University deep learning framework

Knet.jl is a deep learning package implemented in Julia, so you should be able to run it on any machine that can run Julia. It has been extensively tested on Linux machines with NVIDIA GPUs and CUDA libraries, and it has been reported to work on OSX and Windows. If you would like to try it on your own computer, please follow the instructions on Installation. If you would like to try working with a GPU and do not have access to one, take a look at Using Amazon AWS or Using Microsoft Azure...

Downloads: 1 This Week

Last Update: 2023-02-26
See Project
16

cuDF

GPU DataFrame Library

Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data. cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming. For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
17

Stable Diffusion in Docker

Run the Stable Diffusion releases in a Docker container

Run the Stable Diffusion releases in a Docker container with txt2img, img2img, depth2img, pix2pix, upscale4x, and inpaint. Run the Stable Diffusion releases on Huggingface in a GPU-accelerated Docker container. By default, the pipeline uses the full model and weights which requires a CUDA capable GPU with 8GB+ of VRAM. It should take a few seconds to create one image. On less powerful GPUs you may need to modify some of the options; see the Examples section for more details. If you lack...

Downloads: 1 This Week

Last Update: 2023-09-22
See Project
18

Face Alignment

2D and 3D Face alignment library build using pytorch

... dlib, BlazeFace, or pre-existing ground truth bounding boxes. While not required, for optimal performance(especially for the detector) it is highly recommended to run the code using a CUDA-enabled GPU. While here the work is presented as a black box, if you want to know more about the intrisecs of the method please check the original paper either on arxiv or my webpage.

Downloads: 1 This Week

Last Update: 2023-08-16
See Project
19

NNlib.jl

Neural Network primitives with multiple backends

This package provides a library of functions useful for neural networks, such as softmax, sigmoid, batched multiplication, convolutions and pooling. Many of these are used by Flux.jl, which loads this package, but they may be used independently.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
20

cuML

RAPIDS Machine Learning Library

cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects. cuML enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs without going into the details of CUDA programming. In most cases, cuML's Python API matches the API from scikit-learn. For large datasets, these GPU-based implementations can complete 10-50x faster than their CPU...

Downloads: 0 This Week

Last Update: 2024-08-08
See Project
21

ParallelStencil.jl

Package for writing high-level code for parallel stencil computations

ParallelStencil empowers domain scientists to write architecture-agnostic high-level code for parallel high-performance stencil computations on GPUs and CPUs. Performance similar to CUDA C / HIP can be achieved, which is typically a large improvement over the performance reached when using only CUDA.jl or AMDGPU.jl GPU Array programming. For example, a 2-D shallow ice solver presented at JuliaCon 2020 [1] achieved a nearly 20 times better performance than a corresponding GPU Array programming...

Downloads: 0 This Week

Last Update: 2024-08-16
See Project
22

Halide

A language for fast, portable data-parallel computation

Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building...

Downloads: 0 This Week

Last Update: 2024-07-17
See Project
23

MegEngine

Easy-to-use deep learning framework with 3 key features

MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR algorithm...

Downloads: 1 This Week

Last Update: 2024-04-30
See Project
24

Simple StyleGan2 for Pytorch

Simplest working implementation of Stylegan2

Simple Pytorch implementation of Stylegan2 that can be completely trained from the command-line, no coding needed. You will need a machine with a GPU and CUDA installed. You can also specify the location where intermediate results and model checkpoints should be stored. You can increase the network capacity (which defaults to 16) to improve generation results, at the cost of more memory. By default, if the training gets cut off, it will automatically resume from the last checkpointed file. Once...

Downloads: 0 This Week

Last Update: 2024-08-23
See Project
25

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It assumes...

Downloads: 0 This Week

Last Update: 2024-02-29
See Project