cuda gpu free download

Showing 18 open source projects for "cuda gpu"

View related business solutions

Libraries Windows Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

cuda-oxide

cuda-oxide is an experimental Rust-to-CUDA compiler

cuda-oxide is an experimental NVIDIA Labs project that brings Rust closer to native CUDA GPU development. It works as a Rust-to-CUDA compiler path that lets developers write SIMT GPU kernels in idiomatic Rust instead of using a separate CUDA C++ workflow. The project compiles standard Rust code directly to PTX, avoiding DSLs, source-to-source translation, or foreign-language bindings.

Downloads: 2 This Week

Last Update: 2026-06-10
See Project
2

CUDA Core Compute Libraries (CCCL)

CUDA Core Compute Libraries

CCCL, or CUDA Core Compute Libraries, is a unified repository that consolidates several foundational CUDA C++ libraries into a single, cohesive development platform. It brings together Thrust, CUB, and libcudacxx, which collectively provide high-level abstractions, low-level performance primitives, and a CUDA-compatible standard library for GPU programming.

Downloads: 1 This Week

Last Update: 6 days ago
See Project
3

CUDA API Wrappers

Thin, unified, C++-flavored wrappers for the CUDA APIs

CUDA API Wrappers is a C++ library providing high-level, modern wrappers for NVIDIA’s CUDA runtime and driver APIs, enhancing usability and efficiency. It is intended for those who would otherwise use these APIs directly, to make working with them more intuitive and consistent, making use of modern C++ language capabilities, programming idioms, and best practices. In a nutshell - making CUDA API work more fun.

Downloads: 2 This Week

Last Update: 2026-02-09
See Project
4

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. ...

Downloads: 1 This Week

Last Update: 2026-06-01
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
5

Tiny CUDA Neural Networks

Lightning fast C++/CUDA neural network framework

This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...

Downloads: 0 This Week

Last Update: 2025-07-08
See Project
6

MuJoCo Playground

An open source library for GPU-accelerated robot learning

MuJoCo Playground, developed by Google DeepMind, is a GPU-accelerated suite of simulation environments for robot learning and sim-to-real research, built on top of MuJoCo MJX. It unifies a range of control, locomotion, and manipulation tasks into a consistent and scalable framework optimized for JAX and Warp backends. The project includes classic control benchmarks from dm_control, advanced quadruped and bipedal locomotion systems, and dexterous as well as non-prehensile manipulation setups....

Downloads: 1 This Week

Last Update: 2026-04-28
See Project
7

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It...

Downloads: 5 This Week

Last Update: 2026-06-13
See Project
8

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 31 This Week

Last Update: 2026-06-02
See Project
9

PyTorch Geometric

Geometric deep learning extension library for PyTorch

...We have outsourced a lot of functionality of PyTorch Geometric to other packages, which needs to be additionally installed. These packages come with their own CPU and GPU kernel implementations based on C++/CUDA extensions. We do not recommend installation as root user on your system python. Please setup an Anaconda/Miniconda environment or create a Docker image. We provide pip wheels for all major OS/PyTorch/CUDA combinations.

Downloads: 1 This Week

Last Update: 2026-06-05
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
10

Face Alignment

2D and 3D Face alignment library build using pytorch

...However, the users can alternatively use dlib, BlazeFace, or pre-existing ground truth bounding boxes. While not required, for optimal performance(especially for the detector) it is highly recommended to run the code using a CUDA-enabled GPU. While here the work is presented as a black box, if you want to know more about the intrisecs of the method please check the original paper either on arxiv or my webpage.

Downloads: 0 This Week

Last Update: 2026-04-06
See Project
11

Jupyter Docker Stacks

Ready-to-run Docker images containing Jupyter applications

Jupyter Docker Stacks provides a curated set of ready-to-run Docker container images that bundle Jupyter applications with popular data science and computing tools, enabling users to quickly start working in a reproducible environment. These stacks support a range of use cases, from lightweight base notebook images to full featured environments that include scientific computing libraries, machine learning tools, and IDE-like notebook interfaces, all within Docker containers that run...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
12

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...

Downloads: 0 This Week

Last Update: 2025-09-05
See Project
13

Multimodal

TorchMultimodal is a PyTorch library

This project, also known as TorchMultimodal, is a PyTorch library for building, training, and experimenting with multimodal, multi-task models at scale. The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference...

Downloads: 0 This Week

Last Update: 2026-01-12
See Project
14

QtAV

A multimedia framework based on Qt and FFmpeg

QtAV is a cross-platform and high performance multimedia playback framework based on Qt and FFmpeg. Features: timeline preview, gpu decoding etc

5 Reviews

Downloads: 30 This Week

Last Update: 16 hours ago
See Project
15

Darknet

Convolutional Neural Networks

...With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.

Downloads: 26 This Week

Last Update: 6 hours ago
See Project
16

Flux3D.jl

3D computer vision library in Julia

...This package utilizes Flux.jl and Zygote.jl as its building blocks for training 3D vision models and for supporting differentiation. This package also have support of CUDA GPU acceleration with CUDA.jl.

Downloads: 0 This Week

Last Update: 2023-12-07
See Project
17

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
18

CUJ2K: Jpeg2000 encoder on Cuda

A fast JPEG2000 encoder running on CUDA-capable GPUs. Supports lossless and lossy encoding. Both commandline tool with GUI and CUDA/C++ library. Visit >> http://cuj2k.sourceforge.net << for further information.

Downloads: 11 This Week

Last Update: 2013-04-02
See Project