Performance meets Productivity
C++ and Python support for the CUDA Quantum programming model
Accelerated libraries for quantum-classical computing built on CUDA-Q
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
CV-CUDA™ is an open-source, GPU accelerated library
CUDA programming in Julia
The CUDA target for Numba
Thin, unified, C++-flavored wrappers for the CUDA APIs
CUDA Core Compute Libraries
How to optimize some algorithm in cuda
Lightning fast C++/CUDA neural network framework
Build an automated pipeline that converts CUDA APIs into Numba
A NumPy-compatible array library accelerated by CUDA
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
The best AI Aimbot for Fortnite, Valorant, CS2, R6, COD, Apex, & more
RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner
Solving the Satoshi Puzzle
Solve puzzles. Learn CUDA
ONNX-TensorRT: TensorRT backend for ONNX
CUDA Templates for Linear Algebra Subroutines
A Python framework for accelerated simulation, data generation
Distributed parallelization of stencil-based GPU and CPU applications
Please do not feed the models
Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code
Productive, portable, and performant GPU programming in Python