Solve puzzles. Learn CUDA
Performance meets Productivity
The CUDA target for Numba
How to optimize some algorithm in cuda
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
A NumPy-compatible array library accelerated by CUDA
Development repository for the Triton language and compiler
A Python framework for accelerated simulation, data generation
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
An interactive NVIDIA-GPU process viewer and beyond
Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Rembg is a tool to remove images background
Fast and memory-efficient exact attention
GPU accelerated decision optimization
Our first fully AI generated deep learning system
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Data manipulation and transformation for audio signal processing
An open source library for GPU-accelerated robot learning
Simplest working implementation of Stylegan2
Generate audiobooks from e-books
2D and 3D Face alignment library build using pytorch
Geometric deep learning extension library for PyTorch
A set of Docker images for training and serving models in TensorFlow
Low-latency REST API for serving text-embeddings