Solve puzzles. Learn CUDA
Performance meets Productivity
The CUDA target for Numba
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
How to optimize some algorithm in cuda
A NumPy-compatible array library accelerated by CUDA
Development repository for the Triton language and compiler
A Python framework for accelerated simulation, data generation
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun
An interactive NVIDIA-GPU process viewer and beyond
Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Rembg is a tool to remove images background
Fast and memory-efficient exact attention
GPU accelerated decision optimization
Our first fully AI generated deep learning system
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Generate audiobooks from e-books
High-Resolution Image Synthesis with Latent Diffusion Models
An open source library for GPU-accelerated robot learning
Simplest working implementation of Stylegan2
2D and 3D Face alignment library build using pytorch
Package and deploy machine learning models using Docker containers
Image-to-Image Translation in PyTorch