Solve puzzles. Learn CUDA
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
How to optimize some algorithm in cuda
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
Fast LLM speculative inference server for consumer hardware
Please do not feed the models
Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code
Our first fully AI generated deep learning system
Instant neural graphics primitives: lightning fast NeRF and more
Synchronized Translation for Videos
Serving multiple LoRA finetuned LLM as one
A computer vision framework to create and deploy apps in minutes
FAIR's research platform for object detection research
Rust language bindings for TensorFlow
Transformer related optimization, including BERT, GPT
A fast GPU accelerated feature extraction software for speech analysis