CV-CUDA™ is an open-source, GPU accelerated library
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
How to optimize some algorithm in cuda
Lightning fast C++/CUDA neural network framework
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
ONNX-TensorRT: TensorRT backend for ONNX
Solve puzzles. Learn CUDA
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Clean and efficient FP8 GEMM kernels with fine-grained scaling
Please do not feed the models
Our first fully AI generated deep learning system
Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code
Package and deploy machine learning models using Docker containers
A RWKV management and startup tool, full automation, only 8MB
Cross platform .Net wrapper to the OpenCV image processing library
Self-host the powerful Chatterbox TTS model
A high-throughput and memory-efficient inference and serving engine
Geometric deep learning extension library for PyTorch
Fast LLM speculative inference server for consumer hardware
Apple Silicon (MLX) port of Karpathy's autoresearch
Stable Diffusion built-in to Blender
Fast Python collaborative filtering for implicit feedback datasets
Open Source Computer Vision Library
Jittor is a high-performance deep learning framework
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML