C++ library for high performance inference on NVIDIA GPUs
A NumPy-compatible array library accelerated by CUDA
FlashMLA: Efficient Multi-head Latent Attention Kernels
Thin, unified, C++-flavored wrappers for the CUDA APIs
AWS Libfabric
oneAPI Deep Neural Network Library (oneDNN)
GPU DataFrame Library
Lightning fast C++/CUDA neural network framework
A library for deep learning end-to-end dialog systems and chatbots
Transformers4Rec is a flexible and efficient library
Build and run Docker containers leveraging NVIDIA GPUs
The C++ parallel algorithms library
GUI for training of neural network models for GuitarML Proteus
Facebook AI Research Sequence-to-Sequence Toolkit written in Python
YOLO ROS: Real-Time Object Detection for ROS
Polyhedral compiler for expressing fast and portable data algorithms
A fast open framework for deep learning
OpenCV Pre-built CUDA binaries
CUDA-enabled machine learning library for recurrent neural networks