A NumPy-compatible array library accelerated by CUDA
C++ library for high performance inference on NVIDIA GPUs
GPU DataFrame Library
FlashMLA: Efficient Multi-head Latent Attention Kernels
Thin, unified, C++-flavored wrappers for the CUDA APIs
oneAPI Deep Neural Network Library (oneDNN)
Lightning fast C++/CUDA neural network framework
Transformers4Rec is a flexible and efficient library
Build and run Docker containers leveraging NVIDIA GPUs
Facebook AI Research Sequence-to-Sequence Toolkit written in Python
YOLO ROS: Real-Time Object Detection for ROS
A fast open framework for deep learning