C++ library for high performance inference on NVIDIA GPUs
FlashMLA: Efficient Multi-head Latent Attention Kernels
A lightweight header-only library for using Keras (TensorFlow) models
Official inference framework for 1-bit LLMs
Library for reading and writing large multi-dimensional arrays
Visual SLAM/odometry package based on NVIDIA-accelerated cuVSLAM
A GPU-accelerated library containing highly optimized building blocks
XLS: Accelerated HW Synthesis
Industrial-grade RPC framework used throughout Baidu
Fast strong hash functions: SipHash/HighwayHash
An FHE compiler for C++
SageMaker specific extensions to TensorFlow
Parallel Processing for Next-Generation Sequencing (NGS) Analysis
A Compiler Development Toolkit