Integrate cutting-edge LLM technology quickly and easily into your app
FlashMLA: Efficient Multi-head Latent Attention Kernels
C++ library for high performance inference on NVIDIA GPUs
The Compute Library is a set of computer vision and machine learning
Clean and efficient FP8 GEMM kernels with fine-grained scaling
oneAPI Deep Neural Network Library (oneDNN)
Toolkit for making machine learning and data analysis applications
Geometric deep learning extension library for PyTorch
Deep learning inference framework optimized for mobile platforms
Fast and user-friendly runtime for transformer inference
Machine learning, computer vision, statistics and computing for .NET
Machine learning with Gaussian kernels.
Calculates similarity between neighborhoods of two vertices in a graph
Computer vision and image processing library for Qt.