A library for accelerating Transformer models on NVIDIA GPUs
A high-throughput and memory-efficient inference and serving engine
User-friendly AI Interface
Open-Source AI Camera. Empower any camera/CCTV
Fast inference engine for Transformer models
lightweight, standalone C++ inference engine for Google's Gemma models
Tensor search for humans
Superduper: Integrate AI models and machine learning workflows
A GPU-accelerated library containing highly optimized building blocks
A real time inference engine for temporal logical specifications
Lightweight inference library for ONNX files, written in C++
Toolbox of models, callbacks, and datasets for AI/ML researchers
Deep learning inference framework optimized for mobile platforms