A Pythonic framework to simplify AI service building
The Triton Inference Server provides an optimized cloud
Superduper: Integrate AI models and machine learning workflows
Sparsity-aware deep learning inference runtime for CPUs
Phi-3.5 for Mac: Locally-run Vision and Language Models
Bring the notion of Model-as-a-Service to life
Easy-to-use Speech Toolkit including Self-Supervised Learning model
AIMET is a library that provides advanced quantization and compression
Tensor search for humans
MII makes low-latency and high-throughput inference possible
Neural Network Compression Framework for enhanced OpenVINO
Database system for building simpler and faster AI-powered application
Toolbox of models, callbacks, and datasets for AI/ML researchers
Implementation of model parallel autoregressive transformers on GPUs