A toolkit to optimize ML models for deployment for Keras & TensorFlow
Replace OpenAI GPT with another LLM in your app
The Triton Inference Server provides an optimized cloud
Standardized Serverless ML Inference Platform on Kubernetes
Sparsity-aware deep learning inference runtime for CPUs
A Pythonic framework to simplify AI service building
Easy-to-use Speech Toolkit including Self-Supervised Learning model
AIMET is a library that provides advanced quantization and compression
Tensor search for humans
A unified framework for scalable computing
Images to inference with no labeling
A computer vision framework to create and deploy apps in minutes