AIMET is a library that provides advanced quantization and compression
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
A unified framework for scalable computing
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Pytorch domain library for recommendation systems
Lightweight Python library for adding real-time multi-object tracking
A high-performance ML model serving framework, offers dynamic batching
Large Language Model Text Generation Inference
Easiest and laziest way for building multi-agent LLMs applications
Open platform for training, serving, and evaluating language models
Bring the notion of Model-as-a-Service to life
Phi-3.5 for Mac: Locally-run Vision and Language Models
Multilingual Automatic Speech Recognition with word-level timestamps
MII makes low-latency and high-throughput inference possible
High quality, fast, modular reference implementation of SSD in PyTorch
Serve machine learning models within a Docker container
A computer vision framework to create and deploy apps in minutes
Lightweight anchor-free object detection model