AIMET is a library that provides advanced quantization and compression
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
A unified framework for scalable computing
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Pytorch domain library for recommendation systems
Lightweight Python library for adding real-time multi-object tracking
A high-performance ML model serving framework, offers dynamic batching
Large Language Model Text Generation Inference
Easiest and laziest way for building multi-agent LLMs applications
Fast inference engine for Transformer models
Open platform for training, serving, and evaluating language models
Phi-3.5 for Mac: Locally-run Vision and Language Models
Multilingual Automatic Speech Recognition with word-level timestamps
MII makes low-latency and high-throughput inference possible
High quality, fast, modular reference implementation of SSD in PyTorch
A GPU-accelerated library containing highly optimized building blocks
Serve machine learning models within a Docker container
A computer vision framework to create and deploy apps in minutes