MII makes low-latency and high-throughput inference possible
Libraries for applying sparsification recipes to neural networks
Superduper: Integrate AI models and machine learning workflows
A high-performance ML model serving framework, offers dynamic batching
Phi-3.5 for Mac: Locally-run Vision and Language Models
A set of Docker images for training and serving models in TensorFlow
State-of-the-art diffusion models for image and audio generation
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
Neural Network Compression Framework for enhanced OpenVINO
Openai style api for open large language models
Sparsity-aware deep learning inference runtime for CPUs
Large Language Model Text Generation Inference
Standardized Serverless ML Inference Platform on Kubernetes
Trainable models and NN optimization tools
Probabilistic reasoning and statistical analysis in TensorFlow
Efficient few-shot learning with Sentence Transformers
Multilingual Automatic Speech Recognition with word-level timestamps
Replace OpenAI GPT with another LLM in your app
Easy-to-use Speech Toolkit including Self-Supervised Learning model
PyTorch extensions for fast R&D prototyping and Kaggle farming
Official inference library for Mistral models
Open-source tool designed to enhance the efficiency of workloads
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Powering Amazon custom machine learning chips
A Unified Library for Parameter-Efficient Learning