An MLOps framework to package, deploy, monitor and manage models
Simplifies the local serving of AI models from any source
LLM training code for MosaicML foundation models
Optimizing inference proxy for LLMs
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
Neural Network Compression Framework for enhanced OpenVINO
Openai style api for open large language models
Operating LLMs in production
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Probabilistic reasoning and statistical analysis in TensorFlow
Libraries for applying sparsification recipes to neural networks
Single-cell analysis in Python
Training and deploying machine learning models on Amazon SageMaker
Sparsity-aware deep learning inference runtime for CPUs
Large Language Model Text Generation Inference
Easiest and laziest way for building multi-agent LLMs applications
Efficient few-shot learning with Sentence Transformers
Superduper: Integrate AI models and machine learning workflows
MII makes low-latency and high-throughput inference possible
Official inference library for Mistral models
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Adversarial Robustness Toolbox (ART) - Python Library for ML security
A Unified Library for Parameter-Efficient Learning
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Replace OpenAI GPT with another LLM in your app