A library for accelerating Transformer models on NVIDIA GPUs
Data manipulation and transformation for audio signal processing
GPU environment management and cluster orchestration
MII makes low-latency and high-throughput inference possible
Library for OCR-related tasks powered by Deep Learning
Phi-3.5 for Mac: Locally-run Vision and Language Models
A Unified Library for Parameter-Efficient Learning
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
State-of-the-art Parameter-Efficient Fine-Tuning
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Simplifies the local serving of AI models from any source
Uncover insights, surface problems, monitor, and fine tune your LLM
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Unified Model Serving Framework
Low-latency REST API for serving text-embeddings
Standardized Serverless ML Inference Platform on Kubernetes
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Replace OpenAI GPT with another LLM in your app
LLM training code for MosaicML foundation models
An MLOps framework to package, deploy, monitor and manage models
A lightweight vision library for performing large object detection
Library for serving Transformers models on Amazon SageMaker
A set of Docker images for training and serving models in TensorFlow
AIMET is a library that provides advanced quantization and compression
Images to inference with no labeling