Run Local LLMs on Any Device. Open-source
Simplifies the local serving of AI models from any source
A Pythonic framework to simplify AI service building
Phi-3.5 for Mac: Locally-run Vision and Language Models
Standardized Serverless ML Inference Platform on Kubernetes
Libraries for applying sparsification recipes to neural networks
Data manipulation and transformation for audio signal processing
Tensor search for humans
20+ high-performance LLMs with recipes to pretrain, finetune at scale
A Unified Library for Parameter-Efficient Learning
A library for accelerating Transformer models on NVIDIA GPUs
DoWhy is a Python library for causal inference
A high-throughput and memory-efficient inference and serving engine
LLM training code for MosaicML foundation models
An MLOps framework to package, deploy, monitor and manage models
A unified framework for scalable computing
Everything you need to build state-of-the-art foundation models
Powering Amazon custom machine learning chips
Deep learning optimization library: makes distributed training easy
State-of-the-art diffusion models for image and audio generation
The official Python client for the Huggingface Hub
Uncover insights, surface problems, monitor, and fine tune your LLM
GPU environment management and cluster orchestration
A set of Docker images for training and serving models in TensorFlow
Optimizing inference proxy for LLMs