Efficient few-shot learning with Sentence Transformers
An easy-to-use LLMs quantization package with user-friendly apis
The unofficial python package that returns response of Google Bard
State-of-the-art diffusion models for image and audio generation
PyTorch extensions for fast R&D prototyping and Kaggle farming
Easy-to-use deep learning framework with 3 key features
A set of Docker images for training and serving models in TensorFlow
A library for accelerating Transformer models on NVIDIA GPUs
Standardized Serverless ML Inference Platform on Kubernetes
MII makes low-latency and high-throughput inference possible
20+ high-performance LLMs with recipes to pretrain, finetune at scale
GPU environment management and cluster orchestration
AIMET is a library that provides advanced quantization and compression
Powering Amazon custom machine learning chips
Phi-3.5 for Mac: Locally-run Vision and Language Models
A Unified Library for Parameter-Efficient Learning
Library for serving Transformers models on Amazon SageMaker
Deep learning optimization library: makes distributed training easy
PyTorch library of curated Transformer models and their components
State-of-the-art Parameter-Efficient Fine-Tuning
Simplifies the local serving of AI models from any source
Unified Model Serving Framework
Low-latency REST API for serving text-embeddings
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
LLM training code for MosaicML foundation models