AIMET is a library that provides advanced quantization and compression
20+ high-performance LLMs with recipes to pretrain, finetune at scale
A library for accelerating Transformer models on NVIDIA GPUs
Efficient few-shot learning with Sentence Transformers
PyTorch library of curated Transformer models and their components
An easy-to-use LLMs quantization package with user-friendly apis
Open platform for training, serving, and evaluating language models
Visual Instruction Tuning: Large Language-and-Vision Assistant
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere