AIMET is a library that provides advanced quantization and compression
An easy-to-use LLMs quantization package with user-friendly apis
A library for accelerating Transformer models on NVIDIA GPUs
PyTorch library of curated Transformer models and their components
Open platform for training, serving, and evaluating language models
Visual Instruction Tuning: Large Language-and-Vision Assistant
A graphical manager for ollama that can manage your LLMs
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere