A state-of-the-art open visual language model
Accessible large language models via k-bit quantization for PyTorch
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat
PyTorch library of curated Transformer models and their components
NeurIPS2025 Spotlight] Quantized Attention
Low-code framework for building custom LLMs, neural networks
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Capable of understanding text, audio, vision, video
Visual Instruction Tuning: Large Language-and-Vision Assistant
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere