MemoryOS is designed to provide a memory operating system
Parallax is a distributed model serving framework
LLM training in simple, raw C/CUDA
AirLLM 70B inference with single 4GB GPU
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
On the Structural Pruning of Large Language Models
DepGraph: Towards Any Structural Pruning
The official implementation of RAPTOR
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Driving with Graph Visual Question Answering
A tension reasoning engine over 131 S-class problems
Code and models for ICML 2024 paper, NExT-GPT
Open-weight, large-scale hybrid-attention reasoning model
Serving multiple LoRA finetuned LLM as one
An open-source framework for training large multimodal models