AirLLM 70B inference with single 4GB GPU
LM Studio CLI
TONL (Token-Optimized Notation Language)
A New Axis of Sparsity for Large Language Models
Cache-Augmented Generation: A Simple, Efficient Alternative to RAG
High-speed Large Language Model Serving for Local Deployment
Unifying 3D Mesh Generation with Language Models
Diversity-driven optimization and large-model reasoning ability
Implements a reference architecture for creating information systems