ChatGLM2-6B: An Open Bilingual Chat LLM
The official repo of Qwen chat & pretrained large language model
DepGraph: Towards Any Structural Pruning
Data Lake for Deep Learning. Build, manage, and query datasets
Make your agents learn from experience
High-performance inference framework for large language models
NeurIPS2025 Spotlight] Quantized Attention
A Next-Generation Training Engine Built for Ultra-Large MoE Models
Open-source, local-first memory for any tool-capable LLM agent
Accelerate local LLM inference and finetuning
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
Find the local LLM that actually runs and performs best
GPT4V-level open-source multi-modal model based on Llama3-8B
A New Axis of Sparsity for Large Language Models
On the Structural Pruning of Large Language Models
The Cradle framework is a first attempt at General Computer Control
Implementation for MatMul-free LM
MobileLLM Optimizing Sub-billion Parameter Language Models
Diversity-driven optimization and large-model reasoning ability
Chinese and English multimodal conversational language model
Tensor search for humans
Capable of understanding text, audio, vision, video
Open-source, high-performance Mixture-of-Experts large language model
Run Mixtral-8x7B models in Colab or consumer desktops
AIlice is a fully autonomous, general-purpose AI agent