Technical principles related to large models
Unified KV Cache Compression Methods for Auto-Regressive Models
Redundancy-aware KV Cache Compression for Reasoning Models
Data Lake for Deep Learning. Build, manage, and query datasets
A tension reasoning engine over 131 S-class problems
Compress tool outputs, logs, files, and RAG chunks
DepGraph: Towards Any Structural Pruning
On the Structural Pruning of Large Language Models
Advanced techniques for RAG systems
The official repository for ERNIE 4.5 and ERNIEKit