Technical principles related to large models
Unified KV Cache Compression Methods for Auto-Regressive Models
Redundancy-aware KV Cache Compression for Reasoning Models
A tension reasoning engine over 131 S-class problems
Data Lake for Deep Learning. Build, manage, and query datasets
Compress tool outputs, logs, files, and RAG chunks
On the Structural Pruning of Large Language Models
DepGraph: Towards Any Structural Pruning
Advanced techniques for RAG systems
The official repository for ERNIE 4.5 and ERNIEKit