Tile Kernels is a DeepSeek kernel library written with TileLang for high-performance AI and machine-learning workloads. It contains specialized kernels for areas such as mixture-of-experts routing, quantization, batched transpose operations, Engram gating, and Manifold HyperConnection components. The project includes both optimized kernel implementations and PyTorch reference versions for comparison and validation. It is aimed at developers and researchers who work close to model internals and need efficient low-level building blocks. TileKernels also includes testing and benchmarking utilities to help evaluate correctness and performance. Its main value is providing reusable TileLang-based kernels for experimental and production-adjacent deep-learning systems.
Features
- TileLang-based AI kernel library
- Mixture-of-experts routing kernels
- FP8, FP4, and E5M6 quantization support
- Batched transpose kernel implementations
- PyTorch reference implementations
- Testing and benchmarking utilities