Clean and efficient FP8 GEMM kernels with fine-grained scaling
FlashMLA: Efficient Multi-head Latent Attention Kernels
An experimental version of DeepSeek model
A Powerful Native Multimodal Model for Image Generation
Runtime extension of Proximus enabling Deployment on AMD Ryzen™ AI