Port of Facebook's LLaMA model in C/C++
Open-source large language model family from Tencent Hunyuan
Clean and efficient FP8 GEMM kernels with fine-grained scaling
FlashMLA: Efficient Multi-head Latent Attention Kernels
A Customizable Image-to-Video Model based on HunyuanVideo
Pretrained time-series foundation model developed by Google Research
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Towards self-verifiable mathematical reasoning
Open-weight, large-scale hybrid-attention reasoning model
Let us control diffusion models
Code release for "Masked-attention Mask Transformer
4-bit Command A+ model for enterprise agents and multilingual tasks
Efficient MoE model for reasoning, coding, and AI agent workflows
High-performance MoE model with MLA, MTP, and multilingual reasoning