Fast, Sharp & Reliable Agentic Intelligence
Towards self-verifiable mathematical reasoning
Z80-μLM is a 2-bit quantized language model
A theoretical reconstruction of the Claude Mythos architecture
FlashMLA: Efficient Multi-head Latent Attention Kernels
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open-weight, large-scale hybrid-attention reasoning model
DeepSeek Coder: Let the Code Write Itself
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Ling is a MoE LLM provided and open-sourced by InclusionAI
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
New set of lightweight state-of-the-art, open foundation models
ICLR2024 Spotlight: curation/training code, metadata, distribution
Analyze computation-communication overlap in V3/R1
Safety reasoning models built-upon gpt-oss
Diversity-driven optimization and large-model reasoning ability
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
This repository contains the official implementation of research
The official pytorch implementation of our paper
Generate embeddings from large-scale graph-structured data
Large language model developed and released by NVIDIA
High-compute ultra-reasoning model surpassing model surpassing GPT-5
Flagship MoE model for advanced reasoning, coding, and agents
Custom BLEURT model for evaluating text similarity using PyTorch