Towards self-verifiable mathematical reasoning
A theoretical reconstruction of the Claude Mythos architecture
FlashMLA: Efficient Multi-head Latent Attention Kernels
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open-weight, large-scale hybrid-attention reasoning model
DeepSeek Coder: Let the Code Write Itself
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
New set of lightweight state-of-the-art, open foundation models
ICLR2024 Spotlight: curation/training code, metadata, distribution
Analyze computation-communication overlap in V3/R1
Safety reasoning models built-upon gpt-oss
Diversity-driven optimization and large-model reasoning ability
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
This repository contains the official implementation of research
The official pytorch implementation of our paper
Generate embeddings from large-scale graph-structured data
Large language model developed and released by NVIDIA
High-compute ultra-reasoning model surpassing model surpassing GPT-5
Flagship MoE model for advanced reasoning, coding, and agents
Custom BLEURT model for evaluating text similarity using PyTorch
Efficient MoE reasoning model for coding and math workloads
CLIP ViT-bigG/14: Zero-shot image-text model trained on LAION-2B
Tiny pre-trained IBM model for multivariate time series forecasting