New family of code large language models (LLMs)
Controllable & emotion-expressive zero-shot TTS
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Pokee Deep Research Model Open Source Repo
Unified Multimodal Understanding and Generation Models
DeepMind model for tracking arbitrary points across videos & robotics
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Language modeling in a sentence representation space
Advancing Formal Mathematical Reasoning via Reinforcement Learning
Clean and efficient FP8 GEMM kernels with fine-grained scaling
FlashMLA: Efficient Multi-head Latent Attention Kernels
Renderer for the harmony response format to be used with gpt-oss
A Powerful Native Multimodal Model for Image Generation
Implementation of the Surya Foundation Model for Heliophysics
A SOTA open-source image editing model
Safety reasoning models built-upon gpt-oss
Diversity-driven optimization and large-model reasoning ability
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
Open-source framework for intelligent speech interaction
Large Multimodal Models for Video Understanding and Editing
MiniMax-M2, a model built for Max coding & agentic workflows
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based Reinforcement Learning audio edit model
Open-weight, large-scale hybrid-attention reasoning model