tiktoken is a fast BPE tokeniser for use with OpenAI's models
Clean and efficient FP8 GEMM kernels with fine-grained scaling
Repo for SeedVR2 & SeedVR
Large Multimodal Models for Video Understanding and Editing
A SOTA open-source image editing model
MOSS‑TTS Family open‑source speech and sound generation model
Collection of Gemma 3 variants that are trained for performance
Accurate × Fast × Comprehensive
MiniMax-M2, a model built for Max coding & agentic workflows
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
Implementation of the Surya Foundation Model for Heliophysics
OCR expert VLM powered by Hunyuan's native multimodal architecture
The official PyTorch implementation of Google's Gemma models
Long-form streaming TTS system for multi-speaker dialogue generation
Block Diffusion for Ultra-Fast Speculative Decoding
A Powerful Native Multimodal Model for Image Generation
Instructions on how to use the Realtime API on Microcontrollers
Pretrained time-series foundation model developed by Google Research
Inference script for Oasis 500M
New set of lightweight state-of-the-art, open foundation models
Official implementation of DreamCraft3D
Global weather forecasting model using graph neural networks and JAX
code for Mesh R-CNN, ICCV 2019
Production-tested AI infrastructure tools
LLM-based Reinforcement Learning audio edit model