Claude Code image, a one-stop open source transit service
Large Multimodal Models for Video Understanding and Editing
Repo of Qwen2-Audio chat & pretrained large audio language model
New set of lightweight state-of-the-art, open foundation models
Pretrained time-series foundation model developed by Google Research
Inference script for Oasis 500M
Official implementation of DreamCraft3D
Collection of Gemma 3 variants that are trained for performance
The official PyTorch implementation of Google's Gemma models
A 0.1B Omni model trained from scratch
Block Diffusion for Ultra-Fast Speculative Decoding
A Powerful Native Multimodal Model for Image Generation
Instructions on how to use the Realtime API on Microcontrollers
Long-form streaming TTS system for multi-speaker dialogue generation
Implementation of the Surya Foundation Model for Heliophysics
Production-tested AI infrastructure tools
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
ChatGPT interface with better UI
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Open-source, high-performance Mixture-of-Experts large language model
The ChatGPT Retrieval Plugin lets you easily find personal documents
Open source large language model by Alibaba
Release for Improved Denoising Diffusion Probabilistic Models
Powerful open source image generation model