The Gemini Cookbook is an official repository of examples and guides for using Google’s Gemini API. It provides a structured learning path with quick-start tutorials for beginners and practical examples for advanced users. The repository covers a wide range of Gemini capabilities, including text, images, video, speech, robotics, and multimodal interactions. It highlights newly introduced features such as Gemini 2.5 models (Flash and Pro), Gemini’s native image generation, Veo for video generation, robotics-focused reasoning models, and Lyria for TTS and music generation. The Cookbook also includes tutorials on advanced API workflows such as grounding answers with external tools, batch-mode request handling, and live multimodal interactivity with LiveAPI. Designed as a hands-on resource, it helps developers quickly explore Gemini’s potential while serving as a reference for integrating cutting-edge multimodal AI into applications.
Features
- Step-by-step quick-start tutorials for Gemini API basics
- Examples showcasing multimodal workflows (text, image, video, audio)
- Support for Gemini 2.5 models including Flash and Pro variants
- Guides for Veo (video generation), Lyria (TTS/music), and robotics models
- Advanced workflows such as grounding, batch-mode, and LiveAPI streaming
- Demo applications referenced from companion repositories