Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression
Port of OpenAI's Whisper model in C/C++
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Foundational Models for State-of-the-Art Speech and Text Translation
StreamSpeech is a seamless model for offline speech recognition
Real-time voice interactive digital human
Image generation model with single-stream diffusion transformer
Toolkit for audio, music, and speech generation
Documentation for Google's Gen AI site - including Gemini API & Gemma
End-to-end speech processing toolkit
Workflow and speech recognition app
Framework for building neural networks
Bailing is a voice dialogue robot similar to GPT-4o
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Lightning-fast, on-device TTS, running natively via ONNX
Refer and Ground Anything Anywhere at Any Granularity
Sample code and notebooks for Generative AI on Google Cloud
C++ inference library for multiple SVC/TTS
Language modeling in a sentence representation space
The TypeScript AI agent framework
Provides CTP stock options and Zhongtai Securities XTP
fast C++ library for linear algebra & scientific computing