High-performance inference server for text embeddings models API layer
Hypernetworks that adapt LLMs for specific benchmark tasks
Document (PDF, Word, PPTX ...) extraction and parse API
A playground to generate images from any text prompt using SD
TTS with kokoro and onnx runtime
Use Microsoft Edge's online text-to-speech service from Python
Python library and CLI tool to interface with Google Translate
Contexts Optical Compression
Official inference repo for FLUX.1 models
A robust, efficient, low-latency speech-to-text library
Code for running inference and finetuning with SAM 3 model
High-Quality Voice Cloning TTS for 600+ Languages
Qwen3-TTS is an open-source series of TTS models
Offline inference engine for art, real-time voice conversations
A TTS that fits in your CPU (and pocket)
Robust Speech Recognition via Large-Scale Weak Supervision
A Family of Open Sourced Music Foundation Models
A high-quality rapid TTS voice cloning model
A lightweight text-to-speech model with zero-shot voice cloning
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Tokenizer-Free TTS for Multilingual Speech Generation
Official inference repo for FLUX.2 models
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Google Gen AI Python SDK provides an interface for developers
The python library for real-time communication