OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to equivalent Edge voices. Because it relies on Edge’s TTS, the audio generation itself is free, and the project essentially acts as a smart proxy that handles formatting and streaming. The server supports Server-Sent Events (SSE) for streaming audio, enabling low-latency playback in chat UIs and other interactive tools. A Docker image is provided for one-command deployment, and environment variables can be used to configure default voice, language, response format, authentication, and logging options.
Features
- Local OpenAI-compatible /v1/audio/speech endpoint backed by Microsoft Edge TTS
- Support for multiple audio formats including mp3, opus, aac, flac, wav, and pcm
- SSE streaming mode for real-time text-to-speech in chat and assistant interfaces
- Mapping of OpenAI voice names to Edge voices plus option to select any Edge TTS voice directly
- Simple Docker-based deployment and optional Python development setup
- Configurable defaults for voice, language, speed, response format, and API key enforcement via environment variables