FastKoko
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
...It supports multiple languages and voicepacks and allows phoneme based generation for more accurate pronunciation and prosody. The server also offers per-word timestamped captions, which makes it useful for creating subtitles or aligning audio with text. A built in web UI, API documentation, and debug endpoints for monitoring system status help users explore voices, test requests, and integrate the service into larger systems.