Supertonic is a lightning-fast, on-device text-to-speech system built around ONNX Runtime for maximum speed and portability. It focuses on running entirely locally, eliminating the need for cloud APIs and providing low latency and strong privacy guarantees, even on constrained devices like Raspberry Pi boards and e-readers. The core model is highly compact at around 66 million parameters, yet benchmarks show it can generate speech up to 167× faster than real time on modern consumer hardware and significantly outpace popular cloud TTS APIs in throughput and real-time factor. Supertonic is designed to handle real-world text gracefully, including numbers, dates, currency symbols, abbreviations, and technical units, without requiring heavy pre-processing or custom text normalization. The repository provides complete reference implementations across many programming ecosystems—Python, Node.js, browser (WebGPU/WASM), Java, C++, C#, Go, Swift, iOS, Rust, and Flutter.
Features
- Ultra-fast on-device TTS with up to 167× real-time generation
- Compact ~66M parameter model optimized for CPU and WebGPU inference
- Full offline operation with no cloud calls or external APIs
- Robust handling of numbers, dates, currency, abbreviations, and technical text
- Multi-language SDK support across Python, JS, C++, Java, C#, Go, Swift, Rust, iOS, and Flutter
- Pre-built demos for Raspberry Pi, e-readers, and browser-based playback