FastKoko - Browse /v0.2.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-02-07	1.2 kB	0
v0.2.0 source code.tar.gz	2025-02-07	44.4 MB	0
v0.2.0 source code.zip	2025-02-07	44.5 MB	0
Totals: 3 Items		88.9 MB	0

Complete Model Overhaul:
Upgraded to Kokoro v1.0 model architecture, deprecated V0.19 support
Integration with hexgrad/kokoro and hexgrad/misaki packages
Pre-installed all multi-language support from Misaki:
- English (en), Japanese (ja), Korean (ko), Chinese (zh), Vietnamese (vi)
- Note: This will likely controlled via env variable in upcoming versions
All voice packs included for supported languages, along with the original versions
Enhanced Audio Generation Features:
Per-word timestamped caption generation
Phoneme generation, Phoneme-Based Audio Generation (510 token cap)
Web UI Improvements:
Weighted voice mixing
Text file upload support
Improved text editor, user interface changes

What's Changed * Combine Voices endpoint now returns a .pt file, with generation combinations generated on the fly otherwise * Bumping PyTorch version to 2.6.0, CUDA 12.4 * Adjustments to Docker workflows + Incorporating Docker Bake

Contributors * @fireblade2534 * @eschmidbauer * @jteijema * @dino65-dev * @Galunid * @JoshRosen * @richardr1126

Full Changelog: v0.1.4...v0.2.0

Source: README.md, updated 2025-02-07

FastKoko Files

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko Files

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

Get an email when there's a new version of FastKoko