| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-02-07 | 1.2 kB | |
| v0.2.0 source code.tar.gz | 2025-02-07 | 44.4 MB | |
| v0.2.0 source code.zip | 2025-02-07 | 44.5 MB | |
| Totals: 3 Items | 88.9 MB | 0 | |
- Complete Model Overhaul:
- Upgraded to Kokoro v1.0 model architecture, deprecated V0.19 support
- Integration with hexgrad/kokoro and hexgrad/misaki packages
- Pre-installed all multi-language support from Misaki:
- English (en), Japanese (ja), Korean (ko), Chinese (zh), Vietnamese (vi)
- Note: This will likely controlled via env variable in upcoming versions
- All voice packs included for supported languages, along with the original versions
- Enhanced Audio Generation Features:
- Per-word timestamped caption generation
- Phoneme generation, Phoneme-Based Audio Generation (510 token cap)
- Web UI Improvements:
- Weighted voice mixing
- Text file upload support
- Improved text editor, user interface changes
What's Changed * Combine Voices endpoint now returns a .pt file, with generation combinations generated on the fly otherwise * Bumping PyTorch version to 2.6.0, CUDA 12.4 * Adjustments to Docker workflows + Incorporating Docker Bake
Contributors * @fireblade2534 * @eschmidbauer * @jteijema * @dino65-dev * @Galunid * @JoshRosen * @richardr1126
Full Changelog: v0.1.4...v0.2.0