Menu

Tree [2d7193] checkpoint-backup /
 History

HTTPS access


File Date Author Commit
 .circleci 2021-07-29 BenAAndrew BenAAndrew [cd51ad] Add config and badge
 application 2021-08-09 BenAAndrew BenAAndrew [2d7193] Add disk usage label
 dataset 2021-07-31 BenAAndrew BenAAndrew [3ee50d] Fix create dataset CLI
 synthesis 2021-07-29 BenAAndrew BenAAndrew [59774b] Improve comment coverage
 tests 2021-08-09 BenAAndrew BenAAndrew [46cd87] Add backup checkpoint slider
 training 2021-08-09 BenAAndrew BenAAndrew [2d7193] Add disk usage label
 .coveragerc 2021-07-28 BenAAndrew BenAAndrew [42c4ca] Add silence padding slider to synthesis
 .gitignore 2021-07-26 BenAAndrew BenAAndrew [1a2486] extend_existing_dataset
 Dockerfile 2021-03-10 BenAAndrew BenAAndrew [ccafa8] init
 LICENSE 2021-03-10 BenAAndrew BenAAndrew [dbcce7] Initial commit
 README.md 2021-08-01 BenAAndrew BenAAndrew [6e0034] Add extra badges
 faqs.md 2021-07-23 BenAAndrew BenAAndrew [7d77ca] Update faqs.md
 glow.py 2021-06-04 BenAAndrew BenAAndrew [fc6080] Remove unused imports and variables
 install.md 2021-07-23 BenAAndrew BenAAndrew [027e3d] Update python versions to 3.6
 install.sh 2021-07-23 BenAAndrew BenAAndrew [027e3d] Update python versions to 3.6
 libdeepspeech.so 2021-07-24 BenAAndrew BenAAndrew [b89995] Fix compiling issues
 main.py 2021-07-17 BenAAndrew BenAAndrew [918c17] Use both Silero and DeepSpeech
 maintenance.md 2021-07-26 BenAAndrew BenAAndrew [d1ed25] Remove synonym suggestion
 preview.png 2021-03-10 BenAAndrew BenAAndrew [ccafa8] init
 pyproject.toml 2021-07-29 BenAAndrew BenAAndrew [59774b] Improve comment coverage
 requirements-cpu.txt 2021-07-28 BenAAndrew BenAAndrew [b94fb6] Update requirements-cpu.txt
 requirements.txt 2021-07-28 BenAAndrew BenAAndrew [533956] Add multi-line synthesis

Read Me

Voice Cloning App

CircleCI
Discord
codecov
comment
comment

A Python/Pytorch app for easily synthesising human voices

Preview

Discord Server

Video guide

Voice Sharing Hub

FAQ's

System Requirements

  • Windows 10 or Ubuntu 20.04+ operating system
  • 5GB+ Disk space
  • NVIDIA GPU with at least 4GB of memory & driver version 450.36+ (optional)

Key features

  • Automatic dataset generation
  • Additional language support
  • Local & remote training
  • Easy train start/stop
  • Tools for extracting kindle & audible as data sources
  • Data importing/exporting
  • Word replacement suggestion
  • Multi GPU support

Manual Guides

Future Improvements

  • Add support for alternative models
  • Improved batch size estimation
  • AMD GPU support

Other resources

Acknowledgements

This project uses a reworked version of Tacotron2 & Waveglow. All rights for belong to NVIDIA and follow the requirements of their BSD-3 licence.

Additionally, the project uses DSAlign, Silero, DeepSpeech & hifi-gan.

Thank you to Dr. John Bustard at Queen's University Belfast for his support throughout the project.

Supported by uberduck.ai, reach out to them for live model hosting.

Also a big thanks to the members of the VocalSynthesis subreddit for their feedback.

Finally thank you to everyone raising issues and contributing to the project.

MongoDB Logo MongoDB