[WIP] VoiceSmith makes training text to speech models easy
...This is untested since I don't currently own a Mac. NVIDIA GPU with CUDA support is highly recommended, you can train on CPU otherwise but it will take days if not weeks. VoiceSmith currently uses a two-stage modified DelightfulTTS and UnivNet pipeline.