Tacotron-2 is a TensorFlow implementation of DeepMind’s Tacotron-2 end-to-end text-to-speech architecture, which predicts mel spectrograms from raw text and then feeds them to a neural vocoder such as WaveNet. It reproduces the original paper’s hyperparameters exactly via paper_hparams.py, while also offering a tuned hparams.py with extra improvements that often yield better audio quality in practice. The repository is structured as a full training pipeline: dataset preparation, preprocessing into spectrograms, Tacotron training, WaveNet (or Griffin-Lim) vocoder training, and final waveform synthesis. It includes directory layouts and logging directories for multiple datasets such as LJSpeech and M-AILABS en_US/en_UK, making it easier to adapt to new English corpora. Separate log trees track mel-spectrograms, attention plots, evaluation audio, and vocoder outputs, so you can inspect how alignment and audio quality evolve over time.

Features

  • Full TensorFlow implementation of Tacotron-2 with paper-accurate and enhanced hyperparameter sets
  • End-to-end pipeline from raw audio datasets (e.g., LJSpeech, M-AILABS) through preprocessing, Tacotron training, and vocoder training
  • Support for both WaveNet vocoder and Griffin-Lim inversion for mel-to-waveform synthesis
  • Detailed repository structure with logs for mel-spectrograms, attention plots, evaluation audio, and vocoder outputs
  • Modular training scripts (preprocess.py, train.py, synthesize.py, wavenet_preprocess.py) for flexible experimentation
  • Example configurations that replicate the original paper results and variants that push for improved stability and quality

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

MIT License

Follow Tacotron-2

Tacotron-2 Web Site

Other Useful Business Software
Try Google Cloud Risk-Free With $300 in Credit Icon
Try Google Cloud Risk-Free With $300 in Credit

No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Tacotron-2!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

2025-11-28