We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity. In addition, we find VALL-E could preserve the speaker's emotion and acoustic environment of the acoustic prompt in synthesis.

Features

  • The pipeline of VALL-E is phoneme → discrete code → waveform
  • VALL-E generates the discrete audio codec codes based on phoneme and acoustic code prompts
  • VALL-E directly enables various speech synthesis applications
  • Zero-shot TTS, speech editing, and content creation
  • Combined with other generative AI models like GPT-3
  • VALL-E can synthesize personalized speech while maintaining the acoustic environment of the speaker prompt

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow VALL-E

VALL-E Web Site

Other Useful Business Software
Try Google Cloud Risk-Free With $300 in Credit Icon
Try Google Cloud Risk-Free With $300 in Credit

No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VALL-E!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM), Python Generative AI, Python AI Models

Registered

2023-03-22