VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz, enabling high audio fidelity with efficient processing of long sequences. The model integrates a Qwen2.5-based large language model with a diffusion head to produce realistic acoustic details and capture conversational context. Training involved curriculum learning with increasing sequence lengths up to 65K tokens, allowing VibeVoice to handle very long dialogues effectively. Safety mechanisms include an audible disclaimer and imperceptible watermarking in all generated audio to mitigate misuse risks.

Features

  • Open-source TTS model for expressive, long-form conversational speech
  • Generates up to 90 minutes of audio with up to 4 distinct speakers
  • Continuous acoustic & semantic tokenizers at 7.5 Hz for fidelity and efficiency
  • Integrates Qwen2.5-1.5B LLM with a diffusion head for context and realism
  • Curriculum-trained on sequences up to 65K tokens for long dialogues
  • Embedded audible disclaimer and imperceptible watermark in all outputs

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow VibeVoice

VibeVoice Web Site

Other Useful Business Software
$300 Free Credits for Your Google Cloud Projects Icon
$300 Free Credits for Your Google Cloud Projects

Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
Start Free Trial
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VibeVoice!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Models, Python Text-to-Speech (TTS) Models

Registered

2025-12-08