DiffSinger

DiffSinger is an open-source PyTorch implementation of a diffusion-based acoustic model for singing-voice synthesis (SVS) and also text-to-speech (TTS) in a related variant. The core idea is to view generation of a sung voice (mel-spectrogram) as a diffusion process: starting from noise, the model iteratively “denoises” while being conditioned on a music score (lyrics, pitch, musical timing). This avoids some of the typical problems of prior SVS models — like over-smoothing or unstable GAN training — and produces more realistic, expressive, and natural-sounding singing. The method introduces a “shallow diffusion” mechanism: instead of diffusing over many steps, generation begins at a shallow step determined adaptively, which leverages prior knowledge learned by a simple mel-spectrogram decoder and speeds up inference.

Features

Diffusion-based singing voice synthesis (SVS) conditioned on musical score
Support for multiple input modalities: lyrics + pitch (F0), lyrics + MIDI
Shallow diffusion mechanism for faster inference without compromising quality
Built-in vocoder integration (HiFiGAN / NSF-HiFiGAN) to convert mel-spectrogram to waveform
Also supports conventional text-to-speech (TTS), not just singing
Pretrained models and example workflows to simplify getting started

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow DiffSinger

DiffSinger Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.

Start Free

Rate This Project

User Reviews

Be the first to post a review of DiffSinger!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

9 hours ago

Report inappropriate content

DiffSinger

Singing Voice Synthesis via Shallow Diffusion Mechanism

Get an email when there's a new version of DiffSinger

Features

Project Samples

Project Activity

Categories

License

Follow DiffSinger

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered