audio free download - SourceForge

OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model

OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak naturally in others. ...

Downloads: 26 This Week

Last Update: 2025-11-28

See Project

Real-Time Voice Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that captures voice characteristics; this embedding is then used by a Tacotron-style synthesizer to generate spectrograms from text, which a WaveRNN-based vocoder finally turns into audio. ...

Downloads: 8 This Week

Last Update: 2026-03-09

See Project

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model

PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks in speech and audio, with state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey.

Downloads: 0 This Week

Last Update: 2025-03-04

See Project

Coqui TTS

A deep learning toolkit for Text-to-Speech, battle-tested in research

TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...

Downloads: 30 This Week

Last Update: 2023-12-12

See Project

Mocking Bird

Clone a voice in 5 seconds to generate arbitrary speech in real-time

MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English. ...

1 Review

Downloads: 2 This Week

Last Update: 2023-03-23

See Project

Search Results for "audio"

Showing 5 open source projects for "audio"

OpenVoice

Real-Time Voice Cloning

PaddleSpeech

Coqui TTS

Mocking Bird

Search Results for "audio"

Showing 5 open source projects for "audio"

OpenVoice

Real-Time Voice Cloning

PaddleSpeech

Coqui TTS

Mocking Bird

Related Searches

Related Categories