two free download - SourceForge

FireRedTTS-2

Long-form streaming TTS system for multi-speaker dialogue generation

FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like...

Downloads: 0 This Week

Last Update: 2026-02-16

See Project

MARS5

MARS5 speech model (TTS) from CAMB.AI

...To control speaker identity, MARS5 uses a short reference audio clip, typically between 2 and 12 seconds, from which it learns the voice characteristics. It supports two main inference modes: shallow clone, which is faster and only needs the reference audio, and deep clone, which additionally uses the transcript of the reference audio to increase similarity and naturalness at the cost of more computation.

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.

Downloads: 12 This Week

Last Update: 2026-02-24

See Project

GLM-TTS

Controllable & emotion-expressive zero-shot TTS

GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice character even for unseen speakers. The system introduces a multi-reward reinforcement learning framework that jointly optimizes for voice similarity, emotional expressiveness, pronunciation, and intelligibility, yielding output that can rival commercial options in naturalness and expressiveness. ...

Downloads: 0 This Week

Last Update: 2026-04-10

See Project

fairseq2

FAIR Sequence Modeling Toolkit 2

...It supports multi-GPU and multi-node distributed training using DDP, FSDP, and tensor parallelism, capable of scaling up to 70B+ parameter models. The framework integrates seamlessly with PyTorch 2.x features such as torch.compile, Fully Sharded Data Parallel (FSDP), and modern configuration management.

Downloads: 0 This Week

Last Update: 2026-03-26

See Project

Dia2

TTS model capable of streaming conversational audio in realtime

...The model supports audio conditioning, allowing generated speech to follow a reference voice or conversational style more naturally. Dia2 provides 1B and 2B model checkpoints along with inference code for research and experimentation. It currently focuses on English generation and supports up to two minutes of generated audio. Its main value is enabling low-latency, dialogue-oriented TTS workflows where timing, turn-taking, and natural conversation matter.

Downloads: 0 This Week

Last Update: 2026-06-08

See Project

StyleTTS 2

Towards Human-Level Text-to-Speech through Style Diffusion

...It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide generation toward more natural and coherent utterances. StyleTTS2 supports both single-speaker and multi-speaker configurations, with the ability to sample or transfer styles from reference audio, making it powerful for expressive TTS and character voices. ...

Downloads: 5 This Week

Last Update: 2025-11-28

See Project

TITTSE

Two Integrated Text To Speech Engines uses MMS & Silero

TITTSE is a Python Application that allows you to easily and quickly convert text to speech in 15 different languages (or add more easily) using Two TTS Engines. All you need is a text file ending in the tittse extension with 4 header lines including the TITTSE language code (see documentation for your language), the 'base' file name for the audio files TITTSE creates, voice gender (girl or boy), offset (file numbers added to base file name start at this number). After those first four lines, every paragraph is created as a single audio file. ...

Downloads: 11 This Week

Last Update: 7 days ago

See Project

VITS

Conditional Variational Autoencoder with Adversarial Learning

VITS is a foundational research implementation of “VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech,” a well-known neural TTS architecture. Unlike traditional two-stage systems that separately train an acoustic model and a vocoder, VITS trains an end-to-end model that maps text directly to waveform using a conditional variational autoencoder combined with normalizing flows and adversarial training. This architecture enables parallel generation (fast inference) while achieving speech quality that rivals or surpasses many two-stage systems. ...

Downloads: 1 This Week

Last Update: 2025-11-28

See Project

Transformer TTS

Implementation of a Transformer based neural network

TransformerTTS is an implementation of a non-autoregressive Transformer-based neural network for text-to-speech, built with TensorFlow 2. It takes inspiration from architectures like FastSpeech, FastSpeech 2, FastPitch, and Transformer TTS, and extends them with its own aligner and forward models. The system separates alignment learning and acoustic modeling: an autoregressive Transformer is used as an aligner to extract phoneme-to-frame durations, while a non-autoregressive “ForwardTransformer” generates mel-spectrograms conditioned on text and durations. ...

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Resemblyzer

A python package to analyze and compare voices with deep learning

...The project is useful for researchers and developers who need a practical way to reason about speaker identity without building a voice encoder from scratch. It can help identify whether two recordings sound like the same speaker or visualize voice relationships across many samples. Its main value is making speaker representation accessible through a simple Python workflow.

Downloads: 1 This Week

Last Update: 2026-06-10

See Project

DC-TTS

TensorFlow Implementation of DC-TTS: yet another text-to-speech model

...It follows the “Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention” paper, but the author adapts and extends the design to make it practical for real experiments. The model is split into two networks: Text2Mel, which maps text to mel-spectrograms, and SSRN (spectrogram super-resolution network), which converts low-resolution mel-spectrograms into high-resolution magnitude spectrograms suitable for waveform synthesis. Training scripts, data loaders, and hyperparameter configurations are provided to reproduce results on several datasets, including LJ Speech for English, a Korean single-speaker dataset, and audiobook data from Nick Offerman and Kate Winslet.

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Search Results for "two"

Showing 12 open source projects for "two"

FireRedTTS-2

MARS5

Kitten TTS

GLM-TTS

fairseq2

Dia2

StyleTTS 2

TITTSE

VITS

Transformer TTS

Resemblyzer

DC-TTS

Search Results for "two"

Showing 12 open source projects for "two"

FireRedTTS-2

MARS5

Kitten TTS

GLM-TTS

fairseq2

Dia2

StyleTTS 2

TITTSE

VITS

Transformer TTS

Resemblyzer

DC-TTS

Related Searches

Related Categories