Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Text to Speech Software
Search Results

Search Results for "python text" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Linux 100
Windows 94
Mac 93
More...
BSD 57
ChromeOS 57
Desktop Operating Systems 1
Mobile Operating Systems 1

Category

Artificial Intelligence 100
Multimedia 4
Communications 1
Desktop Environment 1
Education 1
Games 1
System 1
Terminals 1

License

OSI-Approved Open Source 96
Public Domain 1

Translations

German 2
Bengali 1
English 1

Programming Language

Python 92
C 3
JavaScript 3
BASIC 1
More...
C++ 1
C# 1
Java 1
PHP 1
TypeScript 1
Unix Shell 1

Status

Beta 5
Alpha 3
Production/Stable 2

Showing 100 open source projects for "python text"

View related business solutions

Text to Speech Linux Clear Filters & Widen Search

Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

wukong-robot

Chinese voice dialogue robot/smart speaker project

wukong-robot is a Chinese voice assistant / smart speaker project built to let makers and hackers design highly customizable voice-controlled devices. It combines wake-word detection, automatic speech recognition, natural language understanding, and text-to-speech into a single framework aimed at the Chinese-speaking ecosystem. The project is positioned as a simple, flexible, and elegant platform that can run on devices like Raspberry Pi and other Linux-based boards, making it suitable for...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
2

Audio Webui

A webui for different audio related Neural Networks

Audio Webui is a Gradio-based web user interface that unifies a wide range of audio-related neural networks under a single, accessible front end. It is designed as an “all-in-one” environment where users can experiment with text-to-speech, voice cloning, generative music, and other neural audio models without writing boilerplate code. The project supports multiple back-end models and toolchains (such as Bark, RVC, AudioLDM, Audiocraft, and other text-to-audio or voice-cloning tools),...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
3

DiffSinger

Singing Voice Synthesis via Shallow Diffusion Mechanism

DiffSinger is an open-source PyTorch implementation of a diffusion-based acoustic model for singing-voice synthesis (SVS) and also text-to-speech (TTS) in a related variant. The core idea is to view generation of a sung voice (mel-spectrogram) as a diffusion process: starting from noise, the model iteratively “denoises” while being conditioned on a music score (lyrics, pitch, musical timing). This avoids some of the typical problems of prior SVS models — like over-smoothing or unstable GAN...

Downloads: 32 This Week

Last Update: 2025-11-28
See Project
4

WaveRNN

WaveRNN Vocoder + TTS

WaveRNN is a PyTorch implementation of DeepMind’s WaveRNN vocoder, bundled with a Tacotron-style TTS front end to form a complete text-to-speech stack. As a vocoder, WaveRNN models raw audio with a compact recurrent neural network that can generate high-quality waveforms more efficiently than many traditional autoregressive models. The repository includes scripts and code for preprocessing datasets such as LJSpeech, training Tacotron to produce mel spectrograms, training WaveRNN on those...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
Earn up to 16% annual interest with Nexo.
Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
5

Mocking Bird

Clone a voice in 5 seconds to generate arbitrary speech in real-time

MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English....

1 Review

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
6

VoiceFixer

General Speech Restoration

VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that...

Downloads: 13 This Week

Last Update: 2025-11-28
See Project
7

TensorFlowTTS

Real-Time State-of-the-art Speech Synthesis for Tensorflow 2

TensorFlowTTS is a state-of-the-art, open-source speech synthesis library built on TensorFlow 2. It offers a variety of architectures for text-to-speech, including classic and modern models such as Tacotron‑2, FastSpeech / FastSpeech2, and neural vocoders like MelGAN and Multiband‑MelGAN. Because it’s based on TensorFlow 2, it can leverage optimizations such as fake-quantization aware training and pruning — which allow models to run faster than real time and to be deployable on mobile or...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
8

VITS

Conditional Variational Autoencoder with Adversarial Learning

VITS is a foundational research implementation of “VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech,” a well-known neural TTS architecture. Unlike traditional two-stage systems that separately train an acoustic model and a vocoder, VITS trains an end-to-end model that maps text directly to waveform using a conditional variational autoencoder combined with normalizing flows and adversarial training. This architecture enables parallel generation...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
9

Transformer TTS

Implementation of a Transformer based neural network

TransformerTTS is an implementation of a non-autoregressive Transformer-based neural network for text-to-speech, built with TensorFlow 2. It takes inspiration from architectures like FastSpeech, FastSpeech 2, FastPitch, and Transformer TTS, and extends them with its own aligner and forward models. The system separates alignment learning and acoustic modeling: an autoregressive Transformer is used as an aligner to extract phoneme-to-frame durations, while a non-autoregressive...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

PaddlePaddle models

Pre-trained and Reproduced Deep Learning Models

Pre-trained and Reproduced Deep Learning Models ("Flying Paddle" official model library, including a variety of academic frontier and industrial scene verification of deep learning models) Flying Paddle's industrial-level model library includes a large number of mainstream models that have been polished by industrial practice for a long time and models that have won championships in international competitions; it provides many scenarios for semantic understanding, image classification,...

Downloads: 2 This Week

Last Update: 2022-08-01
See Project
11

HiFi-GAN

Generative Adversarial Networks for Efficient and High Fidelity Speech

HiFi-GAN is a GAN-based neural vocoder designed to generate high-fidelity speech waveforms from mel spectrograms with exceptional efficiency. It introduces a generator architecture tailored to model the periodic structure of speech and a set of discriminators that focus on different scales and periods of the waveform to better capture naturalness. The model targets a sweet spot between sample quality and generation speed, outperforming many previous GAN vocoders while being far faster than...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
12

Bangla TTS

Bangla text to speech synthesis in python

Bangla text to speech Multilingual (Bangla, English) real-time ([almost] in a GPU) speech synthesis library. Installation -------------------------------------- * Install Anaconda * conda create -n new_virtual_env python==3.6.8 * conda activate new_virtual_env * pip install -r requirements.txt * While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (>500 MB) * For fast inference, you must install tensorflow-gpu and have a NVidia GPU. ...

Downloads: 0 This Week

Last Update: 2020-09-03
See Project
13

Dragonfire

The open-source virtual assistant for Ubuntu based Linux distributions

Dragonfire is the open-source virtual assistant project for Ubuntu-based Linux distributions. Her main objective is to serve as a command and control interface to the helmet user. So that you will be able to give orders just by using your voice commands and your eye movements. That makes the helmet handsfree. We are planning to ship Dragonfire as a preinstalled software package on DragonOS Linux Distribution. DragonOS will be a Linux distribution specially designed for the helmet. It will...

Downloads: 1 This Week

Last Update: 2022-01-13
See Project
14

Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

Tacotron-2 is a TensorFlow implementation of DeepMind’s Tacotron-2 end-to-end text-to-speech architecture, which predicts mel spectrograms from raw text and then feeds them to a neural vocoder such as WaveNet. It reproduces the original paper’s hyperparameters exactly via paper_hparams.py, while also offering a tuned hparams.py with extra improvements that often yield better audio quality in practice. The repository is structured as a full training pipeline: dataset preparation,...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
15

OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition

OpenSeq2Seq is a TensorFlow-based toolkit for efficient experimentation with sequence-to-sequence models across speech and NLP tasks. Its core goal is to give researchers a flexible, modular framework for building and training encoder–decoder architectures while fully leveraging distributed and mixed-precision training. The toolkit includes ready-made models for neural machine translation, automatic speech recognition, speech synthesis, language modeling, and additional NLP tasks such as...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
16

DC-TTS

TensorFlow Implementation of DC-TTS: yet another text-to-speech model

DC-TTS is a TensorFlow implementation of the DC-TTS architecture, a fully convolutional text-to-speech system designed to be efficiently trainable while producing natural speech. It follows the “Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention” paper, but the author adapts and extends the design to make it practical for real experiments. The model is split into two networks: Text2Mel, which maps text to mel-spectrograms, and SSRN...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
17

vinuxproject

Vinux is an Ubuntu derived distribution for blind & visually impaired.

Vinux supports software text to speech and Braille support from boot-up to shutdown. Users can use installation medium to install independently with no sighted assistance required. Vinux supports command line environment speech, Desktop environment speech and magnification features. Vinux comes with an accessible suite of software and has an excellent mailing list support group.

Downloads: 8 This Week

Last Update: 2017-01-20
See Project
18

Text to Speech for Video

create wav files for video character speech by typing in dialogue

Choose from the "voices" available, and type in what you want the computer to say. A wave file called sounds.wav is stored to the output sub folder. Output is intended primarily for users who need speech for animated characters in videos.

Downloads: 1 This Week

Last Update: 2015-10-16
See Project
19

Steel TTS

A cross-platform wrapper for common text-to-speech engines in Python

Steel is a cross-platform package for using common text-to-speech (speech synthesis) engines in Python. Steel currently supports the following TTS software: - Microsoft Speech API 5 (SAPI5) - eSpeak - NS Speech Synthesis - FreeTTS Documentation: http://sourceforge.net/p/steeltts/wiki/ Bug Tracker: http://sourceforge.net/p/steeltts/tickets/ If you are interested in contributing to the Steel TTS codebase, or would like to make a feature-request, please contact the lead developer, Jasper Danielson, at jrd4@rice.edu.

Downloads: 0 This Week

Last Update: 2016-03-15
See Project
20

AarTon

AarTon is an automated text-to-speech application. It allows user to enter text in a web-based front-end and render these texts via a multi-channel sound card.

Downloads: 0 This Week

Last Update: 2013-11-14
See Project
21

Speect

Speect is a multilingual TTS system. It offers a full text-to-speech system with various API's, as well as an environment for research and development of TTS systems and voices. It is written in ANSI C and uses a plug-in mechanism for extensions. Speect also includes an extensive set of Python bindings for quick implementation of new ideas, these bindings are derived from SWIG interface files and can easily be extended for other languages supported by SWIG.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
22

voicecommand

Run Bash commands using Google voice recognition

This simple pygtk application uses ffmpeg and arecord; to record sound Google's unofficial text to speech service; to convert sound to text The python subprocess module to run the text as a shell command. The text to speech service used by this application is unofficial, and this program should therefore be considered a complete hack.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
23

HeadCalc

A program for school children to practice mental calculation. The output can be text or spoken using text to speech.

Downloads: 0 This Week

Last Update: 2013-05-16
See Project
24

Voice Conference Manager

Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers

Downloads: 0 This Week

Last Update: 2013-04-17
See Project
25

Story Talker

Story Talker is a GUI frontend to the Festival Text-to-speech Synthesis engine. It is designed for speaking long plain text or HTML files, such as e-books and web articles. It is not a screen reader for the blind. (Python, wxPython)

Downloads: 0 This Week

Last Update: 2013-03-20
See Project

Previous
1
2
3
You're on page 4
Next

Related Searches

voice cloning

bangla text to speech

text to speech

offline artificial intelligence assistant

blind

sapi5

tts voices

mega voice command

voice biometrics

voice changer software

Related Categories

Artificial Intelligence

Multimedia

Communications

Desktop Environment

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise