python voice synthesis free download

Showing 72 open source projects for "python voice synthesis"

View related business solutions

Multimedia Python Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
1

FFsubsync

Automagically synchronize subtitles with video

...First, make sure ffmpeg is installed. Make sure ffmpeg is on your path and can be referenced from the command line! Next, grab the script. It should work with both Python 2 and Python 3. There may be occasions where you have a correctly synchronized srt file in a language you are unfamiliar with, as well as an unsynchronized srt file in your native language. In this case, you can use the correctly synchronized srt file directly as a reference for synchronization, instead of using the video as the reference. ffsubsync uses the file extension to decide whether to perform voice activity detection on the audio or to directly extract speech from an srt file. ffsubsync usually finishes in 20 to 30 seconds, depending on the length of the video.

Downloads: 39 This Week

Last Update: 2026-07-24
See Project
2

PersonaPlex

PersonaPlex code

PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional...

Downloads: 0 This Week

Last Update: 2026-03-02
See Project
3

Speakr

Speakr is a personal, self-hosted web application

Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications. Behind the scenes, Speakr leverages modern TTS engines and streaming audio technologies to deliver smooth and...

Downloads: 0 This Week

Last Update: 2026-07-15
See Project
4

ML Sharp

Sharp Monocular View Synthesis in Less Than a Second

ML Sharp is a research code release that turns a single 2D photograph into a photorealistic 3D representation that can be rendered from nearby viewpoints. Instead of requiring multi-view input, it predicts the parameters of a 3D Gaussian scene representation directly from one image using a single forward pass through a neural network. The core idea is speed: the 3D representation is produced in under a second on a standard GPU, and then the resulting scene can be rendered in real time to...

Downloads: 3 This Week

Last Update: 2026-01-29
See Project
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
5

SpotSeekBot

spotify music downloader telegram bot (tracks, albums, playlists)

SpotSeekBot is a Discord music bot designed to stream and control Spotify-based playback within voice channels. It allows users to search for tracks, play songs, and manage queues directly through chat commands. The bot integrates with Spotify APIs to retrieve track information and playlists while using external sources for actual audio playback. It supports common playback controls such as pause, skip, and seek, enabling interactive music sessions in real time. The system is designed to...

Downloads: 0 This Week

Last Update: 2026-04-27
See Project
6

Podcastfy.ai

Transforming Multimodal Content into Captivating Multilingual Audio

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.

Downloads: 0 This Week

Last Update: 2024-11-16
See Project
7

3D Gaussian Splatting

Original reference implementation of "3D Gaussian Splatting"

Gaussian Splatting is the official implementation of “3D Gaussian Splatting for Real-Time Radiance Field Rendering,” a research project for reconstructing and rendering 3D scenes from collections of images. The system represents scenes as millions of optimized 3D Gaussians rather than traditional meshes or neural fields, allowing high-quality novel view synthesis with real-time rendering performance. It includes training scripts, rendering tools, scene conversion utilities, and viewers for...

Downloads: 7 This Week

Last Update: 2026-05-08
See Project
8

Pixal3D

Pixel-Aligned 3D Generation from Images

Pixal3D is a TencentARC research project for generating high-fidelity 3D assets from a single input image. It addresses a key weakness in image-to-3D generation: many models produce plausible 3D shapes but fail to preserve pixel-level faithfulness to the original image. Pixal3D improves this by explicitly lifting image features into 3D through back-projection, creating clearer correspondences between the input pixels and the generated asset. The system is designed to produce detailed...

Downloads: 2 This Week

Last Update: 2026-06-23
See Project
9

Music Assistant

Music Assistant is a free, opensource Media library manager

Music Assistant Server is the core backend for Music Assistant, a free and open-source music library manager for local and online music sources. It connects streaming services, local files, metadata providers, and many speaker ecosystems into one centralized music system. The server is designed to run on an always-on device such as a Raspberry Pi, NAS, Intel NUC, or similar home server. It can work as a standalone product, but it is especially tailored for Home Assistant users who want...

Downloads: 5 This Week

Last Update: 3 days ago
See Project
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
10

AudioNotes

Extract audio and video content and organize it into a Markdown note

AudioNotes is an application (or proof-of-concept) that likely combines audio recording or playback with note-taking or annotation functionality — enabling users to record voice or audio and attach textual or timestamped notes, making it ideal for lectures, interviews, meetings, or personal memos. Such a tool offers a more expressive and flexible way to capture and revisit information: instead of just typed notes or raw audio, users get both audio context and structured notes. As an...

Downloads: 1 This Week

Last Update: 2026-07-22
See Project
11

Loris

C++ class library for sound analysis, synthesis, and morphing

Loris is a library for sound analysis, synthesis, and morphing, developed by Kelly Fitz and Lippold Haken at the CERL Sound Group. Loris includes a C++ class library, Python module, C-linkable interface, command line utilities, and documentation. Loris development has moved to GitHub: https://github.com/kellyfitz/loris This SourceForge project is no longer maintained. Its files remain available as an archive of past releases.

1 Review

Downloads: 62 This Week

Last Update: 2026-07-16
See Project
12

gmf_synth

A graphical interface GUI for Fluidsynth Soundfont Player

A graphical interface for software synthesizer or sound-samplers. Currently supported is fluidsynth. Can be used to play SoundFonts, SF2 and MIDI files. Required is an installation of fluidsynth. Written in Python / Qt4.

Downloads: 0 This Week

Last Update: 2026-06-21
See Project
13

Free Karaoke File Maker

Free Karaoke File Maker

You can hide the singer's voice from the music files that cannot hide the voice in the computer. By default, it will be saved with 2 audio tracks of singer + melody. If you want to save only the melody without the singer's voice, you have to select the No Vocal option. To save the output file, click Save Folder and choose the location you want to save (Default: Desktop). If you are sure of the above preparations, you can change the file you want to change by holding down the mouse and...

Downloads: 2 This Week

Last Update: 2024-12-24
See Project
14

Color to Waveform

Convert colors to synth presets

The purpose of the program is to convert a color to a waveform you can use as a synthesizer oscillator inside a DAW such as FL Studio from Image Line. Many synths are provided with an option to load your own waveform, to replace the basic saw, square and sine waveforms commonly used to create synth sounds. The waveform generated by the program will correspond to the subliminal synesthetic sensation of the selected color. You can create your own synth presets to use in a track using color as a base.

Downloads: 0 This Week

Last Update: 2024-09-16
See Project
15

VCClient

Software that uses AI to perform real-time voice conversion

VCClient is a real-time voice conversion system that uses machine learning models to transform a speaker’s voice into another voice with minimal latency. It is designed for live applications such as streaming, gaming, and virtual communication, where immediate feedback is essential. The system supports multiple voice conversion models, including RVC and other neural network-based approaches, allowing users to switch between different voices or customize their output. It provides both a...

Downloads: 50 This Week

Last Update: 2026-03-23
See Project
16

ritaos

RitaOS is a free and open-source software collection featuring offline

RitaOS is an open-source project providing free software for education, healthcare, multimedia and communication. Current projects include: • RitaOS MediaConverter • RitaOS Visual Voice • RitaOS BridgeCopy • RitaOS Care Communicator • RitaWiki • RitaOS Toolkit Website: https://ritaos.de All software is provided free of charge. Use at your own risk.

Downloads: 5 This Week

Last Update: 2026-07-26
See Project
17

Internet DJ Console

A feature packed DJ console and internet radio client for Linux users

Conceived as an internet radio Shoutcast/Icecast client and DJ console IDJC has two main media players, a background track player, effects buttons, crossfader, webm, aac, ogg, and mp3 streaming, stream automation timers, aux input, voice and VoIP integration. Media file formats include: mp3, ogg, flac, wma, wav, m4a, m3u, xspf, pls, and cue sheet support, IRC track and station announcements, uses jack audio connection kit to provide a flexible audio chain. This list of features is by no...

32 Reviews

Downloads: 7 This Week

Last Update: 2026-01-10
See Project
18

FluidPatcher

A performance-oriented patch interface for FluidSynth

FluidPatcher is a performance-oriented interface for FluidSynth built using wxpython to create a simple GUI that allows live editing, selecting, and playing of patches. A patch is a collection of settings such as soundfont presets for each MIDI channel, control-change/sysex messages to send when the patch is selected, and midi router or effects settings. Groups of patches are stored in banks, which are saved as human-readable and -editable YAML files. This allows a musician to easily create...

Downloads: 0 This Week

Last Update: 2026-05-06
See Project
19

DALL-E 2 - Pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch. The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best performing variant (but which incidentally involves a causal transformer as...

Downloads: 2 This Week

Last Update: 2023-10-19
See Project
20

Text to Waveform

Create synth presets from words

Convert words to waveforms you can load into a synthesizer oscillator to create synth presets. Have fun turning your name, your friends' names, your city name, your pet's name, your team's name into synth presets you can use to produce a track.

Downloads: 0 This Week

Last Update: 2023-12-09
See Project
21

PicResize

A simple pic resizer

A simple pic resizer working with drag and drop. Drag and drop an image file on a shortcut to the program, input width or height, confirm, find your resized image in the same folder with new dimensions in the file name.

Downloads: 2 This Week

Last Update: 2023-12-09
See Project
22

Telegram WebRTC (VoIP)

Voice chats, private incoming and outgoing calls in Telegram

Telegram WebRTC (VoIP) is a Python and C++ library that enables real-time voice and video communication features for Telegram bots and clients. It provides an interface for joining, managing, and streaming audio or video in Telegram group calls and voice chats. The library is built on top of low-level communication protocols, ensuring efficient handling of real-time media streams.

Downloads: 0 This Week

Last Update: 2026-05-01
See Project
23

Spleeter

Deezer source separation library including pretrained models

Spleeter is the Deezer source separation library with pretrained models written in Python and using Tensorflow. It makes it easy to train music source separation models (assuming you have a dataset of isolated sources), and provides already trained state of the art models for performing various flavours of separation. 2 stems and 4 stems models have state of the art performances on the musdb dataset. Spleeter is also very fast as it can perform separation of audio files to 4 stems 100x...

1 Review

Downloads: 38 This Week

Last Update: 2021-09-03
See Project
24

Swami Project

A SoundFont editor and other software for editing, managing and sharing sample based MIDI instrument files for computer music composition. Support for other formats is planned.

3 Reviews

Downloads: 4 This Week

Last Update: 2019-03-09
See Project
25

JAVT - Just Another Voice Transformer

Just Another Speech Recognition and Text to Speech software.

JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.

Downloads: 0 This Week

Last Update: 2020-08-19
See Project