Browse free open source Speech software and projects below. Use the toggles on the left to filter open source Speech software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    eSpeak: speech synthesis
    Text to Speech engine for English and many other languages. Compact size with clear but artificial pronunciation. Available as a command-line program with many options, a shared library for Linux, and a Windows SAPI5 version.
    Leader badge
    Downloads: 2,246 This Week
    Last Update:
    See Project
  • 2
    Buzz

    Buzz

    Transcribe and translate audio offline on your personal computer

    Buzz transcribes and translates audio to text offline using OpenAI's Whisper. Import audio and video files into Buzz and export them as TXT, SRT, or VTT files. Buzz supports Whisper, Whisper.cpp, Faster Whisper, Whisper-compatible models from the Hugging Face repository, and the OpenAI Whisper API. Get linux versions from: - https://flathub.org/apps/io.github.chidiwilliams.Buzz - https://snapcraft.io/buzz Home page of Buzz https://github.com/chidiwilliams/buzz Note for Windows: App is not signed, you will get a warning when you install it. Select More info -> Run anyway.
    Leader badge
    Downloads: 2,863 This Week
    Last Update:
    See Project
  • 3
    WaveSurfer
    WaveSurfer is an open source tool for sound visualization and manipulation. Typical applications are speech/sound analysis and sound annotation/transcription. WaveSurfer may be extended by plug-ins as well as embedded in other applications.
    Leader badge
    Downloads: 269 This Week
    Last Update:
    See Project
  • 4
    NoiseGator (Noise Gate)

    NoiseGator (Noise Gate)

    A simple noise gate app intended for use with VOIPs like Skype.

    Ever wanted to cut out background noise when talking with others on Skype? Now it's possible! NoiseGator is a light-weight noise gate application that routes audio through an audio input to an audio output. In real-time the audio level is analysed and if the average level is higher than the threshold the audio bypasses as normal. However, if the average level goes below the threshold, the gate closes and the audio is cut. When used with a virtual audio cable it can act as a noise gate for a either a sound input(microphone) or sound output(speakers). Can also be used to gate noise from your own mic or play your microphone through your speakers. REQUIREMENTS: - Java 7 or higher for Windows. - Java 6 or higher for Mac. Java 7 recommended. - A virtual audio cable is required for use with VOIPs: For Windows users I recommend the VB-Cable driver (http://vb-audio.pagesperso-orange.fr/Cable/index.htm). Mac users can use SoundFlower.
    Leader badge
    Downloads: 486 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 5
    RHVoice

    RHVoice

    Free open source speech synthesizer for Russian and other languages

    RHVoice is a free and open-source multilingual speech synthesizer. Its developers hope to give more visually impaired people the ability to use a good free synthesis voice reading in their native language with their screen reader. We are especially interested in supporting those languages for which there are currently no good voices that could be used with a screen reader. The creator of RHVoice, Olga Yakovleva, is blind herself. Many of the contributors to the RHVoice project, both programmers and non-programmers, are blind or partially sighted.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 6
    eGuideDog free software for the blind
    eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.
    Leader badge
    Downloads: 163 This Week
    Last Update:
    See Project
  • 7
    Mumble

    Mumble

    Low-latency, high quality voice chat for gamers

    Mumble is an open source, low-latency, high quality voice chat software primarily intended for use while gaming. It includes game linking, so voice from other players comes from the direction of their characters, and has echo cancellation so the sound from your loudspeakers won't be audible to other players.
    Leader badge
    Downloads: 129 This Week
    Last Update:
    See Project
  • 8
    Open JTalk is a Japanese text-to-speech synthesis system. This software is released under the Modified BSD license.
    Leader badge
    Downloads: 527 This Week
    Last Update:
    See Project
  • 9
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 18 This Week
    Last Update:
    See Project
  • Payments you can rely on to run smarter. Icon
    Payments you can rely on to run smarter.

    Never miss a sale. Square payment processing serves customers better with tools and integrations that make work more efficient.

    Accept payments at your counter or on the go. It’s easy to get started. Try the Square POS app on your phone or pick from a range of hardworking hardware.
    Learn More
  • 10
    FreeTTS is a speech synthesis engine written entirely in the Java(tm) programming language. FreeTTS was written by the Sun Microsystems Laboratories Speech Team and is based on CMU's Flite engine. FreeTTS also includes a partial JSAPI 1.0
    Leader badge
    Downloads: 236 This Week
    Last Update:
    See Project
  • 11
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip install SpeechRecognition. The first software requirement is Python 2.6, 2.7, or Python 3.3+. This is required to use the library. PyAudio is required if and only if you want to use microphone input (Microphone). PyAudio version 0.2.11+ is required, as earlier versions have known memory management bugs when recording from microphones in certain situations. To hack on this library, first make sure you have all the requirements listed in the "Requirements" section.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 12
    Simple TTS Reader

    Simple TTS Reader

    A small clipboard reader

    Simple TTS Reader is a small utility that reads text from your clipboard using Microsoft Speech API. Whenever you copy any text, the app instantly converts it into spoken words. Select your preferred speech engine from those installed on your system, such as Microsoft Zira, and adjust speed and volume for personalized playback. The application can also be minimized to the system tray. Plus, it is free and comes with an intuitive interface that makes it accessible to everyone.
    Leader badge
    Downloads: 98 This Week
    Last Update:
    See Project
  • 13
    MMDAgent is the toolkit for building voice interaction systems. Users can design users own dialog scenario, 3D agents, and voices. This software is released under the Modified BSD license.
    Leader badge
    Downloads: 99 This Week
    Last Update:
    See Project
  • 14
    hts_engine is software to synthesize speech waveform from HMMs trained by the HMM-based speech synthesis system (HTS). This software is released under the Modified BSD license.
    Leader badge
    Downloads: 220 This Week
    Last Update:
    See Project
  • 15
    TTS

    TTS

    Deep learning for text to speech

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed, and quality. TTS comes with pre-trained models, tools for measuring dataset quality, and is already used in 20+ languages for products and research projects. Released models in PyTorch, Tensorflow and TFLite. Tools to curate Text2Speech datasets underdataset_analysis. Demo server for model testing. Notebooks for extensive model benchmarking. Modular (but not too much) code base enabling easy testing for new ideas. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings efficiently. Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN). If you are only interested in synthesizing speech with the released TTS models, installing from PyPI is the easiest option.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 17
    a tool for segmenting, labeling and transcribing speech
    Downloads: 51 This Week
    Last Update:
    See Project
  • 18
    Voxal voice changer

    Voxal voice changer

    Transform your voice in real-time voxal voice changer

    Voxal Voice Changer is a program that allows you to modify your voice by applying various effects (e.g. pitch change, echo, etc.) in real-time. Effects can be added in any sequence and in any combination, allowing you to distort your voice beyond recognition. Take your audio to the next level! Our powerful Voice Changer software lets you morph your voice in real-time with stunning AI-powered quality. Whether you're looking to have fun, protect your privacy, or create engaging content, we have the perfect voice for you. Audio can be captured from various sources, pre-listening is available, and the most popular audio formats are supported.
    Leader badge
    Downloads: 45 This Week
    Last Update:
    See Project
  • 19
    The project provides a ready-to-use interface for the julius CSR engine for a handicapped child which is not able to use the keyboard well. It integrates into X11 and Windows. Find out how you can help: http://simon-listens.org/index.php?support
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    OpenOffice.org Export As DAISY
    odt2daisy is an OpenOffice.org Writer extension, enabling to export in DAISY XML, Full DAISY (xml+audio) and Audiobook format. DAISY is an NISO Z39.86 standard for blind, visual impaired, print-disabled, and learning-disabled people.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21
    TranscriberAG is designed for assisting the manual annotation of speech signals. It provides a user-friendly GUI for segmenting long duration speech recordings, transcribing them, labeling speech turns, topic changes and acoustic conditions.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 22
    Audiobook Cutter is an easy-to-use tool which splits large speech MP3 files into smaller ones without re-encoding. The split points are determined by silent parts. The main purpose is to make audiobooks or podcasts more manageable in a user-friendly way.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 23

    Sinsy

    HMM-based singing voice synthesis system

    Sinsy is an HMM-based singing voice synthesis system. This software is released under the Modified BSD license.
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 24
    Speech Recognition in English & Polish

    Speech Recognition in English & Polish

    Speech recognition software for English & Polish languages

    Software for speech recognition in English & Polish languages. Basic versions of SkryBot: 1. SkryBot Home Speech (English Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesEnglish/InstalatorSkryBotHomeSpeechDemo-2.6.9.18117.exe/download 2. SkryBot DoMowy (Polish Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesPolish/InstalatorSkryBotDoMowyDemo-2.4.9.18117.exe/download More help: https://sourceforge.net/p/skrybotdomowy/wiki/ Domain advanced versions (Polish Language) 1. SkryBot Prawo - for judicial professionals. 2. SkryBot Administracyjny - for civil and government administration. 3. SkryBot Medycyna Rodzinna - for physicians Professional version of SkryBot (commercial) offers you: 1. Audio conversion and cutting sound files into smaller ones. 2. Searching for words or phrases in sound files (recognized by SkryBot). 3. Editing sound files and automatic cutting off long silence parts in audio file.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    GRANULE is a flashcards program based on Leitner cardfile methodology for learning new words. It features long-term memory training capabilities with scheduling, integrated pictures, sound, and full-screen mode.
    Downloads: 24 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next