Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Multimedia
Sound/Audio
Speech Software
Search Results

Search Results for "open source speech to text software"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 20
Windows 19
Mac 16
More...
BSD 11
ChromeOS 7
Desktop Operating Systems 1

Category

Multimedia 24
- Sound/Audio 24
- Graphics 1
Artificial Intelligence 10
Software Development 6
Communications 5
Scientific/Engineering 3
Internet 2
Text Editors 2
Database 1
Desktop Environment 1
Social sciences 1
System 1

License

OSI-Approved Open Source 24

Translations

English 10
Chinese (Simplified) 2
French 2
German 2
More...
Arabic 1
Brazilian Portuguese 1
Japanese 1
Russian 1
Spanish 1
Thai 1
Turkish 1

Programming Language

Python 24
C++ 5
C 2
Java 2
JavaScript 2
More...
Lisp 1
Tcl 1

Status

Beta 10
Production/Stable 5
Planning 2
Pre-Alpha 2
More...
Alpha 2

Showing 24 open source projects for "open source speech to text software"

View related business solutions

Speech Python Clear Filters & Widen Search

Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
1

PersonaPlex

PersonaPlex code

PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional AI assistants typically lack. ...

Downloads: 3 This Week

Last Update: 2026-03-02
See Project
2

Moshi

A speech-text foundation model for real time dialogue

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and...

Downloads: 1 This Week

Last Update: 2024-11-05
See Project
3

SPPAS

SPPAS - the automatic annotation and analyses of speech

SPPAS is a scientific computer software package written and maintained by Brigitte Bigi of the Laboratoire Parole et Langage, in Aix-en-Provence, France. Available for free, with open source code, there is simply no other package for linguists to simple use in the automatic annotations of speech, the analyses of any kind of annotated data and the conversion of annotated files.

Downloads: 20 This Week

Last Update: 2026-04-06
See Project
4

DeepSpeech

Open source embedded speech-to-text engine

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.

Downloads: 3 This Week

Last Update: 2021-04-08
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

Defox text to speech and downloader

Written or imported text offline read or online download.

This software design to convert text to speech and download the converted speech. Description : • Installation setup with two languages (English, French) • Two areas called text reading and speech downloading • Many languages supported to download center Note 1: I'm a student yet and I'm not in the software designing industry. Therefore maybe I haven't software making skills.

1 Review

Downloads: 0 This Week

Last Update: 2019-09-27
See Project
6

Vapp IVR framework

A Python library to create sophisticated multilingual IVR applications. NOTICE. The repository is frozen, please find the latest version of the software at https://github.com/sippy/vapp

Downloads: 0 This Week

Last Update: 2015-04-14
See Project
7

Steel TTS

A cross-platform wrapper for common text-to-speech engines in Python

Steel is a cross-platform package for using common text-to-speech (speech synthesis) engines in Python. Steel currently supports the following TTS software: - Microsoft Speech API 5 (SAPI5) - eSpeak - NS Speech Synthesis - FreeTTS Documentation: http://sourceforge.net/p/steeltts/wiki/ Bug Tracker: http://sourceforge.net/p/steeltts/tickets/ If you are interested in contributing to the Steel TTS codebase, or would like to make a feature-request, please contact the lead...

Downloads: 0 This Week

Last Update: 2016-03-15
See Project
8

AarTon

AarTon is an automated text-to-speech application. It allows user to enter text in a web-based front-end and render these texts via a multi-channel sound card.

Downloads: 0 This Week

Last Update: 2013-11-14
See Project
9

RNNLIB

RNNLIB is a recurrent neural network library for sequence learning problems. Applicable to most types of spatiotemporal data, it has proven particularly effective for speech and handwriting recognition. full installation and usage instructions given at http://sourceforge.net/p/rnnl/wiki/Home/

2 Reviews

Downloads: 0 This Week

Last Update: 2016-11-28
See Project
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
10

Speect

...Speect is free and open source software. As a collection it is distributed under a MIT license.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
11

VoiceCode Programming by Voice Toolbox

VoiceCode is an Open Source initiative started by the National Research Council of Canada, to develop a programming by voice toolbox. The aim of the project is to make programming through voice input as easy and productive as with mouse and keyboard. For install, Use subversion, as described in this page: http://sourceforge.net/apps/mediawiki/voicecode/index.php?title=VCode_1_Doc/InstallationManual.

5 Reviews

Downloads: 0 This Week

Last Update: 2013-03-10
See Project
12

VEDICS

VEDICS (Voice Enabled Desktop Interaction and Control System) is an assistive software which lets the user to interact with the OS using voice commands. Using this software the user can access any element found on the user's screen.

1 Review

Downloads: 0 This Week

Last Update: 2013-05-28
See Project
13

QWave

QWave: Qt-based waveform display and audio playback class library.

Downloads: 0 This Week

Last Update: 2013-05-01
See Project
14

uListen

uListen is a TTS(Text To Speech) application. It can TALK you the web pages, chm files, pdf files and word files and plain text files.

Downloads: 0 This Week

Last Update: 2013-04-05
See Project
15

Annotation Graph Toolkit

AGTK is a suite of software components for building tools for annotating linguistic signals, time-series data which documents any kind of linguistic behavior (e.g. audio, video). The internal data structures are based on annotation graphs.

Downloads: 1 This Week

Last Update: 2013-04-25
See Project
16

Fala - A simple text reader

A simple software that speaks a text. You can type the text or appoint a file. Fala is just a frontend to festival. It's designed for GNOME, but if you have gtk, pyhton and festival you are able to run it.

1 Review

Downloads: 0 This Week

Last Update: 2015-09-29
See Project
17

ftw. Text Modeller

Software to fit whole-sentence language models using the principle of maximum entropy. For developers of speech recognizers, text prediction interfaces, OCR, machine translation software.

Downloads: 0 This Week

Last Update: 2013-03-20
See Project
18

DJBorg

DJBorg turns your MP3 playlist into a personalized radio station, adding randomly-generated DJ banter between tracks. Song information (based on ID3 tags), news, weather, and headlines are announced via a text-to-speech engine.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
19

PhoneBlogger

PhoneBlogger allows you to post to a weblog by phone. PhoneBlogger is written in VoiceXML, Python, and JavaScript.

Downloads: 0 This Week

Last Update: 2016-08-20
See Project
20

Python Gutenberg E-text Project

The PyGE (Python Gutenberg E-text) project is a suite of GUI desktop utilities written in Python to promote and facilitate awareness and enjoyment of works of literature that are available from the archives of Project Gutenberg.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
21

SoccerPhone

SoccerPhone provides lives soccer scores by phone. The only league currently supported is US Major League Soccer. Support for Soccernet is under development. SoccerPhone is written in VoiceXML, Python, and JavaScript.

1 Review

Downloads: 0 This Week

Last Update: 2013-02-25
See Project
22

Sayz Me

Sayz Me is a text-to-speech application for Windows. Text can be typed in or read from clipboard. Words are highlighted when spoken. Select voice, adjust reading speed, voice pitch, font and color. Simple and easy to use.

2 Reviews

Downloads: 0 This Week

Last Update: 2013-04-11
See Project
23

Open Interface for Speech Synthesis

The Open Interface for Speech Synthesis (OISS) provides an interface to speech synthesis hardware and software for end-user applications under Unix.

Downloads: 0 This Week

Last Update: 2013-02-21
See Project
24

wxVoiceModem

This project is intended for users who want to get more out of the voice modem they may have. Why another project for modem? Looking for the good quality software for the voice communication trough the modem, I could find only Win32 based. Linux now :)

Downloads: 0 This Week

Last Update: 2016-03-08
See Project

Previous
You're on page 1
Next

Related Searches

hindi text to speech

mega-voice

sppas

deepspeech-0.9.3-models.scorer

convert txt file to .srt file

ivr

sapi5

rnnlib

tts voices

voice and speech recognition software

Related Categories

Multimedia

Artificial Intelligence

Software Development

Communications

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise