speech free download - SourceForge

195 projects for "speech" with 2 filters applied:

Multimedia BSD Clear Filters & Widen Search

Context for your AI agents
Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.

Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.

Try for free
All-in-one security tool helps you prevent ransomware and breaches.
SIEM + Detection and Response for IT Teams

Blumira’s detection and response platform enables faster resolution of threats to help you stop ransomware attacks and prevent data breaches. We surface real threats, providing meaningful findings so you know what to prioritize. With our 3-step rapid response, you can automatically block known threats, use our playbooks for easy remediation, or contact our security team for additional guidance. Our responsive security team helps with onboarding, triage and ongoing consultations to continuously help your organization improve your security coverage.

Learn More
1

eGuideDog free software for the blind

eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.

16 Reviews

Downloads: 227 This Week

Last Update: 10 hours ago
See Project
2

opencore-amr

Audio codecs extracted from Android Open Source Project

Library of OpenCORE Framework implementation of Adaptive Multi Rate Narrowband and Wideband (AMR-NB and AMR-WB) speech codec. Library of VisualOn implementation of Adaptive Multi Rate Wideband (AMR-WB) encoder and Advanced Audio Coding (AAC) encoder. Modified library of Fraunhofer AAC decoder and encoder.

19 Reviews

Downloads: 5,580 This Week

Last Update: 2025-08-21
See Project
3

Google2SRT

Download, save and convert multiple subtitles from YouTube videos

Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.

34 Reviews

Downloads: 83 This Week

Last Update: 2025-01-11
See Project
4

sourcesinc

Source code from the Research Institute for Signals, Systems and Computational Intelligence http://fich.unl.edu.ar/sinc

Downloads: 17 This Week

Last Update: 2023-12-05
See Project
Leverage AI to Automate Medical Coding
Medical Coding Solution

As a healthcare provider, you should be paid promptly for the services you provide to patients. Slow, inefficient, and error-prone manual coding keeps you from the financial peace you deserve. XpertDox’s autonomous coding solution accelerates the revenue cycle so you can focus on providing great healthcare.

Learn More
5

Speech Signal Processing Toolkit (SPTK)

SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.

9 Reviews

Downloads: 8 This Week

Last Update: 2023-05-10
See Project
6

AhoTTS - TTS for Basque and Spanish

Text-to-Speech for Basque and Spanish

Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/

1 Review

Downloads: 2 This Week

Last Update: 2022-05-03
See Project
7

VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM

This repository is a voice activity detection (VAD) toolkit that implements multiple models (DNN, bDNN, LSTM, ACAM) for detecting speech versus non-speech in audio. It also provides a recorded dataset in varied real-world settings (e.g. bus stop, construction site, park, room) with ground truth labeling. Acoustic feature extraction (multi-resolution cochleagram, MRCG). Post-processing modules (e.g. smoothing, thresholds). The toolkit supports both MATLAB and Python/TensorFlow components (for feature extraction, classification, postprocessing). ...

Downloads: 0 This Week

Last Update: 2025-10-02
See Project
8

OpenOffice.org Export As DAISY

odt2daisy is an OpenOffice.org Writer extension, enabling to export in DAISY XML, Full DAISY (xml+audio) and Audiobook format. DAISY is an NISO Z39.86 standard for blind, visual impaired, print-disabled, and learning-disabled people.

3 Reviews

Downloads: 8 This Week

Last Update: 2020-12-07
See Project
9

chatbot_chung

chatbot chung is a keywords based probabilities algorythm simple entertainment chatbot with 3D talking openGL avatars written in freebasic. Can import aiml simple question/answer or question/random/answers or single star/ multi srai data saved from "AIML_chung" open source application . Online html5 javascript version with 44 languages multilingual auto detection available on the website (source included in the zip file). SORT gentext text generation algorythm option added (desktop version) .

Downloads: 0 This Week

Last Update: 2020-06-27
See Project
Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software.
Banks, lending institutions

Founded in 2004, axefinance is a global market-leading software provider focused on credit risk automation for lenders looking to provide an efficient, competitive, and seamless omnichannel financing journey for all client segments (FI, Retail, Commercial, and Corporate.)

Learn More
10

SmartBody

Character animation system for games and simulations.

...) * Steering - avoiding obstacles and moving objects * Object manipulation - reach, grasp, touch , pick up objects * Lip Syncing - characters can speak with simultaneous lip-sync using text-to-speech or prerecorded audio * Gazing - robust gazing behavior that incorporates various parts of the body * Nonverbal behavior - gesturing, head nodding and shaking, eye saccades - Online and offline retargeting of motion - Automatic skinning and rigging SmartBody is written in C++ and can be incorporated into most game engines. ...

4 Reviews

Downloads: 1 This Week

Last Update: 2020-06-12
See Project
11

Regulus

Regulus is a Prolog-based toolkit for building spoken dialogue systems.

Downloads: 0 This Week

Last Update: 2020-05-20
See Project
12

AhoTTS Multilingual, a Multilingual TTS

Text-to-Speech TTS for Basque, Spanish, Catalan, Galician and English

Text-to-Speech conversor for Basque, Spanish, Catalan, Galician and English. It includes linguistic processing and built voices for all the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/

1 Review

Downloads: 0 This Week

Last Update: 2019-11-29
See Project
13

Mumble

Low-latency, high quality voice chat for gamers

Mumble is an open source, low-latency, high quality voice chat software primarily intended for use while gaming. It includes game linking, so voice from other players comes from the direction of their characters, and has echo cancellation so the sound from your loudspeakers won't be audible to other players.

169 Reviews

Downloads: 72 This Week

Last Update: 2022-01-22
See Project
14

FolioReaderKit

A Swift ePub reader and parser framework for iOS

...Go to your storyboard file, choose or create the view controller that should present the epub reader. In the identity, the inspector set StoryboardFolioReaderContrainer as a class. Media Overlays (Sync text rendering with audio playback). TTS - Text to Speech Support, parse epub cover image, RTL Support. Vertical or/and Horizontal scrolling, share Custom Image Quotes NEW, supports multiple instances at same time, like parallel reading.

Downloads: 0 This Week

Last Update: 2023-06-07
See Project
15

ILA - teachable voice assistant

ILA is a fully customizable and teachable voice assistant for Java

...It is designed to integrate with your home enviroment and for example build up your own, free and open Amazon Echo replacement ;-) Right now the key components of ILA are the open source speech recognition CMU Sphinx-4, Google (Speech Recognition/Text-To-Speech) and MaryTTS (Text-To-Speech). The goal is to make ILA completely free of Google by improving all aspects of the open source systems. Since version 3.3 users can also write own add-ons to extend ILA. ILA's successor is the SEPIA Framework: https://sepia-framework.github.io/ Hope you enjoy ILA - Florian

4 Reviews

Downloads: 1 This Week

Last Update: 2018-07-23
See Project
16

eSpeak: speech synthesis

Text to Speech engine for English and many other languages. Compact size with clear but artificial pronunciation. Available as a command-line program with many options, a shared library for Linux, and a Windows SAPI5 version.

40 Reviews

Downloads: 2,348 This Week

Last Update: 2021-11-17
See Project
17

Transcriber

a tool for segmenting, labeling and transcribing speech

3 Reviews

Downloads: 40 This Week

Last Update: 2017-03-01
See Project
18

srt-translator

Subtitle translator from one natural language to other.

Translating subtitles in format SubRip from one natural language to other. It is based on Google Translate without API and therefore without payment. Translator have automatic and manual spell checkers.

Downloads: 17 This Week

Last Update: 2016-07-19
See Project
19

emofilt - emotional speech synthesis

PROJECT DEVELOPMENT MOVED TO GITHUB! EmoFilt enables the free-for-non-commercial-use speech synthesis engine MBROLA to sound emotional by manipulating the phonetic description. It does so by modifying melody and rhythm of the speech, matching a target emotion. It is available for 34 languag

3 Reviews

Downloads: 0 This Week

Last Update: 2017-07-06
See Project
20

read_chung

read chung is a small txt reader with multilingual tts text to speech voices from responsivevoice and yandextranslate and animated 3D face avatar written in html5 , javascript and uses jsc3D .

1 Review

Downloads: 0 This Week

Last Update: 2016-02-16
See Project
21

apr

Swiss knife Java library

Downloads: 0 This Week

Last Update: 2016-03-14
See Project
22

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 1 This Week

Last Update: 2015-10-06
See Project
23

Jampal mp3 library

mp3 library, advanced ID3V1 and ID3V2 tagger, player. Organize a large mp3 library, over 40,000 songs. Speech synthesis and tag backup utilities. Scripts to maintain and organize song files.

Downloads: 1 This Week

Last Update: 2015-07-26
See Project
24

eNTranslator

To aid translation of satsangs of Paramhamsa Nithyananda

To aid translation of satsangs of Paramhamsa Nithyananda. Can be used for general purpose by others as well. This translator desktop app uses google translator to translate English text. The auto generated translations are then enriched with human alternation using an easy graphical user interface. Time stamp information may be synched and a subtitle file or a simple textual output may be generated. Additionally it is planned to use google voice tools to also add voice over from these...

Downloads: 0 This Week

Last Update: 2016-06-24
See Project
25

Accelerated Feature Extraction Tool

A fast GPU accelerated feature extraction software for speech analysis

A fast feature extraction software tool for speech analysis and processing. It incorporates standard MFCC, PLP, and TRAPS features. The tool is a specially designed to process very large audio data sets. It uses GPU acceleration if compatible GPU available (CUDA as weel as OpenCL, NVIDIA, AMD, and Intel GPUs are supported). CPU SSE intrinsic instruction set is used in cases where no compatible GPU present.

1 Review

Downloads: 0 This Week

Last Update: 2015-05-25
See Project