Low-latency, high quality voice chat for gamers
Mumble is an open source, low-latency, high quality voice chat software primarily intended for use while gaming. It includes game linking, so voice from other players comes from the direction of their characters, and has echo cancellation so the sound from your loudspeakers won't be audible to other players.
SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
WaveSurfer is an open source tool for sound visualization and manipulation. Typical applications are speech/sound analysis and sound annotation/transcription. WaveSurfer may be extended by plug-ins as well as embedded in other applications.
EMU is a collection of software tools for the creation, manipulation and analysis of speech databases. At the core of EMU is a database search engine which allows queries based on the sequential and hierarchical structure of the annotations.
Simple TTS Reader is a small clipboard reader. Simply copy any text, and it will be read aloud. You can choose any installed speech engine, e.g. Microsoft Anna. This text-to-speech utility can also be minimized to tray. Requires .NET Framework 2.0.
Implementation of duration high-order hidden Markov model in Matlab.
Implementation of duration high-order hidden Markov model (DHO-HMM) in Matlab with application in speech recognition.
MMDAgent is the toolkit for building voice interaction systems. Users can design users own dialog scenario, 3D agents, and voices. This software is released under the Modified BSD license.
MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
A speech recognition system using Matlab/Simulink/Stateflow.
This project provide hidden Markov model speech recognition system by using Matlab/Simulink/Stateflow.
A speech synthesis and recognition library that is cross-platform, accessible from Java and C++, and has a very small API. Uses CMU Sphinx4 and FreeTTS internally.
FreeTTS is a speech synthesis engine written entirely in the Java(tm) programming language. FreeTTS was written by the Sun Microsystems Laboratories Speech Team and is based on CMU's Flite engine. FreeTTS also includes a partial JSAPI 1.0
Sayz Me is a text-to-speech application for Windows. Text can be typed in or read from clipboard. Words are highlighted when spoken. Select voice, adjust reading speed, voice pitch, font and color. Simple and easy to use.
A patent-free audio codec designed especially for voice (unlike Vorbis which targets general audio) signals and providing good narrowband and wideband quality. This project is complementary to the Ogg Vorbis codec.
Implementation of Media Resource Control Protocol Client (MRCP). Supports ASR and TTS functionality. Design pattern implementation. Documentation, sample application and library source code.
Speech Made Visible is an experiment in showing some of the qualities of speech in printed text. Analyze a recording for attributes like pitch, intensity (loudness), and speed; then style the words in a transcript to suggest those characteristics.
HMM-based singing voice synthesis system
Sinsy is an HMM-based singing voice synthesis system. This software is released under the Modified BSD license.
TclSpeech is an extension package to Tcl written in C that gives Mac OS Classic and Mac OS X users access to Apples Speech Manager through scripting in Tcl.
OC Volume is a speech recognition engine written in Java for integration with other applications. It is currently an User-Dependent Isolated Word Recognizer and can be expanded to include more capability for recognition.
This is an open machine translatioin system that can translate any language pairs.
A system intended to act as a prototype for a future CSS3/(X)HTML web browser that renders via computer-generated speech, instead of visually.
Howe is a spoken dialogue system project designed for browsing instructions. The project serves as a base for research on new dialogue system capabilities.
SAPI Lipsync (phoneme alignment) C++ Software
TTS-Cubed is not being developed any more. Please see Speect at http://speect.sourceforge.net
Speak clipboard is a Windows application written on C# that reads the content of the computer clipboard with natural voice.