SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
MMDAgent is the toolkit for building voice interaction systems. Users can design users own dialog scenario, 3D agents, and voices. This software is released under the Modified BSD license.
WaveSurfer is an open source tool for sound visualization and manipulation. Typical applications are speech/sound analysis and sound annotation/transcription. WaveSurfer may be extended by plug-ins as well as embedded in other applications.
FreeTTS is a speech synthesis engine written entirely in the Java(tm) programming language. FreeTTS was written by the Sun Microsystems Laboratories Speech Team and is based on CMU's Flite engine. FreeTTS also includes a partial JSAPI 1.0
Liepa TTS andoid engine
TTS engne for Lithuanian language synthesation based on LIEPA project (https://www.xn--ratija-ckb.lt/liepa/infrastrukturines-paslaugos/elektroninio-teksto-skaitytuvas/7563)
EMU is a collection of software tools for the creation, manipulation and analysis of speech databases. At the core of EMU is a database search engine which allows queries based on the sequential and hierarchical structure of the annotations.
Open JTalk is a Japanese text-to-speech synthesis system. This software is released under the Modified BSD license.
hts_engine is software to synthesize speech waveform from HMMs trained by the HMM-based speech synthesis system (HTS). This software is released under the Modified BSD license.
A model for a webjornal narrated in audio, for digital inclusion proposites. In portuguese: Um modelo para jornal web narrado em áudio, para fins de inclusão digital.
A speech recognition system using Matlab/Simulink/Stateflow.
This project provide hidden Markov model speech recognition system by using Matlab/Simulink/Stateflow.
A phrase to phoneme code converter for the SpeakJet chip by Magnevation. Speakalator runs on Unix type operating systems.
HMM-based singing voice synthesis system
Sinsy is an HMM-based singing voice synthesis system. This software is released under the Modified BSD license.
PHP based Viewer for Voice Servers like Mumble.
Project that aims converting a text page directly into MP3 or other audio format using the MBrola libraries
A system intended to act as a prototype for a future CSS3/(X)HTML web browser that renders via computer-generated speech, instead of visually.
create wav files for video character speech by typing in dialogue
Choose from the "voices" available, and type in what you want the computer to say. A wave file called sounds.wav is stored to the output sub folder. Output is intended primarily for users who need speech for animated characters in videos.
MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
This program converts text files with a text-to-speech (TTS) engine into audio files. Supports SAPI5 voices and MP3 output.
TTS-Cubed is not being developed any more. Please see Speect at http://speect.sourceforge.net
Implementation of Media Resource Control Protocol Client (MRCP). Supports ASR and TTS functionality. Design pattern implementation. Documentation, sample application and library source code.
Low-latency, high quality voice chat for gamers
Mumble is an open source, low-latency, high quality voice chat software primarily intended for use while gaming. It includes game linking, so voice from other players comes from the direction of their characters, and has echo cancellation so the sound from your loudspeakers won't be audible to other players.
SAPI Lipsync (phoneme alignment) C++ Software
Speech Made Visible is an experiment in showing some of the qualities of speech in printed text. Analyze a recording for attributes like pitch, intensity (loudness), and speed; then style the words in a transcript to suggest those characteristics.
Implementation of duration high-order hidden Markov model in Matlab.
Implementation of duration high-order hidden Markov model (DHO-HMM) in Matlab with application in speech recognition.
extensible Gaussian Mixture Model
The main key is provide a common framework to create Gaussian Mixture Models (GMM) systems. The project is aimed to create a common extensible GMM that can evolve and aggregate acceleration techniques like OpenCL, as well as methods like MMI, MAP, SVM and so on.