1.) Investigation with cosine transform, and anti transform algorithm, with some voice recognition code. 2.) Translator: Croatian, English. 3.) 2D to 3D picture algorithm (principle) and new 2Dto3D video conversion code with AviSynth video scripting
Cairo sets out to provide an enterprise grade, MRCPv2 compliant speech solution utilizing existing opensource speech resources such as FreeTTS and Sphinx-4.
This is a fast C implementation of Arturo Camacho's SWIPE' pitch extraction algorithm. See the project homepage for more about the advantages of the SWIPE' algorithm. swipe-1.0.tar.gz contains the current source, which should compile quite neatly.
FreeTTS is a speech synthesis engine written entirely in the
Java(tm) programming language. FreeTTS was written by the Sun Microsystems Laboratories Speech Team and is based on CMU's
Flite engine. FreeTTS also includes a partial JSAPI 1.0
XVoice provides voice control of X applications using IBM's ViaVoice for Linux (free download at their web site). Both user-defined commands and dictation are supported. It can be used to write letters, write code, control netscape, etc.
Scalable Language API (SLAPI) The most comprehensive architecture for conversational natural-language applications including speech recognition/synthesis, semantics, & machine translation. Used on Android & other mobile app platforms.
Adds the ability to navigate within and between pages to the standard WIkipedia interface. The project won Best-in-contest for the AVIOS Speech Application contest in 2010.
Asterisk Dialplan application, which allows you to use Lia_Phon and Mbrola as a French speech synthesizer. Application du plan de numérotation d'Asterisk, qui permet d'utiliser Lia_Phon et Mbrola comme synthétiseur vocal français sous Asterisk.
'Text to Voice' or 'Text to Speech' is 1 of the coolest Firefox add-ons. It gives ur brwsr the pwr of speech. Select txt, clck the bttn on the bttm rite & this add-on spks the selectd txt 4 u. Isn't it brllant? Moreovr odio file cn b dnloaded as
Speech Made Visible is an experiment in showing some of the qualities of speech in printed text. Analyze a recording for attributes like pitch, intensity (loudness), and speed; then style the words in a transcript to suggest those characteristics.
Performs actions on detected volume threshold Examples : - Launch music on clap - Launch speech recording when you start speaking - Launch guard webcam when a significant sound is detected - Increase or decrease headphones volume when ambient noise pass
audacity-extra now provides a sleek dark themed version of the Audacity opensource sound editor. The project experiments with Audacity variations. There's a vowel-sound target-practice display for language learners and an analog waveform data logger for embedded systems.
Linux on Sound is a project to create a transparent console interface over eSpeak (http://espeak.sourceforge.net/), to increase linux console's accessibility.
Webvoice is a text to speech cgi program. You can embed a link in a html page to send things you want to say, via sound. No software is required on the client side. Festival and sox are needed on the server. Webvoice has its own interface (if needed).
A plugin for pidgin that interfaces with the popular program festival. It allows for instant messages to be spoken by festival so you can hear it through your speakers.
A collection of tools for generating audio and visual (PNG/HTML/WAVE) for use in web sites including CAPTCHA challenges and PNG image creation tools with Javascript mouse tracking support.
ASR-Builder provides an easy-to-use interface to the HTK toolkit, that allows users to build ASR systems. ASR-Builder provides a platform that performs house-keeping tasks when using HTK and also provides default training/testing/recognition scripts.
A text to speech converter which will be able to read any document(Presently it is reading text and .doc files).The main aim of the project is to make reading an interesting task and assist BLIND people.
Speech based User Interface Components Library for Java is a project to create Java controls and applications that can be used not only by literate people but also by non-literates. Speech and visual element with minimal text is used to create components
BladeWareVXML is a portable VoiceXML 2.1 interpreter that is an enhanced version (performance, usability and integration) of OpenVXI. A commercial version, with documentation, sample code, and support options, is available from the Commetrex Website.