speech recognition matlab free download

Showing 12 open source projects for "speech recognition matlab"

View related business solutions

Multimedia Python Clear Filters & Widen Search

PMG Low-Code Automation Platform
For companies of all sizes interested in a low-code and digital process automation platform

PMG is a low-code software platform that allows users to configure automation solutions and business applications to drive digital transformation initiatives. From streamlining business processes through automation, to integrating existing systems and filling in point solution functionality gaps, to delivering a collaborative workspace and unified user experience – PMG’s low-code platform does it all without coding. Business users as well as IT resources are empowered to configure, deploy, and maintain solutions that meet their company’s specific needs.

Learn More
Business Texting and Text Message Marketing Solutions - Textellent
Textellent's robust business texting services provide SMS and MMS capability for customer service, sales, and marketing texting programs.

Textellent's business texting solution makes designing, managing, measuring, and integrating SMS and MMS campaigns easy. Whether used for customer service, sales, or marketing, Textellent supports your customer journey with an easy-to-use service that text-enables local business lines and serves businesses of any size. Robust scheduling and appointment reminders are also available, as are keywords and shortcodes for easy opt-in programs with TCPA compliance supported by AI.

Learn More
1

SpeechRecognition

Speech recognition module for Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip...

Downloads: 25 This Week

Last Update: 2024-05-05
See Project
2

VATSG

Video automatic transcribe and translated subtitle generator

It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your target language subtitle file(srt format) If you...

Downloads: 8 This Week

Last Update: 2023-09-19
See Project
3

DeepFaceLab

The leading software for creating deepfakes

... to strengthen their own pipeline with other features without having to write complicated boilerplate code. DeepFaceLab can achieve results with high fidelity that are indiscernible by mainstream forgery detection approaches. Apart from seamlessly swapping faces, it can also de-age faces, replace the entire head, and even manipulate speech (though this will require some skill in video editing).

Downloads: 371 This Week

Last Update: 2023-09-07
See Project
4

FM2TXT

RtlSdr listen to radio, recognize audio, and writes text file log

Just log your favorite FM station speech to a text file using rtl-sdr dongle and speech recognition. Cross-platform tool. Follow the README on the download page for Windows installation. https://sourceforge.net/projects/fm2txt-rtlsdr/files/ If you prefer GitHub source, not SF: https://github.com/randaller/fm2txt For those, who want to recognize from soundcard, not from rtl-sdr (this allows to transcribe NFM etc): https://github.com/randaller/souncard2txt

Downloads: 0 This Week

Last Update: 2017-12-17
See Project
High-performance Open Source API Gateway
KrakenD is a stateless, distributed, high-performance API Gateway that helps you effortlessly adopt microservices

KrakenD is a high-performance API Gateway optimized for resource efficiency, capable of managing 70,000 requests per second on a single instance. The stateless architecture allows for straightforward, linear scalability, eliminating the need for complex coordination or database maintenance.

Learn More
5

JAVT - Just Another Voice Transformer

Just Another Speech Recognition and Text to Speech software.

JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.

Downloads: 2 This Week

Last Update: 2020-08-19
See Project
6

Distant Speech Recognition

Beamforming and Speech Recognition Toolkit

BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research...

Downloads: 0 This Week

Last Update: 2019-08-21
See Project
7

yaafe

Yet Another Audio Feature Extractor is a toolbox for audio analysis. Easy to use and efficient at extracting a large number of audio features simultaneously. WAV and MP3 files supported, or embedding in C++, Python or Matlab applications.

1 Review

Downloads: 0 This Week

Last Update: 2016-02-25
See Project
8

InproTK

An Incremental Spoken Dialogue Processing Toolkit

InproTK is an Incremental Spoken Dialogue Processing Toolkit, that is, a toolkit to help you build dialogue systems that listen and talk incrementally, allowing for advanced interactional behaviour. Please see our Wiki for more information: http://sourceforge.net/p/inprotk/wiki/

Downloads: 0 This Week

Last Update: 2015-06-16
See Project
9

avimmir

(audio, video, image) Multimedia Multimodal Information Retrieval

audio classification; speaker segmentation; speaker clustering; speaker recognition; spoken document retrieval; image retrieval; video retrieval; etc.

Downloads: 0 This Week

Last Update: 2013-11-23
See Project
Network Performance Monitoring | Statseeker
Statseeker is a powerful network performance monitoring solution for businesses

Using just a single server or virtual machine, Statseeker can be up and running within minutes, and discovering your entire network in less than an hour, without any significant effect on your bandwidth availability.

Learn More
10

RNNLIB

RNNLIB is a recurrent neural network library for sequence learning problems. Applicable to most types of spatiotemporal data, it has proven particularly effective for speech and handwriting recognition. full installation and usage instructions given at http://sourceforge.net/p/rnnl/wiki/Home/

2 Reviews

Downloads: 0 This Week

Last Update: 2016-11-28
See Project
11

SWIPE' pitch extractor

This is a fast C implementation of Arturo Camacho's SWIPE' pitch extraction algorithm. See the project homepage for more about the advantages of the SWIPE' algorithm. swipe-1.0.tar.gz contains the current source, which should compile quite neatly.

Downloads: 0 This Week

Last Update: 2013-04-11
See Project
12

ASR-Builder

ASR-Builder provides an easy-to-use interface to the HTK toolkit, that allows users to build ASR systems. ASR-Builder provides a platform that performs house-keeping tasks when using HTK and also provides default training/testing/recognition scripts.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project