speech recognize free download

Showing 19 open source projects for "speech recognize"

View related business solutions

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
1

SpeechRecognition

Speech recognition module for Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details).

Downloads: 9 This Week

Last Update: 2026-06-16
See Project
2

annyang!

Speech recognition for your site

annyang is a tiny javascript library that lets your visitors control your site with voice commands. annyang supports multiple languages, has no dependencies, weighs just 2kb and is free to use. annyang understands commands with named variables, splats, and optional words. Use named variables for one word arguments in your command. Use splats to capture multi-word text at the end of your command (greedy). Use optional words or phrases to define a part of the command as optional. annyang plays...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
3

stt

Voice Recognition to Text Tool

stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. ...

Downloads: 0 This Week

Last Update: 2026-02-17
See Project
4

Translate-Subtitle-File

Subtitle Creation Assistant

...You can configure your own API Key to use your own account's free quota, such as Tencent's free translation quota of 5 million characters per month, IBM's 500-minute speech-to-text free quota (tern. best The domain name has expired and I don't want to renew it.) Azure speech-to-text and DeepL free version have problems, it is normal to not use it, please wait for the next version to fix. Machine translation of subtitle files, use machine translation to process files.

Downloads: 1 This Week

Last Update: 2026-06-21
See Project
Stop vibe-debugging.
Plug Claude into your app's actual errors.

AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.

Free 30 days.
5

Stanza

Stanford NLP Python library for many human languages

...It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. Stanza is built with highly accurate neural network components that also enable efficient training and evaluation with your own annotated data.

Downloads: 1 This Week

Last Update: 2026-06-18
See Project
6

VideoSrt

Windows-GUI

This is an open source Windows-GUI software tool that can recognize video speech and automatically generate subtitle SRT files. VideoSrtIt is written in Golanglanguage and developed based on lxn/walk Windows-GUI toolkit. Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). ...

Downloads: 14 This Week

Last Update: 2023-01-13
See Project
7

Jarvis Python AI Assistant

Python AI assistant

Jarvis is a voice commanding assistant service in Python 3.8 It can recognize human speech, talk to user and execute basic commands. Opens a web page (e.g 'Jarvis open youtube') Play music in Youtube (e.g 'Jarvis play mozart') Increase/decrease the speakers master volume (also can set max/mute speakers volume) (e.g 'Jarvis volume up!') Opens libreoffice suite applications (calc, writer, impress) (e.g 'Jarvis open calc') Tells about something, by searching on the internet (e.g 'Jarvis tells me about oranges') Tells the weather for a place (e.g 'Jarvis tell_the_skills me the weather in London') Tells the current time and/or date (e.g 'Jarvis tell me time or date') Set an alarm (e.g 'Jarvis create a new alarm') Tells the internet speed (ping, uplink and downling) (e.g 'Jarvis tell_the_skills me the internet speed') Tells the internet availability (e.g 'Jarvis is the internet connection ok?') ...

Downloads: 6 This Week

Last Update: 2023-04-19
See Project
8

PseudonymizeSpeech

Praat script to pseudonymize speech.

A Praat script to pseudonymize speech. That is, Pseudonymize Speech tries to make it difficult to recognize a speaker while still retaining relevant (para-)linguistic features and intelligibility. There is a trade-off between the level of pseudonymization and the (para-)linguistic features retained. The approach is to manipulate the spectro-temporal structure of the speech to simulate a different length and structure of the vocal tract, as well as a different pitch and speaking rate. ...

Downloads: 0 This Week

Last Update: 2025-07-04
See Project
9

jieba

Stuttering Chinese word segmentation

"Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable...

Downloads: 0 This Week

Last Update: 2022-02-18
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
10

Commander

Commander.exe is speech recognition engine for Polaris.

...Current version supports simple but powerful commands such as openig search forms, changing workspace, copy and paste code. Efforts on daily basis are made to increase the range of functionality that can be controlled with voice. Commander is here to recognize speech and send it to Polaris who activates actions in Eclipse IDE

Downloads: 0 This Week

Last Update: 2019-05-12
See Project
11

Yak

Yak finds and removes silence in a speech recording

Yak is built to remove silence in a speech audio recording, recognize and export speech chapters. The command-line arguments are: yak INPUT.wav [OUTPUT.wav] [Options] -----Options: -s Scan only, don't produce any audio -m Mute the audio between chapters, keeps original length -----Export or Show -ci Export chapter list for the original file -co Export chapter list for the trimmed file -d Export 2s Dropped parts for each chapter -t Export 2s Transitions for each chapter -v Show chapters list -vm Show chapters list in milliseconds

Downloads: 0 This Week

Last Update: 2019-02-14
See Project
12

FM2TXT

RtlSdr listen to radio, recognize audio, and writes text file log

Just log your favorite FM station speech to a text file using rtl-sdr dongle and speech recognition. Cross-platform tool. Follow the README on the download page for Windows installation. https://sourceforge.net/projects/fm2txt-rtlsdr/files/ If you prefer GitHub source, not SF: https://github.com/randaller/fm2txt For those, who want to recognize from soundcard, not from rtl-sdr (this allows to transcribe NFM etc): https://github.com/randaller/souncard2txt

Downloads: 1 This Week

Last Update: 2017-12-17
See Project
13

Interactive4J

Project aim to provide simple easy APIs for Java developers to use interactive abilities in their Java Applications like speech recognition, handwriting recognition, use of web cam , sound record/play, decision trees , text to speech and many others.

Downloads: 0 This Week

Last Update: 2014-07-15
See Project
14

Vok Meister

Going on a world trip? Vok Meister will help you speak the languages in no-time. A complete studio for gathering and memorizing vocabularies. Focused on the structure in languages, for fast learning.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project
15

Framework SRM

Framework SRM (Sound Recognizer ME) written in Java Micro Edition capable of recognize abstract sounds and isolated words announcer dependent on mobile devices.

1 Review

Downloads: 0 This Week

Last Update: 2014-03-07
See Project
16

Natural Language Tools

Goal is to create tools for Latent Semantic Analysis (LSA) and Language Modeling. LSA allows stuff like automatic classification of documents, based on their content (recognize spam from normal email). Language Modeling is neccessary for continuous speech

Downloads: 0 This Week

Last Update: 2015-11-08
See Project
17

AIBO Pal

A speech recognition application. It uses Microsoft Speech SDK to recognize and speak words. It can Play Music, Read the News, Tell the Time, Open Apps and many other cool things only with voice commands.

Downloads: 0 This Week

Last Update: 2015-05-22
See Project
18

SAA

SAA (SSPLab Audio Analyzer) It will be able to separate sources, recognize speech and analyze the auditory scene. It can also synthesize spatialised sounds from mono recording, edit, analyze via spectrogram, filter and re-sample signals.

1 Review

Downloads: 0 This Week

Last Update: 2013-03-19
See Project
19

wav2vec2-large-xlsr-53-russian

Russian ASR model fine-tuned on Common Voice and CSS10 datasets

wav2vec2-large-xlsr-53-russian is a fine-tuned automatic speech recognition (ASR) model based on Facebook’s wav2vec2-large-xlsr-53 and optimized for Russian. It was trained using Mozilla’s Common Voice 6.1 and CSS10 datasets to recognize Russian speech with high accuracy. The model operates best with audio sampled at 16kHz and can transcribe Russian speech directly without a language model. It achieves a Word Error Rate (WER) of 13.3% and Character Error Rate (CER) of 2.88% on the Common Voice test set, with even better results when used with a language model. ...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project