Speech to Text Software for Windows

View 37 business solutions

Browse free open source Speech to Text software and projects for Windows below. Use the toggles on the left to filter open source Speech to Text software by OS, license, language, programming language, and project status.

  • Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
    Achieve perfect load balancing with a flexible Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • 1
    CMU Sphinx

    CMU Sphinx

    Speech Recognition Toolkit

    Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
    Leader badge
    Downloads: 812 This Week
    Last Update:
    See Project
  • 2
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 3
    TTS Voice Wizard

    TTS Voice Wizard

    Speech to Text to Speech, sends text as OSC messages

    Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) Use TTS Voice Wizard's accessibility features to improve your VRChat experience (it works outside of VRChat too!) You can convert your Speech-to-Text and back to Speech through various Speech Recognition and Text-to-Speech methods. You can send what you say as OSC messages to VRChat to be displayed on your avatar using KillFrenzyAvatarText or VRChats Chatbox. The app can translate your speech from one language to over 20 other support languages. There are 100+ different voices with various customization options so you can pick a voice that best suits you. Display the current song you are listening to on Spotify or via your browser. Display tracker and controller battery life in conjunction with XSOverlay. Use in conjunction with HRtoVRChat_OSC to enable you to display your heartrate in VRChat's Chatbox.
    Downloads: 26 This Week
    Last Update:
    See Project
  • 4
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
    Downloads: 26 This Week
    Last Update:
    See Project
  • Digital Payments by Deluxe Payment Exchange Icon
    Digital Payments by Deluxe Payment Exchange

    A single integrated payables solution that takes manual payment processes out of the equation, helping reduce risk and cutting costs for your business

    Save time, money and your sanity. Deluxe Payment Exchange+ (DPX+) is our integrated payments solution that streamlines and automates your accounts payable (AP) disbursements. DPX+ ensures secure payments and offers suppliers alternate ways to receive funds, including mailed checks, ACH, virtual credit cards, debit cards, or eCheck payments. By simply integrating with your existing accounting software like QuickBooks®, you’ll implement efficient payment solutions for AP with ease—without costly development fees or untimely delays.
  • 5
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems in post. With Coqui, dubbing is a delight. Effortlessly clone the voice of your talent into another language and let the clone do the dub. With text-to-speech, experience the immediacy of script-to-performance. Cast from a wide selection of high-quality, directable, emotive voices or clone a voice to suit your needs. With Coqui text-to-speech, production times go from months to minutes.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1. You can use multiple service providers, 2. You can configure your own API Key to use your own account's free quota, such as Tencent's free translation quota of 5 million characters per month, IBM's 500-minute speech-to-text free quota (tern. best The domain name has expired and I don't want to renew it.) Azure speech-to-text and DeepL free version have problems, it is normal to not use it, please wait for the next version to fix. Machine translation of subtitle files, use machine translation to process files.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8

    SoundTranscriber

    SoundTranscriber can be used to generate automatic transcription / aut

    SoundTranscriber can be used to generate automatic transcription / aut
    Downloads: 13 This Week
    Last Update:
    See Project
  • 9
    VATSG

    VATSG

    Video automatic transcribe and translated subtitle generator

    It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your target language subtitle file(srt format) If you are a general user who want to view any video file and mp3 file to your language, It will provide way. It's very easy to use because it has simple gui and very intuitive. So you can easily use it for any purpose. Now, you can choose to download either window installer setup type or uninstalled type. Enjoy and support my consistent development!
    Downloads: 11 This Week
    Last Update:
    See Project
  • EBizCharge Payment Platform for Accounts Receivable Icon
    EBizCharge Payment Platform for Accounts Receivable

    Getting paid has never been easier.

    Don’t let unpaid invoices limit your business’s growth. EBizCharge plugs directly into the tools your business already uses to speed up payment collection.
  • 10
    Anthromorphic Scribe

    Anthromorphic Scribe

    Provides speech to text gui to sphinx4

    It provides an interactive speech to text application that uses sphinx 4. With this you can use pre-recorded audio, record your own voice and convert incompatible audio/video to be compatible with sphinx 4. It currently supports U.S English by using hub4 acoustic and language model.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    ILA - teachable voice assistant

    ILA - teachable voice assistant

    ILA is a fully customizable and teachable voice assistant for Java

    ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own, free and open Amazon Echo replacement ;-) Right now the key components of ILA are the open source speech recognition CMU Sphinx-4, Google (Speech Recognition/Text-To-Speech) and MaryTTS (Text-To-Speech). The goal is to make ILA completely free of Google by improving all aspects of the open source systems. Since version 3.3 users can also write own add-ons to extend ILA. ILA's successor is the SEPIA Framework: https://sepia-framework.github.io/ Hope you enjoy ILA - Florian
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Conversations

    Conversations

    App in java for chatting to a generative A.I. (involving tts and stt)

    Java application for chatting to generative AI Llama3. * The user can speak into the microphone (speechToText), edit the recognized text and send it to the AI. * The AI ​​responds and the server returns that response in real time, and the sentences converted to audio (textToSpeech), and the application broadcasts them through the speaker. The application is prepared so that only one user occupies the server's resources, so if the server is busy, in theory it will not let you connect. There is a demo video that shows how it works: https://frojasg1.com:8443/resource_counter/resourceCounter?operation=countAndForward&url=https%3A%2F%2Ffrojasg1.com%2Fdemos%2Faplicaciones%2Fchat%2F20240815.Demo.Chat.mp4%3Forigin%3Dsourceforge&origin=web
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    JAVT - Just Another Voice Transformer

    JAVT - Just Another Voice Transformer

    Just Another Speech Recognition and Text to Speech software.

    JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Speech Recognition in English & Polish

    Speech Recognition in English & Polish

    Speech recognition software for English & Polish languages

    Software for speech recognition in English & Polish languages. Basic versions of SkryBot: 1. SkryBot Home Speech (English Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesEnglish/InstalatorSkryBotHomeSpeechDemo-2.6.9.18117.exe/download 2. SkryBot DoMowy (Polish Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesPolish/InstalatorSkryBotDoMowyDemo-2.4.9.18117.exe/download More help: https://sourceforge.net/p/skrybotdomowy/wiki/ Domain advanced versions (Polish Language) 1. SkryBot Prawo - for judicial professionals. 2. SkryBot Administracyjny - for civil and government administration. 3. SkryBot Medycyna Rodzinna - for physicians Professional version of SkryBot (commercial) offers you: 1. Audio conversion and cutting sound files into smaller ones. 2. Searching for words or phrases in sound files (recognized by SkryBot). 3. Editing sound files and automatic cutting off long silence parts in audio file.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    insofts player
    insofts-player Free media player, with which you can easily and conveniently view video and listen to audio files in various formats, without installing additional codecs. View streaming video, audio.   Constantly updating the online media library Additional features: sound recording, uart protocol support, speech to text
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    JKVSRG Tamil and Amharic Translator

    JKVSRG Tamil and Amharic Translator

    Attractive and user friendly Tamil and Amharic translator

    JKVSRG Tamil and Amharic translator translates between Tamil and Amharic. User can transliterate English to Tamil and English to Amharic with the help of Tamil and Amharic Letter System which is included in this software. In addition, User can also translate multilingual text file to Tamil and Amharic text file. It is an interactive and user friendly translator because of its special feature speech synthesize (text to speech). It runs on windows. It takes a smaller amount of memory and few seconds to install. Users make sure internet connection and speaker in their system while using it.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    JKVSRG Eng Amharic Translator

    JKVSRG Eng Amharic Translator

    Translation between English and Amharic

    JKVSRG English and Amharic translator translates English to Amharic and Amharic to English. User can transliterate English to Amharic with the help of Amharic Letter System which is included in this software. In addition, User can also translate English text file to Amharic text file and vice-versa while comparing it with other Amharic Translators. This translator is an interactive and user friendly translator because of its special feature speech synthesize (text to speech). It runs on all windows operation system. It takes less memory and few seconds to install. When using it, users ensure internet connection and speaker in their system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MDictate

    MDictate

    Speech to text using python, pocketsphinx, ready to deploy

    Automated speech recognition software is extremely cumbersome. This project's aim is to incrementally improve the quality of an open-source and ready to deploy speech to text recognition system. Runs on Windows using the mdictate.exe, but the core workings are found in the mdictate.py script which should work on Windows/Linux/OS X. In version 1.0, we use pocketsphinx' default setup with a basic graphic interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Responding Partner

    Responding Partner

    Control your PC computer with voice commands

    Responding Partner is a windows application that enables you free talking with your computer which equipped with spoken animation character. You will be surprised how smart responding partner robot is. It also enables voice commands and controls to your computer for small task like open media files, open and close program, shutdown and restart computer,open website, type in editor, text to speech,etc. You can extend the ability by installing new plugin which available at files tab. We will continuous to update new plugin and animation character. Engine inside: - Speech Recognition - Text to Speech Requirements - Microphone
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Sintetizador GMS

    Sintetizador GMS

    Herramienta para Sintetizar voces en la PC

    Herramienta para escuchar documentos, puede copiar y pegar, guardar documentos y generar audio hablado.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Sluh

    creating user interface for converting speech to text and voice contro

    Проект по изучению пригодности\простоты к использованию CMUSphinx, pocketsphinx в пользовательских целях. Попытка создания программного интерфейса по распознаванию русской речи (преобразованию в текст) на базе CMUSphinx, pocketsphinx для голосового управления ПО и прочее.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This software convert speech to text using Java and Android application. With this software you can also search for text in Google. You can use offline speech to text with this application if you don't have Internet, you can find the steps in guide file. How to use: ----------------- 1- Install a software to convert the PC as router (EX: My Wifi Router) then connect your mobile with PC via wifi. 2- Install Smart Text to Speech.apk file on your phone. 3- Open "Smart Speech to Text.jar" java application on PC. 4- Launch Smart Speech to Text on your phone. 5- Click on "Speak Now" button in java application. 6- After you speak click on red circle button on your phone to stop speaking and to convert it to text or you can wait few seconds notice: --------- Speech that will converted relied on the language that installed on your phone. © Copyright Khalid Kareem 2014, All Rights Reserved E-mail: khaled.atroshy@gmail.com Enjoy!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Plugin for Windows XP that put an Icon in Windows Bar and allow read the copied text(By mean Ctrl+C) using Festival a Speech To Text Synthesis Software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Speechalyzer

    Speechalyzer

    Process large speech data wrt transcription, labeling and annotation

    Speechalyzer: a tool for the daily work of a 'speech worker' It is optimized to process large speech data sets with respect to transcription, labeling and annotation. It is implemented as a client server based framework in Java and interfaces software for speech recognition, synthesis, speech classification and quality evaluation. The application is mainly the processing of training data for speech recognition and classification models and performing benchmarking tests on speech-to-text, text-to-speech and speech classification software systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next