Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Speech to Text Software
Search Results

Search Results for "intelligence"

x

Sort By:

Relevance

Clear All Filters

OS

Mac 24
Linux 23
Windows 22
More...
BSD 6
ChromeOS 6
Mobile Operating Systems 1

Category

Artificial Intelligence 24
Multimedia 7
Scientific/Engineering 2
System 2
Communications 1
Database 1
Text Editors 1

License

OSI-Approved Open Source 13
GNU Free Documentation License 1

Translations

English 4
Catalan 1
French 1
German 1
More...
Spanish 1

Programming Language

Python 6
Java 4
C++ 3
TypeScript 3
More...
C# 1
Go 1
JavaScript 1
PHP 1

Status

Beta 2
Alpha 1

Showing 24 open source projects for "intelligence"

View related business solutions

Speech to Text Mac Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Error to trace to log to deploy. One click. No SSH.
Catch the cause before the pager goes off.

AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.

Free 30 days.
1

sherpa-onnx

Speech-to-text, text-to-speech, and speaker recognition

Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.

Downloads: 276 This Week

Last Update: 2 days ago
See Project
2

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...

Downloads: 77 This Week

Last Update: 2025-06-26
See Project
3

Vibe

Transcribe on your own

Vibe is an open-source project by thewh1teagle designed to deliver a collaborative and interactive social application experience, though its specifics depend on its evolving community scope; its development often focuses on connecting users through dynamic features that can include chat, shared spaces, and immersive interactions. The repository typically includes backend logic, frontend integration, and real-time communication stacks to support live user engagement, performance...

Downloads: 74 This Week

Last Update: 2026-03-13
See Project
4

Handy STT

A free, open source, and extensible speech-to-text application

Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active...

Downloads: 21 This Week

Last Update: 2026-04-27
See Project
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
5

SpeechRecognition

Speech recognition module for Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...

Downloads: 6 This Week

Last Update: 3 days ago
See Project
6

Ito

Ito, smart dictation in every application

ito is an open‑source JavaScript library for serverless, browser‑to‑browser communication designed for use on devices with or without user input interfaces, such as IoT devices, mobile devices, tablets, and desktops, enabling peer messaging and data sharing via short passcodes and cloud‑backed pairing without an application server.

Downloads: 4 This Week

Last Update: 2025-12-05
See Project
7

TTS Voice Wizard

Speech to Text to Speech, sends text as OSC messages

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) Use TTS Voice Wizard's accessibility features to improve your VRChat experience (it works outside of VRChat too!) You can convert your Speech-to-Text and back to Speech through various Speech Recognition and Text-to-Speech methods. You can send what you say as OSC messages to VRChat to be displayed on your avatar using KillFrenzyAvatarText or VRChats...

Downloads: 7 This Week

Last Update: 2026-05-08
See Project
8

Translate-Subtitle-File

Subtitle Creation Assistant

Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1. You can use multiple service...

Downloads: 12 This Week

Last Update: 1 day ago
See Project
9

RealtimeSTT

A robust, efficient, low-latency speech-to-text library

RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.

Downloads: 0 This Week

Last Update: 2026-05-31
See Project
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.

Sign Up Free
10

Buzz

Transcribe and translate audio offline on your personal computer

Buzz transcribes and translates audio to text offline using OpenAI's Whisper. Import audio and video files into Buzz and export them as TXT, SRT, or VTT files. Buzz supports Whisper, Whisper.cpp, Faster Whisper, Whisper-compatible models from the Hugging Face repository, and the OpenAI Whisper API. Get linux versions from: - https://flathub.org/apps/io.github.chidiwilliams.Buzz - https://snapcraft.io/buzz Home page of Buzz https://github.com/chidiwilliams/buzz Note for...

2 Reviews

Downloads: 28,954 This Week

Last Update: 2026-03-14
See Project
11

Conversations

App in java for chatting to a generative A.I. (involving tts and stt)

Java application for chatting to generative AI Llama3. * The user can speak into the microphone (speechToText), edit the recognized text and send it to the AI. * The AI responds and the server returns that response in real time, and the sentences converted to audio (textToSpeech), and the application broadcasts them through the speaker. The application is prepared so that only one user occupies the server's resources, so if the server is busy, in theory it will not let you...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
12

Whishper

Transcribe any audio to text, translate and edit subtitles 100% locall

Open-source, local-first audio transcription and subtitling suite with a simple web UI. Thanks to open-source technologies, Whishper can run 100% offline. Your data never leaves your computer. Whishper allows you to translate your transcriptions to and from more than 60 languages thanks to Argos Translate and LibreTranslate. Download the transcriptions in many formats (json, txt, vtt, srt). Easily edit your subtitles right in the Web-UI.

Downloads: 4 This Week

Last Update: 2024-09-10
See Project
13

Coqui STT

The deep learning toolkit for speech-to-text

Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems...

Downloads: 2 This Week

Last Update: 2022-09-03
See Project
14

DeepSpeech

Open source embedded speech-to-text engine

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the...

Downloads: 4 This Week

Last Update: 2021-04-08
See Project
15

CMU Sphinx

Speech Recognition Toolkit

Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.

58 Reviews

Downloads: 241 This Week

Last Update: 2024-01-11
See Project
16

ILA - teachable voice assistant

ILA is a fully customizable and teachable voice assistant for Java

ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own,...

4 Reviews

Downloads: 0 This Week

Last Update: 2018-07-23
See Project
17

insofts player

insofts-player Free media player, with which you can easily and conveniently view video and listen to audio files in various formats, without installing additional codecs. View streaming video, audio. Constantly updating the online media library Additional features: sound recording, uart protocol support, speech to text

2 Reviews

Downloads: 0 This Week

Last Update: 2018-06-09
See Project
18

Sluh

creating user interface for converting speech to text and voice contro

Проект по изучению пригодности\простоты к использованию CMUSphinx, pocketsphinx в пользовательских целях. Попытка создания программного интерфейса по распознаванию русской речи (преобразованию в текст) на базе CMUSphinx, pocketsphinx для голосового управления ПО и прочее.

Downloads: 0 This Week

Last Update: 2016-09-21
See Project
19

Speechalyzer

Process large speech data wrt transcription, labeling and annotation

Speechalyzer: a tool for the daily work of a 'speech worker' It is optimized to process large speech data sets with respect to transcription, labeling and annotation. It is implemented as a client server based framework in Java and interfaces software for speech recognition, synthesis, speech classification and quality evaluation. The application is mainly the processing of training data for speech recognition and classification models and performing benchmarking tests on...

Downloads: 3 This Week

Last Update: 2016-04-27
See Project
20

Anthromorphic Scribe

Provides speech to text gui to sphinx4

It provides an interactive speech to text application that uses sphinx 4. With this you can use pre-recorded audio, record your own voice and convert incompatible audio/video to be compatible with sphinx 4. It currently supports U.S English by using hub4 acoustic and language model.

Downloads: 0 This Week

Last Update: 2013-05-05
See Project
21

Voice_Tic_Tac_Toe

Play Tic Tac Toe With Voice Input

Voice Tic Tac Toe enables you to play Tic Tac Toe via voice input. The game engine is developed in python programming language and uses Microsoft SAPI for Speech to Text Conversion.

Downloads: 0 This Week

Last Update: 2012-08-26
See Project
22

Voice Conference Manager

Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers

Downloads: 0 This Week

Last Update: 2013-04-17
See Project
23

ftw. Text Modeller

Software to fit whole-sentence language models using the principle of maximum entropy. For developers of speech recognizers, text prediction interfaces, OCR, machine translation software.

Downloads: 0 This Week

Last Update: 2013-03-20
See Project
24

im_narrator

im_narrator provides text-to-speech and speech-to-text services for Instant Messaging clients for the blind (and others). It currently supports AIM, MSN, ICQ, Exodus, Jabber Messenger, and Meca on win32. Ports to other platforms are planned.

Downloads: 0 This Week

Last Update: 2013-03-07
See Project

Previous
You're on page 1
Next

Related Searches

buzz

buzz-1.3.3-windows.exe

whisper-windows-x64.exe

transcribe audio to srt

whisper

speech

pyaudio-0.2.11-cp314-cp314-win_amd64.whl

srt file

cmusphinx-zh-cn-5.2.tar.gz

buzz captions

Related Categories

Artificial Intelligence

Multimedia

Scientific/Engineering

System

Communications

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise