Search Results for "open source speech to text software"

Showing 108 open source projects for "open source speech to text software"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. ...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 2
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    ...It is API-first, meaning you can use it as an app for production work or integrate its speech generation into your own software via an API layer.
    Downloads: 90 This Week
    Last Update:
    See Project
  • 4
    Polyglot

    Polyglot

    Cross-platform AI language practice app

    Polyglot is a cross platform AI language practice application that runs as a desktop app and also offers a web version. It is built around conversational large language models and Azure based text to speech services, turning them into an interactive environment for speaking practice in multiple languages. Users can define custom AI personas, choose languages, and configure their own OpenAI and Azure keys so they retain control over which backends they use. The app supports speech recognition...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure parameters such as rate, pitch, and volume per role. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    BrowserAI

    BrowserAI

    Run local LLMs like llama, deepseek, kokoro etc. inside your browser

    BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Text Search Engine

    Text Search Engine

    A text search engine that supports mixed Chinese and English search

    Text-Search-Engine is a JavaScript-based lightweight search engine that enables full-text search functionality. It allows developers to implement fast search indexing and retrieval in web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    AutoSubs is an open-source, AI-powered subtitle generation tool that enables users to automatically transcribe audio and video content into accurate, editable subtitles directly on their device. It supports both standalone usage and integration with professional video editing software such as DaVinci Resolve, allowing creators to generate and edit subtitles within their existing workflows.
    Downloads: 20 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    Readest

    Readest

    Readest is a modern, feature-rich ebook reader

    Readest is a project meant to facilitate reading, studying, or consuming content by integrating reading tools with AI-powered assistance. Although the repository is not as widely documented or popular as some, the idea is that Readest supports features to help with reading comprehension — likely combining OCR / text retrieval, translation, note-taking, or summarization for reading materials (eBooks, articles, PDFs). The goal appears to be to let users feed in arbitrary reading material and...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Obsidian Text Generator Plugin

    Obsidian Text Generator Plugin

    Text generator is a handy plugin for Obsidian

    Text Generator is an open-source AI Assistant Tool that brings the power of Generative Artificial Intelligence to the power of knowledge creation and organization in Obsidian. For example, use Text Generator to generate ideas, attractive titles, summaries, outlines, and whole paragraphs based on your knowledge database.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Slate Text Editor framework

    Slate Text Editor framework

    Completely customizable framework for building rich text editors

    you can do things like turn a selection of text bold, or add a semantically rendered block quote in the middle of the page. The most important part of Slate is that plugins are first-class entities. That means you can completely customize the editing experience, to build complex editors like Medium's or Dropbox's, without having to fight against the library's assumptions. Slate's core logic assumes very little about the schema of the data you'll be editing, which means that there are no...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Vibe

    Vibe

    Transcribe on your own

    Vibe is an open-source project by thewh1teagle designed to deliver a collaborative and interactive social application experience, though its specifics depend on its evolving community scope; its development often focuses on connecting users through dynamic features that can include chat, shared spaces, and immersive interactions. The repository typically includes backend logic, frontend integration, and real-time communication stacks to support live user engagement, performance...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 14
    Ito

    Ito

    Ito, smart dictation in every application

    ito is an open‑source JavaScript library for serverless, browser‑to‑browser communication designed for use on devices with or without user input interfaces, such as IoT devices, mobile devices, tablets, and desktops, enabling peer messaging and data sharing via short passcodes and cloud‑backed pairing without an application server.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Deep Chat

    Deep Chat

    Customizable AI chat component for websites with API support

    Deep Chat is a highly customizable web component designed to simplify the integration of AI-powered chat interfaces into websites. It allows developers to embed a fully functional chatbot using minimal setup, while still offering extensive control over behavior, appearance, and integrations. Deep Chat supports connections to a wide range of AI services as well as custom backends, enabling flexible deployment for different use cases. It is built as a framework-agnostic solution, meaning it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Agili Hacker Podcast

    Agili Hacker Podcast

    AI tool that turns Hacker News posts into daily podcast updates

    ...As an open-source tool, it also encourages community contributions and customization for developers who want to adapt or extend its workflow for similar AI-driven content pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Editor.js

    Editor.js

    A block-style editor with clean JSON output

    Editor.js is an open-source text editor offering a variety of features to help users create and format content efficiently. It has a modern, block-style interface that allows users to easily add and arrange different types of content, such as text, images, lists, quotes, etc. Each Block is provided via a separate plugin making Editor.js extremely flexible.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    VSCodium

    VSCodium

    binary releases of VS Code without MS branding/telemetry/licensing

    Microsoft’s vscode source code is open source (MIT-licensed), but the product available for download (Visual Studio Code) is licensed under this not-FLOSS license and contains telemetry/tracking. The VSCodium project exists so that you don’t have to download+build from source. This project includes special build scripts that clone Microsoft’s vscode repo, run the build commands, and upload the resulting binaries for you to GitHub releases. These binaries are licensed under the MIT license....
    Downloads: 71 This Week
    Last Update:
    See Project
  • 19
    Lexical

    Lexical

    Lexical is an extensible text editor framework

    An extensible text editor framework that does things differently. Lexical is comprised of editor instances that each attach to a single content editable element. A set of editor states represent the current and pending states of the editor at any given time. Lexical is designed for everyone. It follows best practices established in WCAG and is compatible with screen readers and other assistive technologies. Lexical is minimal. It doesn't directly concern itself with UI components, toolbars...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    Tolaria

    Tolaria

    Desktop app to manage markdown knowledge bases

    Tolaria is a platform designed to help developers understand, refactor, and improve codebases through structured analysis and transformation workflows. It focuses on breaking down complex systems into manageable components, making it easier to identify technical debt and architectural issues. The project emphasizes clarity, maintainability, and iterative improvement of software systems. It provides tools and patterns for analyzing dependencies, restructuring modules, and improving code...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    OpenAI Translator

    OpenAI Translator

    Browser extension and cross-platform desktop app based on ChatGPT API

    Browser extension and cross-platform desktop application for translation based on ChatGPT API. I have developed a Bob plugin that utilizes ChatGPT API to provide global word translation on macOS. However, since not all users have access to macOS to benefit from the plugin, I have created this project! What began as a Chrome extension has now evolved into a multi-platform desktop app that I am currently developing. The desktop application does not support the pop-up icon after word selection....
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    Logseq

    Logseq

    A privacy-first, open-source platform for knowledge management

    Logseq is a privacy-first, open-source knowledge base that works on top of local plain-text Markdown and Org-mode files. Use it to write, organize and share your thoughts, keep your to-do list, and build your own digital garden. Logseq is a platform for knowledge management and collaboration. It focuses on privacy, longevity, and user control. The server will never store or analyze your private notes. Your data are plain text files and we currently support both Markdown and Emacs Org-mode...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 23
    React Wrap Balancer

    React Wrap Balancer

    Simple React Component That Makes Titles More Readable

    The React Wrap Balancer project is a React component that improves text readability by intelligently balancing line breaks in headings and other text elements. It addresses common layout issues where text wraps unevenly, such as leaving a single word on the last line, which can negatively impact visual design. The component dynamically adjusts how text is split across lines based on the available space, resulting in more aesthetically pleasing layouts. It uses modern browser APIs like...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Choices.js

    Choices.js

    A vanilla JS customizable select box/text input plugin

    Choices.js is a lightweight, configurable select box/text input plugin. Similar to Select2 and Selectize but without the jQuery dependency. Choices is compiled using Babel targeting browsers with more than 1% of global usage and expecting that features listed below are available or polyfilled in the browser. You may see exact list of target browsers by running npx browserslist within this repository folder. If you need to support a browser that does not have one of the features listed below,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    OpenMAIC

    OpenMAIC

    Open Multi-Agent Interactive Classroom

    OpenMAIC is an open-source multi-agent learning platform built to turn a topic or uploaded material into a fully interactive classroom experience with minimal setup. It is designed around coordinated AI roles, including teacher-like and classmate-like agents that can present information, respond in real time, and participate in live educational dialogue.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB