Showing 297 open source projects for "speech to text in java"

View related business solutions
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    ...You can configure your own API Key to use your own account's free quota, such as Tencent's free translation quota of 5 million characters per month, IBM's 500-minute speech-to-text free quota (tern. best The domain name has expired and I don't want to renew it.) Azure speech-to-text and DeepL free version have problems, it is normal to not use it, please wait for the next version to fix. Machine translation of subtitle files, use machine translation to process files.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4

    Omilo - a text to speech application

    Omilo is a simple text to speech application

    Omilo is a simple text to speech application for Windows and Linux using Festival, Flite, Marytts and Piper voices.
    Leader badge
    Downloads: 13 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 5
    Koodo Reader

    Koodo Reader

    A modern ebook manager and reader with sync and backup

    ...Customize the source folder and synchronize among multiple devices using OneDrive, iCloud, Dropbox, etc. Single-column, two-column, or continuous scrolling layouts. Text-to-speech, translation, progress slider, touch screen support, batch import. Add bookmarks, notes, highlights to your books. Adjust font size, font family, line-spacing, paragraph spacing, background color, text color, margins, and brightness. Night mode and theme color. Text highlight, underline, boldness, italics and shadow. Adjust font size, font family, line-spacing, paragraph spacing, background color, text color, margins, and brightness.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 6

    speech intonator

    The purpose of the project is to develop audio processing algorithms

    The initial version of the main branch of the project has been completed. The main name of the project is "Java audio mixer Summaha". The second name of the project is "Sound Arithmometer". Main purpose - production of musical sound remixes from a set of samples. The name "Summaha" rhymes well with 'Yamaha' and creates motivation and inspiration to achieve a sound quality comparable to with a well-known brand. Detailed documentation in 'read' signature files. Anyone who is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Transcoder

    Transcoder

    Hardware-accelerated video transcoding using Android MediaCodec APIs

    Transcoder by DeepMedia is an AI-powered video-to-video speech translation engine that enables fully automated multilingual dubbing. Unlike traditional speech translation systems that rely on multi-stage pipelines, Transcoder directly translates one speaker’s video into another language while preserving facial expressions, lip-sync, and vocal identity. Designed for real-time use and production-grade pipelines, Transcoder combines advanced deep learning models with GPU acceleration to deliver...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Buzz

    Buzz

    Transcribe and translate audio offline on your personal computer

    Buzz transcribes and translates audio to text offline using OpenAI's Whisper. Import audio and video files into Buzz and export them as TXT, SRT, or VTT files. Buzz supports Whisper, Whisper.cpp, Faster Whisper, Whisper-compatible models from the Hugging Face repository, and the OpenAI Whisper API. Get linux versions from: - https://flathub.org/apps/io.github.chidiwilliams.Buzz - https://snapcraft.io/buzz Home page of Buzz https://github.com/chidiwilliams/buzz Note for...
    Leader badge
    Downloads: 2,810 This Week
    Last Update:
    See Project
  • 9
    eGuideDog free software for the blind
    eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.
    Leader badge
    Downloads: 197 This Week
    Last Update:
    See Project
  • Lightspeed golf course management software Icon
    Lightspeed golf course management software

    Lightspeed Golf is all-in-one golf course management software to help courses simplify operations, drive revenue and deliver amazing golf experiences.

    From tee sheet management, point of sale and payment processing to marketing, automation, reporting and more—Lightspeed is built for the pro shop, restaurant, back office, beverage cart and beyond.
    Learn More
  • 10
    Anx Reader

    Anx Reader

    Featuring powerful AI capabilities and supporting e-book formats

    ...It supports major formats (EPUB, MOBI, AZW3, FB2, TXT) and integrates powerful AI tools for summarizing and intelligent navigation via OpenAI, Claude, Gemini, and DeepSeek. Anx also syncs progress, notes, and highlights over WebDAV, and offers rich analytics—including heatmaps and exportable reading summaries. UI customization and text-to-speech enhance the reading experience.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 11
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    react-use

    react-use

    Component for React

    ...Tracks mouse hover state of some element. Display an element or video full-screen. Tracks location hash value. Tracks whether user is being inactive. Tracks an HTML element's intersection. Synthesizes speech from a text string. Tracks page navigation bar location state. Re-renders component, while tweening a number from 0 to 1. Tracks long press gesture of some element. Tracks state of a CSS media query. Tracks state of connected hardware devices. Returns a callback, which re-renders component when called. Tracks state of device's motion sensor. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    RemoteTTS

    RemoteTTS

    Tool to remotely activate Text-To-Speech (TTS) on a server

    The tool provides a simple TCP/UDP interface to let a remote machine perform TTS outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Google2SRT

    Google2SRT

    Download, save and convert multiple subtitles from YouTube videos

    Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.
    Downloads: 69 This Week
    Last Update:
    See Project
  • 15
    htmid

    htmid

    Generative Music For Beginners and Everyone Else

    Generative music is a fascinating and innovative approach to music creation that involves creating procedurally generated music that evolves and changes over time. Whether you're a beginner or a seasoned musician, this guide will introduce you to the world of generative music and show you how to create your own live music performances. Generative music is music that is ever-changing and created in real-time. It can be created by anyone, with or without musical experience. Learn how to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 181 This Week
    Last Update:
    See Project
  • 18
    Time_limit

    Time_limit

    A windowed/full-screen countdown timer.

    A windowed / full-screen countdown timer. Colour and font size changes are used as warnings. Progress-bar gives a glance at the time stream. Three different modes are available: - time left; - time passed; - ordinary clock. When the time is over several possibilities are available: - to show the defined message; - to continue count the time; - to launch another application; - to close the count-down timer. Useful for speech, lecture or presentation timing. Colour / font...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 19
    jfPaint

    jfPaint

    Multi-Tabbed and Multi-Layered Paint program with a rich tool set.

    Multi-Tabbed and Multi-Layered Paint program with a rich tool set. Features : pencil, line, curve, fill, box, circle, text, substitute, transparency, gradients and gaussian blur. Supports : JPG, PNG, BMP, TIFF, SVG, PGM, PPM, multi-layer proprietary file format (.jfpaint)
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    threeddonut

    threeddonut

    3D donut. Example of frojasg1.com libraries usage

    The application shows a 3D donut, that can be rotated with two sliders in both axis. It is a simple example of what can be done with frojasg1.com platform libraries: - Zoom option for components - Multi language - Dark mode option - Automatic Undo-Redo for text components, with popup menu included - Text Search/Replace window prepared to be used. - Base components for auto-completion windows. - Automatic component relocation after redimensioning a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Shutter Encoder

    Shutter Encoder

    Free professional video converter Windows|Mac|Linux

    Shutter Encoder is an video, audio and image converter based on FFmpeg and other great tools. It has been designed by video editors in order to be as accessible and efficient as possible. It's a swiss knife tool for any video editor. Link to website & downloads : https://www.shutterencoder.com - Without conversion: Cut without re-encoding, Replace audio, Rewrap, Conform, Merge, Extract, Subtitling, Video inserts - Sound conversions: WAV, AIFF, FLAC, ALAC, MP3, AAC, AC3,...
    Leader badge
    Downloads: 50 This Week
    Last Update:
    See Project
  • 22
    jPicEdt

    jPicEdt

    Another drawing editor for LaTeX with PSTricks & TikZ

    jPicEdt is an extensible internationalized vector-based drawing editor for LaTeX and related packages (TikZ, PsTricks,...), written in Java. It is also a library of reusable high-level graphic primitives.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Kisekae UltraKiss

    Kisekae UltraKiss

    Kisekae UltraKiss is a full featured integrated development environmen

    UltraKiss is a computer program that implements the Kisekae Set system, KiSS, a Japanese graphics system originally developed to facilitate costume changes on virtual dolls. UltraKiss was developed to help artists build their KiSS sets. It is a full featured viewer for all KiSS dolls, games, and visual applications. It is also a complete graphical development environment for creating KiSS applications. It fully implements the FKiSS event driven programming language up to and including...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 25
    Yaoqiang BPMN Editor

    Yaoqiang BPMN Editor

    an Open Source BPMN 2.0 / DMN 1.1 Modeler

    Yaoqiang BPMN Editor is a graphical editor for business process diagrams, compliant with OMG specifications (BPMN 2.0 / DMN 1.1).
    Leader badge
    Downloads: 55 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next