Showing 25 open source projects for "transcribing"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces a specialized pipeline that separates text generation from timestamp alignment, allowing the system to generate transcripts and then align them with audio using forced alignment techniques. ...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 2
    Hyprnote

    Hyprnote

    Local-first AI Notepad for Private Meetings

    Hyprnote is an open-source, privacy-first AI notepad app designed for taking notes during meetings—transcribing audio (microphone and system) and generating context-rich summaries using on-device AI models like Whisper and HyprLLM, all without any data leaving your machine.(turn0search7, turn0search1). Listens to your meetings while you write. Crafts smart summaries based on your quick notes. Runs completely offline using open-source models like Whisper or HyprLLM.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    whisper-timestamped

    whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps

    ...Whisper models were trained to predict approximate timestamps on speech segments (most of the time with 1-second accuracy), but they cannot originally predict word timestamps. This repository proposes an implementation to predict word timestamps and provide a more accurate estimation of speech segments when transcribing with Whisper models. Besides, a confidence score is assigned to each word and each segment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Style-Bert-VITS2

    Style-Bert-VITS2

    Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

    Style-Bert-VITS2 is a text-to-speech system based on Bert-VITS2 that focuses on highly controllable voice styles and emotional expression. It takes the original Bert-VITS2 v2.1 and its Japanese-Extra variant and extends them so you can control emotion and speaking style with fine-grained intensity, not just choose a generic tone. The project targets both power users and beginners: Windows users without Git or Python can install and run it using bundled .bat scripts, while advanced users can...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    WhisperSpeech

    WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper

    WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Whisper-Studio

    Another whisper wrapper, built fully in C++, with some neat features.

    a native lightweight C++ application for OpenAI's Whisper, with a few new things like transcribing audio in real-time, identifying speakers, auto-paste transcriptions, and a few other things. Its not the prettiest app, I suck at design, but it gets the job done.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    # Radio Transcription Tool v3.1 A professional Python application for recording and transcribing Dutch and Belgian radio streams using OpenAI Whisper API, with advanced keyword extraction powered by KeyBERT. ## 🎯 Features - **Live Radio Recording**: Record streams from 40+ Dutch and Belgian radio stations - **Live Stream Listening**: Listen to radio streams without recording - **AI Transcription**: High-quality transcription using OpenAI Whisper API - **Smart Keyword Extraction**: Advanced phrase analysis with KeyBERT - **Professional UI**: Modern Tkinter interface with Bluvia branding - **Organized Output**: Timestamped folders with MP3 recordings and transcriptions - **API Key Management**: Built-in OpenAI API key configuration
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    A2M — Audio to MIDI

    A2M — Audio to MIDI

    A2M is a desktop app that converts AUDIO TO MIDI in one click.

    A2M (Audio To MIDI) is a simple desktop tool for transcribing local audio files into MIDI files with one click. It is designed primarily for piano recording transcription, and works best on solo piano recordings. Using A2M is straightforward: Select an audio file, click Convert, and the application generates a MIDI file automatically in your Downloads/A2M folder. All processing is done locally on your device, no uploads, no accounts, and no telemetry.
    Leader badge
    Downloads: 77 This Week
    Last Update:
    See Project
  • 9
    Piano transcription

    Piano transcription

    Task of transcribing piano recordings into MIDI files

    Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    AutoSub

    AutoSub

    A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

    AutoSub is a Python-based tool designed to automatically generate subtitles for video or audio content using speech recognition technology. It processes media files by extracting audio, transcribing spoken content, and generating subtitle files in standard formats. The tool supports multiple languages and can integrate with translation systems to produce subtitles in different languages. It is designed for automation, allowing batch processing of multiple media files. AutoSub leverages FFmpeg for media handling and integrates with speech recognition engines for transcription. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    Tengwar Editor

    Tengwar Editor

    Create, save, copy and edit tengwar texts with this application.

    Tengwar is the alphabet of elfish languages which was invented by J.R.R.Tolkien. The application will be write sentences in different languages by simbols of tengwar.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    a tool for segmenting, labeling and transcribing speech
    Downloads: 24 This Week
    Last Update:
    See Project
  • 13

    FtlTranscribe

    For transcribing: Writing down the content of a sound file

    I made this program to help me when I transcribed the interviews during my master thesis. Select a sound file - select a text file. Play an interval of i.e. 5 seconds of the sound file by pressing F8. Write into the text file. Then play the next interval. F9 repeats the last interval. Download the zip file. Unpack it. Run the setup.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    This is a toolkit for transcribing a music audio file to common music notation. This is done by manually annotating a spectrogram or something similar and converting it to a MIDI file and to a abc music notation file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The Tibetan and Himalayan Library has several open-sourced tools for inputting, manipulating, translating, and transcribing Tibetan-language text, audio, and video. We aim to make using a computer easier for Himalayan peoples and scholars.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    TranscriberAG is designed for assisting the manual annotation of speech signals. It provides a user-friendly GUI for segmenting long duration speech recordings, transcribing them, labeling speech turns, topic changes and acoustic conditions.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    MusicAide is a tool to assist musicians in transcribing and typesetting music. It has MIDI support, can export to LilyPond, and can produce guitar tablature.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    plone product for uploading, transcribing, indexing and translating of audio files
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Keystroke enables the transcription and logging of continuous media such as audio or video. It features media control (pause/play/seek) through keys to increase efficiency while transcribing and/or logging.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    OpenTranscribe is a software to aid musicians in transcribing music. It lets you slow down a part of the music without affecting pitch. It also lets you loop over a section which have some tricky parts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SignStream is an application for transcribing video and other time-based media. It's primary use is linguistic analysis, notably of signed languages and gesture, although it is intended to facilitate analysis of other types of time-based media.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    TransKribe is a very simple, rather unfinished KDE application designed to aid in the task of transcribing audio (speech) recordings. The most important feature are playback control via easily accessible keys and automatic insertion of time-marks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Scribam is an application for transcribing and engraving Gregorian Chant. It assists in the initial entry steps, as well as in the final layout and pagination phrase.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    GUIDE (GUgyeol Integrated Decipherment Environment) is a set of tools for browsing historical document images, transcribing, deciphering and translating it. Developed for studying old buddhist codices, it would be serviceable for studying any papyri.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Linguistics tool, mainly for discourse analysis. Trans-Scribe will provide users with an interface for transcribing .wav audio speech files in any language and creating documents that display, publish, or export the waveform along with transcriptions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo