transcribing free download

8 projects for "transcribing" with 1 filter applied:

ChromeOS Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
1

WhisperJAV

Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces a specialized pipeline that separates text generation from timestamp alignment, allowing the system to generate transcripts and then align them with audio using forced alignment techniques. ...

Downloads: 25 This Week

Last Update: 2026-05-11
See Project
2

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper

WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. ...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
3

Piano transcription

Task of transcribing piano recordings into MIDI files

Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool,...

Downloads: 6 This Week

Last Update: 2025-12-02
See Project
4

AutoSub

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

AutoSub is a Python-based tool designed to automatically generate subtitles for video or audio content using speech recognition technology. It processes media files by extracting audio, transcribing spoken content, and generating subtitle files in standard formats. The tool supports multiple languages and can integrate with translation systems to produce subtitles in different languages. It is designed for automation, allowing batch processing of multiple media files. AutoSub leverages FFmpeg for media handling and integrates with speech recognition engines for transcription. ...

Downloads: 13 This Week

Last Update: 2026-04-28
See Project
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
5

THL Tools (Tibetan and Himalayan Librar)

The Tibetan and Himalayan Library has several open-sourced tools for inputting, manipulating, translating, and transcribing Tibetan-language text, audio, and video. We aim to make using a computer easier for Himalayan peoples and scholars.

Downloads: 6 This Week

Last Update: 2013-04-16
See Project
6

MusicAide

MusicAide is a tool to assist musicians in transcribing and typesetting music. It has MIDI support, can export to LilyPond, and can produce guitar tablature.

Downloads: 0 This Week

Last Update: 2016-07-28
See Project
7

MiniHAL

plone product for uploading, transcribing, indexing and translating of audio files

Downloads: 0 This Week

Last Update: 2013-04-22
See Project
8

Keystroke

Keystroke enables the transcription and logging of continuous media such as audio or video. It features media control (pause/play/seek) through keys to increase efficiency while transcribing and/or logging.

1 Review

Downloads: 1 This Week

Last Update: 2013-04-19
See Project