Showing 25 open source projects for "subtitle-workshop"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 2
    Video-subtitle-remover (VSR)

    Video-subtitle-remover (VSR)

    AI tool that removes hardcoded subtitles and text from videos locally

    Video Subtitle Remover is an AI-based application designed to remove hardcoded subtitles from videos and generate new files without the embedded text. Video Subtitle Remover analyzes video frames and detects subtitle regions, then replaces the removed areas using an AI algorithm that fills the space with reconstructed visual content. This process aims to maintain the original resolution and visual continuity of the video after subtitle removal.
    Downloads: 115 This Week
    Last Update:
    See Project
  • 3
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken segment and synthesizes audio via neural TTS services, producing one audio clip per subtitle entry. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps.
    Downloads: 24 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    ...It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies subtitle overlays, producing a polished short video without manual editing. The tool streamlines multiple steps of the tedious short-form video workflow: highlight detection, clipping, subtitle generation, cropping to vertical 9:16 format, and final rendering — reducing hours of editing to a mostly automated pipeline. Because it supports both local and online video sources, it's flexible whether you're working with your own recorded content or repurposing existing longer-form videos.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    MoneyPrinterTurbo

    MoneyPrinterTurbo

    Generate short videos with one click using AI LLM

    MoneyPrinterTurbo is an AI-driven tool that enables users to generate high-definition short videos with minimal input. By providing a topic or keyword, the system automatically creates video scripts, sources relevant media assets, adds subtitles, and incorporates background music, resulting in a polished video ready for distribution.
    Downloads: 251 This Week
    Last Update:
    See Project
  • 8
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    ...It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common formats like MP3 or WAV. It also supports generating subtitle files (such as SRT or VTT) alongside the speech, which is handy for video narration, e-learning, or accessibility workflows. From the CLI you can adjust parameters such as speaking rate, volume, and pitch, giving you some control over prosody without diving into SSML. The library is asynchronous under the hood, which makes it efficient for batch jobs or web services that need to synthesize many utterances concurrently.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 9
    stt

    stt

    Voice Recognition to Text Tool

    stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. It supports GPU acceleration if available, enabling faster processing on compatible hardware but still offers reliable performance on CPUs alone.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    ...It separates client-side media handling from backend AI processing, reducing data exposure while still enabling transcription and document generation. AI-Media2Doc supports flexible customization through prompts, allowing users to tailor output styles based on their needs. It also includes features like subtitle export and AI-assisted follow-up questioning for deeper interaction with the generated content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 12
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 13
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    The AI Scientist-v2

    The AI Scientist-v2

    Workshop-Level Automated Scientific Discovery via Agentic Tree Search

    AI-Scientist-v2 is an advanced autonomous research system designed to perform end-to-end scientific discovery using large language models and agent-based orchestration. The platform is capable of generating original research ideas, designing and executing experiments, analyzing and visualizing results, and producing full academic papers without direct human intervention. It introduces a generalized framework that removes reliance on predefined templates, enabling broader applicability across...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Habit Tracker

    Habit Tracker

    Habit Tracker for the AI Coding Workshop

    Habit Tracker is a personal habit-tracking web application designed to help users build and maintain daily habits through intuitive UI and analytics that visualize progress over time. It runs locally with a FastAPI backend (Python) and a React frontend, storing all data in a lightweight SQLite database so there’s no need for user accounts or cloud storage, which keeps habit data fully private and self-contained. The app provides streak tracking and completion rates for each habit, giving...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Violin

    Violin

    Open-source Video Translation Skill

    Violin is an open-source video translation and dubbing tool that turns existing videos into localized versions with translated voice-over and optional subtitles. It transcribes the original speech, translates the text, generates natural-sounding speech in the target language, and remuxes the new audio back into the video. The project is designed to keep the generated speech aligned with the original timing so the final result feels closer to a real dubbed video. It can be used from the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    yt-fts

    yt-fts

    Search all of YouTube from the command line

    yt-fts, short for YouTube Full Text Search, is an open-source command-line tool that enables users to search the spoken content of YouTube videos by indexing their subtitles. The program automatically downloads subtitles from a specified YouTube channel using the yt-dlp utility and stores them in a local SQLite database. Once indexed, users can perform full-text searches across all transcripts to quickly locate keywords or phrases mentioned within the videos. The tool returns search results...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a wide variety of OCR tasks, outperforming many traditional OCR systems and even other multimodal models on benchmark suites. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19

    WhisperJAV

    A subtitle generator for Japanese Adult Videos.

    A subtitle generator for Japanese Adult Videos. Transformer-based ASR architectures like Whisper suffer significant performance degradation when applied to the spontaneous and noisy domain of JAV. This degradation is driven by specific acoustic and temporal characteristics that defy the statistical distributions of standard training data.
    Downloads: 87 This Week
    Last Update:
    See Project
  • 20
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    ...It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets the generated dub track stay in sync with the original video structure. The project supports a wide range of languages for translation, spanning major world languages (English, Spanish, French, German, Chinese, Arabic, etc.) and many regional or less widely spoken languages, making it suitable for broad internationalization. ...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 21

    SoundTranscriber

    SoundTranscriber can be used to generate automatic transcription / aut

    SoundTranscriber can be used to generate automatic transcription / aut
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    YYeTsBot

    YYeTsBot

    Renren Film and Television bot, fully connected to Renren resources

    ...You can directly send the name of the episode you want to watch, and you can choose to share the webpage or link (ed2k and magnet links). When searching for resources, it will be searched according to my predetermined priority (everyone video offline, subtitle man), of course, you can also use commands to force a subtitle group. Due to the difference in translations, it is recommended to enter a partial translation and then select from the list. For example, if you want to watch the fourth season of Game of Thrones, just search for "Game of Thrones". Want to keep a resource for yourself, but don't know how to program? ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    VATSG

    VATSG

    Video automatic transcribe and translated subtitle generator

    It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your target language subtitle file(srt format) If you are a general user who want to view any video file and mp3 file to your language, It will provide way. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    commit-autosuggestions

    commit-autosuggestions

    A tool that AI automatically recommends commit messages

    This is implementation of CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model. CommitBERT is accepted in ACL workshop : NLP4Prog. Have you ever hesitated to write a commit message? Now get a commit message from Artificial Intelligence! CodeBERT: A Pre-Trained Model for Programming and Natural Languages introduces a pre-trained model in a combination of Program Language and Natural Language(PL-NL). It also introduces the problem of converting code into natural language (Code Documentation Generation). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A collection of software made by Milos Rancic.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo