Showing 49 open source projects for "subtitle-workshop"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • 1
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 2
    Video-subtitle-remover (VSR)

    Video-subtitle-remover (VSR)

    AI tool that removes hardcoded subtitles and text from videos locally

    Video Subtitle Remover is an AI-based application designed to remove hardcoded subtitles from videos and generate new files without the embedded text. Video Subtitle Remover analyzes video frames and detects subtitle regions, then replaces the removed areas using an AI algorithm that fills the space with reconstructed visual content. This process aims to maintain the original resolution and visual continuity of the video after subtitle removal.
    Downloads: 115 This Week
    Last Update:
    See Project
  • 3
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    LLPlayer

    LLPlayer

    The media player for language learning, with dual subtitles

    ...Real-time translation capabilities enable subtitles to be translated using multiple translation engines and language models. Additional tools such as instant word lookup, contextual translation, and subtitle search allow learners to interact with the text while watching videos.
    Downloads: 44 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken segment and synthesizes audio via neural TTS services, producing one audio clip per subtitle entry. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 7
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    AutoSubs is an open-source, AI-powered subtitle generation tool that enables users to automatically transcribe audio and video content into accurate, editable subtitles directly on their device. It supports both standalone usage and integration with professional video editing software such as DaVinci Resolve, allowing creators to generate and edit subtitles within their existing workflows.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 8
    KrillinAI

    KrillinAI

    Video translation and dubbing tool powered by LLMs

    KrillinAI is an end-to-end content localization, translation, and dubbing tool aimed at helping creators transform videos into multiple languages with minimal manual effort. It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed audio tracks. KrillinAI supports both landscape and portrait videos, which makes it suitable for a wide range of platforms — from YouTube to TikTok or other vertical-video sites — and ensures correct formatting and layout for the final video. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 9
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    ...It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies subtitle overlays, producing a polished short video without manual editing. The tool streamlines multiple steps of the tedious short-form video workflow: highlight detection, clipping, subtitle generation, cropping to vertical 9:16 format, and final rendering — reducing hours of editing to a mostly automated pipeline. Because it supports both local and online video sources, it's flexible whether you're working with your own recorded content or repurposing existing longer-form videos.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    MoneyPrinterTurbo

    MoneyPrinterTurbo

    Generate short videos with one click using AI LLM

    MoneyPrinterTurbo is an AI-driven tool that enables users to generate high-definition short videos with minimal input. By providing a topic or keyword, the system automatically creates video scripts, sources relevant media assets, adds subtitles, and incorporates background music, resulting in a polished video ready for distribution.
    Downloads: 251 This Week
    Last Update:
    See Project
  • 12
    LLM From Scratch

    LLM From Scratch

    Build and train a GPT-style language model

    LLM From Scratch is a hands-on educational workshop project that teaches developers how to build and train a GPT-style language model entirely from scratch using PyTorch. Instead of relying on high-level abstractions or prebuilt frameworks, the project walks users through implementing every core component manually, including tokenization, transformer architecture, training loops, and autoregressive text generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    ...It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common formats like MP3 or WAV. It also supports generating subtitle files (such as SRT or VTT) alongside the speech, which is handy for video narration, e-learning, or accessibility workflows. From the CLI you can adjust parameters such as speaking rate, volume, and pitch, giving you some control over prosody without diving into SSML. The library is asynchronous under the hood, which makes it efficient for batch jobs or web services that need to synthesize many utterances concurrently.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 14
    Short Video Factory

    Short Video Factory

    AI tool for automatic batch short video creation and editing

    ...It enables users to generate product marketing clips and general content videos by combining simple prompt-based input with pre-prepared media assets. Short Video Factory integrates multiple stages of video production, including script generation, voice synthesis, video editing, and subtitle effects, into a single streamlined workflow. By leveraging AI technologies, it significantly reduces the manual effort required to produce high-quality short videos at scale. Short Video Factory supports batch processing, allowing users to automatically generate multiple videos based on predefined templates and configurations. It is built as a cross-platform desktop solution with a focus on usability, making it accessible to both beginners and content creators who need fast turnaround times.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    stt

    stt

    Voice Recognition to Text Tool

    stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. It supports GPU acceleration if available, enabling faster processing on compatible hardware but still offers reliable performance on CPUs alone.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    ...The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure parameters such as rate, pitch, and volume per role. It offers streaming playback so audio starts almost immediately, even for very long inputs, and automatically generates subtitle files suitable for video production or translation workflows. Under the hood, easyVoice uses a modern stack with Vue 3 and Element Plus on the front end, Node.js and Express on the back end, and TTS engines such as Microsoft Azure TTS and OpenAI-compatible APIs, orchestrated through ffmpeg.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    ...It separates client-side media handling from backend AI processing, reducing data exposure while still enabling transcription and document generation. AI-Media2Doc supports flexible customization through prompts, allowing users to tailor output styles based on their needs. It also includes features like subtitle export and AI-assisted follow-up questioning for deeper interaction with the generated content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 19
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 20
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    OpenShorts

    OpenShorts

    Free & open source AI video platform

    OpenShorts is an open-source, self-hosted AI video automation platform designed to generate, edit, and distribute short-form vertical content across social media platforms. It combines multiple tools into a single pipeline, including clip generation, AI-driven video creation, and YouTube optimization features. The system can transform long videos or uploaded files into short clips by detecting engaging moments, reframing content, and adding subtitles and visual effects. It also supports...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Habit Tracker

    Habit Tracker

    Habit Tracker for the AI Coding Workshop

    Habit Tracker is a personal habit-tracking web application designed to help users build and maintain daily habits through intuitive UI and analytics that visualize progress over time. It runs locally with a FastAPI backend (Python) and a React frontend, storing all data in a lightweight SQLite database so there’s no need for user accounts or cloud storage, which keeps habit data fully private and self-contained. The app provides streak tracking and completion rates for each habit, giving...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    xgplayer

    xgplayer

    A HTML5 video player with a parser that saves traffic

    ...Because of its emphasis on modularity and extensibility, xgplayer can be embedded into modern web projects and customized — developers can add controls, custom buffering strategies, subtitle handling, adaptive bitrate streaming, or integrate with other web-based video infrastructures. It seeks to provide a smooth, stable viewing experience even on varied devices or network conditions, and is particularly appealing for web apps that need more control than vanilla video tags offer.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    Violin

    Violin

    Open-source Video Translation Skill

    Violin is an open-source video translation and dubbing tool that turns existing videos into localized versions with translated voice-over and optional subtitles. It transcribes the original speech, translates the text, generates natural-sounding speech in the target language, and remuxes the new audio back into the video. The project is designed to keep the generated speech aligned with the original timing so the final result feels closer to a real dubbed video. It can be used from the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Wandesk

    Wandesk

    Give intelligence shape. Let AI sculpt your desktop

    Wandesk is an open-source AI desktop environment that lets users build local apps by describing what they need in natural language. It combines chat, generated apps, shared workspace context, personal memory, files, tasks, and agent integrations into one desktop-like interface. Its App Workshop can generate a React UI, backend API, and SQLite storage together, allowing users to create usable local tools without manually writing the whole application. Wandesk is local-first, works without signup, and can connect to providers such as Claude, Codex, DeepSeek, OpenAI, Kimi, Qwen, or any OpenAI-compatible endpoint. Built-in apps include Chat, Notebook, Ledger, Memory, Files, Tasks, Settings, Claude Code, Codex, and Open Source Radar. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo