pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.
Features
- Automatic end-to-end video translation pipeline
- Speech transcription and subtitle generation
- AI voice synthesis and dubbing
- Interactive proofreading support
- Speaker diarization and role assignment
- Supports local and cloud model backends