AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. It separates client-side media handling from backend AI processing, reducing data exposure while still enabling transcription and document generation. AI-Media2Doc supports flexible customization through prompts, allowing users to tailor output styles based on their needs. It also includes features like subtitle export and AI-assisted follow-up questioning for deeper interaction with the generated content.
Features
- Converts video and audio into multiple document styles such as notes and mind maps
- Local media processing using ffmpeg WebAssembly without installation
- Privacy-focused design with no login and local data storage
- AI-powered transcription and document generation pipeline
- Subtitle export and intelligent screenshot insertion into documents
- Customizable prompts and optional AI chat for content interaction