Open Source Python Multimedia Software - Page 9

Python Multimedia Software

View 4896 business solutions

Browse free open source Python Multimedia Software and projects below. Use the toggles on the left to filter open source Python Multimedia Software by OS, license, language, programming language, and project status.

  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    YouTube-DL-PyTK

    YouTube-DL-PyTK

    Video Downloader - Cross Platform

    YouTube-DL-PyTK (formerly known as YouTube-DL-GTK) is just a graphical launcher for yt-dlp. Its purpose is simple; to facilitate the downloading of non-copyright-protected videos from certain internet websites including YouTube. Source code is included and can be executed directly if you have Python and the proper dependencies available. I digitally sign some files in my releases. If you'd like to verify those signatures, you can find my PGP/GPG keys at: https://marcusadams.me/keys.html If you'd like to read the source code without downloading anything, you can do so on the project's Gitlab at: https://gitlab.com/gerowen/youtube-dl-pytk If you'd like to donate there's several ways to do so: PayPal: https://paypal.me/gerowen Bitcoin (BTC): bc1q86c5j7wvf6cw78tf8x3szxy5gnxg4gj8mw4sy2 Monero (XMR): 42ho3m9tJsobZwQDsFTk92ENdWAYk2zL8Qp42m7pKmfWE7jzei7Fwrs87MMXUTCVifjZZiStt3E7c5tmYa9qNxAf3MbY7rD
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Video Frame Extractor

    Video Frame Extractor

    Extracts semi-random frames from all MP4 videos

    This simple tool extracts frames from all MP4 videos in the same folder as this program. ## How to use: - Place this program in the folder containing your MP4 videos. - Double-click on VideoFrameExtractor.exe to run it. - When prompted, enter the number of frames you want to extract from each video. - Wait for the program to finish processing all videos. - Find your extracted frames in the 'extracted_frames' folder. The frames are extracted at evenly distributed points throughout each video. For example, if you choose 3 frames, they will be taken at the 25%, 50%, and 75% marks of each video. (Source code is included with the program .zip file.)
    Leader badge
    Downloads: 27 This Week
    Last Update:
    See Project
  • 3
    PixelToPath

    PixelToPath

    Convert PNG to SVG with a simple GUI tool.

    PixelToPath is an open-source application that converts PNG images into scalable vector graphics (SVG) using the Potrace engine. Designed with simplicity in mind, it provides an intuitive graphical interface to adjust vectorization settings such as smoothing, threshold, and curve precision. PixelToPath is available as a standalone executable for Windows (no Python or installation required) and as a source version for Linux and Windows users who prefer customization. Potrace is fully integrated, allowing offline usage with no extra configuration. Whether you're a designer, developer, or hobbyist, PixelToPath makes bitmap-to-vector conversion fast, accessible, and efficient. The project is hosted on GitHub with source code and releases available for download.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 4
    easycap-app

    easycap-app

    Capture your screen with unprecedented ease and quality.

    Welcome to EasyCap, your ultimate desktop screen recorder and screenshot editor. Designed with simplicity and power in mind, EasyCap is perfect for professionals, creators, and anyone looking to capture their PC activities with ease. Whether you're creating tutorials, recording gameplay, or capturing important moments, EasyCap makes it effortless.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    AI-Shorts-Creator

    AI-Shorts-Creator

    Python-based tool that leverages the power of GPT-4

    AI-Shorts-Creator is a Python-based tool that automates the creation of short-form videos by analyzing long-form content and extracting the most engaging segments. It uses AI models to evaluate transcripts and identify highlight moments, then processes video clips accordingly. The system integrates FFmpeg and OpenCV to crop videos dynamically, often using face detection to keep subjects centered. It is designed for content creators who want to generate short videos quickly without manual editing. The workflow includes downloading source videos, analyzing content, segmenting highlights, and exporting finished clips. It supports multiple video formats and is configurable for different use cases. Overall, it provides an automated pipeline for transforming long videos into social media-ready shorts.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    BackgroundRemover

    BackgroundRemover

    Background Remover lets you Remove Background from images and video

    BackgroundRemover is a command line tool to remove background from image and video, made by nadermx to power BackgroundRemoverAI. If you wonder why it was made read this short blog post.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Consistent Depth

    Consistent Depth

    We estimate dense, flicker-free, geometrically consistent depth

    Consistent Depth is a research project developed by Facebook Research that presents an algorithm for reconstructing dense and geometrically consistent depth information for all pixels in a monocular video. The system builds upon traditional structure-from-motion (SfM) techniques to provide geometric constraints while integrating a convolutional neural network trained for single-image depth estimation. During inference, the model fine-tunes itself to align with the geometric constraints of a specific input video, ensuring stable and realistic depth maps even in less-constrained regions. This approach achieves improved geometric consistency and visual stability compared to prior monocular reconstruction methods. The project can process challenging hand-held video footage, including those with moderate dynamic motion, making it practical for real-world usage.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    EnCodec

    EnCodec

    State-of-the-art deep learning based audio codec

    Encodec is a neural audio codec developed by Meta for high-fidelity, low-bitrate audio compression using end-to-end deep learning. Unlike traditional codecs (like MP3 or Opus), Encodec uses a learned quantizer and decoder to reconstruct complex waveforms with remarkable accuracy at bitrates as low as 1.5 kbps. It employs a convolutional encoder–decoder architecture trained with perceptual loss functions that optimize for human auditory quality rather than raw waveform distance. The model can operate in real time and supports variable bandwidths, bitrates, and multi-band audio. Encodec has applications in speech and music compression, generative modeling, and efficient data transmission for communication systems. The repository includes pretrained checkpoints, PyTorch inference code, and examples for integrating Encodec as a module in downstream generative or streaming systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Google Photos Sync

    Google Photos Sync

    Google Photos and Albums backup with Google Photos Library API

    Google Photos Sync is a backup tool for your Google Photos cloud storage. Google Photos Sync downloads all photos and videos the user has uploaded to Google Photos. It also organizes the media in the local file system using album information. Additional Google Photos 'Creations' such as animations, panoramas, movies, effects and collages are also backed up. This software is read only and never modifies your cloud library in any way, so there is no risk of damaging your data. There are a number of long standing issues with the Google Photos API that mean it is not possible to make a true backup of your media.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    Green Recorder

    Green Recorder

    A simple screen recorder for Linux desktop

    Green Recorder is a desktop screen recording application designed for Linux systems, providing a simple interface for capturing screen activity and audio. It supports recording in multiple formats by leveraging FFmpeg and other backend tools to encode output efficiently. The application allows users to record full screens or specific areas, making it suitable for tutorials and demonstrations. It includes options for selecting audio sources and controlling frame rates to balance quality and performance. green-recorder is designed to be lightweight and user-friendly, minimizing system overhead during recording sessions. It also supports Wayland and Xorg environments, ensuring compatibility across different Linux setups. Overall, it offers an accessible solution for screen recording with essential customization options.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Imogen

    Imogen

    GPU Texture Generator

    Imogen is a real-time, node-based procedural texture generation tool aimed at artists, developers, and shader enthusiasts. It allows users to build complex material textures using a graph-based interface, combining operations like blending, noise, filters, and color correction in a non-destructive workflow. Built with Vulkan and ImGui, Imogen provides immediate visual feedback and supports GPU acceleration for high-resolution texture output. It's particularly useful in game development, VFX, and digital art where procedural workflows are valued for their flexibility and speed.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    ML Sharp

    ML Sharp

    Sharp Monocular View Synthesis in Less Than a Second

    ML Sharp is a research code release that turns a single 2D photograph into a photorealistic 3D representation that can be rendered from nearby viewpoints. Instead of requiring multi-view input, it predicts the parameters of a 3D Gaussian scene representation directly from one image using a single forward pass through a neural network. The core idea is speed: the 3D representation is produced in under a second on a standard GPU, and then the resulting scene can be rendered in real time to generate new views interactively. The representation is metric, meaning it supports camera movements with an absolute scale rather than only relative depth cues, which is useful for consistent viewpoint changes and downstream spatial tasks. The project is structured for reproducibility, with code and assets aimed at demonstrating view synthesis quality, sharp details, and fine structures when rendering high-resolution images.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Mesh R-CNN

    Mesh R-CNN

    code for Mesh R-CNN, ICCV 2019

    Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. Unlike voxel-based or point-based approaches, Mesh R-CNN uses a differentiable mesh representation, allowing it to efficiently refine surface geometry while maintaining high spatial detail. The system combines 2D detection from Mask R-CNN with 3D reasoning modules that output full mesh reconstructions aligned with the input image. It has been evaluated on datasets such as Pix3D, where it demonstrates state-of-the-art performance in reconstructing real-world object geometry.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Mkchromecast

    Mkchromecast

    Cast macOS and Linux Audio/Video to your Google Cast and Sonos Devices

    This is a program to cast audio and video from your macOS, or Linux desktop to your Google Cast devices or Sonos speakers. It is written in Python, and it streams via node.js, ffmpeg, or avconv. Mkchromecast is capable of using lossy and lossless audio formats provided that ffmpeg, avconv (Linux), or parec (Linux) are installed. It also supports Multi-room group playback, and 24-bits/96kHz high audio resolution. Linux users also can configure ALSA to capture audio.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    OSXPhotos

    OSXPhotos

    Python app to work with pictures and associated metadata

    OSXPhotos provides the ability to interact with and query Apple's Photos.app library on macOS and Linux. You can query the Photos library database — for example, file name, file path, and metadata such as keywords/tags, persons/faces, albums, etc. You can also easily export both the original and edited photos. OSXPhotos also works with iPhoto libraries though some features are available only for Photos. Limited support is also provided for exporting photos and metadata from iPhoto libraries. Only iPhoto 9.6.1 (the final release) has been tested. This package will read Photos databases for any supported version on any supported macOS version. E.g. you can read a database created with Photos 5.0 on MacOS 10.15 on a machine running macOS 10.12 and vice versa.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    PML

    PML

    The easiest way to use deep metric learning in your application

    This library contains 9 modules, each of which can be used independently within your existing codebase, or combined together for a complete train/test workflow. To compute the loss in your training loop, pass in the embeddings computed by your model, and the corresponding labels. The embeddings should have size (N, embedding_size), and the labels should have size (N), where N is the batch size. The TripletMarginLoss computes all possible triplets within the batch, based on the labels you pass into it. Anchor-positive pairs are formed by embeddings that share the same label, and anchor-negative pairs are formed by embeddings that have different labels. Loss functions can be customized using distances, reducers, and regularizers. In the diagram below, a miner finds the indices of hard pairs within a batch. These are used to index into the distance matrix, computed by the distance object. For this diagram, the loss function is pair-based, so it computes a loss per pair.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    PyLivestream

    PyLivestream

    Pure Python FFmpeg-based live video / audio streaming to YouTube

    PyLivestream is a Python-based tool that enables real-time video streaming from various input sources to platforms such as YouTube and Twitch. It acts as a wrapper around FFmpeg, allowing users to stream video from cameras, files, or screen capture devices with minimal configuration. The tool supports cross-platform operation and integrates easily into Python workflows, making it suitable for automation and scripting. It provides options for controlling streaming parameters such as bitrate, resolution, and codecs. PyLivestream is designed for reliability, handling streaming sessions with consistent performance across different environments. It is particularly useful for developers and researchers who need programmable access to live streaming capabilities. Overall, it simplifies the process of broadcasting live video using FFmpeg.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Quod Libet

    Quod Libet

    Music player and music library manager for Linux, Windows, and macOS

    Quod Libet is a cross-platform audio/music management program. It provides many ways to view your local library, and supports streaming audio and feeds (podcasts, etc). It has extremely flexible metadata editing and searching capabilities. With over 90 plugins included, you can extend and integrate with almost anything, or write your own! Ex Falso is a bare-bones tag editor with the same editing interface as Quod Libet. Quod Libet is a GTK+-based audio player written in Python, using the Mutagen tagging library. It’s designed around the idea that you know how to organize your music better than we do. It lets you make playlists based on regular expressions (don’t worry, regular searches work too). It lets you display and edit any tags you want in the file, for all the file formats it supports. Unlike some, Quod Libet will scale to libraries with tens of thousands of songs. It also supports most of the features you’d expect from a modern media player.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Tilf

    Tilf

    Tilf (Tiny Elf) is a free, simple yet powerful pixel art editor

    Tilf (Tiny Elf) is a lightweight, cross-platform pixel art editor developed in Python with PySide6, designed for simplicity, speed, and freedom from account systems or installation overhead. It focuses on enabling artists to create sprites, icons, and small 2D assets quickly, without requiring setup, dependencies, or internet connectivity. Tilf provides a familiar drawing environment with essential tools—such as pencil, eraser, fill, eyedropper, rectangle, and ellipse—along with zoom, grid display, real-time preview, and undo/redo capabilities. It supports importing and exporting images in PNG, JPG, and BMP formats, including transparency options. With its single-executable builds for Windows, macOS, and Linux, Tilf can be run instantly and is ideal for both hobbyist pixel artists and developers needing a quick sketching tool for sprite work. The project emphasizes accessibility and minimalism over complexity, making it approachable even for users with no technical background.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Verticals v3

    Verticals v3

    Automated YouTube Shorts pipeline

    Verticals v3 is an automated content generation workflow designed to create and process YouTube Shorts videos programmatically. It combines multiple tools and scripts to handle tasks such as downloading source material, editing clips, adding subtitles, and formatting output for vertical video platforms. The pipeline emphasizes automation, allowing users to produce short-form content at scale with minimal manual intervention. It integrates FFmpeg and other media processing tools to handle video transformations, resizing, and encoding. The system also supports adding overlays, captions, and audio enhancements to improve engagement. Designed for creators and developers, it enables repeatable workflows for generating social media content efficiently. Its modular structure allows customization of each stage in the pipeline, making it adaptable to different content strategies.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Web RPA

    Web RPA

    Web Robotics Process Automation Tool

    Web RPA is a browser automation framework designed to perform robotic process automation tasks directly within web environments. It enables users to automate repetitive actions such as form filling, data extraction, and workflow execution through programmable scripts. The system focuses on simplicity and flexibility, allowing automation without requiring complex infrastructure. It supports interaction with web elements, navigation flows, and dynamic content handling, making it suitable for scraping and automation scenarios. WebRPA can be integrated into larger systems or used as a standalone tool for automating browser-based operations. Its lightweight design ensures efficient execution while maintaining adaptability for different use cases. Overall, it provides a practical solution for automating web workflows and repetitive tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Windrecorder

    Windrecorder

    Windrecorder is a memory search app by records everything

    Windrecorder is an open-source personal memory search engine that continuously records on-screen activity in a highly optimized and storage-efficient format. It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage overhead. It includes a web-based interface where users can browse timelines, analyze activity, and perform semantic queries on recorded content. The tool emphasizes privacy by running entirely offline, ensuring that all captured data remains on the user’s device without external transmission. It also provides analytical insights such as activity summaries, word clouds, and timelines, making it useful for productivity tracking and recall.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    YouTube-8M

    YouTube-8M

    Starter code for working with the YouTube-8M dataset

    youtube-8m is Google’s open source starter code and reference implementation for training and evaluating machine learning models on the YouTube-8M dataset, one of the largest video understanding datasets publicly released. The repository provides a complete pipeline for video-level and frame-level modeling using TensorFlow, including data reading, model training, evaluation, and inference. It was developed to support the YouTube-8M Video Understanding Challenge (hosted on Kaggle and featured at ICCV 2019), enabling researchers and practitioners to benchmark video classification models on large-scale datasets with over millions of labeled videos. The code demonstrates how to process frame-level features, train logistic and deep learning models, evaluate them using metrics like global Average Precision (gAP) and mean Average Precision (mAP), and export trained models for MediaPipe inference.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    gopro-dashboard-overlay

    gopro-dashboard-overlay

    Programs to process GoPro MP4 & Generic GPX/FIT files

    gopro-dashboard-overlay is a multimedia processing toolkit that generates data-rich overlays on videos using telemetry data from GoPro cameras or external GPS sources. It processes video files alongside GPX or FIT data to render dashboards displaying metrics such as speed, distance, and location directly onto footage. The system supports a wide range of layouts, including maps, gauges, and charts, which can be customized through configuration files. It integrates FFmpeg for rendering and supports multiple resolutions and camera modes such as timelapse and timewarp. The tool can also convert metadata into formats like GPX or CSV for further analysis. It is designed for both post-processing workflows and automated video generation pipelines. Overall, it enhances action footage by adding synchronized visual data overlays.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    nunif

    nunif

    Misc; latest version of waifu2x; 2D video to stereo 3D video

    nunif is a deep learning–based image processing framework focused on image upscaling, restoration, denoising, and enhancement tasks using neural network models. The project provides a collection of AI-powered utilities designed primarily for anime-style artwork, illustrations, and high-quality image restoration workflows. It includes command-line tools and graphical interfaces for applying trained neural models to improve image resolution and visual clarity while minimizing artifacts. nunif supports GPU acceleration and batch processing, making it suitable for creators, archivists, and enthusiasts handling large image collections. The framework is highly modular, allowing developers to experiment with custom models, inference pipelines, and image-processing workflows. Its emphasis on anime and illustration enhancement has made it especially popular in digital art and media preservation communities.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB