Open Source Python Multimedia Software - Page 9

Python Multimedia Software

View 4897 business solutions

Browse free open source Python Multimedia Software and projects below. Use the toggles on the left to filter open source Python Multimedia Software by OS, license, language, programming language, and project status.

  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    UniConvertor

    UniConvertor

    Universal graphics translator

    UniConvertor is an universal graphics translator. The project uses sK1 engine to convert one format to another. It has an import filters for: SVG, CDR, CDT, CMX, AI, XAR, CGM, WMF, XFIG, SK, SK1, SK2, CPL, ASE, ACO, JCW, GPL, SOC, SKP, PSD, XCF, PNG, JPG, TIFF, WEBP, BMP, PCX, PPM, XBM, XPM and export filters: SVG, AI, CDR, CMX, PDF, SK, SK1, SK2, CGM, WMF, CPL, ASE, ACO, JCW, GPL, SOC, SKP, PNG. This SourceForge project page is outdated. To download latest UniConvertor binaries, please visit official project site: https://sk1project.net/uc2/
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    BackgroundRemover

    BackgroundRemover

    Background Remover lets you Remove Background from images and video

    BackgroundRemover is a command line tool to remove background from image and video, made by nadermx to power BackgroundRemoverAI. If you wonder why it was made read this short blog post.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Consistent Depth

    Consistent Depth

    We estimate dense, flicker-free, geometrically consistent depth

    Consistent Depth is a research project developed by Facebook Research that presents an algorithm for reconstructing dense and geometrically consistent depth information for all pixels in a monocular video. The system builds upon traditional structure-from-motion (SfM) techniques to provide geometric constraints while integrating a convolutional neural network trained for single-image depth estimation. During inference, the model fine-tunes itself to align with the geometric constraints of a specific input video, ensuring stable and realistic depth maps even in less-constrained regions. This approach achieves improved geometric consistency and visual stability compared to prior monocular reconstruction methods. The project can process challenging hand-held video footage, including those with moderate dynamic motion, making it practical for real-world usage.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Google Photos Sync

    Google Photos Sync

    Google Photos and Albums backup with Google Photos Library API

    Google Photos Sync is a backup tool for your Google Photos cloud storage. Google Photos Sync downloads all photos and videos the user has uploaded to Google Photos. It also organizes the media in the local file system using album information. Additional Google Photos 'Creations' such as animations, panoramas, movies, effects and collages are also backed up. This software is read only and never modifies your cloud library in any way, so there is no risk of damaging your data. There are a number of long standing issues with the Google Photos API that mean it is not possible to make a true backup of your media.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Green Recorder

    Green Recorder

    A simple screen recorder for Linux desktop

    Green Recorder is a desktop screen recording application designed for Linux systems, providing a simple interface for capturing screen activity and audio. It supports recording in multiple formats by leveraging FFmpeg and other backend tools to encode output efficiently. The application allows users to record full screens or specific areas, making it suitable for tutorials and demonstrations. It includes options for selecting audio sources and controlling frame rates to balance quality and performance. green-recorder is designed to be lightweight and user-friendly, minimizing system overhead during recording sessions. It also supports Wayland and Xorg environments, ensuring compatibility across different Linux setups. Overall, it offers an accessible solution for screen recording with essential customization options.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Imogen

    Imogen

    GPU Texture Generator

    Imogen is a real-time, node-based procedural texture generation tool aimed at artists, developers, and shader enthusiasts. It allows users to build complex material textures using a graph-based interface, combining operations like blending, noise, filters, and color correction in a non-destructive workflow. Built with Vulkan and ImGui, Imogen provides immediate visual feedback and supports GPU acceleration for high-resolution texture output. It's particularly useful in game development, VFX, and digital art where procedural workflows are valued for their flexibility and speed.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    ML Sharp

    ML Sharp

    Sharp Monocular View Synthesis in Less Than a Second

    ML Sharp is a research code release that turns a single 2D photograph into a photorealistic 3D representation that can be rendered from nearby viewpoints. Instead of requiring multi-view input, it predicts the parameters of a 3D Gaussian scene representation directly from one image using a single forward pass through a neural network. The core idea is speed: the 3D representation is produced in under a second on a standard GPU, and then the resulting scene can be rendered in real time to generate new views interactively. The representation is metric, meaning it supports camera movements with an absolute scale rather than only relative depth cues, which is useful for consistent viewpoint changes and downstream spatial tasks. The project is structured for reproducibility, with code and assets aimed at demonstrating view synthesis quality, sharp details, and fine structures when rendering high-resolution images.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    MMDeploy

    MMDeploy

    OpenMMLab Model Deployment Framework

    MMDeploy is an open-source deep learning model deployment toolset. It is a part of the OpenMMLab project. Models can be exported and run in several backends, and more will be compatible. All kinds of modules in the SDK can be extended, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on. Install and build your target backend. ONNX Runtime is a cross-platform inference and training accelerator compatible with many popular ML/DNN frameworks. Please read getting_started for the basic usage of MMDeploy.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Mesh R-CNN

    Mesh R-CNN

    code for Mesh R-CNN, ICCV 2019

    Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. Unlike voxel-based or point-based approaches, Mesh R-CNN uses a differentiable mesh representation, allowing it to efficiently refine surface geometry while maintaining high spatial detail. The system combines 2D detection from Mask R-CNN with 3D reasoning modules that output full mesh reconstructions aligned with the input image. It has been evaluated on datasets such as Pix3D, where it demonstrates state-of-the-art performance in reconstructing real-world object geometry.
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Mkchromecast

    Mkchromecast

    Cast macOS and Linux Audio/Video to your Google Cast and Sonos Devices

    This is a program to cast audio and video from your macOS, or Linux desktop to your Google Cast devices or Sonos speakers. It is written in Python, and it streams via node.js, ffmpeg, or avconv. Mkchromecast is capable of using lossy and lossless audio formats provided that ffmpeg, avconv (Linux), or parec (Linux) are installed. It also supports Multi-room group playback, and 24-bits/96kHz high audio resolution. Linux users also can configure ALSA to capture audio.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    MystiQ

    MystiQ

    Qt5/C++ FFmpeg Media Converter

    MystiQ is a cross-platform multimedia converter built with Qt and FFmpeg, designed to provide a modern graphical interface for video and audio processing tasks. It allows users to perform operations such as transcoding, trimming, and format conversion without needing to use command-line tools. The application supports a wide range of codecs and formats, enabling compatibility across devices and platforms. It includes batch processing capabilities, allowing multiple files to be converted simultaneously. MystiQ also provides customizable encoding parameters, giving users control over quality and performance. Its interface is designed to be intuitive while still exposing advanced features for experienced users. Overall, it combines ease of use with powerful multimedia processing capabilities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    OSXPhotos

    OSXPhotos

    Python app to work with pictures and associated metadata

    OSXPhotos provides the ability to interact with and query Apple's Photos.app library on macOS and Linux. You can query the Photos library database — for example, file name, file path, and metadata such as keywords/tags, persons/faces, albums, etc. You can also easily export both the original and edited photos. OSXPhotos also works with iPhoto libraries though some features are available only for Photos. Limited support is also provided for exporting photos and metadata from iPhoto libraries. Only iPhoto 9.6.1 (the final release) has been tested. This package will read Photos databases for any supported version on any supported macOS version. E.g. you can read a database created with Photos 5.0 on MacOS 10.15 on a machine running macOS 10.12 and vice versa.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    PML

    PML

    The easiest way to use deep metric learning in your application

    This library contains 9 modules, each of which can be used independently within your existing codebase, or combined together for a complete train/test workflow. To compute the loss in your training loop, pass in the embeddings computed by your model, and the corresponding labels. The embeddings should have size (N, embedding_size), and the labels should have size (N), where N is the batch size. The TripletMarginLoss computes all possible triplets within the batch, based on the labels you pass into it. Anchor-positive pairs are formed by embeddings that share the same label, and anchor-negative pairs are formed by embeddings that have different labels. Loss functions can be customized using distances, reducers, and regularizers. In the diagram below, a miner finds the indices of hard pairs within a batch. These are used to index into the distance matrix, computed by the distance object. For this diagram, the loss function is pair-based, so it computes a loss per pair.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Pydub

    Pydub

    Manipulate audio with a simple and easy high level interface

    Manipulate audio with a simple and easy high level interface. You can pass an optional bitrate argument to export using any syntax ffmpeg supports. Any further arguments supported by ffmpeg can be passed as a list in a 'parameters' argument, with switch first, argument second. Note that no validation takes place on these parameters, and you may be limited by what your particular build of ffmpeg/avlib supports. You can open and save WAV files with pure python. For opening and saving non-wav files, like mp3, you'll need ffmpeg or libav. Any operations that combine multiple AudioSegment objects in any way will first ensure that they have the same number of channels, frame rate, sample rate, bit depth, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Quod Libet

    Quod Libet

    Music player and music library manager for Linux, Windows, and macOS

    Quod Libet is a cross-platform audio/music management program. It provides many ways to view your local library, and supports streaming audio and feeds (podcasts, etc). It has extremely flexible metadata editing and searching capabilities. With over 90 plugins included, you can extend and integrate with almost anything, or write your own! Ex Falso is a bare-bones tag editor with the same editing interface as Quod Libet. Quod Libet is a GTK+-based audio player written in Python, using the Mutagen tagging library. It’s designed around the idea that you know how to organize your music better than we do. It lets you make playlists based on regular expressions (don’t worry, regular searches work too). It lets you display and edit any tags you want in the file, for all the file formats it supports. Unlike some, Quod Libet will scale to libraries with tens of thousands of songs. It also supports most of the features you’d expect from a modern media player.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Tilf

    Tilf

    Tilf (Tiny Elf) is a free, simple yet powerful pixel art editor

    Tilf (Tiny Elf) is a lightweight, cross-platform pixel art editor developed in Python with PySide6, designed for simplicity, speed, and freedom from account systems or installation overhead. It focuses on enabling artists to create sprites, icons, and small 2D assets quickly, without requiring setup, dependencies, or internet connectivity. Tilf provides a familiar drawing environment with essential tools—such as pencil, eraser, fill, eyedropper, rectangle, and ellipse—along with zoom, grid display, real-time preview, and undo/redo capabilities. It supports importing and exporting images in PNG, JPG, and BMP formats, including transparency options. With its single-executable builds for Windows, macOS, and Linux, Tilf can be run instantly and is ideal for both hobbyist pixel artists and developers needing a quick sketching tool for sprite work. The project emphasizes accessibility and minimalism over complexity, making it approachable even for users with no technical background.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Verticals v3

    Verticals v3

    Automated YouTube Shorts pipeline

    Verticals v3 is an automated content generation workflow designed to create and process YouTube Shorts videos programmatically. It combines multiple tools and scripts to handle tasks such as downloading source material, editing clips, adding subtitles, and formatting output for vertical video platforms. The pipeline emphasizes automation, allowing users to produce short-form content at scale with minimal manual intervention. It integrates FFmpeg and other media processing tools to handle video transformations, resizing, and encoding. The system also supports adding overlays, captions, and audio enhancements to improve engagement. Designed for creators and developers, it enables repeatable workflows for generating social media content efficiently. Its modular structure allows customization of each stage in the pipeline, making it adaptable to different content strategies.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Windrecorder

    Windrecorder

    Windrecorder is a memory search app by records everything

    Windrecorder is an open-source personal memory search engine that continuously records on-screen activity in a highly optimized and storage-efficient format. It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage overhead. It includes a web-based interface where users can browse timelines, analyze activity, and perform semantic queries on recorded content. The tool emphasizes privacy by running entirely offline, ensuring that all captured data remains on the user’s device without external transmission. It also provides analytical insights such as activity summaries, word clouds, and timelines, making it useful for productivity tracking and recall.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    asciinema

    asciinema

    Open source terminal session recorder

    asciinema is a free and open source terminal session recorder. It lets you easily record and play back terminal sessions in the terminal or in a web browser. Forget old screen recording methods and resulting blurry videos. asciinema lets you record your terminal sessions the right way, which is right where you work, in the terminal. Recording is as easy as running one command, and since it’s purely text-based you can copy and paste any content you want, simply pause the recording! You can also easily share your recordings on the web, embed an asciicast player in your blog post, project documentation page or in your conference talk slides. See plenty of example sessions recorded with asciinema here: https://asciinema.org/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    nunif

    nunif

    Misc; latest version of waifu2x; 2D video to stereo 3D video

    nunif is a deep learning–based image processing framework focused on image upscaling, restoration, denoising, and enhancement tasks using neural network models. The project provides a collection of AI-powered utilities designed primarily for anime-style artwork, illustrations, and high-quality image restoration workflows. It includes command-line tools and graphical interfaces for applying trained neural models to improve image resolution and visual clarity while minimizing artifacts. nunif supports GPU acceleration and batch processing, making it suitable for creators, archivists, and enthusiasts handling large image collections. The framework is highly modular, allowing developers to experiment with custom models, inference pipelines, and image-processing workflows. Its emphasis on anime and illustration enhancement has made it especially popular in digital art and media preservation communities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    pybaselines

    pybaselines

    Library of algorithms for baseline correction of experimental data

    pybaselines is a Python library that provides many different algorithms for performing baseline correction on data from experimental techniques such as Raman, FTIR, NMR, XRD, XRF, PIXE, etc. The aim of the project is to provide a semi-unified API to allow quick testing and comparing multiple baseline correction algorithms to find the best one for a set of data. pybaselines has 50+ baseline correction algorithms. These include popular algorithms, such as AsLS, airPLS, ModPoly, and SNIP, as well as many lesser-known algorithms. Most algorithms are adapted directly from literature, although there are a few that are unique to pybaselines, such as penalized spline versions of Whittaker-smoothing-based algorithms. The full list of implemented algorithms can be found in the documentation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    pyntcloud

    pyntcloud

    pyntcloud is a Python library for working with 3D point clouds

    This page will introduce the general concept of point clouds and illustrate the capabilities of pyntcloud as a point cloud processing tool. Point clouds are one of the most relevant entities for representing three dimensional data these days, along with polygonal meshes (which are just a special case of point clouds with connectivity graph attached). In its simplest form, a point cloud is a set of points in a cartesian coordinate system. Accurate 3D point clouds can nowadays be (easily and cheaply) acquired from different sources. pyntcloud enables simple and interactive exploration of point cloud data, regardless of which sensor was used to generate it or what the use case is. Although it was built for being used on Jupyter Notebooks, the library is suitable for other kinds of uses. pyntcloud is composed of several modules (as independent as possible) that englobe common point cloud processing operations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    yami

    yami

    An open-source music player with simple UI

    Yami is a lightweight, open-source music player built in Python. It focuses on simplicity and ease of use, providing an intuitive user interface (UI) for users to manage and play their music. Whether you're playing local files or downloading from online sources using spotdl, Yami offers a seamless experience. This project is designed for users who want a minimalistic, cross-platform music player with the ability to integrate external sources like Spotify/YouTube Music.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    easycap-app

    easycap-app

    Capture your screen with unprecedented ease and quality.

    Welcome to EasyCap, your ultimate desktop screen recorder and screenshot editor. Designed with simplicity and power in mind, EasyCap is perfect for professionals, creators, and anyone looking to capture their PC activities with ease. Whether you're creating tutorials, recording gameplay, or capturing important moments, EasyCap makes it effortless.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 25
    Video Frame Extractor

    Video Frame Extractor

    Extracts semi-random frames from all MP4 videos

    This simple tool extracts frames from all MP4 videos in the same folder as this program. ## How to use: - Place this program in the folder containing your MP4 videos. - Double-click on VideoFrameExtractor.exe to run it. - When prompted, enter the number of frames you want to extract from each video. - Wait for the program to finish processing all videos. - Find your extracted frames in the 'extracted_frames' folder. The frames are extracted at evenly distributed points throughout each video. For example, if you choose 3 frames, they will be taken at the 25%, 50%, and 75% marks of each video. (Source code is included with the program .zip file.)
    Leader badge
    Downloads: 24 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB