Open Source Python Multimedia Software - Page 8

Python Multimedia Software

View 4860 business solutions

Browse free open source Python Multimedia Software and projects below. Use the toggles on the left to filter open source Python Multimedia Software by OS, license, language, programming language, and project status.

  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    GIF-Overlay

    GIF-Overlay

    Floating GIF Display Application

    Visit Website: https://duyxyz.github.io/GIF-Overlay/ GIF Overlay is a lightweight Windows application that allows you to display GIF images floating on your screen with useful and easy-to-use features.
    Downloads: 40 This Week
    Last Update:
    See Project
  • 2
    SPPAS

    SPPAS

    SPPAS - the automatic annotation and analyses of speech

    SPPAS is a scientific computer software package written and maintained by Brigitte Bigi of the Laboratoire Parole et Langage, in Aix-en-Provence, France. Available for free, with open source code, there is simply no other package for linguists to simple use in the automatic annotations of speech, the analyses of any kind of annotated data and the conversion of annotated files. SPPAS is able to produce automatically speech annotations from a recorded speech sound and its orthographic transcription. SPPAS is helpful for the analysis of any annotated data: estimate statistical distributions, make requests, manage files, visualize annotations. SPPAS offers a file converter from/to a wide range of formats: xra, TextGrid, eaf, trs... <https://sppas.org>
    Downloads: 40 This Week
    Last Update:
    See Project
  • 3
    Canorus

    Canorus

    Music score editor

    Canorus is a free cross-platform music score editor. It supports an unlimited number and length of staffs, polyphony, a MIDI playback of notes, chord markings, lyrics, import/export filters to formats like MIDI, MusicXML, ABC Music, MusiXTeX and LilyPond
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    A2M — Audio to MIDI

    A2M — Audio to MIDI

    A2M is a desktop app that converts AUDIO TO MIDI in one click.

    A2M (Audio To MIDI) is a simple desktop tool for transcribing local audio files into MIDI files with one click. It is designed primarily for piano recording transcription, and works best on solo piano recordings. Using A2M is straightforward: Select an audio file, click Convert, and the application generates a MIDI file automatically in your Downloads/A2M folder. All processing is done locally on your device, no uploads, no accounts, and no telemetry. The app runs on CPU by default, with optional NVIDIA GPU acceleration for faster conversions. Project links: Website: justagwas.com/projects/a2m GitHub: github.com/Justagwas/a2m Documentation: https://github.com/Justagwas/a2m/wiki A2M is fully open source and operates only on the files you choose. VirusTotal scan result: https://www.virustotal.com/gui/file/cc2a961baaaac2f8932c2e9ed04f0c27a55309cc03ed0825e44c8af18e263ce6
    Leader badge
    Downloads: 38 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    YouTube Music Desktop Player

    YouTube Music Desktop Player

    Turns the YouTube Music site into a desktop application.

    Turns the YouTube Music site into a cross-platform desktop application for Windows and Linux using QtWebEngine.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    Ming is an SWF ("Flash") file format output library. It is written in C, with wrappers for C++, Python, and PHP, plus rudimentary support for Ruby and Perl.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    libiptcdata is a standalone C-library for reading and writing the International Press Telecommunications Council (IPTC) metadata contained in various data files such as images.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 8
    rePear is a tool that allows using an Apple iPod audio player without iTunes. The user can manage the audio files on the iPod with a normal file manager and rePear takes care that they can be played on the device.
    Leader badge
    Downloads: 33 This Week
    Last Update:
    See Project
  • 9
    itom

    itom

    itom - an Open Source Measurement, Automation and Evaluation Software

    itom is an open source software suite for operating measurement systems, laboratory automation and data evaluation. One main application of itom is the development and operation of sensor and measurement system for instance in a laboratory environment. Therefore, the software has to be able to communicate with a wide range of different hardware systems, such as cameras or actuators and should provide a diversified and as complete as possible set of evaluation and data processing methods. Additionally, the rapid prototyping of modern measurement and inspection setups requires a system, where parameters or components can easily be changed at runtime, necessitating the availability of an embedded scripting language. Finally, when operating a measurement system, it is also desirable to extend the graphical user interface by system adapted dialogs and windows. The project has been moved mid 2023 to github: https://itom-project.github.io https://github.com/itom-project
    Downloads: 18 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Comix is a user-friendly, customizable image viewer. It is specifically designed to handle comic books, but also serves as a generic viewer. It reads images in ZIP, RAR or tar archives (also gzip or bzip2 compressed) as well as plain image files.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    YouTube-DL-PyTK

    YouTube-DL-PyTK

    Video Downloader - Cross Platform

    YouTube-DL-PyTK (formerly known as YouTube-DL-GTK) is just a graphical launcher for yt-dlp. Its purpose is simple; to facilitate the downloading of non-copyright-protected videos from certain internet websites including YouTube. Source code is included and can be executed directly if you have Python and the proper dependencies available. I digitally sign some files in my releases. If you'd like to verify those signatures, you can find my PGP/GPG keys at: https://marcusadams.me/keys.html If you'd like to read the source code without downloading anything, you can do so on the project's Gitlab at: https://gitlab.com/gerowen/youtube-dl-pytk If you'd like to donate there's several ways to do so: PayPal: https://paypal.me/gerowen Bitcoin (BTC): bc1q86c5j7wvf6cw78tf8x3szxy5gnxg4gj8mw4sy2 Monero (XMR): 42ho3m9tJsobZwQDsFTk92ENdWAYk2zL8Qp42m7pKmfWE7jzei7Fwrs87MMXUTCVifjZZiStt3E7c5tmYa9qNxAf3MbY7rD
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    xSTUDIO

    xSTUDIO

    xSTUDIO is a high performance playback and review tool.

    xSTUDIO is a high performance playback and review tool designed by and for Visual Effects, Animation and Post Production professionals. The application can load and play large collections of media files. The efficient playback engine allows you to quickly load and play high resolution image formats with a wide range of file formats and encoding. Intuitive tools allow you to create and organise playlists and media sub-sets within playlists to build interactive review sessions, image and video reference libraries. A multi-track timeline editing interface provides the facility for loading or creating edits from simple to complex.
    Leader badge
    Downloads: 31 This Week
    Last Update:
    See Project
  • 13
    A Python interface to the gnuplot plotting program.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    YehDown

    YehDown

    Yeahdown: Easy-to-use video downloader for Windows

    Yeahdown is a straightforward, user-friendly Windows-based application designed to simplify the process of downloading videos and audio from popular websites like YouTube and Vimeo. Perfect for non-technical users, it offers an intuitive interface and fast, reliable downloads. Key features include improved download speeds, support for multiple major video platforms, and real-time updates for new features. Tested on windows 11.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 15
    MLT Multimedia Framework
    A multimedia authoring and processing framework and a video playout server for television broadcasting.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    SMILI

    SMILI

    Scientific Visualisation Made Easy

    The Simple Medical Imaging Library Interface (SMILI), pronounced 'smilie', is an open-source, light-weight and easy-to-use medical imaging viewer and library for all major operating systems. The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with graphical user interfaces and/or via the command-line. See our YouTube channel for tutorial videos via the homepage. The applications are all built out of a uniform user-interface framework that provides a very high level (Qt) interface to powerful image processing and scientific visualisation algorithms from the Insight Toolkit (ITK) and Visualisation Toolkit (VTK). The framework allows one to build stand-alone medical imaging applications quickly and easily.
    Leader badge
    Downloads: 28 This Week
    Last Update:
    See Project
  • 17
    Video Frame Extractor

    Video Frame Extractor

    Extracts semi-random frames from all MP4 videos

    This simple tool extracts frames from all MP4 videos in the same folder as this program. ## How to use: - Place this program in the folder containing your MP4 videos. - Double-click on VideoFrameExtractor.exe to run it. - When prompted, enter the number of frames you want to extract from each video. - Wait for the program to finish processing all videos. - Find your extracted frames in the 'extracted_frames' folder. The frames are extracted at evenly distributed points throughout each video. For example, if you choose 3 frames, they will be taken at the 25%, 50%, and 75% marks of each video. (Source code is included with the program .zip file.)
    Leader badge
    Downloads: 28 This Week
    Last Update:
    See Project
  • 18
    Animated Drawings

    Animated Drawings

    Code to accompany "A Method for Animating Children's Drawings"

    AnimatedDrawings is a framework that converts user sketches or line drawings into fully animated 2D motion sequences using learned motion priors. The idea is that you draw a simple static figure (stick figure, silhouette, or contour lines), and the system produces plausible skeletal motion (walking, jumping, dancing) that adheres to the drawn shape constraints. The architecture separates shape embedding (to understand user-drawn geometry) from motion embedding / generation (to produce temporally coherent movement). Users can provide rough keyframes or control constraints (pose anchors), and the system fills intermediate frames with fluid animation. The repository includes demonstration apps and notebooks where you can upload or draw shapes and watch animations play. Because the approach is data-driven, it generalizes to new drawings even with varying proportions or stylizations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    AnimeGAN

    AnimeGAN

    A simple PyTorch Implementation of Generative Adversarial Networks

    A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. The images are generated from a DCGAN model trained on 143,000 anime character faces for 100 epochs. Manipulating latent codes enables the transition from images in the first row to the last row. The images are not clean, some outliers can be observed, which degrades the quality of the generated images. Anime-style images of 126 tags are collected from danbooru.donmai.us using the crawler tool gallery-dl. The images are then processed by an anime face detector python-anime face. The resulting dataset contains ~143,000 anime faces. Note that some of the tags may no longer be meaningful after cropping, i.e. the cropped face images under the 'uniform' tag may not contain visible parts of uniforms.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ChatterBot

    ChatterBot

    Machine learning, conversational dialog engine for creating chat bots

    ChatterBot is a Python library that makes it easy to generate automated responses to a user’s input. ChatterBot uses a selection of machine learning algorithms to produce different types of responses. This makes it easy for developers to create chat bots and automate conversations with users. For more details about the ideas and concepts behind ChatterBot see the process flow diagram. The language independent design of ChatterBot allows it to be trained to speak any language. Additionally, the machine-learning nature of ChatterBot allows an agent instance to improve it’s own knowledge of possible responses as it interacts with humans and other sources of informative data. An untrained instance of ChatterBot starts off with no knowledge of how to communicate. Each time a user enters a statement, the library saves the text that they entered and the text that the statement was in response to. As ChatterBot receives more input the number of responses that it can reply increase.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Crunch

    Crunch

    Insane(ly slow but wicked good) PNG image optimization

    Crunch is an image compression tool for lossy PNG image file optimization. Using a combination of selective bit depth, color palette reduction and color type, as well as zopfli DEFLATE compression algorithm encoding that employs the pngquant and zopflipng PNG optimization tools, Crunch is effectively able to optimize and compress images with minimal decrease in image quality. While it may produce file size gains larger than those produced by lossless approaches, the impact on image quality is often imperceptible, and optimized file sizes are still significantly lower than the original.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Crunch PNG

    Crunch PNG

    Insane(ly slow but wicked good) PNG image optimization

    Crunch is a tool for lossy PNG image file optimization. It combines selective bit depth, color type, and color palette reduction with zopfli DEFLATE compression algorithm encoding using the pngquant and zopflipng PNG optimization tools. This approach leads to a significant file size gain relative to lossless approaches at the expense of a relatively modest decrease in image quality. Continuous benchmark testing is available in our GitHub Actions CI. Please see the benchmarks directory of this repository for details about the benchmarking approach and instructions on how to execute benchmarks locally on the reference images distributed in this repository or with your own image files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    DALL-E 2 - Pytorch

    DALL-E 2 - Pytorch

    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis

    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch. The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best performing variant (but which incidentally involves a causal transformer as the denoising network) To train DALLE-2 is a 3 step process, with the training of CLIP being the most important. To train CLIP, you can either use x-clip package, or join the LAION discord, where a lot of replication efforts are already underway. Then, you will need to train the decoder, which learns to generate images based on the image embedding coming from the trained CLIP.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Emoji for Python

    Emoji for Python

    emoji terminal output for Python

    Emoji for Python. This project was inspired by kyokomi. The entire set of Emoji codes as defined by the Unicode consortium is supported in addition to a bunch of aliases. By default, only the official list is enabled but doing emoji.emojize(language='alias') enables both the full list and aliases. By default, the language is English (language='en') but also supported languages are Spanish ('es'), Portuguese ('pt'), Italian ('it'), French ('fr'), German ('de'). The utils/get-codes-from-unicode-consortium.py may help when updating unicode_codes.py but is not guaranteed to work. Generally speaking it scrapes a table on the Unicode Consortium's website with BeautifulSoup and prints the contents to stdout in a more useful format.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB