Page 13 | video-making free download

Yukki Music Bot

Telegram Group Calls Streaming bot with some useful features

Yukki Music Bot is a Powerful Telegram Music+Video Bot written in Python using Pyrogram and Py-Tgcalls by which you can stream songs, video and even live streams in your group calls via various sources.

Downloads: 4 This Week

Last Update: 2024-09-19

See Project

Apache MXNet (incubating)

A flexible and efficient library for deep learning

...It contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations. On top of this is a graph optimization layer, overall making MXNet highly efficient yet still portable, lightweight and scalable.

Downloads: 0 This Week

Last Update: 2023-12-13

See Project

MAE (Masked Autoencoders)

PyTorch implementation of MAE

...This forces the model to learn semantic structure and global context without supervision. The encoder processes only the visible patches, while a lightweight decoder reconstructs the full image—making pretraining computationally efficient. After pretraining, the encoder serves as a powerful backbone for downstream tasks like image classification, segmentation, and detection, achieving top performance with minimal fine-tuning. The repository provides pretrained models, fine-tuning scripts, evaluation protocols, and visualization tools for reconstruction quality and learned features.

Downloads: 0 This Week

Last Update: 2025-10-06

See Project

NWT - Pytorch (wip)

Implementation of NWT, audio-to-video generation, in Pytorch

Implementation of NWT, audio-to-video generation, in Pytorch. The paper proposes a new discrete latent representation named Memcodes, which can be succinctly described as a type of multi-head hard-attention to learned memory (codebook) key/values. They claim the need for less codes and smaller codebook dimensions in order to achieve better reconstructions.

Downloads: 0 This Week

Last Update: 2023-03-22

See Project

GLIDE (Text2Im)

GLIDE: a diffusion-based text-conditional image synthesis model

...It demonstrates how diffusion-based generative models can be conditioned on text to produce highly detailed and coherent visual outputs. The repository provides both model code and pretrained checkpoints, making it possible for researchers and developers to experiment with text-to-image synthesis. GLIDE includes advanced techniques such as classifier-free guidance, which improves the quality and alignment of generated images with the input text. The project also offers sampling scripts and utilities for exploring how diffusion models can be applied to multimodal tasks. ...

Downloads: 8 This Week

Last Update: 1 day ago

See Project

Piano transcription

Task of transcribing piano recordings into MIDI files

...The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music. By using this transcription tool, users can transform live performance audio (or recordings) into editable, machine-readable MIDI — enabling tasks such as analysis, editing, remixing, or generation of piano music. The authors used this system to build a large-scale classical piano MIDI dataset (see next project), but as a standalone tool it enables researchers, musicians, or hobbyists to transcribe their own piano recordings automatically.

Downloads: 0 This Week

Last Update: 2025-12-02

See Project

Face Mask Detection

Face Mask Detection system based on computer vision and deep learning

...Our face mask detector doesn't use any morphed masked images dataset and the model is accurate. Owing to the use of MobileNetV2 architecture, it is computationally efficient, thus making it easier to deploy the model to embedded systems (Raspberry Pi, Google Coral, etc.).

1 Review

Downloads: 0 This Week

Last Update: 2022-05-26

See Project

Ecco

Explain, analyze, and visualize NLP language models

Ecco is an interpretability tool for transformers that helps visualize and analyze how language models generate text, making model behavior more transparent.

Downloads: 0 This Week

Last Update: 2025-01-22

See Project

PaddleGAN

PaddlePaddle GAN library, including lots of interesting applications

PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on. PaddleGAN provides developers with high-performance implementation of classic and SOTA Generative Adversarial Networks, and supports developers to quickly build, train and deploy GANs for academic, entertainment, and industrial usage. GAN-Generative Adversarial Network, was praised by "the Father...

Downloads: 1 This Week

Last Update: 2023-03-22

See Project

MuJoCo-py

mujoco-py allows using MuJoCo from Python 3

mujoco-py is a Python wrapper for MuJoCo, a high-performance physics engine widely used in robotics, reinforcement learning, and AI research. It allows developers and researchers to run detailed rigid body simulations with contacts directly from Python, making MuJoCo easier to integrate into machine learning workflows. The library is compatible with MuJoCo version 2.1 and supports Linux and macOS, while Windows support has been deprecated. It provides utilities for loading models, running simulations, and accessing simulation states in real time, along with visualization tools for rendering environments. ...

Downloads: 0 This Week

Last Update: 2025-10-03

See Project

twitchtube

Twitch YouTube bot. Automatically make video compilations

Automatically make video compilations of the most viewed Twitch clips, and upload them to YouTube using Python 3. twitchtube is currently being rewritten, which will include breaking changes. Every parameter that is not specified, will default to an assigned value in config.py.

Downloads: 0 This Week

Last Update: 2023-04-19

See Project

TimeSformer

The official pytorch implementation of our paper

TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods.

Downloads: 0 This Week

Last Update: 2025-10-07

See Project

PyCls

Codebase for Image Classification Research, written in PyTorch

...Distributed training and mixed precision are first-class, enabling fast experiments on multi-GPU setups with simple, declarative configs. Model definitions are concise and modular, making it easy to prototype new blocks or swap backbones while keeping the rest of the pipeline unchanged. Pretrained weights and evaluation scripts cover common datasets, and the logging/metric stack is designed for quick comparison across runs. Practitioners use pycls both as a baseline factory and as a scaffold for new classification backbones.

Downloads: 0 This Week

Last Update: 2025-10-07

See Project

Machine Learning Collection

A resource for learning about Machine learning & Deep Learning

...I try to make the code as clear as possible, and the goal is be to used as a learning resource and a way to look up problems to solve specific problems. For most, I have also done video explanations on YouTube if you want a walkthrough for the code.

Downloads: 0 This Week

Last Update: 2024-08-01

See Project

UniVL

Official implementation for UniVL video and language training models

UniVL is a video-language pretrain model. It is designed with four modules and five objectives for both video language understanding and generation tasks. It is also a flexible model for most of the multimodal downstream tasks considering both efficiency and effectiveness.

Downloads: 0 This Week

Last Update: 2024-07-12

See Project

Denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

...The implementation includes data augmentation techniques applied to the raw waveforms (e.g. noise mixing, reverberation) to improve model robustness and generalization to diverse noise types. The project supports both offline denoising (batch inference) and live audio processing (e.g. via loopback audio interfaces), making it practical for real-time use in calls or recording. The codebase includes training and evaluation scripts, configuration management via Hydra, and pretrained models on standard noise datasets.

Downloads: 0 This Week

Last Update: 2025-10-07

See Project

DensePose

A real-time approach for mapping all human pixels of 2D RGB images

DensePose is a computer vision system that maps all human pixels in an RGB image to the 3D surface of a human body model. It extends human pose estimation from predicting joint keypoints to providing dense correspondences between 2D images and a canonical 3D mesh (such as the SMPL model). This enables detailed understanding of human shape, motion, and surface appearance directly from images or videos. The repository includes the DensePose network architecture, training code, pretrained...

Downloads: 6 This Week

Last Update: 2025-10-06

See Project

Objectron

A dataset of short, object-centric video clips

The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. ...

Downloads: 0 This Week

Last Update: 2022-02-21

See Project

Gluon CV Toolkit

...It features training scripts that reproduce SOTA results reported in latest papers, a large set of pre-trained models, carefully designed APIs and easy-to-understand implementations and community support. From fundamental image classification, object detection, semantic segmentation and pose estimation, to instance segmentation and video action recognition. The model zoo is the one-stop shopping center for many models you are expecting. GluonCV embraces a flexible development pattern while is super easy to optimize and deploy without retaining a heavyweight deep learning framework.

Downloads: 0 This Week

Last Update: 2021-11-01

See Project

Deep Exemplar-based Video Colorization

The source code of CVPR 2019 paper "Deep Exemplar-based Colorization"

...In order to colorize your own video, it requires to extract the video frames, and provide a reference image as an example.

Downloads: 1 This Week

Last Update: 2023-03-23

See Project

Text2Video

Software tool that converts text to video for more engaging experience

Text2Video is a software tool that converts text to video for more engaging learning experience. I started this project because during this semester, I have been given many reading assignments and I felt frustration in reading long text. For me, it was very time and energy-consuming to learn something through reading. So I imagined, "What if there was a tool that turns text into something more engaging such as a video, wouldn't it improve my learning experience?"

1 Review

Downloads: 4 This Week

Last Update: 2023-03-22

See Project

HiFi-GAN

Generative Adversarial Networks for Efficient and High Fidelity Speech

...In experiments on LJSpeech, HiFi-GAN was shown to achieve mean opinion scores close to human recordings while synthesizing 22.05 kHz audio up to ~168× faster than real time on an NVIDIA V100 GPU. A smaller configuration trades a bit of quality for even higher speed and can run more than 13× faster than real time on CPU, making it suitable for deployment scenarios without powerful GPUs.

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Consistent Depth

We estimate dense, flicker-free, geometrically consistent depth

...This approach achieves improved geometric consistency and visual stability compared to prior monocular reconstruction methods. The project can process challenging hand-held video footage, including those with moderate dynamic motion, making it practical for real-world usage.

Downloads: 3 This Week

Last Update: 7 days ago

See Project

VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint

VideoPose3D is a deep learning framework that reconstructs 3D human poses from 2D keypoint sequences extracted from videos. It builds on top of convolutional and temporal networks that map 2D joint coordinates over time to consistent 3D skeletons, enabling robust motion capture without specialized sensors. The model is trained on large motion capture datasets and can generalize well to unseen environments by leveraging temporal context for smoothing and error correction. By using only 2D...

Downloads: 2 This Week

Last Update: 2025-10-07

See Project

Linux-Intelligent-Ocr-Solution

Easy-OCR solution and Tesseract trainer for GNU/Linux

...Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package. Forum : https://groups.google.com/forum/#!forum/lios Video Tutorial : https://www.youtube.com/playlist?list=PLn29o8rxtRe1zS1r2-yGm1DNMOZCgdU0i Tesseract Training Tutorial (beta) : https://www.youtube.com/watch?v=qLpCld4cdtk Source Code Github : https://github.com/Nalin-x-Linux/lios-3 Gitlab : https://gitlab.com/Nalin-x-Linux/lios-3 User guide is available in download page

5 Reviews

Downloads: 8 This Week

Last Update: 2020-10-19

See Project

Search Results for "video-making" - Page 13

Showing 367 open source projects for "video-making"

Yukki Music Bot

Apache MXNet (incubating)

MAE (Masked Autoencoders)

NWT - Pytorch (wip)

GLIDE (Text2Im)

Piano transcription

Face Mask Detection

Ecco

PaddleGAN

MuJoCo-py

twitchtube

TimeSformer

PyCls

Machine Learning Collection

UniVL

Denoiser

DensePose

Objectron

Gluon CV Toolkit

Deep Exemplar-based Video Colorization

Text2Video

HiFi-GAN

Consistent Depth

VideoPose3D

Linux-Intelligent-Ocr-Solution

Search Results for "video-making" - Page 13

Showing 367 open source projects for "video-making"

Yukki Music Bot

Apache MXNet (incubating)

MAE (Masked Autoencoders)

NWT - Pytorch (wip)

GLIDE (Text2Im)

Piano transcription

Face Mask Detection

Ecco

PaddleGAN

MuJoCo-py

twitchtube

TimeSformer

PyCls

Machine Learning Collection

UniVL

Denoiser

DensePose

Objectron

Gluon CV Toolkit

Deep Exemplar-based Video Colorization

Text2Video

HiFi-GAN

Consistent Depth

VideoPose3D

Linux-Intelligent-Ocr-Solution

Related Searches

Related Categories