Showing 365 open source projects for "video-making"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 1
    Oasis

    Oasis

    Inference script for Oasis 500M

    ...Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. Because it’s an inference-focused repository, it’s especially useful as a practical reference for running the model, wiring inputs, and producing the autoregressive sequence of gameplay frames. It also serves as a research sandbox for people exploring how far interactive generative models can go with smaller, more accessible checkpoints compared to massive internal systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Fast3R

    Fast3R

    Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

    ...It represents a next-generation feedforward 3D reconstruction model capable of producing dense point clouds and camera poses for hundreds to thousands of images or video frames in a single inference pass—eliminating the need for slow, iterative structure-from-motion pipelines. Built on PyTorch Lightning and extending concepts from DUSt3R and Spann3r, Fast3R unifies multi-view geometry, depth estimation, and camera registration within a single transformer-based architecture. It outputs high-quality 3D scene representations from unordered or sequential views, scaling to large datasets and varied camera intrinsics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PaSa

    PaSa

    An advanced paper search agent powered by large language models

    PaSa is an open-source “paper search agent” built around large language models (LLMs), designed to automate the process of academic literature retrieval with human-like decision making. Instead of simply translating a query into keywords and returning a flat list of matching papers, PaSa uses a dual-agent architecture (Crawler + Selector) that can iteratively search, read, analyze, and filter academic publications — simulating how a researcher might dig through citation networks, expand references, and evaluate relevance based on both metadata and content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 5
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    ...The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. M1 is further trained with large-scale reinforcement learning over diverse tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Agent Payments Protocol (AP2)

    Agent Payments Protocol (AP2)

    Building a Secure and Interoperable Future for AI-Driven Payments

    AP2 is a project released by Google’s “Agentic Commerce” initiative, focusing on a protocol and reference implementation for agent-driven or AI-mediated payments. In effect, AP2 aims to define a secure, interoperable protocol that allows software agents to act on behalf of users—making payments or shopping decisions autonomously—while preserving necessary security, auditability, and trust. The repository contains sample scenarios (in Python, Android, etc.) that illustrate how agents, servers, and payments flows would work under the protocol. It includes “types” definitions (the core message and object schema) and example agent implementations to demonstrate the mechanics of agent-to-agent and agent-to-server interactions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    BWR Ai watermark remover

    BWR Ai watermark remover

    AI-powered tool to quickly remove watermarks from videos flawlessly

    ...Its intuitive interface features white and blue design elements for easy navigation, making it ideal for content creators, video editors, social media managers, and marketers. Blue Wave Remover enhances video visuals by removing unwanted logos and overlays, ensuring professional, clean footage for repurposing, presentations, and online sharing. Key functions include automatic watermark detection, AI-powered inpainting, background reconstruction, and seamless integration into existing workflows. ...
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Sonnet

    Sonnet

    TensorFlow-based neural network library

    ...These modules can hold references to parameters, other modules and methods that apply some function on the user input. There are a number of predefined modules that already ship with Sonnet, making it quite powerful and yet simple at the same time. Users are also encouraged to build their own modules. Sonnet is designed to be extremely unopinionated about your use of modules. It is simple to understand, and offers clear and focused code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Kognition Smart Building Software Icon
    Kognition Smart Building Software

    For organizations searching for enterprise safety and security monitoring AI for smart buildings

    Its multi-patented enterprise software utilizes artificial intelligence to integrate and orchestrate new and existing security cameras, access control systems and IoT sensors into a dynamic, real-time alerting and analytics platform for smart buildings. Kognition’s easy-to-use user interface transforms surveillance video and IoT data into actionable intelligence to prevent hacking, espionage, theft, the spread of diseases, active shooters, and other high impact dangers. A growing list of Fortune 500 customers rely on Kognition’s products & services everyday to enhance and automate security and safety in their buildings.
    Learn More
  • 10
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    The Open Source Computer Vision Library has >2500 algorithms, extensive documentation and sample code for real-time computer vision. It works on Windows, Linux, Mac OS X, Android, iOS in your browser through JavaScript. Languages: C++, Python, Julia, Javascript Homepage: https://opencv.org Q&A forum: https://forum.opencv.org/ Documentation: https://docs.opencv.org Source code: https://github.com/opencv Please pay special attention to our tutorials!...
    Leader badge
    Downloads: 3,164 This Week
    Last Update:
    See Project
  • 11
    EmotiVoice

    EmotiVoice

    Multi-Voice and Prompt-Controlled TTS Engine

    EmotiVoice is a multi-voice, prompt-controlled text-to-speech engine designed to generate highly expressive speech across thousands of voices. It supports both English and Chinese and ships with over 2,000 preset voices, making it suitable for everything from characters and virtual anchors to narration and dialogue. The core idea is prompt-based emotional and style control: you can ask the engine to speak “happy,” “sad,” “excited,” or with other high-level style prompts that shape prosody, pitch, speed, and energy. EmotiVoice provides multiple ways to interact with it, including a web interface, a Docker image, an HTTP API (including an OpenAI-compatible TTS API), and Python scripts for batch synthesis. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    YYeTsBot

    YYeTsBot

    Renren Film and Television bot, fully connected to Renren resources

    ...You can directly send the name of the episode you want to watch, and you can choose to share the webpage or link (ed2k and magnet links). When searching for resources, it will be searched according to my predetermined priority (everyone video offline, subtitle man), of course, you can also use commands to force a subtitle group. Due to the difference in translations, it is recommended to enter a partial translation and then select from the list. For example, if you want to watch the fourth season of Game of Thrones, just search for "Game of Thrones". Want to keep a resource for yourself, but don't know how to program? ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    VideoCrafter2

    VideoCrafter2

    Overcoming Data Limitations for High-Quality Video Diffusion Models

    VideoCrafter is an open-source video generation and editing toolbox designed to create high-quality video content. It features models for both text-to-video and image-to-video generation. The system is optimized for generating videos from textual descriptions or still images, leveraging advanced diffusion models. VideoCrafter2, an upgraded version, improves on its predecessor by enhancing motion dynamics and concept combinations, especially in low-data scenarios. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    Errbot

    Errbot

    Chatbot daemon that connects to your favorite chat services

    ...The goal of the project is to make it easy for you to write your own plugins so you can make it do whatever you want, a deployment, retrieving some information online, trigger a tool via an API, troll a co-worker, etc. Errbot is being used in a lot of different contexts, chatops (tools for devops), online gaming chatrooms like EVE, video streaming chatrooms like livecoding.tv, home security, etc. Extending Errbot and adding your own commands can be done by creating a plugin, which is simply a class derived from BotPlugin. The docstrings will be automatically reused by the !help command. We aim to give you all the tools you need to build a customized bot safely, without having to worry about basic functionality. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework developed by Tencent, extending the capabilities of HunyuanVideo. It allows for high-quality video creation from still images, using PyTorch and providing pre-trained model weights, inference code, and customizable training options. The system includes a LoRA training code for adding special effects and enhancing video realism, aiming to offer versatile and scalable solutions for generating videos from static image inputs.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    Warlock-Studio

    Warlock-Studio

    Suite with Real-ESRGAN, BSRGAN , RealESRNet, IRCNN, GFPGAN & RIFE.

    v5.1.1. Warlock-Studio is a Windows application that uses Real-ESRGAN, BSRGAN, IRCNN, GFPGAN, RealESRNet, RealESRAnime and RIFE Artificial Intelligence models to upscale, restore faces, interpolate frames and reduce noise in images and videos. the application supports GPU acceleration (including multi-GPU setups) and offers batch processing for large workloads. It includes drag-and-drop handling for single or multiple files, optional pre-resize functions, and an automatic tiling system...
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 17
    MMEditing

    MMEditing

    MMEditing is a low-level vision toolbox based on PyTorch

    MMEditing is an open-source toolbox for low-level vision. It supports various tasks. MMEditing is a low-level vision toolbox based on PyTorch, supporting super-resolution, inpainting, matting, video interpolation, etc. We decompose the editing framework into different components and one can easily construct a customized editor framework by combining different modules. The toolbox directly supports popular and contemporary inpainting, matting, super-resolution and generation tasks. The toolbox provides state-of-the-art methods in inpainting/matting/super-resolution/generation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    XAgent

    XAgent

    An Autonomous LLM Agent for Complex Task Solving

    XAgent is an AI-driven autonomous agent framework capable of handling multi-step tasks across different domains. It enables AI agents to perform decision-making, task planning, and self-learning based on user-defined objectives, making it ideal for automation and research applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Stable Diffusion

    Stable Diffusion

    High-Resolution Image Synthesis with Latent Diffusion Models

    ...The Stable Diffusion project, developed by Stability AI, is a cutting-edge image synthesis model that utilizes latent diffusion techniques for high-resolution image generation. It offers an advanced method of generating images based on text input, making it highly flexible for various creative applications. The repository contains pretrained models, various checkpoints, and tools to facilitate image generation tasks, such as fine-tuning and modifying the models. Stability AI's approach to image synthesis has contributed to creating detailed, scalable images while maintaining efficiency.
    Downloads: 268 This Week
    Last Update:
    See Project
  • 20
    OpenFieldAI - AI Open Field Test Tracker

    OpenFieldAI - AI Open Field Test Tracker

    OpenFieldAI is an AI based Open Field Test Rodent Tracker

    ...The software generates Centroid graph, Heat map and Line path and a spreadsheet containing all calculated parameters like - Speed - Time in and out of ROI - Distance - Entries/Exits for single/multiple pre-recorded videos or live webcam video. The ROI is assigned automatically in multiple video input , and can be manually given in single input. - For Queries/ Reporting Bugs, contact: kabeermuzammil614@gmail.com - Available on WIndows OS - Software Authorship - Muzammil Kabier and Shamili Mariya Varghese ( Sole Authors )
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21

    SoundTranscriber

    SoundTranscriber can be used to generate automatic transcription / aut

    SoundTranscriber can be used to generate automatic transcription / aut
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    Conscious Artificial Intelligence

    Conscious Artificial Intelligence

    It's possible for machines to become self-aware.

    ...This project has 2 subprojects: Object Pascal based CAI NEURAL API - https://github.com/joaopauloschuler/neural-api Python based K-CAI NEURAL API - https://github.com/joaopauloschuler/k-neural-api A video from the first prototype has been made: http://www.youtube.com/watch?v=qH-IQgYy9zg Above video shows a popperian agent collecting mining ore from 3 mining sites and bringing to the base. At the time the agent is born, it doesn't know how to walk nor it knows that it feels pleasure by mining. He has tact only (blind agent). The video shows learning, planning, executing and plan optimization.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    VALL-E X

    VALL-E X

    Open source implementation of Microsoft's VALL-E X zero-shot TTS model

    ...VALL-E-X supports zero-shot cross-lingual synthesis, meaning a monolingual speaker’s voice can be used to speak other languages without additional training. It also preserves aspects of the acoustic environment, such as background noise or reverb, making the generated audio feel more like it came from the same setting as the prompt. The repository includes Python APIs, sample scripts, ready-to-use voice presets, and demos hosted on Hugging Face Spaces and Google Colab so users can try it.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24

    Eng2BN CSV Translator

    Translate English to Bangla using CSV file format and range wise.

    Eng2BN CSV Translator user-friendly Python tool that enables efficient translation of English text to Bangla within CSV files. The application supports large datasets and allows users to translate specific row ranges, making it ideal for batch processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DiffRhythm

    DiffRhythm

    Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation

    ...Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use. DiffRhythm offers tools for both training and inference, and its flexibility makes it ideal for AI-based music production and research in music generation.
    Downloads: 9 This Week
    Last Update:
    See Project