Showing 1237 open source projects for "video-making"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    verl

    verl

    Volcano Engine Reinforcement Learning for LLMs

    VERL is a reinforcement-learning–oriented toolkit designed to train and align modern AI systems, from language models to decision-making agents. It brings together supervised fine-tuning, preference modeling, and online RL into one coherent training stack so teams can move from raw data to aligned policies with minimal glue code. The library focuses on scalability and efficiency, offering distributed training loops, mixed precision, and replay/buffering utilities that keep accelerators busy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Neuroglancer

    Neuroglancer

    WebGL-based viewer for volumetric data

    ...It allows users to interactively view arbitrary 2D and 3D cross-sections of volumetric data alongside 3D meshes and skeleton models, enabling precise examination of neural structures and biological imaging results. Its multi-pane interface synchronizes multiple orthogonal views with a central 3D viewport, making it ideal for analyzing complex brain imaging data such as connectomics datasets. Neuroglancer operates entirely client-side, fetching data over HTTP in a variety of supported formats including Neuroglancer precomputed, N5, Zarr, and NIfTI, among others. The viewer is built with a multi-threaded architecture, separating rendering and data processing to ensure smooth performance even with massive datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    YAPF

    YAPF

    A formatter for Python files

    ...Instead of relying on a fixed set of heuristics, it explores formatting decisions and chooses the lowest-cost result, aiming to produce code a human would write when following a style guide. You can run it as a command-line tool or call it as a library via FormatCode / FormatFile, making it easy to embed in editors, CI, and custom tooling. Styles are highly configurable: start from presets like pep8, google, yapf, or facebook, then override dozens of options in .style.yapf, setup.cfg, or pyproject.toml. It supports recursive directory formatting, line-range formatting, and diff-only output so you can check or fix just the lines you touched.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Large Concept Model

    Large Concept Model

    Language modeling in a sentence representation space

    ...Probing tools help diagnose what the model knows—e.g., attribute recognition, relation understanding, or compositionality—so you can iterate on data and objectives. The design is modular, making it straightforward to swap backbones, change objectives, or integrate retrieval components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 5
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Anymail

    Anymail

    Django email backends and webhooks for Amazon SES, Mailgun, Mailjet

    Anymail lets you send and receive email in Django using your choice of transactional email service providers (ESPs). It extends the standard django.core.mail with many common ESP-added features, providing a consistent API that avoids locking your code to one specific ESP (and making it easier to change ESPs later if needed). Integration of each ESP’s sending APIs into Django’s built-in email package, including support for HTML, attachments, extra headers, and other standard email features. Extensions to expose common ESP-added functionality, like tags, metadata, and tracking, with code that’s portable between ESPs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Audiomentations

    Audiomentations

    A Python library for audio data augmentation

    ...Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where background noise is present. A folder of (background noise) sounds to be mixed in must be specified. These sounds should ideally be at least as long as the input sounds to be transformed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    ...A container provides an effectively isolated environment, ensuring a consistent runtime and reliable training process. The SageMaker Training Toolkit can be easily added to any Docker container, making it compatible with SageMaker for training models. If you use a prebuilt SageMaker Docker image for training, this library may already be included. Write a training script (eg. train.py). Define a container with a Dockerfile that includes the training script and any dependencies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Oasis

    Oasis

    Inference script for Oasis 500M

    ...Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. Because it’s an inference-focused repository, it’s especially useful as a practical reference for running the model, wiring inputs, and producing the autoregressive sequence of gameplay frames. It also serves as a research sandbox for people exploring how far interactive generative models can go with smaller, more accessible checkpoints compared to massive internal systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Turn more customers into advocates. Icon
    Turn more customers into advocates.

    Fight skyrocketing paid media costs by turning your customers into a primary vehicle for acquisition, awareness, and activation with Extole.

    The platform's advanced capabilities ensure companies get the most out of their referral programs. Leverage custom events, profiles, and attributes to enable dynamic, audience-specific referral experiences. Use first-party data to tailor customer segment messaging, rewards, and engagement strategies. Use our flexible APIs to build management capabilities and consumer experiences–headlessly or hybrid. We have all the tools you need to build scalable, secure, and high-performing referral programs.
    Learn More
  • 10
    Fast3R

    Fast3R

    Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

    ...It represents a next-generation feedforward 3D reconstruction model capable of producing dense point clouds and camera poses for hundreds to thousands of images or video frames in a single inference pass—eliminating the need for slow, iterative structure-from-motion pipelines. Built on PyTorch Lightning and extending concepts from DUSt3R and Spann3r, Fast3R unifies multi-view geometry, depth estimation, and camera registration within a single transformer-based architecture. It outputs high-quality 3D scene representations from unordered or sequential views, scaling to large datasets and varied camera intrinsics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Bailing

    Bailing

    Bailing is a voice dialogue robot similar to GPT-4o

    ...The project is modular: each core function — ASR, VAD, LLM, TTS — exists as a separately replaceable component, which allows flexibility in picking your preferred models depending on resources or languages. It aims to be light enough to run without a GPU, making it usable on modest hardware or edge devices, while still maintaining low latency and smooth interaction. Bailing includes a memory system, giving the assistant the ability to remember user preferences and context across sessions, which enables more personalized and context-aware conversations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    OuteTTS is an interface library for running OuteTTS text-to-speech models across a range of backends, making it easier to deploy the same model on different hardware and runtimes. It provides a high-level Interface API that wraps model configuration, speaker handling, and audio generation so you can focus on integrating speech into your application rather than wiring up low-level engines. The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    shot-scraper

    shot-scraper

    A command-line utility for taking automated screenshots of websites

    shot-scraper is a command-line utility for taking automated screenshots of web pages using a headless browser engine. After installation, a single command can capture a full-page screenshot of a URL and save it to a file, making it ideal for documentation, monitoring, and visual regression tasks. Under the hood it uses a modern browser (installed via a one-time shot-scraper install step) and exposes options for viewport size, full-page versus clipped screenshots, and device emulation. Beyond simple captures, it can run custom JavaScript before taking the shot, allowing you to open menus, scroll, or manipulate the DOM so the screenshot reflects the desired state. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Code-Mode

    Code-Mode

    Plug-and-play library to enable agents to call MCP and UTCP tools

    Code-Mode is a plug-and-play library that lets AI agents call tools by executing TypeScript (or via a Python wrapper) instead of making many individual function calls. Its core philosophy is that language models are very good at writing code, so rather than exposing hundreds of separate tool endpoints, you give the model a single “code execution” tool that has access to your full toolkit through code. This approach can dramatically reduce the number of tool-call iterations needed in complex workflows, turning multi-step call chains into a single code execution with internal branching and loops. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    DeepSpeed MII

    DeepSpeed MII

    MII makes low-latency and high-throughput inference possible

    ...While open-sourcing has democratized access to AI capabilities, their application is still restricted by two critical factors: inference latency and cost. DeepSpeed-MII is a new open-source python library from DeepSpeed, aimed towards making low-latency, low-cost inference of powerful models not only feasible but also easily accessible. MII offers access to the highly optimized implementation of thousands of widely used DL models. MII-supported models achieve significantly lower latency and cost compared to their original implementation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GLM-4.1V

    GLM-4.1V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    ...It represents a trade-off: somewhat reduced capacity compared to 4.5V or 4.6V, but with benefits in terms of speed, deployability, and lower hardware requirements — making it especially useful for developers experimenting locally, building lightweight agents, or deploying on limited infrastructure. Given its open-source availability under the same project repository, it provides an accessible entry point for testing multimodal reasoning and building proof-of-concept applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    UI-TARS

    UI-TARS

    UI-TARS-desktop version that can operate on your local personal device

    UI-TARS is an open-source multimodal “GUI agent” created by ByteDance: a model designed to perceive raw screenshots (or rendered UI frames), reason about what needs to be done, and then perform real interactions with graphical user interfaces (GUIs) — like clicking, typing, navigating menus — across desktop, browser, mobile, or game environments. Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Nextpy

    Nextpy

    Self-Modifying Framework from the Future

    NextPy is a Python-based framework for building AI-powered automation agents, allowing developers to create intelligent, rule-based workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PaSa

    PaSa

    An advanced paper search agent powered by large language models

    PaSa is an open-source “paper search agent” built around large language models (LLMs), designed to automate the process of academic literature retrieval with human-like decision making. Instead of simply translating a query into keywords and returning a flat list of matching papers, PaSa uses a dual-agent architecture (Crawler + Selector) that can iteratively search, read, analyze, and filter academic publications — simulating how a researcher might dig through citation networks, expand references, and evaluate relevance based on both metadata and content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    ...The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. M1 is further trained with large-scale reinforcement learning over diverse tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Agent Payments Protocol (AP2)

    Agent Payments Protocol (AP2)

    Building a Secure and Interoperable Future for AI-Driven Payments

    AP2 is a project released by Google’s “Agentic Commerce” initiative, focusing on a protocol and reference implementation for agent-driven or AI-mediated payments. In effect, AP2 aims to define a secure, interoperable protocol that allows software agents to act on behalf of users—making payments or shopping decisions autonomously—while preserving necessary security, auditability, and trust. The repository contains sample scenarios (in Python, Android, etc.) that illustrate how agents, servers, and payments flows would work under the protocol. It includes “types” definitions (the core message and object schema) and example agent implementations to demonstrate the mechanics of agent-to-agent and agent-to-server interactions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Mezzanine

    Mezzanine

    CMS framework for Django

    ...But Mezzanine is different in that it provides most of its functionality by default. While other platforms rely heavily on modules or reusable applications, Mezzanine comes ready with all the functionality you need, making it the more efficient choice. Mezzanine has a simple yet highly extensible architecture that lets you really get into the code. Apart from the features that come with Django such as MVC architecture, ORM, templating and caching, Mezzanine comes with a great many other features. This includes hierarchical page navigation, a simple drag-and-drop HTML5 forms builder with CSV export, scheduled publishing, easy page ordering, social media sharing, and so much more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Python Zero to Hero for DevOps Engineers

    Python Zero to Hero for DevOps Engineers

    Learn Python from DevOps Engineer point of you

    Python Zero to Hero for DevOps Engineers is a structured “Python Zero to Hero for DevOps Engineers” course laid out as a day-by-day learning path. The repository is organized into Day-01 through Day-19 folders plus a small sample app, which makes it very easy to follow in sequence like a bootcamp. The curriculum starts with Python installation, environment setup, and writing your first script, then quickly moves into data types, strings, regular expressions, variables, and functions. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    YouT Video Mp3 Downloader

    YouT Video Mp3 Downloader

    YouTube video and audio download tool

    YouT Video Mp3 Downloader is an open-source Python application that can download YouTube videos in the highest quality MP4 format and/or as MP3 audio files. The application supports adaptive video/audio streams such as HD, Full HD, 2K, and 4K, offered by YouTube, using the yt-dlp framework. The FFmpeg requirement is automatically resolved; if FFmpeg is not already present when the application is launched, it will download and configure it.
    Downloads: 27 This Week
    Last Update:
    See Project