Showing 1226 open source projects for "video-making"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 1
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ...It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha” system, which batch-generates many distinct voice timbres and allows users to save the ones they like into a curated voice library. It has first-class support for long-form audio generation, making it suitable for audiobooks, podcasts, or long narration tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    ...It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    GLM-V

    GLM-V

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

    ...The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image, video, document, GUI, and grounding tasks. It introduces hybrid training for broad-spectrum reasoning and a Thinking Mode switch to balance speed and depth of reasoning. GLM-4.1V-9B-Thinking incorporates reinforcement learning with curriculum sampling (RLCS) and Chain-of-Thought reasoning, outperforming models much larger in scale (e.g., Qwen-2.5-VL-72B) across many benchmarks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    System Design Primer

    System Design Primer

    Learn how to design large-scale systems

    ...The repository also contains study guides for short, medium, and long interview timelines, allowing learners to focus on both breadth and depth depending on their preparation needs. In addition, it includes flashcard decks designed to reinforce learning through spaced repetition, making it easier to retain key system design knowledge.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 5
    ModernGL

    ModernGL

    Modern OpenGL binding for Python

    ModernGL is a Python wrapper over OpenGL, designed to simplify the creation of high-performance, modern graphics applications. It provides an intuitive API for rendering 2D and 3D graphics, making it accessible to both beginners and experienced developers. ModernGL is suitable for applications such as games, simulations, and data visualizations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    ValueCell

    ValueCell

    Community-driven, multi-agent platform for financial applications

    ValueCell is a community-driven multi-agent AI platform focused on financial research, analysis, and decision-making that lets users leverage multiple specialized AI agents for tasks like data retrieval, investment research, strategy execution, and market tracking. The system brings together a suite of collaborative agents—such as research agents that gather and interpret fundamentals, strategy agents that implement trading logic, and news agents that deliver personalized updates—to help users make more informed financial decisions across stocks, crypto, and other markets. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    ...The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. On the client side, you can set the language, whether to translate into English, model size, voice activity detection, and output recording behavior.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    TruLens

    TruLens

    Evaluation and Tracking for LLM Experiments

    ...Fine-grained, stack-agnostic instrumentation and comprehensive evaluations help identify failure modes and systematically iterate to improve applications. An easy-to-use interface that allows developers to compare different versions of their applications, facilitating informed decision-making and optimization. TruLens supports various use cases, including question-answering, summarization, retrieval-augmented generation, and agent-based applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    ...Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.
    Downloads: 19 This Week
    Last Update:
    See Project
  • Lightspeed golf course management software Icon
    Lightspeed golf course management software

    Lightspeed Golf is all-in-one golf course management software to help courses simplify operations, drive revenue and deliver amazing golf experiences.

    From tee sheet management, point of sale and payment processing to marketing, automation, reporting and more—Lightspeed is built for the pro shop, restaurant, back office, beverage cart and beyond.
    Learn More
  • 10
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    ...The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. It does not require an NVIDIA GPU to run basic tasks, although GPU acceleration can be used when available, making it accessible on modest machines. The tool supports around sixteen languages, including Chinese, English, Japanese, Korean, French, German, Italian, and others, and can capture reference voices directly from a microphone or from uploaded audio.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 11
    CutLER

    CutLER

    Code release for Cut and Learn for Unsupervised Object Detection

    CutLER is an approach for unsupervised object detection and instance segmentation that trains detectors without human-annotated labels, and the repo also includes VideoCutLER for unsupervised video instance segmentation. The method follows a “Cut-and-LEaRn” recipe: bootstrap object proposals, refine them iteratively, and train detection/segmentation heads to discover objects across diverse datasets. The codebase provides training and inference scripts, model configs, and references to benchmarking results that report large gains over prior unsupervised baselines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Skyvern

    Skyvern

    Automate browser-based workflows with LLMs and Computer Vision

    Skyvern uses a combination of computer vision and AI to understand content on a webpage, making it adaptable to any website. Skyvern takes instructions in natural language, allowing it to execute complex objectives with simple commands. Skyvern is an API-first product. Workflows execute in the cloud, allowing it to run hundreds of workflows at the same time. Skyvern's AI decisions come with built-in explanations, providing clear summaries and justifications for every action.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 14
    auto-cpufreq

    auto-cpufreq

    Automatic CPU speed & power optimizer for Linux

    Automatic CPU speed & power optimizer for Linux. Actively monitors laptop battery state, CPU usage, CPU temperature, and system load, ultimately allowing you to improve battery life without making any compromises.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    pyttsx3

    pyttsx3

    Offline Text To Speech synthesis for python

    pyttsx3 is an offline text-to-speech library for Python that wraps native speech engines instead of calling cloud APIs. It is designed to work entirely without an internet connection, making it suitable for local automation, kiosks, accessibility tools, and embedded applications. On Windows it uses SAPI5, on Linux it typically uses eSpeak or eSpeak-NG, and on macOS it can use NSSpeechSynthesizer or AVSpeechSynthesizer, giving it broad cross-platform compatibility. The library exposes a simple but flexible API for controlling voice selection, speaking rate, volume, and other synthesis parameters from Python code. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Vanna

    Vanna

    Chat with your SQL database

    Vanna.AI is an AI-powered tool for natural language database querying, enabling users to interact with databases using simple English queries. It converts natural language questions into SQL queries, making data access more intuitive for non-technical users.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    TensorLy

    TensorLy

    Tensor Learning in Python

    TensorLy is a Python library that aims at making tensor learning simple and accessible. It allows to easily perform tensor decomposition, tensor learning and tensor algebra. Its backend system allows to seamlessly perform computation with NumPy, PyTorch, JAX, TensorFlow, CuPy or Paddle, and run methods at scale on CPU or GPU.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PyTorch3D

    PyTorch3D

    PyTorch3D is FAIR's library of reusable components for deep learning

    ...It’s designed to make it easy to build and train neural networks that work directly with 3D data such as meshes, point clouds, and implicit surfaces. The library provides fast GPU-accelerated implementations of rendering pipelines, transformations, rasterization, and lighting—making it possible to compute gradients through full 3D rendering processes. Researchers use it for tasks like shape generation, reconstruction, view synthesis, and visual reasoning. PyTorch3D also includes utilities for loading, transforming, and sampling 3D assets, so models can be trained end-to-end from 2D supervision or partial data. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    Aider

    Aider

    Aider is AI pair programming in your terminal

    ...Aider creates a structured map of your entire repository, allowing it to handle large and complex projects effectively. It supports over 100 programming languages, making it flexible for nearly any development stack. With built-in Git integration, Aider keeps you in control by automatically committing clean, reversible changes. Whether you’re coding locally or in the cloud, Aider turns natural language requests into reliable, production-ready code.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    Mistral Vibe CLI

    Mistral Vibe CLI

    Minimal CLI coding agent by Mistral

    Mistral Vibe is an AI-powered “vibe-coding” command-line interface (CLI) and coding-assistant framework built by Mistral AI to let developers write, refactor, search, and manage code through natural language and context-aware automation, rather than manual typing only. It aims to take developers out of repetitive boilerplate and let them stay “in the flow”: you can ask the tool to generate functions, refactor code, search across the codebase, manipulate files, commit changes via Git, or run...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 21
    Motor

    Motor

    The async Python driver for MongoDB and Tornado or asyncio

    Motor is an asynchronous Python driver for MongoDB that enables developers to work with MongoDB using non-blocking I/O patterns, making it ideal for high-performance and scalable applications. Built on top of Python’s Tornado and asyncio frameworks, Motor lets you issue database operations without blocking the event loop, enabling concurrency in web servers, real-time systems, and microservices. It provides a familiar API surface similar to the official synchronous PyMongo driver, so you can migrate or write MongoDB code in Python without having to learn a completely new interface. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Memobase

    Memobase

    Fast backend for long-term AI user memory via structured profiles

    ...The system focuses on three principal performance metrics: high search performance, reduced large language model (LLM) costs through batch processing techniques, and low latency with minimal SQL operations. Memobase supports integration with existing LLM workflows via APIs and SDKs (including Python, Node, and Go), making it easy to adopt within diverse application stacks.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    Hello Python

    Hello Python

    Comprehensive tutorial repository aimed at teaching the Python program

    Hello-Python is a comprehensive tutorial repository aimed at teaching the Python programming language from scratch for beginners. It includes over 100 classes and about 44 hours of video instruction, combined with code samples, projects, and a chat community for support. The material covers the fundamentals—variables, data types, loops, functions—as well as intermediate topics like date handling, list comprehensions, file IO, regular expressions, modules, and packages. The course is designed to be accessible: no prior programming experience required, and the resources are freely available. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    GLM-4.5V

    GLM-4.5V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    ...It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding, and long-document interpretation. GLM-4.5V emerged from a training framework that leverages scalable reinforcement learning (with curriculum sampling) to boost performance across tasks ranging from STEM problem solving to long-context reasoning, giving it broad applicability beyond narrow benchmarks. When it was released, it achieved state-of-the-art results on a large collection of public multimodal benchmarks for open-source models.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Agent S2

    Agent S2

    Agent S: an open agentic framework that uses computers like a human

    ...Through modular architecture, it efficiently handles complex tasks, such as navigating UIs, performing low-level actions like text selection, and executing high-level strategies like planning. Additionally, the system's proactive hierarchical planning allows for real-time adaptation, making it an ideal solution for businesses seeking to streamline operations and automate digital workflows. Agent S2 is designed with flexibility, enabling seamless scaling for future applications and tasks.
    Downloads: 5 This Week
    Last Update:
    See Project