Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "video-making" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Linux 332
Windows 327
Mac 305
More...
BSD 126
ChromeOS 123
Mobile Operating Systems 14
Desktop Operating Systems 2

Category

Artificial Intelligence 358
Software Development 31
Multimedia 30
Scientific/Engineering 11
Business 10
System 6
Communications 2
Formats and Protocols 2
Database 1
Education 1
Internet 1
Productivity 1
Security 1

License

OSI-Approved Open Source 318
Creative Commons Attribution License 4
Other License 3
GNU Free Documentation License 1

Translations

English 14
Arabic 1
Chinese (Simplified) 1
Chinese (Traditional) 1
More...
French 1
German 1
Korean 1

Programming Language

Python 358
C++ 11
Unix Shell 11
JavaScript 7
C 4
More...
TypeScript 4
Rust 3
C# 2
Java 2
MATLAB 2
Delphi/Kylix 1
Go 1
Julia 1
Lazarus 1
Object Pascal 1
PL/SQL 1
PowerShell 1
R 1

Status

Production/Stable 14
Beta 5
Pre-Alpha 4
Alpha 2

Showing 358 open source projects for "video-making"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)

Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.

Learn More
G-P - Global EOR Solution
Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.

Learn More
1

MiniCPM-o

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. ...

Downloads: 0 This Week

Last Update: 2025-05-15
See Project
2

h2oGPT

Private chat with local GPT with document, images, video, etc.

h2oGPT is an open-source platform that allows users to interact with local GPT models in a completely private environment. It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and Mixtral, making it a flexible tool for anyone needing advanced document analysis and AI-driven conversation in a secure, local setup.

Downloads: 2 This Week

Last Update: 2025-02-22
See Project
3

AutoClip

AI-powered video clipping and highlight generation

...Once highlights are identified, AutoClip can automatically cut those segments and optionally assemble them into a compilation, thus greatly reducing manual video editing effort. It uses a modern web application stack with a front end (React + TypeScript) for user interaction and a back end that handles downloading, processing, clipping, and queue management, allowing real-time progress feedback and easy deployment, e.g. via Docker.

Downloads: 4 This Week

Last Update: 2025-12-08
See Project
4

HunyuanOCR

OCR expert VLM powered by Hunyuan's native multimodal architecture

...HunyuanOCR handles complex documents: multi-column layouts, tables, mathematical formulas, mixed languages, handwritten or stylized fonts, receipts, tickets, and even video-frame subtitles. The project provides code, pretrained weights, and inference instructions, making it feasible to deploy locally or on a server, and to integrate with applications.

Downloads: 6 This Week

Last Update: 2026-01-13
See Project
Skillfully - The future of skills based hiring
Realistic Workplace Simulations that Show Applicant Skills in Action

Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.

Learn More
5

Phenaki - Pytorch

Implementation of Phenaki Video, which uses Mask GIT

...This repository will also endeavor to allow the researcher to train on text-to-image and then text-to-video. Similarly, for unconditional training, the researcher should be able to first train on images and then fine tune on video.

Downloads: 1 This Week

Last Update: 2024-07-29
See Project
6

Vidi2

Large Multimodal Models for Video Understanding and Editing

Vidi is a family of large multimodal models developed for deep video understanding and editing tasks, integrating vision, audio, and language to allow sophisticated querying and manipulation of video content. It’s designed to process long-form, real-world videos and answer complex queries such as “when in this clip does X happen?” or “where in the frame is object Y during that moment?” — offering temporal retrieval, spatio-temporal grounding (i.e. locating objects over time + space), and even video question answering. ...

Downloads: 0 This Week

Last Update: 2026-01-12
See Project
7

Story Flicks

Generate high-definition story short videos with one click using AI

Story Flicks is another open-source project in the AI-assisted video generation / editing space, focused on creating short, story-style videos from script or prompt inputs. It aims to let users generate high-definition short movies or video stories with minimal manual effort, using AI models under the hood to assemble visuals, timing, and possibly narration or subtitles. For creators who want to produce narrative short-form content — whether for social media, storytelling, or prototyping video ideas — story-flicks offers a lightweight, code-backed alternative to complex video editing suites. ...

Downloads: 0 This Week

Last Update: 2025-12-14
See Project
8

HunyuanVideo-I2V

A Customizable Image-to-Video Model based on HunyuanVideo

HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and support for parallel inference via xDiT. ...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
9

HunyuanCustom

Multimodal-Driven Architecture for Customized Video Generation

HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images.

Downloads: 0 This Week

Last Update: 2025-10-15
See Project
The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.

Learn More
10

Director

AI video agents framework for next-gen video interactions

Director is a video database management system designed to organize, search, and retrieve large collections of video content efficiently.

Downloads: 0 This Week

Last Update: 2025-01-29
See Project
11

SeedVR

Repo for SeedVR2 & SeedVR

SeedVR (from the ByteDance-Seed organization) is an open-source research and implementation repository focused on cutting-edge video restoration using diffusion transformer architectures. The project includes both the original SeedVR and its successor SeedVR2 models, which are designed to restore degraded or low-quality video content by learning to reconstruct high-fidelity frames with temporal coherence. These models leverage advanced techniques such as adaptive attention mechanisms and adversarial training to produce visually appealing results in a single inference step, pushing the boundaries of video restoration research. ...

Downloads: 2 This Week

Last Update: 2026-01-07
See Project
12

Fooocus

Focus on prompting and generating

...Built on Gradio and leveraging Stable Diffusion XL, Fooocus eliminates the need for manual parameter tweaking, allowing users to focus solely on crafting prompts. It offers a user-friendly interface with minimal setup, making advanced image synthesis accessible to a broader audience.

Downloads: 227 This Week

Last Update: 2025-06-03
See Project
13

Qwen2.5-Omni

Capable of understanding text, audio, vision, video

Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds state-of-the-art performance in many multimodal benchmarks, particularly spoken language understanding, audio reasoning, image/video understanding, etc. ...

Downloads: 1 This Week

Last Update: 2025-09-23
See Project
14

MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server

...It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license, with a pyproject.toml and uv-based workflow that makes installation and execution reproducible. Configuration is handled through JSON files that tell MCP clients how to launch the server (typically via uvx minimax-mcp) and which environment variables to use for the API key, host, and output directory. ...

Downloads: 2 This Week

Last Update: 2026-01-07
See Project
15

HunyuanVideo-Avatar

Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

...Character image injection module for better consistency between training and inference conditioning. Emotion control by extracting emotion reference images and transferring emotional style into video sequences.

Downloads: 4 This Week

Last Update: 2025-12-16
See Project
16

highway-env

A minimalist environment for decision-making in autonomous driving

HighwayEnv is an OpenAI Gym-compatible environment focused on autonomous driving scenarios. It provides flexible simulations for testing decision-making algorithms in highway, intersection, and merging traffic situations.

Downloads: 1 This Week

Last Update: 2025-10-18
See Project
17

Godot RL Agents

An Open Source package that allows video game creators

...It allows AI agents to learn how to interact with and play Godot-based games using RL algorithms. The toolkit bridges Godot with Python-based RL libraries like Stable-Baselines3, making it possible to create complex and visually rich RL environments natively in Godot.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
18

MESHROOM

3D reconstruction software

Photogrammetry is the science of making measurements from photographs. It infers the geometry of a scene from a set of unordered photographies or videos. Photography is the projection of a 3D scene onto a 2D plane, losing depth information. The goal of photogrammetry is to reverse this process. The dense modeling of the scene is the result yielded by chaining two computer vision-based pipelines, “Structure-from-Motion” (SfM) and “Multi View Stereo” (MVS).

1 Review

Downloads: 197 This Week

Last Update: 2025-08-19
See Project
19

Sa2VA

Official Repo For "Sa2VA: Marrying SAM2 with LLaVA

..., or “track this object through the video,” outputting pixel-perfect masks or spoken/textual answers as appropriate.

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
20

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use. ...

Downloads: 1 This Week

Last Update: 2025-09-28
See Project
21

AUTOMATIC1111 Stable Diffusion web UI

Stable Diffusion web UI

...With a flexible installation process across Windows, Linux, and Apple Silicon, plus support for GPUs and CPUs, it caters to a wide range of users—from hobbyists to professionals. The interface also supports prompt editing, batch processing, custom scripts, and many community extensions, making it a highly customizable and continually evolving platform for creative AI art generation.

1 Review

Downloads: 275 This Week

Last Update: 2025-06-02
See Project
22

fastdup

An unsupervised and free tool for image and video dataset analysis

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

Downloads: 2 This Week

Last Update: 2024-08-16
See Project
23

Clay Foundation Model

The Clay Foundation Model - An open source AI model and interface

The Clay Foundation Model is an open-source AI model and interface designed to provide comprehensive data and insights about Earth. It aims to serve as a foundational tool for environmental monitoring, research, and decision-making by integrating various data sources and offering an accessible platform for analysis.

Downloads: 1 This Week

Last Update: 2025-07-05
See Project
24

Agno

Lightweight framework for building Agents with memory, knowledge, etc.

Agno is a modular, open-source artificial general intelligence (AGI) research platform that allows developers to build, evaluate, and experiment with cognitive architectures in a composable way. It provides a flexible framework for modeling reasoning, memory, decision-making, and planning, aimed at long-term AI research beyond narrow learning. Agno embraces multi-agent environments and symbolic reasoning as part of its core design, enabling experiments with structured knowledge, goal-oriented behaviors, and meta-learning. It’s designed for researchers seeking an extensible platform to explore AGI components without being tied to black-box models.

Downloads: 5 This Week

Last Update: 1 day ago
See Project
25

GLM-4.6

Agentic, Reasoning, and Coding (ARC) foundation models

GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference...

Downloads: 172 This Week

Last Update: 2026-01-20
See Project

Previous
1
You're on page 2
3
4
5
6
Next

Related Searches

meshroom

fooocus

mega-voice

automatic1111

arabic text to speech

glm 4.6

speech synthesis

.mega-voice

ocr

video ai

Related Categories

Artificial Intelligence

Software Development

Multimedia

Scientific/Engineering

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: