Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "video-making" - Page 5

x

Sort By:

Relevance

Clear All Filters

OS

Linux 332
Windows 327
Mac 305
More...
BSD 126
ChromeOS 123
Mobile Operating Systems 14
Desktop Operating Systems 2

Category

Artificial Intelligence 358
Software Development 31
Multimedia 30
Scientific/Engineering 11
Business 10
System 6
Communications 2
Formats and Protocols 2
Database 1
Education 1
Internet 1
Productivity 1
Security 1

License

OSI-Approved Open Source 318
Creative Commons Attribution License 4
Other License 3
GNU Free Documentation License 1

Translations

English 14
Arabic 1
Chinese (Simplified) 1
Chinese (Traditional) 1
More...
French 1
German 1
Korean 1

Programming Language

Python 358
C++ 11
Unix Shell 11
JavaScript 7
C 4
More...
TypeScript 4
Rust 3
C# 2
Java 2
MATLAB 2
Delphi/Kylix 1
Go 1
Julia 1
Lazarus 1
Object Pascal 1
PL/SQL 1
PowerShell 1
R 1

Status

Production/Stable 14
Beta 5
Pre-Alpha 4
Alpha 2

Showing 358 open source projects for "video-making"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)

Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.

Learn More
The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.

Learn More
1

Auto Synced & Translated Dubs

Automatically translates the text of a video based on a subtitle file

...The tool then time-stretches or compresses each TTS clip to match the original speech duration exactly, which preserves lip-sync and rhythm as closely as possible without manual editing. Finally, it combines all the clips into a single dubbed audio track that can be muxed with the original video, along with new translated subtitle files.

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
2

GLM-V

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

...The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image, video, document, GUI, and grounding tasks. It introduces hybrid training for broad-spectrum reasoning and a Thinking Mode switch to balance speed and depth of reasoning. GLM-4.1V-9B-Thinking incorporates reinforcement learning with curriculum sampling (RLCS) and Chain-of-Thought reasoning, outperforming models much larger in scale (e.g., Qwen-2.5-VL-72B) across many benchmarks.

Downloads: 1 This Week

Last Update: 6 days ago
See Project
3

pyttsx3

Offline Text To Speech synthesis for python

pyttsx3 is an offline text-to-speech library for Python that wraps native speech engines instead of calling cloud APIs. It is designed to work entirely without an internet connection, making it suitable for local automation, kiosks, accessibility tools, and embedded applications. On Windows it uses SAPI5, on Linux it typically uses eSpeak or eSpeak-NG, and on macOS it can use NSSpeechSynthesizer or AVSpeechSynthesizer, giving it broad cross-platform compatibility. The library exposes a simple but flexible API for controlling voice selection, speaking rate, volume, and other synthesis parameters from Python code. ...

Downloads: 8 This Week

Last Update: 2025-11-28
See Project
4

PyTorch3D

PyTorch3D is FAIR's library of reusable components for deep learning

...It’s designed to make it easy to build and train neural networks that work directly with 3D data such as meshes, point clouds, and implicit surfaces. The library provides fast GPU-accelerated implementations of rendering pipelines, transformations, rasterization, and lighting—making it possible to compute gradients through full 3D rendering processes. Researchers use it for tasks like shape generation, reconstruction, view synthesis, and visual reasoning. PyTorch3D also includes utilities for loading, transforming, and sampling 3D assets, so models can be trained end-to-end from 2D supervision or partial data. ...

Downloads: 6 This Week

Last Update: 2025-11-27
See Project
Easy-to-use Business Software for the Waste Management Software Industry
Increase efficiency, expedite accounts receivables, optimize routes, acquire new customers, & more!

DOP Software’s mission is to streamline waste and recycling business’ processes by providing them with dynamic, comprehensive software and services that increase productivity and quality of performance.

Learn More
5

Aider

Aider is AI pair programming in your terminal

...Aider creates a structured map of your entire repository, allowing it to handle large and complex projects effectively. It supports over 100 programming languages, making it flexible for nearly any development stack. With built-in Git integration, Aider keeps you in control by automatically committing clean, reversible changes. Whether you’re coding locally or in the cloud, Aider turns natural language requests into reliable, production-ready code.

Downloads: 6 This Week

Last Update: 2025-08-09
See Project
6

RamaLama

Simplifies the local serving of AI models from any source

...Developers can use familiar container commands to pull, run, and interact with AI models from any source, treating models similarly to how container images are handled in OCI workflows. RamaLama supports multiple model registries and offers a REST API or chatbot interface for interacting with running models, making it flexible for local development, testing, or integration into larger systems.

Downloads: 4 This Week

Last Update: 2026-01-16
See Project
7

CodeGeeX4

CodeGeeX4-ALL-9B, a versatile model for all AI software development

...It supports tasks such as code completion, generation from natural language descriptions, code translation, bug fixing, and explanation. The repository provides model checkpoints, inference examples, and fine-tuning guides, making it adaptable for both research and practical software development workflows. With its open release, CodeGeeX4 aims to provide a transparent alternative to proprietary coding assistants while advancing the field of AI-assisted programming.

Downloads: 4 This Week

Last Update: 5 days ago
See Project
8

Label Studio

Label Studio is a multi-type data labeling and annotation tool

...Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.

Downloads: 19 This Week

Last Update: 2025-12-19
See Project
9

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained on under 100 hours of audio, and supports multiple languages, including English (US/UK), Spanish, French, Hindi, Italian, Japanese, Brazilian Portuguese, and Mandarin Chinese. ...

Downloads: 9 This Week

Last Update: 2025-11-30
See Project
Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight
Lock Down Any Resource, Anywhere, Anytime

CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.

Learn More
10

WhisperLive

A nearly-live implementation of OpenAI's Whisper

...The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. On the client side, you can set the language, whether to translate into English, model size, voice activity detection, and output recording behavior.

Downloads: 7 This Week

Last Update: 2025-11-28
See Project
11

SAM 3D Body

Code for running inference with the SAM 3D Body Model 3DB

...The repository provides Python code to run inference, utilities to download checkpoints from Hugging Face, and demo scripts that turn images into 3D meshes and visualizations. There are Jupyter notebooks that walk you through setting up the model, running it on example images, and visualizing outputs in 3D, making it approachable even if you are not a 3D expert.

Downloads: 7 This Week

Last Update: 2025-12-19
See Project
12

CutLER

Code release for Cut and Learn for Unsupervised Object Detection

CutLER is an approach for unsupervised object detection and instance segmentation that trains detectors without human-annotated labels, and the repo also includes VideoCutLER for unsupervised video instance segmentation. The method follows a “Cut-and-LEaRn” recipe: bootstrap object proposals, refine them iteratively, and train detection/segmentation heads to discover objects across diverse datasets. The codebase provides training and inference scripts, model configs, and references to benchmarking results that report large gains over prior unsupervised baselines. ...

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
13

ContextGem

ContextGem: Effortless LLM extraction from documents

...It provides a flexible, intuitive API that minimizes boilerplate code, enabling developers to build complex extraction workflows efficiently. ContextGem supports various document formats and integrates with multiple LLM providers, making it a versatile tool for tasks like contract analysis, anomaly detection, and information retrieval.

Downloads: 1 This Week

Last Update: 2025-12-19
See Project
14

CogView4

CogView4, CogView3-Plus and CogView3(ECCV 2024)

...Compared to previous CogView versions, CogView4 introduces architectural upgrades, improved training pipelines, and larger-scale datasets, enabling stronger alignment between textual prompts and generated visual content. It emphasizes bilingual usability, making it well-suited for cross-lingual multimodal applications. The model also supports fine-tuning and downstream customization, extending its applicability to creative content generation, human–computer interaction, and research on vision-language alignment.

Downloads: 5 This Week

Last Update: 4 days ago
See Project
15

Agent S2

Agent S: an open agentic framework that uses computers like a human

...Through modular architecture, it efficiently handles complex tasks, such as navigating UIs, performing low-level actions like text selection, and executing high-level strategies like planning. Additionally, the system's proactive hierarchical planning allows for real-time adaptation, making it an ideal solution for businesses seeking to streamline operations and automate digital workflows. Agent S2 is designed with flexibility, enabling seamless scaling for future applications and tasks.

Downloads: 5 This Week

Last Update: 2025-12-16
See Project
16

Improved Diffusion

Release for Improved Denoising Diffusion Probabilistic Models

...The implementation is intended for researchers and practitioners who want to explore the theoretical and practical aspects of diffusion models in deep learning. By making this code available, OpenAI provides a foundation for further experimentation and development in generative modeling research.

Downloads: 2 This Week

Last Update: 6 days ago
See Project
17

MiniMind-V

"Big Model" trains a visual multimodal VLM with 26M parameters

MiniMind-V is an experimental open-source project that aims to train a very small multimodal vision–language model (VLM) from scratch with extremely low compute and cost, making research and experimentation accessible to more people. The repository showcases training workflows and code designed to produce a 26-million parameter model—including both image and text capabilities—using minimal resources in very little time, reflecting a trend toward democratizing AI research. MiniMind-V combines techniques from modern vision-language modeling but focuses on efficiency and simplicity so that individuals or small teams can explore multimodal learning without massive GPU clusters. ...

Downloads: 3 This Week

Last Update: 6 days ago
See Project
18

MedGemma

Collection of Gemma 3 variants that are trained for performance

...It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and multimodal) model that offers deeper clinical reasoning and understanding at higher capacity, making it suitable for complex tasks like medical question answering, summarization of clinical notes, or generating reports from radiology images. The multimodal versions pair a SigLIP-based image encoder pre-trained on diverse de-identified medical imaging data.

Downloads: 3 This Week

Last Update: 2026-01-16
See Project
19

Transformer Debugger

Tool for exploring and debugging transformer model behaviors

...It combines automated interpretability methods with sparse autoencoders, enabling researchers to analyze how specific neurons, attention heads, and latent features contribute to a model’s outputs. TDB allows users to intervene directly in the forward pass of a model and observe how such interventions change predictions, making it possible to answer questions like why a token was selected or why an attention head focused on a certain input. It automatically identifies and explains the most influential components, highlights activation patterns, and maps relationships across circuits within the model. The tool includes both a React-based neuron viewer for exploring model components and a backend activation server for running inferences and serving data.

Downloads: 3 This Week

Last Update: 6 days ago
See Project
20

Skyvern

Automate browser-based workflows with LLMs and Computer Vision

Skyvern uses a combination of computer vision and AI to understand content on a webpage, making it adaptable to any website. Skyvern takes instructions in natural language, allowing it to execute complex objectives with simple commands. Skyvern is an API-first product. Workflows execute in the cloud, allowing it to run hundreds of workflows at the same time. Skyvern's AI decisions come with built-in explanations, providing clear summaries and justifications for every action.

Downloads: 3 This Week

Last Update: 5 days ago
See Project
21

Tianji

Evaluation suite designed to assess the performance of LLMs

...It focuses on measuring general capabilities such as reasoning, knowledge, commonsense, and language understanding. Tianji provides a curated set of benchmarks and a unified framework for systematically comparing LLMs, making it useful for research and model selection.

Downloads: 0 This Week

Last Update: 2025-04-29
See Project
22

TextWorld

TextWorld is a sandbox learning environment for the training

...Developed by Microsoft Research, TextWorld focuses on language understanding, planning, and interaction in complex, narrative-driven environments. It generates games procedurally, enabling scalable testing of agents’ natural language processing and decision-making abilities.

Downloads: 0 This Week

Last Update: 2025-05-21
See Project
23

Dolphin

Document Image Parsing via Heterogeneous Anchor Prompting”

Dolphin — maintained by ByteDance — is a project aimed at providing a high-performance, robust, and extensible media or multimedia framework / player infrastructure (or possibly a streaming media solution), intended to meet modern demands for efficiency, flexibility, and integration in media-heavy applications. It seeks to combine performant media playback or handling (audio/video decoding, streaming, buffering) with a modular, developer-friendly API that allows easy embedding into larger applications or services. Because multimedia delivery requirements vary widely (adaptive streaming, live feeds, cross-platform compatibility, custom UI, performance constraints), Dolphin aims to offer a foundation that developers can build upon or adapt to their needs. ...

Downloads: 0 This Week

Last Update: 2025-12-17
See Project
24

AskUI Vision Agent

Enable AI to control your desktop, mobile and HMI devices

AskUI’s Vision Agent is an automation framework that allows you—and AI agents—to control real desktops, mobile devices, and HMI systems by perceiving the UI and performing actions like clicking, typing, scrolling, and drag-and-drop. It is designed for multi-platform compatibility and supports multiple AI models so you can tailor perception and decision-making to your workload. The repository presents a feature overview, sample media, and frequent release notes, which show ongoing improvements such as CORS checks and other operational tweaks. The broader AskUI documentation covers the Python Vision Agent along with suite services and inference APIs, indicating a productized ecosystem rather than a single library. ...

Downloads: 5 This Week

Last Update: 6 days ago
See Project
25

Map-Anything

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

...The model flexibly accepts different input combinations (images, intrinsics, poses, sparse or dense depth) and produces a rich set of outputs including per-pixel 3D points, camera intrinsics, camera poses, ray directions, confidence maps, and validity masks. Its inference path is fully feed-forward with optional mixed-precision and memory-efficient modes, making it practical to scale to long image sequences while keeping latency predictable.

Downloads: 5 This Week

Last Update: 2026-01-18
See Project

Previous
1
2
3
4
You're on page 5
6
7
8
9
Next

Related Searches

studio

label studio

sam 3d body

ai agent mod

complete website downloader

image annotation

audio to text

recovery

ply

human-signals

Related Categories

Artificial Intelligence

Software Development

Multimedia

Scientific/Engineering

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: