Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "video-making" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 339
Windows 334
Mac 312
More...
BSD 132
ChromeOS 129
Mobile Operating Systems 14
Desktop Operating Systems 2

Category

Artificial Intelligence 365
Software Development 31
Multimedia 30
Scientific/Engineering 11
Business 10
System 6
Communications 2
Formats and Protocols 2
Database 1
Education 1
Internet 1
Productivity 1
Security 1

License

OSI-Approved Open Source 323
Creative Commons Attribution License 4
Other License 3
GNU Free Documentation License 1

Translations

English 14
Arabic 1
Chinese (Simplified) 1
Chinese (Traditional) 1
More...
French 1
German 1
Korean 1

Programming Language

Python 365
C++ 11
Unix Shell 11
JavaScript 7
C 4
More...
TypeScript 4
Rust 3
C# 2
Java 2
MATLAB 2
Delphi/Kylix 1
Go 1
Julia 1
Lazarus 1
Object Pascal 1
PL/SQL 1
PowerShell 1
R 1

Status

Production/Stable 14
Beta 5
Pre-Alpha 4
Alpha 2

Showing 365 open source projects for "video-making"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)

Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.

Learn More
Desktop and Mobile Device Management Software
It's a modern take on desktop management that can be scaled as per organizational needs.

Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.

Learn More
1

Umi-OCR

OCR software, free and offline

...The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. Users can interact with Umi-OCR through a graphical interface, command-line options, or HTTP interfaces, making it adaptable to both casual desktop usage and programmatic automation. Because the project is open source, developers can inspect, modify, and extend its capabilities, and plugins allow for different recognition engines or enhanced features.

Downloads: 68 This Week

Last Update: 2026-01-15
See Project
2

labelme Image Polygonal Annotation

Image polygonal annotation with Python

...It is written in Python and uses Qt for its graphical interface. Image annotation for polygon, rectangle, circle, line and point. Image flag annotation for classification and cleaning. Video annotation. (video annotation). GUI customization (predefined labels / flags, auto-saving, label validation, etc). Exporting VOC-format dataset for semantic/instance segmentation. (semantic segmentation, instance segmentation). Exporting COCO-format dataset for instance segmentation. (instance segmentation). The first time you run labelme, it will create a config file in ~/.labelmerc. ...

Downloads: 10 This Week

Last Update: 2025-11-29
See Project
3

ComfyUI

The most powerful and modular diffusion model GUI, api and backend

The most powerful and modular diffusion model is GUI and backend. This UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface. We are a team dedicated to iterating and improving ComfyUI, supporting the ComfyUI ecosystem with tools like node manager, node registry, cli, automated testing, and public documentation. Open source AI models will win in the long run against closed models and we are only at the beginning. Our core mission...

Downloads: 286 This Week

Last Update: 2 days ago
See Project
4

Perception Models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models

Perception Models is a state-of-the-art framework developed by Facebook Research for advanced image and video perception tasks. It introduces two primary components: the Perception Encoder (PE) for visual feature extraction and the Perception Language Model (PLM) for multimodal decoding and reasoning. The PE module is a family of vision encoders designed to excel in image and video understanding, surpassing models like SigLIP2, InternVideo2, and DINOv2 across multiple benchmarks. ...

Downloads: 0 This Week

Last Update: 6 days ago
See Project
Smart Business Texting that Generates Pipeline
Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.

Learn More
5

Android Use

Automate native Android apps with AI using accessibility APIs

...The project works by using Android’s accessibility API to extract structured UI state (as XML) from the device, which is then fed to a large language model (LLM) like OpenAI’s models for decision-making, and actions are executed via the Android Debug Bridge (ADB). This approach bypasses expensive vision-based models and provides faster, cheaper automation with fine-grained interaction capabilities (for example, tapping buttons, typing text, navigating screens).

Downloads: 5 This Week

Last Update: 5 days ago
See Project
6

BitNet

Inference framework for 1-bit LLMs

BitNet (bitnet.cpp) is a high-performance inference framework designed to optimize the execution of 1-bit large language models, making them more efficient for edge devices and local deployment. The framework offers significant speedups and energy reductions, achieving up to 6.17x faster performance on x86 CPUs and 70% energy savings, allowing the running of models such as the BitNet b1.58 100B with impressive efficiency. With support for lossless inference and enhanced processing power, BitNet enables faster AI applications while minimizing resource usage. ...

Downloads: 5 This Week

Last Update: 2025-06-03
See Project
7

LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

LightRAG is a lightweight Retrieval-Augmented Generation (RAG) framework designed for efficient document retrieval and response generation. It is optimized for speed and lower resource consumption, making it ideal for real-time applications.

Downloads: 5 This Week

Last Update: 2026-01-15
See Project
8

Norfair

Lightweight Python library for adding real-time multi-object tracking

...Any detector expressing its detections as a series of (x, y) coordinates can be used with Norfair. This includes detectors performing tasks such as object or keypoint detection. It can easily be inserted into complex video processing pipelines to add tracking to existing projects. At the same time, it is possible to build a video inference loop from scratch using just Norfair and a detector. Supports moving camera, re-identification with appearance embeddings, and n-dimensional object tracking. Norfair provides several predefined distance functions to compare tracked objects and detections. ...

Downloads: 1 This Week

Last Update: 2025-04-30
See Project
9

VGGSfM

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

...The system combines learned feature matching and geometric optimization to generate high-quality camera calibrations, sparse/dense point clouds, and depth maps in standard COLMAP format. Version 2.0 adds support for dynamic scene handling, dense point cloud export, video-based reconstruction (1000+ frames), and integration with Gaussian Splatting pipelines. It leverages tools like PyCOLMAP, poselib, LightGlue, and PyTorch3D for feature matching, pose estimation, and visualization. With minimal configuration, users can process single scenes or full video sequences, apply motion masks to exclude moving objects, and train neural radiance or splatting models directly from reconstructed outputs.

Downloads: 3 This Week

Last Update: 4 hours ago
See Project
The most trusted software in construction
HCSS is the gold standard software solution for winning, planning, and managing construction projects by connecting the office to the field.

HCSS provides easy-to-use software built for construction companies that want to win more work, work smarter, and boost profits. For nearly 40 years, we've helped heavy civil contractors, infrastructure builders, and utility companies improve operations, from estimating and project management to field tracking, equipment maintenance, and safety. Tools like HeavyBid, HeavyJob, and HCSS Safety are built for the field and designed to work together, giving your team real-time visibility, tighter cost control, and better job outcomes. With 45+ accounting integrations and customizable APIs, HCSS fits seamlessly into your tech stack. We regularly update our software based on feedback from real crews, ensuring it fits the way your team works. Backed by award-winning 24/7/365 support and a proven implementation process, HCSS helps reduce risk, cut inefficiencies, and deliver fast ROI. If you're ready to grow your business and gain a competitive edge, HCSS is the partner that gets you there.

Learn More
10

Recurrent Interface Network (RIN)

Implementation of Recurrent Interface Network (RIN)

Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch. The author unawaredly reinvented the induced set-attention block from the set transformers paper. They also combine this with the self-conditioning technique from the Bit Diffusion paper, specifically for the latents. The last ingredient seems to be a new noise function based around the sigmoid, which the author claims is better than cosine scheduler for larger images. ...

Downloads: 0 This Week

Last Update: 2024-02-14
See Project
11

SAHI

A lightweight vision library for performing large object detection

...Detection of small objects and objects far away in the scene is a major challenge in surveillance applications. Such objects are represented by small number of pixels in the image and lack sufficient details, making them difficult to detect using conventional detectors. In this work, an open-source framework called Slicing Aided Hyper Inference (SAHI) is proposed that provides a generic slicing aided inference and fine-tuning pipeline for small object detection.

Downloads: 0 This Week

Last Update: 2025-09-28
See Project
12

Semantic Router

Superfast AI decision making and processing of multi-modal data

Semantic Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow, unreliable LLM generations to make tool-use or safety decisions, we use the magic of semantic vector space — routing our requests using semantic meaning. Combining LLMs with deterministic rules means we can be confident that our AI systems behave as intended. Cramming agent tools into the limited context window is expensive, slow, and fundamentally limited.

Downloads: 1 This Week

Last Update: 2025-11-18
See Project
13

SAM 3

Code for running inference and finetuning with SAM 3 model

SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an...

Downloads: 109 This Week

Last Update: 2026-01-12
See Project
14

vJEPA-2

PyTorch code and models for VJEPA2 self-supervised learning from video

VJEPA2 is a next-generation self-supervised learning framework for video that extends the “predict in representation space” idea from i-JEPA to the temporal domain. Instead of reconstructing pixels, it predicts the missing high-level embeddings of masked space-time regions using a context encoder and a slowly updated target encoder. This objective encourages the model to learn semantics, motion, and long-range structure without the shortcuts that pixel-level losses can invite. ...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
15

Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM

...It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.

Downloads: 4 This Week

Last Update: 2026-01-08
See Project
16

Letta

Letta (formerly MemGPT) is a framework for creating LLM services

Letta is an AI-powered task automation framework designed to handle workflow automation, natural language commands, and AI-driven decision-making.

Downloads: 6 This Week

Last Update: 2026-01-12
See Project
17

nanocode

Minimal Claude Code alternative. Single Python file, zero dependencies

nanocode is a minimalist coding agent implementation designed as a compact alternative to Claude Code, packaged in a single Python file with no external dependencies and totaling around 250 lines of code. It implements a full agentic loop where the model can reason, decide when to use tools, execute those tools, and iterate until producing a final answer, making it useful for simple AI-assisted coding workflows. It includes a set of integrated tools such as read, write, edit, glob, grep, and bash that let the agent interact with the file system and shell commands directly from the terminal, and it keeps a conversation history with colored terminal output for readability. The project exemplifies how lightweight architectures can still support practical agent workflows without complex infrastructure, making it suitable for developers exploring agent frameworks or building custom coding assistants.

Downloads: 0 This Week

Last Update: 52 minutes ago
See Project
18

SwarmZero

SwarmZero's SDK for building AI agents, swarms of agents and much more

SwarmZero is an open-source platform designed for deploying and managing autonomous robot swarms. It enables collective coordination, decentralized decision-making, and real-time collaboration among large groups of autonomous agents, focusing on multi-robot systems and research in swarm robotics.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
19

Jina

Build cross-modal and multimodal applications on the cloud

Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. ...

Downloads: 0 This Week

Last Update: 2024-11-12
See Project
20

DeepSeek R1

Open-source, high-performance AI model with advanced reasoning

DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely...

1 Review

Downloads: 98 This Week

Last Update: 2025-07-09
See Project
21

x-unet

Implementation of a U-net complete with efficient attention

Implementation of a U-net complete with efficient attention as well as the latest research findings. For 3d (video or CT / MRI scans).

Downloads: 0 This Week

Last Update: 2024-05-03
See Project
22

GPT Researcher

LLM based autonomous agent that does online comprehensive research

Say Hello to GPT Researcher, your AI agent for rapid insights and comprehensive research. GPT Researcher is the leading autonomous agent that takes care of everything from accurate source gathering to organization of research results.

Downloads: 20 This Week

Last Update: 6 days ago
See Project
23

Paper2GUI

Convert AI papers to GUI

...让每个人都简单方便的使用前沿人工智能技术 Paper2GUI: An AI desktop APP toolbox for ordinary people. It can be used immediately without installation. It already supports 40+ AI models, covering AI painting, speech synthesis, video frame complementing, video super-resolution, object detection, and image stylization. , OCR recognition and other fields. Support Windows, Mac, Linux systems. Paper2GUI: 一款面向普通人的 AI 桌面 APP 工具箱，免安装即开即用，已支持 40+AI 模型，内容涵盖 AI 绘画、语音合成、视频补帧、视频超分、目标检测、图片风格化、OCR 识别等领域。支持 Windows、Mac、Linux 系统。

Downloads: 5 This Week

Last Update: 2024-09-20
See Project
24

ebook2audiobook

Generate audiobooks from e-books, voice cloning & 1107+ languages

...The tool supports a wide array of underlying TTS backends (XTTSv2, Bark, VITS, Fairseq, Tacotron2, YourTTS and more), which gives flexibility depending on hardware availability, voice preference, and language. It also supports a huge number of languages — apparently “+1110 languages and dialects” in its supported set — making it suitable for eBooks in many languages.

Downloads: 22 This Week

Last Update: 1 day ago
See Project
25

Sapiens

High-resolution models for human tasks

...The project emphasizes long-horizon reasoning and cross-modal grounding—connecting language, perception, and action into a single agentic model capable of following abstract goals. It includes simulation environments, datasets, and benchmarks for testing grounded understanding, imitation learning, and decision-making. The system’s modular pipeline supports both imitation-based and reinforcement-based training strategies, allowing flexible experimentation with different embodiments and tasks.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

ai

comfyui

ocr

video ai

android

deepseek

pdf ocr

labelme

recurrent neural networks

sahi

Related Categories

Artificial Intelligence

Software Development

Multimedia

Scientific/Engineering

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: