Page 3 | visual free download

Showing 366 open source projects for "visual"

View related business solutions

Artificial Intelligence Linux Clear Filters & Widen Search

Earn up to 16% annual interest with Nexo.
Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
1

Janus

Unified Multimodal Understanding and Generation Models

Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations.

Downloads: 0 This Week

Last Update: 2025-10-20
See Project
2

swark.io

Create architecture diagrams from code automatically using LLMs

Swark is an open-source developer tool and Visual Studio Code extension that automatically generates software architecture diagrams directly from source code using large language models. The project aims to help developers quickly understand complex codebases by analyzing repositories and producing visual diagrams that represent system architecture, dependencies, and component relationships.

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
3

LlamaGen

Autoregressive Model Beats Diffusion

LlamaGen is an open-source research project that introduces a new approach to image generation by applying the autoregressive next-token prediction paradigm used in large language models to visual generation tasks. Instead of relying on diffusion models, the framework treats images as sequences of tokens that can be generated progressively using transformer architectures similar to those used for text generation. The project explores how scaling autoregressive models and improving image tokenization techniques can produce competitive results compared with modern diffusion-based image generators. ...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
4

StarVector

StarVector is a foundation model for SVG generation

...The system treats vector graphics creation as a code generation problem, producing SVG code that can render detailed vector images. Its architecture combines computer vision techniques with language modeling capabilities so it can understand visual inputs and textual prompts simultaneously. The model converts raster images or text instructions into structured vector representations, enabling high-quality vectorization and design generation. This approach allows StarVector to create scalable graphics that maintain visual quality regardless of resolution, which is especially useful for design tools and illustration workflows. ...

Downloads: 1 This Week

Last Update: 2026-03-05
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

FastVLM

This repository contains the official implementation of FastVLM

...The repository documents model variants, showcases head-to-head numbers against known baselines, and explains how the encoder integrates with common LLM backbones. Apple’s research brief frames FastVLM as targeting real-time or latency-sensitive scenarios, where lowering visual token pressure is critical to interactive UX. In short, it’s a practical recipe to make VLMs fast without exotic token-selection heuristics.

Downloads: 1 This Week

Last Update: 2025-10-08
See Project
6

OpenPromptStudio

Visual editor for AI prompts with translation, categories, and tools

OpenPromptStudio is an open source visual editor designed to help users create, organize, and manage prompts for AI image generation tools. It focuses on improving the workflow for building prompts by turning them into structured, visual components that are easier to edit and rearrange. It supports the creation and classification of prompt segments, allowing users to organize them into different types such as styles, quality modifiers, commands, or general prompt elements. ...

Downloads: 1 This Week

Last Update: 4 days ago
See Project
7

ChainForge

An open-source visual programming environment

ChainForge is an open-source visual programming environment designed to help developers systematically test, compare, and evaluate prompts and outputs across multiple large language models in a structured and scalable way. Instead of relying on isolated prompt experimentation, it introduces a dataflow-based interface that allows users to create complex prompt pipelines and evaluate them across different models, parameters, and datasets simultaneously.

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
8

ClaudeBar

A macOS menu bar application that monitors AI coding assistant usage

...Rather than constantly running CLI commands or navigating web dashboards, users can glance at their quota statistics for services like Claude, Codex, Gemini, GitHub Copilot, and Antigravity directly from the menu bar. The application provides real-time tracking of session, weekly, and model-specific usage percentages, using visual indicators such as color-coded progress bars to communicate when quotas are healthy, nearing limits, or depleted. It includes options to enable or disable monitoring for individual providers, supports multiple visual themes (including dark mode and a festive theme), and refreshes data at configurable intervals so users always have up-to-date information.

Downloads: 0 This Week

Last Update: 4 days ago
See Project
9

VideoRAG

"VideoRAG: Chat with Your Videos

VideoRAG is a retrieval-augmented generation (RAG) framework tailored for video content that enables AI systems to answer questions, summarize, and reason over long videos by combining visual embeddings with contextual search. The system works by first breaking video into clips, extracting visual and audio-textual features, and indexing them into embeddings, then using an LLM with a retriever to pull relevant segments on demand. When a user query is received, VideoRAG locates semantically relevant moments in the video using the embedding index, retrieves associated clips or transcripts, and feeds them to a generative model to produce accurate, grounded answers or summaries. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

LandPPT

An LLM-based presentation generation platform

...The application integrates multiple AI models from providers such as OpenAI, Anthropic, Google, and locally hosted models to generate text, images, and structured presentation layouts. It also includes template systems and style options that allow presentations to be customized for different industries, visual themes, or storytelling formats.

Downloads: 4 This Week

Last Update: 2026-04-13
See Project
11

Depth Anything 3

Recovering the Visual Space from Any Views

Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography, AR/VR content creation, robotics perception, and 3D reconstruction workflows, making it versatile across industries and research domains. ...

Downloads: 4 This Week

Last Update: 2026-03-21
See Project
12

video-use

Edit videos with Claude Code

...Designed to work with Claude Code, it automates the entire editing process—from cutting clips to rendering the final output—without requiring manual timelines or complex software interfaces. The system intelligently analyzes audio transcripts and visual cues to make precise, context-aware editing decisions. It supports a wide range of content types, including interviews, tutorials, montages, and talking-head videos. By combining structured text representations with on-demand visual previews, it minimizes processing overhead while maintaining high-quality results. Overall, Video Use reimagines video editing as an AI-driven, conversational workflow.

Downloads: 7 This Week

Last Update: 2026-04-23
See Project
13

Wan2.1

Wan2.1: Open and Advanced Large-Scale Video Generative Model

Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text-to-video and image-to-video generation tasks with flexible resolution options suitable for various GPU hardware configurations. ...

1 Review

Downloads: 77 This Week

Last Update: 2026-03-05
See Project
14

AliceVision

3D Computer Vision Framework

...The framework is built with a strong emphasis on research-grade algorithms while maintaining the robustness required for production environments, making it suitable for industries such as visual effects, cultural heritage preservation, and robotics. AliceVision is modular, enabling developers to use individual components or customize the pipeline for specific workflows, including panorama stitching and camera tracking. It integrates with tools like Meshroom, which offers a graphical interface to simplify complex reconstruction processes for non-technical users.

Downloads: 2 This Week

Last Update: 2026-03-18
See Project
15

UFO³

Weaving the Digital Agent Galaxy

...The system allows users to issue natural language instructions that are translated into automated actions across multiple desktop applications. Using a dual-agent architecture, the framework analyzes both visual interface elements and system control structures in order to understand how applications should be manipulated. This enables the agent to navigate complex software environments and perform tasks that normally require manual interaction. UFO integrates mechanisms for task decomposition, planning, and execution so that high-level user requests can be broken down into smaller steps performed by specialized agents. ...

Downloads: 3 This Week

Last Update: 2026-03-04
See Project
16

FormCreate

The easy-to-use Vue low-code visual AI form designer

FormCreate is a low-code visual form builder built on Vue that enables developers to create complex, dynamic forms through a drag-and-drop interface rather than manual coding. It is part of the broader form-create ecosystem and leverages JSON-based schema generation to dynamically render forms, handle validation, and manage data collection workflows. The tool is designed to significantly reduce development time by allowing users to visually assemble forms while automatically generating the underlying configuration and logic. ...

Downloads: 0 This Week

Last Update: 2026-03-19
See Project
17

LLM Vision

Visual intelligence for your home.

...The project enables Home Assistant to analyze images, video files, and live camera feeds using vision-capable AI models. Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process events from surveillance platforms such as Frigate and convert them into meaningful summaries, notifications, or structured data for automation workflows. It also maintains a timeline of analyzed camera events that can be displayed in dashboards or queried through the assistant interface.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
18

Refly

The first open-source agent skills builder

Refly is an AI-native workflow platform that democratizes automated workflow and skills creation for both technical and non-technical users by offering a visual, natural-language-driven interface. Instead of requiring code, Refly lets creators define tasks and business logic through simple “vibes,” which are compiled into structured, reusable agent skills that can be executed on engines like Claude Code, Cursor, or other supported runtimes. With a focus on making automation accessible, it provides a visual canvas and low-code components that feel similar to drag-and-drop builders but backed by powerful AI orchestration, memory handling, and integrations with external services. ...

Downloads: 0 This Week

Last Update: 2026-02-10
See Project
19

agentation

The visual feedback tool for agents

Agentation is a visual annotation and feedback tool designed to make interacting with AI coding agents more intuitive and precise by letting developers visually click on frontend elements in a browser and annotate them with context before sending structured feedback to an agent. Instead of describing UI elements in text — like “the blue button in the sidebar” — users click directly on elements to automatically capture selectors, positions, and contextual metadata that can be consumed by AI agents to locate exact code references. ...

Downloads: 0 This Week

Last Update: 2026-03-25
See Project
20

Dafthunk

A workflow execution platform built on top of the fantastic Cloudflare

...It aims to combine the approachability of a visual editor with the practical needs of real automation: state persistence, execution history, reusable nodes, and integrations with external systems. A key appeal is that you can go from idea to running automation quickly in a hosted-like experience while still keeping the project open source and infrastructure-aware.

Downloads: 0 This Week

Last Update: 4 days ago
See Project
21

ViZDoom

Doom-based AI research platform for reinforcement learning

ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular. ViZDoom is based on ZDOOM, the most popular modern source-port of DOOM. This means compatibility with a huge range of tools and resources that can be used to create custom scenarios, availability of detailed documentation of the engine and tools and support of Doom community. ...

Downloads: 1 This Week

Last Update: 2026-02-11
See Project
22

LatentSync

Taming Stable Diffusion for Lip Sync

...The system leverages a U-Net diffusion backbone, with cross-attention of audio embeddings (via an audio encoder) and reference video frames to guide generation, and applies a set of loss functions (temporal, perceptual, sync-net based) to enforce lip-sync accuracy, visual fidelity, and temporal consistency. Over versions, LatentSync has improved temporal stability and lowered resource requirements — making inference more practical (e.g. 8 GB VRAM for earlier versions, somewhat higher for latest models).

Downloads: 6 This Week

Last Update: 2025-12-02
See Project
23

Model Explorer

A modern model graph visualizer and debugger

Model Explorer is a visual tool for exploring, debugging, and optimizing ML models deployed on edge devices. Developed by Google AI Edge, it offers a browser-based interface to inspect layer-wise performance, memory usage, and inference timing of TensorFlow Lite and other supported models. It’s a powerful utility for developers optimizing models for constrained environments.

Downloads: 0 This Week

Last Update: 2026-02-09
See Project
24

Midscene

Vision-based AI framework for cross-platform UI automation tasks

Midscene.js is an open source AI-driven UI automation framework designed to control user interfaces across multiple platforms using natural language instructions. Instead of relying on traditional selectors, DOM structures, or accessibility attributes, it uses a vision-first approach where screenshots are analyzed by visual-language models to identify interface elements and perform actions. It allows developers to automate interactions on web applications, desktop software, and mobile devices without needing platform-specific automation logic. Developers can describe tasks such as clicking buttons, filling forms, or extracting information, and the system interprets these commands to interact with the interface accordingly. ...

Downloads: 3 This Week

Last Update: 2 hours ago
See Project
25

LTX-2

Python inference and LoRA trainer package for the LTX-2 audio–video

LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. ...

Downloads: 41 This Week

Last Update: 2026-04-23
See Project