Showing 104 open source projects for "visual code studio"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI Studio. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an...
    Downloads: 105 This Week
    Last Update:
    See Project
  • 2
    ComfyUI-LTXVideo

    ComfyUI-LTXVideo

    LTX-Video Support for ComfyUI

    ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers non-programmers and rapid-iteration teams to harness the performance of LTX-Video while maintaining the clarity and flexibility of a dataflow graph model. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    AionUi

    AionUi

    Free, local, open-source Cowork for Gemini CLI, Claude Code, Codex

    AionUi is an open-source, cross-platform graphical interface that turns command-line AI tools into a unified coworking desktop for interacting with multiple local AI agents and CLI models like Gemini CLI, Claude Code, Codex, Qwen Code, and others. Instead of forcing users to work in separate terminals for each tool, AionUi automatically detects installed CLI tools and provides a central visual workspace where sessions can run in parallel, contexts are preserved, and conversations are saved locally without sending data to external servers. It enhances productivity by offering smart file management features like batch renaming, automatic organization, and intelligent file classification, thereby reducing manual overhead when working with large datasets or complex document structures. ...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 4
    AutoMaker

    AutoMaker

    Start directing AI agents

    Automaker is an autonomous AI development studio designed to transform how software is built by allowing developers to describe features, then watching AI agents implement code, tests, commits, and more with minimal manual typing. Instead of writing every line of code by hand, users add feature cards to a Kanban board with natural language descriptions, and AI agents powered by the Claude Agent SDK handle multi-step tasks such as planning, generating code, running tests, and committing to an isolated git worktree. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • 5
    DotVVM

    DotVVM

    Open source MVVM framework for Web Apps

    ...Save your time with GridView, FileUpload and other components shipped with the framework. Don't spend the time building an API. Just load data from the database and use data-binding to display them. DotVVM needs less than 100 kB of JavaScript code. It's smaller than other ASP.NET-based frameworks. DotVVM offers a free Visual Studio extension giving you all the comfort you are used to. DotVVM comes with ready-made components you can use in your HTML files. The state and user interactions are handled in view models - C# classes. The controls render simple HTML which can be styled easily. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Langflow

    Langflow

    Low-code app builder for RAG and multi-agent AI applications

    Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    Qwen3-VL is the latest multimodal large language model series from Alibaba Cloud’s Qwen team, designed to integrate advanced vision and language understanding. It represents a major upgrade in the Qwen lineup, with stronger text generation, deeper visual reasoning, and expanded multimodal comprehension. The model supports dense and Mixture-of-Experts (MoE) architectures, making it scalable from edge devices to cloud deployments, and is available in both instruction-tuned and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Plannotator

    Plannotator

    Annotate and review coding agent plans visually, share with your team

    Plannotator is an interactive plan review and annotation tool built to support AI coding agents, offering a visual UI for markup, refinement, and team collaboration around agent-generated plans. It allows developers to annotate proposed plans, sketches, and outlines from tools like Claude Code or OpenCode with pen tools, arrows, and highlighting, seamlessly capturing feedback that can be shared across teams or pushed back to agents. Plannotator integrates with diff views so reviewers can annotate changes line-by-line in git diffs, provide structured feedback, and navigate plans visually rather than through raw text alone. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Activepieces

    Activepieces

    Open Source AI Automation

    Activepieces is an open-source automation tool designed to build workflows that connect different apps and services without requiring extensive programming knowledge. It’s tailored for technical and non-technical users alike, enabling teams to automate repetitive tasks using a visual editor and a large library of pre-built connectors. Activepieces can be self-hosted or used via a cloud deployment, making it flexible for teams of all sizes. It supports integrations with popular services like...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    ...The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. With optional local code execution, reflection mechanisms, and compositional planning, Agent S provides a scalable and research-driven framework for building advanced computer-use agents.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    FlowGram

    FlowGram

    Extensible workflow development framework

    FlowGram is an open-source, node-based workflow development framework and toolkit aimed at helping developers build custom AI-workflow platforms or automation systems through a visual, drag-and-drop interface. Instead of shipping as a ready-made product, it provides the building blocks — a canvas for wiring together nodes, a form engine for configuring node parameters, a variable-scope and type-inference engine, and a set of “materials” (pre-built node types such as code execution, conditional logic, LLM calls, etc.) that can be composed into larger workflows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Taipy

    Taipy

    Turns Data and AI algorithms into production-ready web applications

    ...Struggle with sluggish performance and excessive memory usage, as every data point demands processing. Large datasets become cumbersome, complicating the user experience and data analysis. Scenarios are made easy with Taipy Studio. A powerful VS Code extension that unlocks a convenient graphical editor. Get your methods invoked at a certain time or intervals. Enjoy a variety of predefined themes or build your own.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Wan2.1

    Wan2.1

    Wan2.1: Open and Advanced Large-Scale Video Generative Model

    Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports...
    Downloads: 56 This Week
    Last Update:
    See Project
  • 16
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    ...The framework targets both interactive graphical applications and media-rich experiences, making it a solid foundation for games, creative tools, or visualization systems that demand both performance and flexibility. While being low-level, it also provides sensible defaults and helper abstractions that reduce boilerplate and help teams maintain clear, maintainable code.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 17
    FramePack

    FramePack

    Lets make video diffusion practical

    FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 18
    A.I.G

    A.I.G

    Full-stack AI Red Teaming platform

    AI-Infra-Guard is a powerful open-source security platform from Tencent’s Zhuque Lab designed to assess the safety and resilience of AI infrastructures, codebases, and components through automated scanning and evaluation tools. It brings together AI infrastructure vulnerability scanning, MCP server risk analysis, and jailbreak evaluation into a unified workflow so that enterprises and individuals can identify critical security issues without relying on external services. Users can deploy it...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    diff2html

    diff2html

    Pretty diff to html javascript library (diff2html)

    Each diff provides a comprehensive visualization of the code changes, helping developers identify problems and better understand the changes. Each diff features a line-by-line and side-by-side preview of your changes. All the code changes are syntax highlighted using highlight.js, providing more readability. Similar lines are paired, allowing for easier change tracking. We work hard to make sure you can have your diffs in a simple and flexible way. The AI community building the future....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Better Chatbot

    Better Chatbot

    Just a Better Chatbot. Powered by MCP Client & Workflows

    Better‑chatbot is an AI chatbot framework powered by MCP protocols and workflows, allowing developers to deploy and integrate AI-powered chat systems with ease. Integrates all major LLMs: OpenAI, Anthropic, Google, xAI, Ollama, and more. MCP protocol, web search, JS/Python code execution, data visualization. Custom agents, visual workflows, artifact generation. Custom agents, visual workflows, artifact generation. Realtime voice chat with full MCP tool integration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SuperDesign

    SuperDesign

    AI Product Design Agent

    SuperDesign is an innovative open-source AI design agent that embeds directly into your integrated development environment (IDE), enabling developers and designers to generate product interface designs from natural language prompts. It works seamlessly inside environments like VS Code, Cursor, Windsurf, and Claude Code, eliminating the need to switch between design tools and code editors. With a single prompt, SuperDesign can produce full UI mockups, reusable components, and wireframes, dramatically speeding up the early stages of product creation and iteration. Beyond simple visual generation, it lets users fork and refine designs, supporting a dynamic workflow where ideas evolve quickly within the context of your project. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Sa2VA

    Sa2VA

    Official Repo For "Sa2VA: Marrying SAM2 with LLaVA

    Sa2VA is a cutting-edge open-source multi-modal large language model (MLLM) developed by ByteDance that unifies dense segmentation, visual understanding, and language-based reasoning across both images and videos. It merges the segmentation power of a state-of-the-art video segmentation model (based on SAM‑2) with the vision-language reasoning capabilities of a strong LLM backbone (derived from models like InternVL2.5 / Qwen-VL series), yielding a system that can answer questions about...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    LatentSync

    LatentSync

    Taming Stable Diffusion for Lip Sync

    LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning-fast "fully fused" multi-layer perceptron (technical paper), a versatile multiresolution hash encoding (technical paper), as well as support for various other input encodings, losses, and optimizers. We provide a sample application where an image function (x,y) -> (R,G,B) is learned. The fully fused MLP component of this framework requires a very large amount of shared...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    MiniMind-V

    MiniMind-V

    "Big Model" trains a visual multimodal VLM with 26M parameters

    MiniMind-V is an experimental open-source project that aims to train a very small multimodal vision–language model (VLM) from scratch with extremely low compute and cost, making research and experimentation accessible to more people. The repository showcases training workflows and code designed to produce a 26-million parameter model—including both image and text capabilities—using minimal resources in very little time, reflecting a trend toward democratizing AI research. MiniMind-V combines...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB