Showing 373 open source projects for "visual-cfd"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Rivet

    Rivet

    Visual AI IDE for building agents with prompt chains and graphs

    Rivet is an open source visual AI programming environment designed to help developers build complex AI agents using a node-based interface and prompt chaining workflows. It provides a desktop application that allows users to visually construct and debug AI logic as interconnected graphs, making it easier to manage sophisticated interactions between language models and external tools.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. ...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 3
    DESIGN.md

    DESIGN.md

    A format specification for describing a visual identity

    design.md is an open specification created by Google Labs that defines a standardized way to describe design systems for AI coding agents. It allows developers to encode visual identity elements such as colors, typography, spacing, and components in a structured format. The file combines machine-readable design tokens with human-readable explanations, enabling agents to generate consistent user interfaces aligned with a brand. By providing persistent design context, it eliminates the need to repeatedly describe styling requirements to AI tools. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    Qwen3-VL is the latest multimodal large language model series from Alibaba Cloud’s Qwen team, designed to integrate advanced vision and language understanding. It represents a major upgrade in the Qwen lineup, with stronger text generation, deeper visual reasoning, and expanded multimodal comprehension. The model supports dense and Mixture-of-Experts (MoE) architectures, making it scalable from edge devices to cloud deployments, and is available in both instruction-tuned and reasoning-enhanced variants. Qwen3-VL is built for complex tasks such as GUI automation, multimodal coding (converting images or videos into HTML, CSS, JS, or Draw.io diagrams), long-context reasoning with support up to 1M tokens, and comprehensive video understanding. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Qwen-Image-Layered

    Qwen-Image-Layered

    Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

    Qwen-Image-Layered is an extension of the Qwen series of multimodal models that introduces layered image understanding, enabling the model to reason about hierarchical visual structures — such as separating foreground, background, objects, and contextual layers within an image. This architecture allows richer semantic interpretation, enabling use cases such as scene decomposition, object-level editing, layered captioning, and more fine-grained multimodal reasoning than with flat image encodings alone. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Smart Excalidraw

    Smart Excalidraw

    A smart, powerful, and beautiful excalidraw drawing tool

    Smart Excalidraw Next is an AI-powered diagramming tool that allows users to generate professional-quality visual diagrams directly from natural language descriptions, combining generative AI with the flexibility of the Excalidraw canvas. It leverages large language models to interpret user input and automatically produce structured diagrams such as flowcharts, architecture diagrams, ER diagrams, and mind maps with logical layouts and clean visual organization.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    GalTransl

    GalTransl

    Automated translation solution for visual novels

    GalTransl is an automated translation system specifically designed for visual novels, particularly those in the “galgame” genre, leveraging large language models to streamline and enhance the translation process. It integrates support for multiple advanced LLM providers such as GPT-4, Claude, DeepSeek, and other models, enabling high-quality, context-aware translations that go beyond traditional machine translation approaches.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    ...Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen. The framework supports features like Optical Character Recognition (OCR) and Set-of-Mark (SoM) prompting to enhance visual grounding capabilities. It is designed to be compatible with macOS, Windows, and Linux (with X server installed), and is released under the MIT license.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    OpenClaw

    OpenClaw

    Your own personal AI assistant. Any OS. Any Platform.

    OpenClaw (formerly Clawdbot/Moltbot) is an open-source, self-hosted autonomous AI assistant designed to run on user-controlled hardware and bridge conversational natural language with real-world task execution, effectively acting as a proactive digital assistant rather than a reactive chatbot. It lets you send instructions through familiar messaging platforms like WhatsApp, Telegram, Discord, Slack, Signal, iMessage, and more, and then interprets those instructions to carry out actions such...
    Downloads: 338 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Activepieces

    Activepieces

    Open Source AI Automation

    Activepieces is an open-source automation tool designed to build workflows that connect different apps and services without requiring extensive programming knowledge. It’s tailored for technical and non-technical users alike, enabling teams to automate repetitive tasks using a visual editor and a large library of pre-built connectors. Activepieces can be self-hosted or used via a cloud deployment, making it flexible for teams of all sizes. It supports integrations with popular services like Slack, Google Sheets, and Discord, and allows users to create custom pieces to suit unique needs. With real-time logs, version history, and scheduling, Activepieces is positioned as a compelling alternative to Zapier for open-source and privacy-conscious users.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    PySpur

    PySpur

    Visual tool for building, testing, and deploying AI agent workflows

    ...By offering a visual representation of workflows, PySpur makes it easier to debug interactions between components and identify failures in complex pipelines. It supports iterative experimentation, allowing developers to rapidly improve agents without rebuilding systems from scratch. PySpur also enables deployment of finalized workflows after testing, making it suitable for both development and production use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Puck

    Puck

    Open source visual editor for building React drag-and-drop pages

    Puck is an open source visual editor designed for React applications that enables developers to build customizable drag-and-drop page editing experiences. It allows teams to create their own page builders by defining React components that can be arranged and configured through a visual interface. Puck is component-based and configuration-driven, meaning developers specify how components render and which editable fields control their properties.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CogVLM

    CogVLM

    A state-of-the-art open visual language model

    CogVLM is an open-source visual–language model suite—and its GUI-oriented sibling CogAgent—aimed at image understanding, grounding, and multi-turn dialogue, with optional agent actions on real UI screenshots. The flagship CogVLM-17B combines ~10B visual parameters with ~7B language parameters and supports 490×490 inputs; CogAgent-18B extends this to 1120×1120 and adds plan/next-action outputs plus grounded operation coordinates for GUI tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    NVIDIA AI Blueprint

    NVIDIA AI Blueprint

    Suite of reference architectures for building GPU-accelerated vision

    ...The project is organized around real-time video intelligence, downstream analytics, and agentic offline processing. It supports workflows such as natural-language video search, visual question answering, long-video summarization, clip retrieval, verified alerts, and incident analysis. It is designed for technical users who need deployable reference architectures for smart spaces, warehouse automation, SOP validation, monitoring, and operational video analytics. The repository includes Python agent code, Docker Compose deployment configurations, skills, scripts, and a Next.js-based UI.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    Playwright MCP

    Playwright MCP

    Playwright MCP server

    An MCP server developed by Microsoft that offers browser automation capabilities using Playwright, enabling LLMs to interact with web pages through structured accessibility snapshots without relying on visual data. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    ComfyUI-3D-Pack

    ComfyUI-3D-Pack

    An extensive node suite that enables ComfyUI to process 3D inputs

    ...It incorporates modern 3D generation technologies including neural radiance fields, Gaussian splatting, and other AI-driven reconstruction techniques. Through these nodes, users can convert images into 3D models, manipulate geometry, and experiment with generative 3D workflows inside the visual pipeline editor.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    StarVector

    StarVector

    StarVector is a foundation model for SVG generation

    ...The system treats vector graphics creation as a code generation problem, producing SVG code that can render detailed vector images. Its architecture combines computer vision techniques with language modeling capabilities so it can understand visual inputs and textual prompts simultaneously. The model converts raster images or text instructions into structured vector representations, enabling high-quality vectorization and design generation. This approach allows StarVector to create scalable graphics that maintain visual quality regardless of resolution, which is especially useful for design tools and illustration workflows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    GLM-Image

    GLM-Image

    GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image

    ...It excels at generating images that include complex layouts and detailed text content, making it especially useful for posters, diagrams, info-graphics, social media graphics, and visual content that requires precise text placement and semantic alignment. Because it blends linguistic reasoning with image synthesis, GLM-Image produces visual outputs where semantic relationships and textual accuracy are prioritized alongside artistic style and realism, and its model structure enables it to handle dense visual knowledge tasks that challenge many pure diffusion models. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    ...Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while simultaneously producing corresponding audio elements such as speech, music, ambient sound, or effects. This unified approach allows creators to generate complete multimedia sequences where motion, timing, and sound are aligned automatically. LTX-2 is designed for both research and production workflows and can generate high-resolution video clips with precise control over structure, motion, and camera behavior.
    Downloads: 95 This Week
    Last Update:
    See Project
  • 20
    Watermark-Removal

    Watermark-Removal

    Machine learning image inpainting task that removes watermarks

    Watermark-Removal repository is a machine learning project focused on removing visible watermarks from digital images using deep learning and image inpainting techniques. The system analyzes an image containing a watermark and attempts to reconstruct the underlying visual content so that the watermark is removed while preserving the original appearance of the image. The project uses neural network models inspired by research in contextual attention and gated convolution, which are methods commonly applied to image restoration tasks. Through these techniques, the model learns to identify regions of the image affected by the watermark and generate realistic replacements for the missing visual information. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    emgucv

    emgucv

    Cross platform .Net wrapper to the OpenCV image processing library

    Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio and Unity, it can run on Windows, Linux, Mac OS, iOS and Android.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 22
    video-use

    video-use

    Edit videos with Claude Code

    ...Designed to work with Claude Code, it automates the entire editing process—from cutting clips to rendering the final output—without requiring manual timelines or complex software interfaces. The system intelligently analyzes audio transcripts and visual cues to make precise, context-aware editing decisions. It supports a wide range of content types, including interviews, tutorials, montages, and talking-head videos. By combining structured text representations with on-demand visual previews, it minimizes processing overhead while maintaining high-quality results. Overall, Video Use reimagines video editing as an AI-driven, conversational workflow.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 23
    Autonomous Agents

    Autonomous Agents

    Autonomous Agents (LLMs) research papers. Updated Daily

    ...These methods allow agents to combine visual and geometric information while maintaining awareness of the spatial relationships between agents and objects.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    LLM Vision

    LLM Vision

    Visual intelligence for your home.

    ...The project enables Home Assistant to analyze images, video files, and live camera feeds using vision-capable AI models. Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process events from surveillance platforms such as Frigate and convert them into meaningful summaries, notifications, or structured data for automation workflows. It also maintains a timeline of analyzed camera events that can be displayed in dashboards or queried through the assistant interface.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Inkeep

    Inkeep

    Create AI Agents in a No-Code Visual Builder or TypeScript SDK

    Inkeep is an open-source framework for building and deploying AI agent workflows and interactive assistants that operate autonomously across applications, enterprise environments, and customer engagement use cases. It lets developers and non-technical users create, manage, and orchestrate multi-agent systems using both a no-code visual builder and a full TypeScript SDK, giving two ways to define agent behaviors that stay in sync with each other. Agents built with this framework can act as real-time conversational assistants — for example, handling help desk inquiries, providing internal support to teams, or driving in-app experiences — and they can be extended to automate multi-step tasks that interact with external systems like CRMs, knowledge bases, or ticketing systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB