Search Results for "visual-cfd" - Page 3

Sort By:

Showing 501 open source projects for "visual-cfd"

View related business solutions

Python Clear Filters & Widen Search

Stop Storing Third-Party Tokens in Your Database
Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.

Try Auth0 for Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

UFO³

Weaving the Digital Agent Galaxy

...The system allows users to issue natural language instructions that are translated into automated actions across multiple desktop applications. Using a dual-agent architecture, the framework analyzes both visual interface elements and system control structures in order to understand how applications should be manipulated. This enables the agent to navigate complex software environments and perform tasks that normally require manual interaction. UFO integrates mechanisms for task decomposition, planning, and execution so that high-level user requests can be broken down into smaller steps performed by specialized agents. ...

Downloads: 3 This Week

Last Update: 6 days ago
See Project
2

AstronRPA

Agent-ready RPA suite with visual workflow automation tools engine

Astron RPA is an enterprise-grade robotic process automation platform designed to help organizations and developers build automated workflows for desktop and web applications. It provides a visual workflow designer that supports low-code and no-code development, allowing users to create automation processes through a drag-and-drop interface instead of writing extensive code. It enables automation of common desktop software and browser-based tasks, making it suitable for repetitive business operations and system integrations. ...

Downloads: 0 This Week

Last Update: 2026-03-13
See Project
3

Barfi

A Python visual Flow Based Programming library

A Python visual Flow-Based Programming library that integrates into your existing workflow. Barfi is a Flow-Based Programming environment that provides a graphical programming interface. It is integratable into your existing Python workflows. A schema is built using barfi.Blocks. Then the schema is executed with barfi.ComputeEngine. Each barfi.Block has some properties that enable the FBP and schema building.

Downloads: 0 This Week

Last Update: 2025-01-06
See Project
4

dnstwist

Detects phishing and lookalike domains using DNS fuzzing techniques

...Security teams can use the tool to discover potential threats where attackers attempt to deceive users with lookalike domains. dnstwist also helps detect phishing activity by comparing web page content and visual similarity between domains using fuzzy hashing and perceptual hashing techniques. By automating DNS fuzzing and analysis, it provides organizations with an additional source of targeted threat intelligence. The tool can output results in structured formats, making it easier to integrate with security workflows or further analyze suspicious domains.

Downloads: 4 This Week

Last Update: 2026-03-06
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
5

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...

Downloads: 59 This Week

Last Update: 2025-11-30
See Project
6

InternVL

A Pioneering Open-Source Alternative to GPT-4o

InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. The model supports a wide variety of tasks, including visual perception, image classification, and cross-modal retrieval between images and text. It can also be connected to language models to enable conversational interfaces that understand images, videos, and other visual content. ...

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
7

HunyuanWorld 1.0

Generating Immersive, Explorable, and Interactive 3D Worlds

...The architecture integrates panoramic proxy generation, semantic layering, and hierarchical 3D reconstruction to produce high-quality scene-scale 3D worlds from both text and images. HunyuanWorld-1.0 surpasses existing open-source methods in visual quality and geometric consistency, demonstrated by superior scores in BRISQUE, NIQE, Q-Align, and CLIP metrics.

Downloads: 5 This Week

Last Update: 2026-04-15
See Project
8

DeepSeek VL2

Mixture-of-Experts Vision-Language Models for Advanced Multimodal

...or “Generate a caption appropriate to context”). The model supports both image understanding (vision tasks) and multimodal reasoning, and is likely used as a component in agent systems to process visual inputs as context for downstream tasks. The repository includes evaluation results (e.g. image/text alignment scores, common VL benchmarks), configuration files, and model weights (where permitted). While the internal architecture details are not fully documented publicly, the repo suggests that VL2 introduces enhancements over prior vision-language models (e.g. better scaling, cross-modal attention, more robust alignment) to improve grounding and multimodal understanding.

Downloads: 4 This Week

Last Update: 2025-10-03
See Project
9

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use. Hybrid architecture combining multimodal transformer blocks and unimodal refinement blocks. ...

Downloads: 2 This Week

Last Update: 2025-09-28
See Project
Build Securely on Azure with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

DriveLM

Driving with Graph Visual Question Answering

DriveLM is a research-oriented framework and dataset designed to explore how vision-language models can be integrated into autonomous driving systems. The project introduces a new paradigm called graph visual question answering that structures reasoning about driving scenes through interconnected tasks such as perception, prediction, planning, and motion control. Instead of treating autonomous driving as a purely sensor-driven pipeline, DriveLM frames it as a reasoning problem where models answer structured questions about the environment to guide decision making. ...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
11

LlamaGen

Autoregressive Model Beats Diffusion

LlamaGen is an open-source research project that introduces a new approach to image generation by applying the autoregressive next-token prediction paradigm used in large language models to visual generation tasks. Instead of relying on diffusion models, the framework treats images as sequences of tokens that can be generated progressively using transformer architectures similar to those used for text generation. The project explores how scaling autoregressive models and improving image tokenization techniques can produce competitive results compared with modern diffusion-based image generators. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
12

GLM-4.5V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding, and long-document interpretation. GLM-4.5V emerged from a training framework that leverages scalable reinforcement learning (with curriculum sampling) to boost performance across tasks ranging from STEM problem solving to long-context reasoning, giving it broad applicability beyond narrow benchmarks. ...

Downloads: 1 This Week

Last Update: 5 days ago
See Project
13

FastVLM

This repository contains the official implementation of FastVLM

...The repository documents model variants, showcases head-to-head numbers against known baselines, and explains how the encoder integrates with common LLM backbones. Apple’s research brief frames FastVLM as targeting real-time or latency-sensitive scenarios, where lowering visual token pressure is critical to interactive UX. In short, it’s a practical recipe to make VLMs fast without exotic token-selection heuristics.

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
14

ML Ferret

Refer and Ground Anything Anywhere at Any Granularity

Ferret is Apple’s end-to-end multimodal large language model designed specifically for flexible referring and grounding: it can understand references of any granularity (boxes, points, free-form regions) and then ground open-vocabulary descriptions back onto the image. The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo presents the vision-language pipeline, model assets, and paper resources that show how Ferret answers questions, follows instructions, and returns grounded outputs rather than just text. In practice, this enables tasks like “find that small red icon next to the chart and describe it” where both the linguistic reference and the visual region are ambiguous without fine spatial reasoning.

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
15

MoCo (Momentum Contrast)

Self-supervised visual learning using momentum contrast in PyTorch

MoCo is an open source PyTorch implementation developed by Facebook AI Research (FAIR) for the papers “Momentum Contrast for Unsupervised Visual Representation Learning” (He et al., 2019) and “Improved Baselines with Momentum Contrastive Learning” (Chen et al., 2020). It introduces Momentum Contrast (MoCo), a scalable approach to self-supervised learning that enables visual representation learning without labeled data. The core idea of MoCo is to maintain a dynamic dictionary with a momentum-updated encoder, allowing efficient contrastive learning across large batches. ...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
16

Wan2.1

Wan2.1: Open and Advanced Large-Scale Video Generative Model

Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text-to-video and image-to-video generation tasks with flexible resolution options suitable for various GPU hardware configurations. ...

1 Review

Downloads: 41 This Week

Last Update: 2026-03-05
See Project
17

Book5_Essentials-Probability-Statistics

The book 5 of statistics in simplicity

Book5_Essentials-of-Probability-and-Statistics is a Visualize-ML educational volume that introduces the statistical and probabilistic concepts underpinning modern data analysis and machine learning. The repository explains topics such as distributions, sampling, inference, and uncertainty using visual demonstrations and intuitive narratives. Its teaching philosophy prioritizes conceptual clarity over heavy formalism, making statistical thinking more approachable for beginners. The material connects probability theory directly to real analytical workflows, helping learners understand how statistics supports predictive modeling. ...

Downloads: 0 This Week

Last Update: 2026-05-01
See Project
18

GPT-Image2-Skill

GPT Image 2 prompt gallery, image prompt library, agentic skill

GPT-Image2-Skill is a prompt gallery, image prompt library, agent skill, and CLI for OpenAI image generation and editing workflows. It collects curated prompt examples with generated outputs so users can reuse strong visual patterns instead of starting from scratch. The project includes categories such as anime, gaming, cyberpunk, animation, character design, typography, illustration, watercolor, ink, pixel art, isometric scenes, product visuals, and food imagery. It can be installed as an agent skill for supported runtimes or used through a local CLI. ...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
19

Windrecorder

Windrecorder is a memory search app by records everything

...It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage overhead. It includes a web-based interface where users can browse timelines, analyze activity, and perform semantic queries on recorded content. The tool emphasizes privacy by running entirely offline, ensuring that all captured data remains on the user’s device without external transmission. ...

Downloads: 1 This Week

Last Update: 2026-04-24
See Project
20

Depth Anything 3

Recovering the Visual Space from Any Views

Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography, AR/VR content creation, robotics perception, and 3D reconstruction workflows, making it versatile across industries and research domains. ...

Downloads: 1 This Week

Last Update: 2026-03-21
See Project
21

DINOv3

Reference PyTorch implementation and models for DINOv3

DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while maintaining or improving feature quality. ...

Downloads: 16 This Week

Last Update: 2026-03-30
See Project
22

Book1_Python-For-Beginners

The Iris Book: Addition, Subtraction, Multiplication, and Division

Book1_Python-For-Beginners is the introductory volume of the Visualize-ML series, designed to teach Python programming to newcomers with no prior coding experience. The repository emphasizes clarity and gradual skill building, starting from fundamental syntax and moving toward practical programming patterns. It integrates visual aids and annotated code examples to help learners understand not just how Python works but why certain patterns are used. The material is structured to support self-paced learning, making it suitable for students, career switchers, and hobbyists. Because the book is part of a larger data science pathway, it also prepares readers for later work in visualization and machine learning. ...

Downloads: 0 This Week

Last Update: 2026-05-01
See Project
23
$Book3_Elements-of-Mathematics$

Book3_Elements-of-Mathematics

From Addition, Subtraction, Multiplication, and Division to ML

Book3_Elements-of-Mathematics is an open learning resource in the Visualize-ML collection that introduces core mathematical foundations required for modern data science and AI. The repository presents topics such as algebra, calculus fundamentals, and mathematical reasoning using a highly visual and beginner-friendly approach. Its goal is to reduce the intimidation barrier often associated with formal mathematics by combining diagrams, structured explanations, and applied examples. The content is organized progressively so learners can build confidence before moving into more advanced quantitative subjects. It is particularly useful for self-taught developers and students transitioning into technical fields that require mathematical literacy. ...

Downloads: 0 This Week

Last Update: 2026-05-01
See Project
24

Screenshot to Code

A neural network that transforms a design mock-up into static websites

Screenshot-to-code is a tool or prototype that attempts to convert UI screenshots (e.g., of mobile or web UIs) into code representations, likely generating layouts, HTML, CSS, or markup from image inputs. It is part of a research/proof-of-concept domain in UI automation and image-to-UI code generation. Mapping visual design to code constructs. Code/UI layout (HTML, CSS, or markup). Examples/demo scripts showing “image UI code”.

Downloads: 0 This Week

Last Update: 2025-09-26
See Project
25

ManiSkill

SAPIEN Manipulation Skill Framework

...Developed by Hao Su Lab, it focuses on robotic manipulation with diverse, high-quality 3D tasks designed to challenge perception, control, and planning in robotics. ManiSkill provides both low-level control and visual observation spaces for realistic learning scenarios.

Downloads: 0 This Week

Last Update: 2026-04-21
See Project