Search Results for "visual-mingw" - Page 5

Sort By:

Showing 559 open source projects for "visual-mingw"

View related business solutions

Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
1

AdalFlow

The library to build & auto-optimize LLM applications

AdalFlow is a framework for building AI-powered automation workflows, enabling users to design and execute intelligent automation pipelines with minimal coding.

Downloads: 1 This Week

Last Update: 2025-09-25
See Project
2

VOID-TOOLS

OSINT, Discord, web & network utilities

VOID-TOOLS is a Python terminal multitool suite for OSINT, Discord, web, network, social, Roblox, and utility workflows. It is built around a Rich-powered terminal dashboard with keyboard navigation, a cinematic boot flow, category panels, and a live sidebar. The tool organizes many modules into categories and marks premium tools separately from free ones. It includes bilingual setup, theme selection, fuzzy search, category filters, custom plugin loading, and remote manifest updates. The...

Downloads: 3 This Week

Last Update: 4 days ago
See Project
3

PersonaLive

Expressive Portrait Image Animation for Live Streaming

...The framework prioritizes low-latency and streamable output, making it suitable for real-time creative workflows, broadcast overlays, or interactive avatars on consumer-grade GPUs. PersonaLive’s architecture balances visual quality and efficiency by combining motion encoding, temporal modules, and hybrid implicit control signals to preserve identity and stable expression through long sequences.

Downloads: 5 This Week

Last Update: 2026-05-15
See Project
4

Open-AutoGLM

An open phone agent model & framework

...It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off completion of multi-step tasks such as navigating apps, filling forms, and more. Unlike traditional automation scripts that depend on brittle heuristics, Open-AutoGLM uses pretrained large language and vision-language models to interpret visual context and natural language instructions, giving the agent robust adaptability across apps and interfaces.

Downloads: 10 This Week

Last Update: 2026-03-06
See Project
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
5

Label Studio

Label Studio is a multi-type data labeling and annotation tool

...The frontend part of Label Studio app lies in the frontend/ folder and written in React JSX. Multi-user labeling sign up and login, when you create an annotation it's tied to your account. Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.

Downloads: 12 This Week

Last Update: 2026-03-13
See Project
6

ComfyUI IPAdapter plus

ComfyUI reference implementation for IPAdapter models

...It focuses on image-to-image conditioning, letting a reference image guide the subject, style, or composition of a new generation. The project treats IPAdapter like a one-image LoRA, making it useful when users want visual influence without full model training. It includes example workflows that cover the main IPAdapter functions and help users build practical ComfyUI graphs. The extension supports unified loaders, model loaders, advanced apply nodes, attention masks, reference image weighting, and different embedding strategies. It is now in maintenance-only mode, so it is best used by ComfyUI users who need established IPAdapter workflows rather than a rapidly evolving plugin.

Downloads: 6 This Week

Last Update: 2026-06-12
See Project
7

docext

An on-premises, OCR-free unstructured data extraction

...Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual and textual information directly from document images. This allows the system to detect and extract structured elements such as tables, signatures, key fields, and layout information while maintaining semantic understanding of the document content. The toolkit can also convert complex documents into structured markdown representations that preserve formatting and contextual relationships.

Downloads: 6 This Week

Last Update: 2026-03-12
See Project
8

Dask

Parallel computing with task scheduling

Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.

Downloads: 2 This Week

Last Update: 2026-06-11
See Project
9

PaddleX

PaddlePaddle End-to-End Development Toolkit

PaddleX is a deep learning full-process development tool based on the core framework, development kit, and tool components of Paddle. It has three characteristics opening up the whole process, integrating industrial practice, and being easy to use and integrate. Image classification and labeling is the most basic and simplest labeling task. Users only need to put pictures belonging to the same category in the same folder. When the model is trained, we need to divide the training set, the...

Downloads: 6 This Week

Last Update: 2026-06-11
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
10

armory

3D Engine with Blender Integration

...Powered by Armory engine, ArmorPaint is a stand-alone software designed for physically-based texture painting. Drag & drop your 3D models and start painting. Receive instant visual feedback in the viewport as you paint. Powered by Armory engine, ArmorLab is stand-alone software designed for AI-powered texture authoring. Generate PBR materials by drag & dropping your photos. In development! Armory is an open-source 3D game engine with full Blender integration. The engine is currently available in a form of early preview.

Downloads: 6 This Week

Last Update: 2026-02-16
See Project
11

HTTPie Desktop

Cross-platform API testing client for humans

HTTPie Desktop is a graphical API client built on top of the popular HTTPie terminal tool, offering a user-friendly interface for testing and interacting with APIs. It combines the simplicity of HTTPie’s CLI with a modern desktop and web UI for a more visual workflow. Developers can easily build, send, and preview HTTP requests without needing to memorize commands or write scripts. The platform supports organizing work into spaces, collections, and tabs, making it ideal for managing multiple APIs and projects. It also includes AI-assisted features to help streamline request creation and improve productivity. ...

Downloads: 6 This Week

Last Update: 2025-03-12
See Project
12

Frontend Slides

Create beautiful slides on the web using Claude's frontend skills

Frontend Slides is a lightweight tool that enables users to create visually appealing, animation-rich web presentations without requiring knowledge of CSS or JavaScript by leveraging a guided, interactive workflow. It operates on a “show, don’t tell” philosophy, generating visual previews of styles so users can select their preferred design rather than describing it abstractly. The system produces fully self-contained HTML presentations with inline CSS and JavaScript, eliminating the need for external dependencies, build tools, or frameworks. It also supports converting existing PowerPoint files into web-based presentations while preserving content such as images, text, and structure. ...

Downloads: 0 This Week

Last Update: 2026-05-26
See Project
13

GitDiagram

AI tool that converts GitHub repositories into interactive diagrams

GitDiagram is an open source web application designed to help developers quickly understand the structure and architecture of GitHub repositories by automatically generating interactive diagrams. It analyzes repository metadata such as the file tree and project documentation to build a visual representation of how different components of a project relate to one another. It uses an AI-powered pipeline to interpret repository structure and transform that information into system design diagrams rendered with Mermaid visualization. These diagrams provide a high-level overview of a codebase, making it easier for developers to explore unfamiliar projects or understand large and complex repositories. ...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
14

PaperBanana

Extension of Google Research’s PaperBanana

PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can describe the desired figure in natural language and allow the system to generate the visual representation automatically. PaperBanana integrates modern multimodal AI models capable of interpreting instructions and producing graphics that follow academic conventions. ...

Downloads: 0 This Week

Last Update: 2026-06-12
See Project
15

AppAgent

Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

AppAgent is an open-source multimodal agent framework designed to enable large language models to operate smartphone applications through natural interactions with graphical user interfaces. The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same way a human user would, making it compatible with a wide variety of mobile applications. ...

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
16

VGGT-Ω

[CVPR 2026 Oral] VGGT Omega

...The repository also provides a Gradio demo that can visualize predicted cameras and depth-unprojected point clouds as a GLB scene. VGGT-Omega is best suited for researchers and developers working on 3D reconstruction, visual geometry, and image-based scene understanding.

Downloads: 4 This Week

Last Update: 2026-05-26
See Project
17

OSWorld

Benchmarking Multimodal Agents for Open-Ended Tasks

...It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.

Downloads: 2 This Week

Last Update: 2025-03-13
See Project
18

Zerox OCR

PDF to Markdown with vision models

A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.

Downloads: 1 This Week

Last Update: 2024-12-18
See Project
19

Slither

Static Analyzer for Solidity

Slither is a Solidity static analysis framework written in Python 3. It runs a suite of vulnerability detectors, prints visual information about contract details, and provides an API to easily write custom analyses. Slither enables developers to find vulnerabilities, enhance their code comprehension, and quickly prototype custom analyses. Slither is the first open-source static analysis framework for Solidity. Slither is fast and precise; it can find real vulnerabilities in a few seconds without user intervention. ...

Downloads: 6 This Week

Last Update: 2026-01-16
See Project
20

SIA

AI framework to autonomously improve the performance of any AI system

...It is aimed at research and experimentation across tasks such as machine learning benchmarks, legal classification, code optimization, and scientific workflows. It includes built-in tasks, a command-line runner, and a visual dashboard for following generations as they evolve. It also lets users define custom providers, profiles, seed agents, and task directories without changing the core code.

Downloads: 4 This Week

Last Update: 6 days ago
See Project
21

Oasis

Inference script for Oasis 500M

Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. Because it’s an inference-focused repository, it’s especially useful as a practical reference for running the model, wiring inputs, and producing the autoregressive sequence of gameplay frames. ...

Downloads: 5 This Week

Last Update: 2026-01-06
See Project
22

Agent S

Agent S: an open agentic framework that uses computers like a human

...The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. With optional local code execution, reflection mechanisms, and compositional planning, Agent S provides a scalable and research-driven framework for building advanced computer-use agents.

Downloads: 1 This Week

Last Update: 2025-12-16
See Project
23

Droidrun

Powerful framework for controlling Android and iOS devices

Droidrun is a native mobile agent platform that gives users natural-language control over real Android devices to automate any mobile app workflow, from logins and bookings to purchases and data extraction, including access to mobile-only content behind app logins, rate limits, or platform restrictions. Its cloud offering lets users spin up agents in seconds with preinstalled apps, run tasks in parallel across multiple devices, and compose complex, multi-step conditional workflows using...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
24

gopro-dashboard-overlay

Programs to process GoPro MP4 & Generic GPX/FIT files

...The tool can also convert metadata into formats like GPX or CSV for further analysis. It is designed for both post-processing workflows and automated video generation pipelines. Overall, it enhances action footage by adding synchronized visual data overlays.

Downloads: 3 This Week

Last Update: 2026-05-02
See Project
25

Gemma

Gemma open-weight LLM library, from Google DeepMind

...This repository provides the official implementation of the Gemma PyPI package, a JAX-based library that enables users to load, interact with, and fine-tune Gemma models. The framework supports both text and multi-modal input, allowing natural language conversations that incorporate visual content such as images. It includes APIs for conversational sampling, parameter management, and integration with fine-tuning methods like LoRA. The Gemma library can operate efficiently on CPUs, GPUs, or TPUs, with recommended configurations depending on model size. Through included tutorials and Colab notebooks, users can explore examples covering sampling, multi-modal interactions, and fine-tuning workflows. ...

Downloads: 5 This Week

Last Update: 2026-05-20
See Project