Showing 559 open source projects for "visual-mingw"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 1
    gopro-dashboard-overlay

    gopro-dashboard-overlay

    Programs to process GoPro MP4 & Generic GPX/FIT files

    ...The tool can also convert metadata into formats like GPX or CSV for further analysis. It is designed for both post-processing workflows and automated video generation pipelines. Overall, it enhances action footage by adding synchronized visual data overlays.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    ...It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    PPT Builder Skill

    PPT Builder Skill

    AI-friendly PPT builder skill: 17 hand-polished Chinese PPTX templates

    ...It can also work with user-provided templates by inspecting slide images and shape structures before applying non-destructive changes. For original presentations, it guides agents toward clean layouts with simple typography, generous spacing, and restrained visual elements. The project is intended for personal learning and research use, not commercial use.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    nunif

    nunif

    Misc; latest version of waifu2x; 2D video to stereo 3D video

    ...The project provides a collection of AI-powered utilities designed primarily for anime-style artwork, illustrations, and high-quality image restoration workflows. It includes command-line tools and graphical interfaces for applying trained neural models to improve image resolution and visual clarity while minimizing artifacts. nunif supports GPU acceleration and batch processing, making it suitable for creators, archivists, and enthusiasts handling large image collections. The framework is highly modular, allowing developers to experiment with custom models, inference pipelines, and image-processing workflows. Its emphasis on anime and illustration enhancement has made it especially popular in digital art and media preservation communities.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Web Dev for Beginners

    Web Dev for Beginners

    About 24 Lessons, 12 Weeks, Get Started as a Web Developer

    ...Each lesson includes a mix of pre-lecture quizzes, written content, assignments, challenges, and post-lecture quizzes to reinforce learning. The course also offers global accessibility with translations in more than 40 languages and built-in support for running in GitHub Codespaces or locally in Visual Studio Code. This makes it a practical and engaging way for beginners to gain a solid foundation in web development.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    ERAlchemy

    ERAlchemy

    Entity Relation Diagrams generation tool

    ERAlchemy is a tool that generates Entity-Relationship (ER) diagrams from databases or SQLAlchemy models and vice versa. It’s useful for database documentation, reverse engineering, and understanding complex schemas. ERAlchemy can export diagrams in formats like Graphviz and Mermaid, making it easy to include in reports or markdown files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Microsoft Azure CLI

    Microsoft Azure CLI

    Azure command-line interface

    ...We support tab completion for groups, commands, and some parameters. You can use the --query parameter and the JMESPath query syntax to customize your output. With the Azure CLI Tools Visual Studio Code extension, you can create .azcli files and use these features. IntelliSense for commands and their arguments. Snippets for commands, inserting required arguments automatically. Run the current command in the integrated terminal. Run the current command and show its output in a side-by-side editor. Show documentation on mouse hover. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Tally

    Tally

    Let agents classify your bank transactions

    Tally is an open-source, AI-assisted tool designed to automate the classification of personal financial transactions, helping users turn raw bank data into meaningful categories without manual tagging. At its core, Tally pairs a local rule engine with large language models so that an AI assistant (like Claude Code, Copilot, or any CLI agent) interprets, suggests, and categorizes expenses, savings, subscriptions, and income events based on your own rules and behavior. It generates...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Stable Virtual Camera

    Stable Virtual Camera

    Stable Virtual Camera: Generative View Synthesis with Diffusion Models

    ...Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up to 1,000 frames, making it a versatile tool for creators seeking to enhance visual storytelling. ​
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Flagsmith

    Flagsmith

    Open source feature flagging and remote config service

    Release features with confidence; manage feature flags across web, mobile, and server-side applications. Use our hosted API, deploy to your own private cloud, or run on-premises. Flagsmith provides an all-in-one platform for developing, implementing, and managing your feature flags. Whether you are moving off an in-house solution or using toggles for the first time, you will be amazed by the power and efficiency gained by using Flagsmith. Flagsmith makes it easy to create and manage feature...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    ...It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and 10 speech output languages. It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Browser Use MCP Server

    Browser Use MCP Server

    Browse the web, directly from Cursor etc.

    A browser automation server implementing the Model Context Protocol, designed to allow AI assistants to browse the web directly from applications like Cursor. It supports natural language commands for web navigation and interaction. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ImageReward

    ImageReward

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

    ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    VisPy

    VisPy

    Main repository for Vispy

    Vispy is an open-source, high-performance interactive visualization library in Python, designed for creating scientific visualizations and interactive plots. It leverages the power of modern Graphics Processing Units (GPUs) through OpenGL to render large datasets efficiently. Vispy supports a wide range of visualization types, including 2D plots, 3D visualizations, volume rendering, and more, making it suitable for scientific research, data analysis, and educational purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Elyra

    Elyra

    Elyra extends JupyterLab with an AI centric approach

    Elyra is a set of AI-centric extensions to JupyterLab Notebooks. The Elyra Getting Started Guide includes more details on these features. A version-specific summary of new features is located on the releases page.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Videomass

    Videomass

    Videomass is a free, open source and cross-platform GUI for FFmpeg

    Videomass is a free, open-source graphical interface for FFmpeg designed to make advanced video and audio processing accessible to both beginners and experienced users. Built in Python using wxPython, it provides a cross-platform environment for managing encoding, conversion, and editing tasks through a visual interface. The software supports multitasking operations, allowing users to process multiple media files simultaneously. It offers extensive configuration options while also providing presets to simplify common workflows. Videomass integrates closely with FFmpeg, exposing powerful capabilities such as transcoding, filtering, and format conversion without requiring command-line interaction. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling multimodal content in ways that preserve cross-modal relationships and semantic context, often treating content elements as interconnected knowledge entities rather than separate data silos. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    LongCat-Image

    LongCat-Image

    Foundation model for image generation

    LongCat-Image is an open-source foundation model for image generation and editing created by the LongCat team at Meituan, designed to deliver high-quality visual outputs while remaining efficient and accessible for developers and researchers. Rather than relying on massive parameter counts typical of many cutting-edge models, LongCat-Image achieves strong photorealism, stable structure, and accurate bilingual (Chinese and English) text rendering with a more compact ~6-billion parameter architecture, making it competitive with much larger alternatives despite its relatively lean design. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    TimeTracker

    TimeTracker

    Open-source and free to self-host

    TimeTracker by DRYTRIX is a comprehensive self-hosted time tracking and project management application that helps individuals, teams, and small businesses take control of their productivity and billing. Built on a modern web stack (Python/Flask backend with a responsive frontend), TimeTracker enables users to track hours, manage projects and clients, visualize data with interactive charts, and even generate professional invoices directly from tracked time. It embraces privacy and flexibility...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Android Use

    Android Use

    Automate native Android apps with AI using accessibility APIs

    android-action-kernel is an open source Python library designed to let AI agents control and automate native Android applications running on real devices or emulators. It fills a gap in automation tooling by focusing on mobile-first workflows where traditional browser or desktop-based automation doesn’t work; such as logistics, gig work, field operations, and other industries reliant on phones or tablets. The project works by using Android’s accessibility API to extract structured UI state...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MLJAR Studio

    MLJAR Studio

    Python package for AutoML on Tabular Data with Feature Engineering

    We are working on new way for visual programming. We developed a desktop application called MLJAR Studio. It is a notebook-based development environment with interactive code recipes and a managed Python environment. All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    MolmoWeb

    MolmoWeb

    Open multimodal web agent built by Ai2

    ...Unlike traditional automation tools that rely on structured HTML parsing or predefined APIs, MolmoWeb operates directly from screenshots of web pages, interpreting visual content in the same way a human user would. This approach allows it to generalize across different websites without requiring site-specific integrations, making it highly adaptable to diverse web environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Nexent

    Nexent

    Zero-code platform for building AI agents from natural language input

    Nexent is an open source platform designed to enable users to create intelligent agents using natural language instead of traditional programming or visual orchestration tools. It focuses on a zero-code approach, allowing users to define workflows and agent behavior purely through language prompts, significantly lowering the barrier to entry for AI development. Built on the MCP ecosystem, Nexent integrates a wide range of tools, models, and data sources into a unified environment for agent creation and execution. ...
    Downloads: 1 This Week
    Last Update:
    See Project
Auth0 Logo