Showing 559 open source projects for "visual-mingw"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Agent Sprite Forge

    Agent Sprite Forge

    Agent Skill for generating 2D sprite sheets and map, transparent PNG

    ...The system supports multi-frame sprite generation, animation sequencing, and transparent background rendering for easier integration into game engines. Its architecture is designed around automation and repeatability, enabling developers to generate large batches of visual assets through structured prompt workflows. Overall, agent-sprite-forge acts as an AI-assisted creative tool for accelerating 2D game art production and experimentation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    MolmoWeb

    MolmoWeb

    Open multimodal web agent built by Ai2

    ...Unlike traditional automation tools that rely on structured HTML parsing or predefined APIs, MolmoWeb operates directly from screenshots of web pages, interpreting visual content in the same way a human user would. This approach allows it to generalize across different websites without requiring site-specific integrations, making it highly adaptable to diverse web environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Nexent

    Nexent

    Zero-code platform for building AI agents from natural language input

    Nexent is an open source platform designed to enable users to create intelligent agents using natural language instead of traditional programming or visual orchestration tools. It focuses on a zero-code approach, allowing users to define workflows and agent behavior purely through language prompts, significantly lowering the barrier to entry for AI development. Built on the MCP ecosystem, Nexent integrates a wide range of tools, models, and data sources into a unified environment for agent creation and execution. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Unstract

    Unstract

    No-code LLM Platform to launch APIs and ETL Pipelines

    Unstract is a powerful open-source, no-code platform built to automate the extraction and structuring of unstructured documents using large language models and flexible workflows, enabling developers and data teams to turn messy files into organized JSON content without complex coding. It integrates a visual Prompt Studio environment where users can iteratively design extraction schemas, compare outputs from different models, and monitor costs and accuracy side by side, making it easier to refine prompts and extraction logic before deploying at scale. Unstract supports deploying structured extraction as REST API endpoints or embedding it into data engineering ETL pipelines, which allows it to plug directly into data warehouses, cloud storage, or downstream analytics systems. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    City Map Poster Generator

    City Map Poster Generator

    Transform your favorite cities into beautiful, minimalist designs

    maptoposter is a code-driven poster generator that turns any city into a minimalist, print-style map artwork with consistent typography and themed color palettes. It is built around a simple command-line flow where you pass a city and country, and the tool fetches the relevant map geometry and renders it into a clean composition that looks like a design product rather than a raw GIS export. The repository includes a library of predefined themes that change the overall look (for example,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Paper2Slides

    Paper2Slides

    From Paper to Presentation in One Click

    ...It is designed to replace the repetitive work of turning dense technical documents into presentation-friendly structure by extracting key points, figures, and data into a coherent visual narrative. The system supports multiple input formats, so you can process PDFs and common office documents rather than being locked to a single file type. It uses an extraction approach intended to capture critical insights comprehensively, including important visuals and data points that often get missed in naive summarization. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Pycorrector

    Pycorrector

    Pycorrector is a toolkit for text error correction

    Pycorrector is a Python toolkit for Chinese text error correction. It focuses on common error types such as similar-sounding characters, visually similar characters, grammar issues, proper noun errors, missing words, extra words, wrong words, and word-order problems. The project implements multiple correction approaches, including KenLM, ConvSeq2Seq, BERT, MacBERT, ELECTRA, ERNIE, GPT-style models, and newer Qwen-based correction models. It is designed for use cases such as input method...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenSwarm

    OpenSwarm

    Claude code for everything except coding

    ...The included agents can handle research, data analysis, slide decks, documents, images, videos, scheduling, messaging, and other productivity tasks. It is designed for outputs like pitch decks, market research, SEO content, quarterly reports, launch campaigns, visual assets, and multimedia projects. The project can connect to external services through integrations and can be customized into purpose-specific swarms for areas such as SEO, sales, marketing, finance, customer support, or research. Its main appeal is giving technical users a forkable, terminal-based framework for building agent teams that produce polished business and creative deliverables.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Viral-Clips-Crew

    Viral-Clips-Crew

    Your CrewAI Powered Video Editing Assistant

    ...The project focuses on content repurposing, helping users adapt long videos into formats suitable for platforms like TikTok and YouTube Shorts. Its modular design allows customization of each processing stage, including selection logic and visual formatting. Overall, it serves as a tool for automating short-form content creation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 10
    AutoCrop-Vertical

    AutoCrop-Vertical

    Smart video converter using YOLOv8 and FFmpeg

    ...It uses computer vision techniques and AI models such as YOLOv8 to analyze each frame, detect subjects, and dynamically adjust cropping decisions. Instead of applying a static center crop, the system intelligently tracks people or key objects to preserve visual focus and composition. When cropping would degrade the scene, it can switch to alternative layouts such as letterboxing to maintain context. The tool integrates FFmpeg for encoding and rendering, ensuring efficient processing and compatibility with standard video workflows. It supports multiple output aspect ratios and quality settings, allowing customization for different platforms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    alive-progress

    alive-progress

    A new kind of Progress Bar, with real-time throughput, ETA

    alive-progress is an advanced Python progress bar library that introduces a highly animated and adaptive approach to tracking long-running tasks. Unlike traditional static progress indicators, it dynamically adjusts spinner speed and visual feedback based on actual throughput, giving users a more intuitive sense of activity. The library is designed with performance efficiency in mind, using multithreaded updates that minimize CPU overhead and terminal noise. It includes sophisticated ETA estimation powered by exponential smoothing algorithms, improving prediction accuracy for variable workloads. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package. This package is still backward compatible with the original progressbar package so you can safely use it as a drop-in replacement for existing projects. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    sparkmagic

    sparkmagic

    Jupyter magics and kernels for working with remote Spark clusters

    ...Sparkmagic interacts with remote Spark clusters through a REST server. Automatic visualization of SQL queries in the PySpark, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required. Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib). Send local files or dataframes to a remote cluster (e.g. sending pretrained local ML model straight to the Spark cluster) Authenticate to Livy via Basic Access authentication or via Kerberos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ydata-profiling

    ydata-profiling

    Create HTML profiling reports from pandas DataFrame objects

    ydata-profiling primary goal is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, that is so handy, ydata-profiling delivers an extended analysis of a DataFrame while allowing the data analysis to be exported in different formats such as html and json.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    InfiniteYou

    InfiniteYou

    Flexible Photo Recrafting While Preserving Your Identity

    ...Using an architecture built around diffusion transformers (DiTs), InfiniteYou introduces a component called InfuseNet that injects identity features derived from reference images into the generation process — via residual connections — so that the output matches the person’s identity closely, without sacrificing visual quality or text-image alignment. The team uses a multi-stage training strategy with synthetic multi-sample data per identity to fine-tune for both identity consistency and aesthetic quality. Compared to prior methods, InfiniteYou significantly improves on identity similarity, text-prompt adherence, overall image quality, and avoids common problems such as face copy-pasting artifacts.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    shot-scraper

    shot-scraper

    A command-line utility for taking automated screenshots of websites

    shot-scraper is a command-line utility for taking automated screenshots of web pages using a headless browser engine. After installation, a single command can capture a full-page screenshot of a URL and save it to a file, making it ideal for documentation, monitoring, and visual regression tasks. Under the hood it uses a modern browser (installed via a one-time shot-scraper install step) and exposes options for viewport size, full-page versus clipped screenshots, and device emulation. Beyond simple captures, it can run custom JavaScript before taking the shot, allowing you to open menus, scroll, or manipulate the DOM so the screenshot reflects the desired state. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    GPTImage2Skill

    GPTImage2Skill

    GPT Image 2 prompt gallery, image prompt library, agentic skill

    ...It provides reusable image prompts across creative, technical, academic, interface, design, photography, typography, gaming, anime, map, tattoo, and reference-editing use cases. The project is designed to help agents and users produce stronger visual outputs without starting from a blank prompt every time. Its gallery is organized into category files so an agent can load only the relevant prompt references instead of overwhelming the context window. It also includes installation paths for skill-capable environments such as Claude Code, Codex, OpenClaw, and other agent runtimes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Videomass

    Videomass

    Videomass is a free, open source and cross-platform GUI for FFmpeg

    Videomass is a free, open-source graphical interface for FFmpeg designed to make advanced video and audio processing accessible to both beginners and experienced users. Built in Python using wxPython, it provides a cross-platform environment for managing encoding, conversion, and editing tasks through a visual interface. The software supports multitasking operations, allowing users to process multiple media files simultaneously. It offers extensive configuration options while also providing presets to simplify common workflows. Videomass integrates closely with FFmpeg, exposing powerful capabilities such as transcoding, filtering, and format conversion without requiring command-line interaction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    ...It achieves state-of-the-art performance on document parsing benchmarks while maintaining a relatively compact model size, demonstrating efficiency without sacrificing accuracy. Beyond standard OCR tasks, it extends its capabilities to parse complex visual elements such as charts, diagrams, and web interfaces, converting them into structured outputs like SVG code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    ...The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting. It also employs connectionist temporal classification (CTC) to align predicted character sequences with input images without requiring character-level segmentation. The repository provides code for training models, performing inference on handwritten text images, and evaluating recognition accuracy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    SteadyDancer

    SteadyDancer

    Harmonized and Coherent Human Image Animation

    ...The system can be used both in preprocessing pipelines for content creators and in live feedback loops for performers, giving dancers and videographers a tool to refine their visual outputs. It supports integration with standard video formats and includes customizable parameters so users can tune stabilization aggressiveness.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Context Engineering

    Context Engineering

    A frontier, first-principles handbook

    ...It takes inspiration from thought leaders like Andrej Karpathy and bridges theory with practical examples, offering structured guidance on context orchestration, memory, retrieval, and state control within AI workflows. With extensive materials drawn from research, surveys, and visual explanations, the project acts as both a learning resource and a reference for practitioners looking to improve model behavior by engineering richer inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Grounded-Segment-Anything

    Grounded-Segment-Anything

    Marrying Grounding DINO with Segment Anything & Stable Diffusion

    Grounded-Segment-Anything is a research-oriented project that combines powerful open-set object detection with pixel-level segmentation and subsequent creative workflows, effectively enabling detection, segmentation, and high-level vision tasks guided by free-form text prompts. The core idea behind the project is to pair Grounding DINO — a zero-shot object detector that can locate objects described by natural language — with Segment Anything Model (SAM), which can produce detailed masks for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Wan Move

    Wan Move

    Motion-controllable Video Generation via Latent Trajectory Guidance

    Wan Move is an open-source research codebase for motion-controllable video generation that focuses on enabling fine-grained control of motion within generative video models. It is designed to guide the temporal evolution of visual content by leveraging latent trajectory guidance, allowing users to manipulate how objects move over time without modifying the underlying generative architecture. By representing motion information as dense point trajectories and integrating them into the latent space of an image-to-video model, the project produces videos with more precise and controllable motion behavior than many existing methods. ...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo