Showing 71 open source projects for "gui builder python"

View related business solutions
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    AskUI Vision Agent

    AskUI Vision Agent

    Enable AI to control your desktop, mobile and HMI devices

    ...The repository presents a feature overview, sample media, and frequent release notes, which show ongoing improvements such as CORS checks and other operational tweaks. The broader AskUI documentation covers the Python Vision Agent along with suite services and inference APIs, indicating a productized ecosystem rather than a single library. Community-curated lists also recognize Vision Agent as part of the broader “GUI agents” landscape, placing it among other computer-use agents.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    npcpy

    npcpy

    The AI toolkit for the AI developer

    npcpy is a Python-based agent framework and command-line toolkit (the NPC Shell) for developers to build, test, and integrate AI agents into their workflows, including both command-line and GUI interfaces via NPC Studio. Welcome to npcpy, the core library of the NPC Toolkit that supercharges natural language processing pipelines and agent tooling. npcpy is a flexible framework for building state-of-the-art applications and conducting novel research with LLMs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    AICodeBot

    AICodeBot

    AI-powered tool for developers, simplifying coding tasks

    AICodeBot is a terminal-based coding assistant designed to make your coding life easier. Think of it as your AI version of a pair programmer. Perform code reviews, create helpful commit messages, debug problems, and help you think through building new features. A team member that accelerates the pace of development and helps you write better code. We've planned to build out multiple different interfaces for interacting with AICodeBot. To start, it's a command-line tool that you can install...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    Agent S is an open-source agentic framework designed to enable autonomous computer use through an Agent-Computer Interface (ACI). Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    GLM-V

    GLM-V

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

    GLM-V is an open-source vision-language model (VLM) series from ZhipuAI that extends the GLM foundation models into multimodal reasoning and perception. The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    UFO³

    UFO³

    Weaving the Digital Agent Galaxy

    UFO is an open-source framework developed by Microsoft for building intelligent agents that automate interactions with graphical user interfaces on the Windows operating system. The system allows users to issue natural language instructions that are translated into automated actions across multiple desktop applications. Using a dual-agent architecture, the framework analyzes both visual interface elements and system control structures in order to understand how applications should be...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    OmniParser

    OmniParser

    A simple screen parsing tool towards pure vision based GUI agent

    OmniParser is a comprehensive method for parsing user interface screenshots into structured elements, significantly enhancing the ability of multimodal models like GPT-4 to generate actions accurately grounded in corresponding regions of the interface. It reliably identifies interactable icons within user interfaces and understands the semantics of various elements in a screenshot, associating intended actions with the correct screen regions. To achieve this, OmniParser curates an...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    GLM-4.5V

    GLM-4.5V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Meta Agents Research Environments (ARE)

    Meta Agents Research Environments (ARE)

    Meta Agents Research Environments is a comprehensive platform

    Meta Agents Research Environments (ARE) is a simulation and benchmarking platform. It is designed to evaluate AI agents in dynamic, evolving, multi-step tasks. Unlike static benchmarks, ARE supports environments where agents must adapt to changes over time and reason over sequences of actions. It interacts with applications and faces uncertainty. The included Gaia2 benchmark offers 800 scenarios across multiple “universes”. It can test reasoning, memory, tool use, and adaptability....
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    AnyTool

    AnyTool

    AnyTool: Universal Tool-Use Layer for AI Agents

    AnyTool is an open-source universal tool-use layer for AI agents that addresses the critical problem of how autonomous agents reliably interact with external tools and environments. Rather than having each agent handle tool invocation logic on its own, AnyTool provides a standardized interface and orchestrator that intelligently selects and manages tools, reduces context overhead, and improves execution reliability across diverse capabilities like web APIs, local commands, and GUI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ollama_manager_gui

    ollama_manager_gui

    A graphical manager for ollama that can manage your LLMs

    This app will help install ollama and LLMs using the gui provided by this app. It checks for ollama when launched and if it doesn't exist it will help by bringing you to the ollama site for download. This app is heavily upgraded and now also works properly on Linux. It now has progress bars and many many many improvements. It can launch the LLM by clicking the link. it can launch multiple LLMs in separate windows. It can also remove an installed LLM. There is a confirmation...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    GLM-4.1V

    GLM-4.1V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.1V — often referred to as a smaller / lighter version of the GLM-V family — offers a more resource-efficient option for users who want multimodal capabilities without requiring large compute resources. Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Deface GUI -  Face Anonymization Tool

    Deface GUI - Face Anonymization Tool

    Graphical User Interface Face Anonymization Tool

    This application is a professional tool with a graphical user interface that enables anonymization of faces using the Deface Engine. Cross-Platform Compatible (Linux-Windows) NOTE: To use on Windows, first install Python. Then, if necessary, install “pip install deface” (only if necessary).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    MuJoCo MPC

    MuJoCo MPC

    Real-time behaviour synthesis with MuJoCo, using Predictive Control

    ...In addition to its C++ core, MJPC includes an experimental Python API, enabling integration with custom models and MuJoCo tasks for flexible scripting and experimentation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    CodinIT.dev

    CodinIT.dev

    Free, local, open-source AI app builder

    CodinIT.dev is a free, local, open source AI app builder that lets you go from idea to full-stack application entirely on your machine, no coding required, just chat with AI. You can build unlimited apps with real-time previews, instant undo, and responsive, frictionless workflows. Deep Supabase integration means you can create UI and backend logic in one cohesive environment, while the model-agnostic architecture lets you connect to any AI, whether cloud-based (Gemini 3 Pro, GPT-5,...
    Leader badge
    Downloads: 71 This Week
    Last Update:
    See Project
  • 18
    Whisper Batch Transcriber

    Whisper Batch Transcriber

    Unlimited, private and free Speech-To-Text program

    .... ## Notes: - Its 2GB in size and requires 2-6GB of GPU VRAM too. (basically you need atleast a mid-range gaming PC to use this.) - Its fairly slow to start (10min) and transcribe, this is normal behavior. - Includes a python installer to install Python on your computer so you can directly run the 'whisper_transcriber.py' file like you would an .exe by double-clicking it. (I did this because compiling to exe made it slower) - I made it as easy as possible for a layperson to use, so despite its crude looks, its as good as a GUI application experience. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 19
    RetroScheme- Get Your Retrosynthesis

    RetroScheme- Get Your Retrosynthesis

    - RetroScheme is used for molecule sketching and retrosynthesis

    RetroScheme was specifically designed to help Chemists in knowing potential starting material through retrosynthetic analysis. The App is basically a GUI wrapper for the library Aizynthfinder from Astrazeneca.. - The App is coupled with molecular sketching tool to sketch your compound - This was made to be easy for the user and can be used endlessly to assist in potential new drug synthesis
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Warlock-Studio

    Warlock-Studio

    AI Suite for upscaling, interpolating & restoring images/videos

    v6.0. Warlock-Studio is a Windows application that uses Real-ESRGAN, BSRGAN, IRCNN, GFPGAN, RealESRNet, RealESRAnime and RIFE Artificial Intelligence models to upscale, restore faces, interpolate frames and reduce noise in images and videos. the application supports GPU acceleration (including multi-GPU setups) and offers batch processing for large workloads. It includes drag-and-drop handling for single or multiple files, optional pre-resize functions, and an automatic tiling system...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 21
    QuizSolver

    QuizSolver

    AI-powered quiz solver for Windows. Free to use, easy to set up.

    QuizSolver is a free Windows app that uses AI vision to automatically read and answer quiz questions on your screen. It takes a screenshot, detects the answer buttons, sends the question to an AI model, and clicks the correct answer in seconds. Built-in support for Quizalize and Quipper. A Custom mode is available for other quiz sites, though results may vary. HOW TO SET UP: 1. Download and unzip QuizSolver — no installation needed 2. Get a free API key from Groq at...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CC2.TV / CC2 - Audio- und TV-Datenbank

    CC2.TV / CC2 - Audio- und TV-Datenbank

    Meta-Datenbank-Anwendung für die Audio- und TV-Sendungen des CC2.TV

    Dieses Programm stellt eine Meta-Datenbank-Anwendung für die Audio- und Video-Sendungen des CC2.TV für GNU/Linux Systeme zur Verfügung. Es ermöglicht das Durchsuchen, Verwalten und Abspielen der umfangreichen Inhalte des CC2.TV-Audiocasts und -Videocasts. Ziel ist es, die über 3000 Audiocast-Themen und über 1000 Videocast-Themen, die sich auf Computerthemen, Technik und gesellschaftliche Aspekte konzentrieren, komfortabel zugänglich zu machen. Für die volle Funktionalität,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23

    Liveliness and Face Identification

    Leading free and open-source liveliness check &face recognition system

    ...Essentially, it is an application that can be used as a standalone server or deployed in the cloud. You don’t need prior machine learning skills to set up and use. The application is customizable react based mobile friendly UI and Python based backend. The program is a real-time face detection application. It allows you to detect faces using your webcam and displays the video feed with oval drawn around the detected faces. When you run the program, a GUI window will appear. The window appears to do liveliness check and face detection. The description guides you to adjust the settings and click the "Start" button to begin face detection. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    DragGAN

    DragGAN

    Official Code for DragGAN (SIGGRAPH 2023)

    DragGAN is a research-driven image editing system that enables precise manipulation of GAN-generated images through interactive point dragging. The project introduces a novel workflow where users move specific points in an image and the model intelligently deforms the content while preserving realism. Built on top of StyleGAN architectures, the tool operates directly on the learned generative manifold to maintain photorealistic consistency. It combines feature-based motion supervision with a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Lyrebird

    Lyrebird

    Simple and powerful voice changer for Linux, written with Python & GTK

    Simple and powerful voice changer for Linux, written with Python & GTK.
    Downloads: 53 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB