Showing 13 open source projects for "graphical user interface"

View related business solutions
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 1
    Open Interface

    Open Interface

    Control Any Computer Using LLMs

    Open Interface is a cross-platform application that allows users to control their computers using large language models (LLMs). By sending user requests to an LLM backend, it determines the necessary steps and executes them by simulating keyboard and mouse inputs. The system can adjust its actions based on real-time feedback, providing a self-driving computer experience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Hermes Agent

    Hermes Agent

    The agent that grows with you

    ...The agent interfaces with messaging platforms like Telegram, Discord, Slack, and WhatsApp through a single gateway process, and also offers an interactive terminal user interface with history, autocomplete, and streamable tool output. It supports scheduled automation in natural language, allowing users to set up recurring tasks such as daily briefings or system audits that it runs unattended.
    Downloads: 55 This Week
    Last Update:
    See Project
  • 3
    OmniParser

    OmniParser

    A simple screen parsing tool towards pure vision based GUI agent

    OmniParser is a comprehensive method for parsing user interface screenshots into structured elements, significantly enhancing the ability of multimodal models like GPT-4 to generate actions accurately grounded in corresponding regions of the interface. It reliably identifies interactable icons within user interfaces and understands the semantics of various elements in a screenshot, associating intended actions with the correct screen regions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Magentic UI

    Magentic UI

    A research prototype of a human-centered web agent

    Magentic-UI is a research prototype developed by Microsoft that serves as a human-centered interface powered by a multi-agent system. It enables users to automate complex web tasks, such as browsing, form filling, and data analysis, while maintaining control over the process. The system emphasizes transparency and user involvement, making it suitable for tasks requiring both automation and human oversight.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    MAI-UI

    MAI-UI

    Real-World Centric Foundation GUI Agents

    MAI-UI is a cutting-edge open-source project that implements a family of foundation GUI (Graphical User Interface) agent models capable of interpreting natural language and performing real-world GUI navigation and control tasks across mobile and desktop environments. Developed by Tongyi-MAI (Alibaba’s research initiative), the MAI-UI models are multimodal agents trained to understand user instructions and corresponding screenshots, grounding those instructions to on-screen elements and generating sequences of GUI actions such as taps, swipes, text input, and system commands. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    Agent S is an open-source agentic framework designed to enable autonomous computer use through an Agent-Computer Interface (ACI). Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Gemini Fullstack LangGraph Quickstart

    Gemini Fullstack LangGraph Quickstart

    Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph

    ...The project features a React (Vite) frontend and a LangGraph/FastAPI backend designed to work together seamlessly for real-time research and reasoning tasks. The backend agent dynamically generates search queries based on user input, retrieves information via the Google Search API, and performs reflective reasoning to identify knowledge gaps. It then iteratively refines its search until it produces a comprehensive, well-cited answer synthesized by the Gemini model. The repository provides both a browser-based chat interface and a command-line script (cli_research.py) for executing research queries directly. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    OpenAdapt

    OpenAdapt

    Open Source Generative Process Automation

    OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). OpenAdapt learns to automate your desktop and web workflows by observing your demonstrations. Spend less time on repetitive tasks and more on work that truly matters. Boost team productivity in HR operations. Automate candidate sourcing using LinkedIn Recruiter, LinkedIn Talent Solutions, GetProspect, Reply.io, outreach.io, Gmail/Outlook, and more. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    GELab-Zero

    GELab-Zero

    GUI Exploration Lab. One of the best GUI agent solutions

    GELab-Zero is an open-source “GUI Agent” framework aiming to automate interactions with graphical user interfaces (GUIs), combining both the agent model and all supporting infrastructure — including inference, input orchestration, and GUI automation logic — in a plug-and-play package that runs locally, without cloud dependencies. The idea is to let developers or users harness an AI agent that can simulate clicking, typing, reading UI elements, and interacting with apps in a human-like way via the GUI, which can enable tasks like automated testing, scriptable workflows, or even autonomous usage of GUI-based applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    AgentPilot

    AgentPilot

    A versatile workflow automation platform to create AI workflows

    AgentPilot is a versatile workflow automation platform designed to help users create, organize, and execute AI-driven workflows. It supports everything from simple tasks using a single large language model (LLM) to complex multi-step processes. The platform features a user-friendly interface that allows for real-time interaction with workflows, and it supports flexible configurations, including branching workflows and customizable user interfaces. Users can also schedule tasks based on natural language time expressions and integrate various tools to enhance their workflows.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    OWL

    OWL

    Optimized Workforce Learning for General Multi-Agent Assistance

    Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation. OWL (Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation) is an advanced framework designed to enhance multi-agent collaboration, improving task automation across various domains. By utilizing dynamic agent interactions, OWL aims to streamline and optimize complex workflows, making AI collaboration more natural, efficient, and adaptable. It is built on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SuperAGI

    SuperAGI

    A dev-first open source autonomous AI agent framework

    An open-source autonomous AI framework to enable you to develop and deploy useful autonomous agents quickly & reliably. Join a community of developers constantly contributing to make SuperAGI better. Access your agents through a graphical user interface. Interact with agents by giving them input, permissions, etc. Agents typically learn and improve their performance over time with feedback loops. Run multiple agents simultaneously to improve efficiency and productivity. Connect to multiple Vector DBs to enhance your agent’s performance. Each agent is unique, use different models of your choice. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AI-Agent-Host

    AI-Agent-Host

    The AI Agent Host is a module-based development environment.

    The AI Agent Host integrates several advanced technologies and offers a unique combination of features for the development of language model-driven applications. The AI Agent Host is a module-based environment designed to facilitate rapid experimentation and testing. It includes a docker-compose configuration with QuestDB, Grafana, Code-Server and Nginx. The AI Agent Host provides a seamless interface for managing and querying data, visualizing results, and coding in real-time. The AI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo