Showing 5 open source projects for "screen"

View related business solutions
  • Retool your internal operations Icon
    Retool your internal operations

    Generate secure, production-grade apps that connect to your business data. Not just prototypes, but tools your team can actually deploy.

    Build internal software that meets enterprise security standards without waiting on engineering resources. Retool connects to your databases, APIs, and data sources while maintaining the permissions and controls you need. Create custom dashboards, admin tools, and workflows from natural language prompts—all deployed in your cloud with security baked in. Stop duct-taping operations together, start building in Retool.
    Build an app in Retool
  • Outgrown Windows Task Scheduler? Icon
    Outgrown Windows Task Scheduler?

    Free diagnostic identifies where your workflow is breaking down—with instant analysis of your scheduling environment.

    Windows Task Scheduler wasn't built for complex, cross-platform automation. Get a free diagnostic that shows exactly where things are failing and provides remediation recommendations. Interactive HTML report delivered in minutes.
    Download Free Tool
  • 1
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    The Self-Operating Computer Framework is an innovative system that enables multimodal models to autonomously operate a computer by interpreting the screen and executing mouse and keyboard actions to achieve specified objectives. This framework is compatible with various multimodal models and currently integrates with GPT-4o, o1, Gemini Pro Vision, Claude 3, and LLaVa. Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Open-AutoGLM

    Open-AutoGLM

    An open phone agent model & framework

    Open-AutoGLM is an open-source framework and model designed to empower autonomous mobile intelligent assistants by enabling AI agents to understand and interact with phone screens in a multimodal manner, blending vision and language capability to control real devices. It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off completion of multi-step tasks such as navigating apps, filling forms, and more. Unlike traditional automation scripts that depend on brittle heuristics, Open-AutoGLM uses pretrained large language and vision-language models to interpret visual context and natural language instructions, giving the agent robust adaptability across apps and interfaces.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    MAI-UI

    MAI-UI

    Real-World Centric Foundation GUI Agents

    ...Developed by Tongyi-MAI (Alibaba’s research initiative), the MAI-UI models are multimodal agents trained to understand user instructions and corresponding screenshots, grounding those instructions to on-screen elements and generating sequences of GUI actions such as taps, swipes, text input, and system commands. Unlike traditional UI frameworks, MAI-UI emphasizes realistic deployment by supporting agent–user interaction (clarifying ambiguous instructions), integration with external tool APIs using MCP calls, and a device–cloud collaboration mechanism that dynamically routes computation to on-device or cloud models based on task state and privacy constraints.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    OAGI Python SDK

    OAGI Python SDK

    Python SDK for the Computer Use model Lux, developed by OpenAGI

    ...It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives. The SDK is designed around “computer use” as a paradigm, where the AI actually navigates interfaces, clicks, types, scrolls, and reads the screen through screenshots instead of only calling APIs. It provides high-level asynchronous agents (like AsyncDefaultAgent and AsyncActor) that encapsulate the loop of capturing screenshots, sending them to Lux, interpreting responses, and executing UI actions with PyAutoGUI. Multiple installation flavors let you choose between a minimal oagi-core package or variants that bundle desktop automation and FastAPI/Socket.IO server capabilities.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    Robocode

    Robocode

    Robocode is a programming tank game for Java

    Robocode is a programming game, where the goal is to develop a robot battle tank to battle against other tanks with Java. The robot battles are running in real-time and on-screen. The motto of Robocode is: Build the best, destroy the rest!
    Leader badge
    Downloads: 595 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next