Showing 6 open source projects for "screen recorder for windows"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    The Self-Operating Computer Framework is an innovative system that enables multimodal models to autonomously operate a computer by interpreting the screen and executing mouse and keyboard actions to achieve specified objectives. This framework is compatible with various multimodal models and currently integrates with GPT-4o, o1, Gemini Pro Vision, Claude 3, and LLaVa. Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    Open-AutoGLM

    Open-AutoGLM

    An open phone agent model & framework

    Open-AutoGLM is an open-source framework and model designed to empower autonomous mobile intelligent assistants by enabling AI agents to understand and interact with phone screens in a multimodal manner, blending vision and language capability to control real devices. It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    OmniParser

    OmniParser

    A simple screen parsing tool towards pure vision based GUI agent

    OmniParser is a comprehensive method for parsing user interface screenshots into structured elements, significantly enhancing the ability of multimodal models like GPT-4 to generate actions accurately grounded in corresponding regions of the interface. It reliably identifies interactable icons within user interfaces and understands the semantics of various elements in a screenshot, associating intended actions with the correct screen regions. To achieve this, OmniParser curates an...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    OAGI Python SDK

    OAGI Python SDK

    Python SDK for the Computer Use model Lux, developed by OpenAGI

    OAGI Python SDK is a Python client library for the Lux computer-use model that turns Lux into a programmable automation layer for operating human-facing software via vision and actions. It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives. The SDK is designed around “computer use” as a paradigm, where the AI actually navigates...
    Downloads: 5 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    MAI-UI

    MAI-UI

    Real-World Centric Foundation GUI Agents

    MAI-UI is a cutting-edge open-source project that implements a family of foundation GUI (Graphical User Interface) agent models capable of interpreting natural language and performing real-world GUI navigation and control tasks across mobile and desktop environments. Developed by Tongyi-MAI (Alibaba’s research initiative), the MAI-UI models are multimodal agents trained to understand user instructions and corresponding screenshots, grounding those instructions to on-screen elements and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Open Exchange (OpEx)

    Open Exchange (OpEx)

    The open source Algorithmic Trading System

    OpEx is an application suite that includes the main building blocks of commercial electronic trading systems. All OpEx applications run on distributed system architectures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB