Showing 7 open source projects for "mobile web browser"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser-Use is a framework that makes websites accessible for AI agents, enabling automated interactions and data extraction from web pages.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Android Use

    Android Use

    Automate native Android apps with AI using accessibility APIs

    android-action-kernel is an open source Python library designed to let AI agents control and automate native Android applications running on real devices or emulators. It fills a gap in automation tooling by focusing on mobile-first workflows where traditional browser or desktop-based automation doesn’t work; such as logistics, gig work, field operations, and other industries reliant on phones or tablets. The project works by using Android’s accessibility API to extract structured UI state (as XML) from the device, which is then fed to a large language model (LLM) like OpenAI’s models for decision-making, and actions are executed via the Android Debug Bridge (ADB). ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    AskUI Vision Agent

    AskUI Vision Agent

    Enable AI to control your desktop, mobile and HMI devices

    AskUI’s Vision Agent is an automation framework that allows you—and AI agents—to control real desktops, mobile devices, and HMI systems by perceiving the UI and performing actions like clicking, typing, scrolling, and drag-and-drop. It is designed for multi-platform compatibility and supports multiple AI models so you can tailor perception and decision-making to your workload. The repository presents a feature overview, sample media, and frequent release notes, which show ongoing...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    CogAgent

    CogAgent

    An open sourced end-to-end VLM-based GUI Agent

    CogAgent is a 9B-parameter bilingual vision-language GUI agent model based on GLM-4V-9B, trained with staged data curation, optimization, and strategy upgrades to improve perception, action prediction, and generalization across tasks. It focuses on operating real user interfaces from screenshots plus text, and follows a strict input–output format that returns structured actions, grounded operations, and optional sensitivity annotations. The model is designed for agent-style execution rather...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    MolmoWeb

    MolmoWeb

    Open multimodal web agent built by Ai2

    MolmoWeb is an open-source multimodal web agent designed to autonomously navigate and interact with web browsers using vision-language models, representing a significant step toward fully agentic AI systems that can operate in real-world digital environments. The system takes natural language instructions and translates them into sequences of browser actions such as clicking, typing, scrolling, and navigating, effectively performing tasks on behalf of the user. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CoPaw

    CoPaw

    Your Personal AI Assistant; easy to install, deploy on local or coud

    CoPaw is a personal AI assistant designed to run on your own machine or in the cloud, giving you full control over memory, models, and data. Built by the AgentScope team, it connects to multiple chat platforms—including DingTalk, Feishu, QQ, Discord, iMessage, and more—through a single unified assistant. CoPaw supports both cloud-based LLM providers and fully local models such as llama.cpp, MLX, and Ollama, allowing you to operate without API keys if preferred. It includes a browser-based...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Gemini Fullstack LangGraph Quickstart

    Gemini Fullstack LangGraph Quickstart

    Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph

    gemini-fullstack-langgraph-quickstart is a fullstack reference application from Google DeepMind’s Gemini team that demonstrates how to build a research-augmented conversational AI system using LangGraph and Google Gemini models. The project features a React (Vite) frontend and a LangGraph/FastAPI backend designed to work together seamlessly for real-time research and reasoning tasks. The backend agent dynamically generates search queries based on user input, retrieves information via the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB