Page Agent is an open-source in-page AI agent framework that allows developers to control and interact with web interfaces using natural language directly within the browser. Unlike traditional browser automation tools, it operates entirely through in-page JavaScript, eliminating the need for browser extensions, headless browsers, or external automation environments. The system enables users to manipulate the DOM through text-based commands, allowing complex workflows such as form filling, navigation, and UI interaction to be executed through simple natural language instructions. Page Agent is designed to integrate seamlessly into existing web applications, making it possible to embed AI copilots into SaaS platforms without major backend changes. It supports a bring-your-own-LLM approach, allowing developers to connect their preferred language models to power the agent’s reasoning capabilities.
Features
- In-page JavaScript agent with no need for headless browsers or extensions
- Natural language control for interacting with web interfaces and DOM elements
- Bring your own LLM support for customizable AI reasoning
- Text-based DOM manipulation without reliance on screenshots or vision models
- Human-in-the-loop interface for monitoring and guiding actions
- Optional multi-page automation support via browser extension