Showing 61 open source projects for "browser"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser Use is an AI-powered browser automation framework designed to let agents interact with websites just like humans do. It enables developers and AI systems to perform complex online tasks such as form filling, data extraction, and navigation through natural language instructions. Built with Python and compatible with modern LLMs, it integrates seamlessly with tools like ChatBrowserUse, Google Gemini, and Anthropic models.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Browser Harness

    Browser Harness

    Self-healing browser harness that enables LLMs to complete any task

    Browser Harness is a self-healing browser control system built to give language models direct and flexible access to a real Chrome browser through the Chrome DevTools Protocol. Its main philosophy is minimalism: instead of imposing a rigid framework, it exposes a very thin bridge so the agent can perform browser tasks with almost no abstraction in the way.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Browser Use MCP Server

    Browser Use MCP Server

    Browse the web, directly from Cursor etc.

    A browser automation server implementing the Model Context Protocol, designed to allow AI assistants to browse the web directly from applications like Cursor. It supports natural language commands for web navigation and interaction. ​
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Applio

    Applio

    A simple, high-quality voice conversion tool focused on ease of use

    Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through plugins and configurations so users can extend functionality without touching the core code. Applio is considered stable and mature; ongoing development is now centered on security patches, dependency maintenance, and occasional improvements, which makes it attractive for production or repeatable workflows. ...
    Downloads: 119 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    LaVague

    LaVague

    Framework for building AI agents that automate complex web tasks

    ...It can use browser automation tools such as Selenium or Playwright to interact with websites programmatically. Developers can integrate various language models and configure the agent’s reasoning and execution behavior to suit different automation scenarios.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Index

    Index

    The SOTA Open-Source Browser Agent

    Index is an open-source browser automation agent designed to autonomously perform complex tasks across websites by transforming web interfaces into programmable APIs. The system enables developers to instruct an AI agent to interact with web pages using natural language rather than traditional automation scripts. Instead of writing detailed browser automation code, users can describe the desired task and allow the agent to interpret the page structure, interact with elements, and complete multi-step workflows automatically. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Every Code

    Every Code

    Local AI coding agent CLI with multi-agent orchestration tools

    ...Every Code enhances the traditional coding assistant model by introducing multi-agent orchestration, allowing multiple AI agents to collaborate, compare solutions, and refine outputs in parallel. It supports integration with various AI providers, enabling users to route tasks across different models depending on their needs. Every Code also includes browser integration and automation capabilities, extending its usefulness beyond simple code generation into more complex development tasks. Customization is a key focus, with support for theming, configurable settings, and reasoning controls that allow developers to fine-tune how the agent behaves.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 8
    Notte

    Notte

    Opensource browser using agents

    Notte is an open-source browser framework that enables the development and deployment of web-based AI agents. It introduces a perception layer that transforms web pages into structured, navigable maps described in natural language, allowing agents to interact with the internet more effectively. Notte is designed for building scalable and efficient browser-based AI applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Preswald

    Preswald

    Python tool for browser-based interactive data apps in one file

    Preswald is an open source Python-based framework and static-site generator designed for building interactive data applications that run entirely in the browser. It packages application logic, data processing, and user interface components into a single self-contained output, enabling easy sharing and deployment without requiring local dependencies. Preswald leverages a WebAssembly runtime along with technologies like Pyodide and DuckDB to execute Python code directly in the browser environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10
    web-eval-agent MCP Server

    web-eval-agent MCP Server

    An MCP server that autonomously evaluates web applications

    ...Marketing and README material emphasize supercharging local debugging loops by combining live browser execution with LLM-driven hypotheses and fixes. Activity on the repo shows steady iteration, with issues and PRs centered on reliability and developer experience. In short, it wraps autonomous, in-editor web testing and diagnosis behind a predictable MCP interface.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    ...Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as an extension of your brain. So that you can stay focused on doing what matters. Khoj started with the founding principle that a personal assistant be understandable, accessible and hackable. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    AstronRPA

    AstronRPA

    Agent-ready RPA suite with visual workflow automation tools engine

    ...It provides a visual workflow designer that supports low-code and no-code development, allowing users to create automation processes through a drag-and-drop interface instead of writing extensive code. It enables automation of common desktop software and browser-based tasks, making it suitable for repetitive business operations and system integrations. Astron RPA includes a large library of reusable components that handle tasks such as user interface operations, data processing, and system interactions, allowing workflows to be assembled from modular building blocks. Astron RPA also integrates with intelligent agent systems so that automated processes and AI-driven workflows can work together in broader automation scenarios.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Skyvern

    Skyvern

    Automate browser-based workflows with LLMs and Computer Vision

    Skyvern uses a combination of computer vision and AI to understand content on a webpage, making it adaptable to any website. Skyvern takes instructions in natural language, allowing it to execute complex objectives with simple commands. Skyvern is an API-first product. Workflows execute in the cloud, allowing it to run hundreds of workflows at the same time. Skyvern's AI decisions come with built-in explanations, providing clear summaries and justifications for every action. Support for...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    StableSwarmUI

    StableSwarmUI

    Multi-user UI for managing and running Stable Diffusion workflows tool

    ...It focuses on enabling multiple users to interact with shared resources, making it suitable for collaborative or server-based deployments. It provides a centralized system where users can submit, monitor, and manage generation tasks through a browser interface. It abstracts much of the complexity involved in running diffusion models by offering a structured environment for handling prompts, outputs, and processing queues. StableSwarmUI is built to work alongside backend systems that execute the actual image generation, allowing separation between user interaction and compute workloads. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    KeepChatGPT

    KeepChatGPT

    Browser userscript that enhances ChatGPT reliability and usability

    KeepChatGPT is an open source browser userscript designed to enhance the reliability, usability, and efficiency of the ChatGPT web interface. It runs through userscript managers and injects additional functionality directly into the page, allowing users to improve their workflow without requiring a backend service or separate application. It focuses on solving common problems experienced during AI conversations, such as session timeouts, network errors, message failures, and interruptions during long chats. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Qwen-Agent

    Qwen-Agent

    Agent framework and applications built upon Qwen>=3.0

    ...It provides components for instruction following, tool usage (function calling), planning, memory, RAG (retrieval augmented generation), code interpreter, etc. It ships with example applications (Browser Assistant, Code Interpreter, Custom Assistant), supports GUI front-ends, backends, server setups. Agent workflow can maintain context / memory to perform multi-turn or more complex logic over time. It acts as the backend for Qwen Chat among other use cases. Built-in Code Interpreter tool that can execute code (locally) as part of agent workflows.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    OpenManus

    OpenManus

    Open-source AI agent framework

    OpenManus is an open-source AI agent framework designed to autonomously execute complex, multi-step tasks by combining reasoning, planning, and tool use. It enables developers to build agents that can think, act, and iterate toward goals rather than simply responding to prompts. The platform emphasizes task decomposition, allowing agents to break down objectives into smaller steps and execute them sequentially or recursively. OpenManus supports integration with external tools, APIs, and...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 19
    Integuru v0

    Integuru v0

    The first AI agent that builds permissionless integrations

    ...Instead of relying on official developer documentation or publicly available APIs, the system analyzes network traffic generated by user interactions within a web application. Developers capture browser requests and authentication data, which the agent then uses to infer the structure of the platform’s internal API endpoints. Based on this information, the system generates executable code that can replicate the original action programmatically. This approach allows developers to automate workflows and build integrations with services that do not provide official APIs or developer tools. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Upsonic

    Upsonic

    The most reliable AI agent framework that supports MCP

    Upsonic is a reliability-focused AI agent framework designed for real-world applications. It enables the development of trusted agent workflows within organizations by incorporating advanced reliability features, such as verification layers and output evaluation systems. The framework supports the Model Context Protocol (MCP), facilitating integration with various tools and enhancing agent capabilities. ​
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    AIHawk

    AIHawk

    AIHawk aims to easy job hunt process by automating job applications

    AIHawk is an AGPL‑licensed AI agent focused on automating job applications. It scrapes job listings from corporate sites (or LinkedIn in forks) and uses LLMs to generate tailored applications, streamlining the process across multiple platforms—dubbed “revolutionary” by mainstream tech outlets.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    UI-TARS

    UI-TARS

    UI-TARS-desktop version that can operate on your local personal device

    UI-TARS is an open-source multimodal “GUI agent” created by ByteDance: a model designed to perceive raw screenshots (or rendered UI frames), reason about what needs to be done, and then perform real interactions with graphical user interfaces (GUIs) — like clicking, typing, navigating menus — across desktop, browser, mobile, or game environments. Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception, reasoning, grounding, and action into one end-to-end framework: it “thinks before acting,” enabling flexible, general-purpose automation. This allows it to perform complex, multi-step tasks such as filling forms, downloading files, navigating applications, and even controlling in-game actions — all by understanding the UI as a human would. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    BrowserGym

    BrowserGym

    A Gym environment for web task automation

    BrowserGym is an open framework for web task automation research that exposes browser interaction as a Gym-style environment for training and evaluating agents. It is intended for researchers building web agents rather than for end users looking for a consumer automation product. The project provides a common environment where agents can interact with websites, execute tasks, and be evaluated against standardized benchmarks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Reader 3

    Reader 3

    Quick illustration of how one can easily read books together with LLMs

    ...While it lacks advanced features like built-in annotations or rich media support, its simplicity is intentional, enabling users to quickly load EPUBs, view them in a browser, and even repurpose text for downstream tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 16 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB